minyoung huh (jacob)

I am a PhD student working on machine learning and computer vision at MIT CSAIL / Embodied Intelligence working with Phillip Isola and Pulkit Agrawal. I received my Bachelors from UC Berkeley advised under Alexei (Alyosha) Efros at Berkeley AI Research (BAIR). Prior to joining BAIR, I worked with Maysam Chamanzar and Michel Maharbiz at Swarm Lab.

I spent some time at Google Research, Facebook (Meta) Research, Adobe Research, and Snap Research.

My research interest is in creating more intelligent and efficient machine learning algorithms. Specifically I am interested in deep learning theory, generative models, and continual learning.

Email  /  Google Scholar  /  Github

profile photo

blind-date Totems: Physical Objects for Verifying Visual Integrity
Jingwei Ma, Lucy Chai, Minyoung Huh, Tongzhou Wang, Phillip Isola, Antonio Torralba
ECCV, 2022

We introduce a new approach to image forensics: placing physical refractive objects, which we call totems, into a scene so as to protect any photograph taken of that scene. Totems bend and redirect light rays, thus providing multiple, albeit distorted, views of the scene within a single image which we unscramble to reconstruct the underlying 3Dscene.

blind-date The Low-Rank Simplicity Bias in Deep Networks
Minyoung Huh, Hossein Mobahi, Richard Zhang, Brian Cheung, Pulkit Agrawal, Phillip Isola
arXiv, 2021

We conjecture that deep networks are implicitly biased to find lower rank solutions and that these are the solutions that generalize well. We further demonstrate linear over-parameterization can be used as an implicit regularizer to improve generalization without changing the effective model capacity.

Learning to Ground Multi-Agent Communication with Autoencoders
Toru Lin, Minyoung Huh, Chriss Stauffer, Ser-Nam Lim, Phillip Isola
NeurIPS, 2021

We demonstrate a simple way to ground language in learned representations, which facilitates decentralized multi-agent communication and coordination. We find that a standard representation learning algorithm -- autoencoding -- is sufficient for arriving at a grounded common language.

blind-date Transforming and Projecting Images into Class-conditional Generative Networks
Minyoung Huh, Richard Zhang, Jun-Yan Zhu, Sylvain Paris, Aaron Hertzmann
ECCV, 2020 (Oral Presentation)

We propose a method for projecting images into the space of generative neural networks. We optimize for transformation to counteract the model biases in a generative neural networks. We further propose the use of hybrid (gradient + CMA-ES) optimization to improve model inversion.

Feedback Adversarial Learning: Spatial Feedback for Improving Generative Adversarial Networks
Minyoung Huh, Shao-Hua Sun, Ning Zhang
CVPR, 2019

We propose feedback adversarial learning (FAL) framework that can improve existing generative adversarial networks by leveraging spatial feedback from the discriminator into generative process.

Fighting Fake News: Detecting Malicious Image Manipulations via Learned Self-Consistency
Minyoung Huh, Andrew Liu*, Alexei A. Efros, Andrew Owens
ECCV, 2018

We propose a learning algorithm for detecting visual image manipulations that is trained only using a large dataset of real photographs. The algorithm uses the automatically recorded photo EXIF metadata as supervisory signal for training a model to determine whether an image is self-consistent

blind-date Multi-view to Novel view: Synthesizing Views via Self-Learned Confidence
Shao-Hua Sun, Minyoung Huh, Yuan-Hong Liao, Ning Zhang, Joseph J. Lim
ECCV, 2018

We propose an end-to-end trainable framework that learns to exploit multiple viewpoints to synthesize a novel view without any 3D supervision. Our model consists of a flow prediction module and a pixel generation module to directly leverage information presented in source views as well as hallucinate missing pixels from statistical priors.

What makes ImageNet good for Transfer Learning?
Minyoung Huh, Pulkit Agrawal, Alexei A. Efros
NeurIPS workshop, 2018

This work provides an empirical investigation into the various facets of this question, such as looking at the importance of the amount of examples, number of classes, balance between images-per-class and classes, and the role of fine and coarse grained recognition. We pre-train CNN features on various subsets of the ImageNet dataset and evaluate transfer performance on a variety of standard vision tasks.

Ultrasonic sculpting of virtual, steerable optical waveguides in tissue
Maysam Chamanzar, Matteo Giuseppe Scopelliti, Julien Bloch, Ninh Do, Minyoung Huh, Dongjin Seo, Jillian Iafrati, Vikaas S. Sohal, Mohammad-Reza Alam, Michel M. Maharbiz
Nature Communications, 2019

We demonstrate that ultrasound can be used to define and steer the trajectory of light within scattering media by exploiting local pressure differences created by acoustic waves that result in refractive index contrasts.

Virtual Acousto-optic Beam Paths for Steerable Deep-tissue Optical Stimulation and Imaging
Maysam Chamanzar, Minyoung Huh, Ninh Do, Mohammad-Reza Alam, Michel M. Maharbiz
CLEO, 2016 & SfN, 2016

We present the first non-invasive methodology for optical delivery and steering deep inside the brain through creating reconfigurable light paths by ultrasonic waves via modulating the refractive and diffractive properties of the medium.


MIT Fall 2021 - Deep Learning

MIT IAP 2021 - Deep Learning for Control

UC Berkeley 2016 - Designing Information Devices and Systems II


Fun little The New Yorker article on image generation that I participated in (very briefly).

The website template is from Jon Barron