Zhang, Yuting Ph.D.
Senior Applied Scientist
Computer Vision Science @ Amazon Web Services (AWS)
––––––––––––––––––––––––––––––––––
E-mail: |
Short Bio
Since August 2018, I have been an applied scientist in the Rekognition and Textract team at Amazon Web Services, where I have been developing machine-learning solutions to joint vision and language problems. Before that, I was a postdoctoral fellow (January 2016 – July 2018) and a visiting Ph.D. student (September 2013 – December 2015) working with Honglak Lee at the University of Michigan, Ann Arbor. I received my Ph.D. from Zhejiang University in December 2015, advised by Gang Pan. I was also working with Kui (Chris) Jia and Yi Ma in the Advanced Digital Sciences Center (Singapore), UIUC, in 2012.
Publications & Preprints
DocTr: Document Transformer for Structured Information Extraction in Documents
Haofu Liao, Aruni Roychowdhury, Weijian Li, Ankan Bansal, Yuting Zhang, Zhuowen Tu, Ravi Kumar Satzoda, R. Manmatha, Vijay Mahadevan
International Conference on Computer Vision (ICCV), October 2023.
[]
[]
[paper]
[arXiv]
@inproceedings{2023-iccv-doctr, author={Haofu Liao and Aruni Roychowdhury and Weijian Li and Ankan Bansal and Yuting Zhang and Zhuowen Tu and Ravi Kumar Satzoda and R. Manmatha and Vijay Mahadevan}, booktitle={International Conference on Computer Vision ({ICCV})}, title={{DocTr}: Document Transformer for Structured Information Extraction in Documents}, year={2023}, month={October}, url={http://www.ytzhang.net/files/publications/2023-iccv-doctr.pdf}, arxiv={2307.07929} }
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Jiang Liu, Hui Ding, Zhaowei Cai, Yuting Zhang, Vijay Mahadevan, Ravi Kumar Satzoda, R. Manmatha
Conference on Computer Vision and Pattern Recognition (CVPR), June 2023.
[]
[]
[paper]
[arXiv]
@inproceedings{2023-cvpr-polyformer, author={Jiang Liu and Hui Ding and Zhaowei Cai and Yuting Zhang and Vijay Mahadevan and Ravi Kumar Satzoda and R. Manmatha}, booktitle={Conference on Computer Vision and Pattern Recognition ({CVPR})}, title={{PolyFormer}: Referring Image Segmentation as Sequential Polygon Generation}, year={2023}, month={June}, url={http://www.ytzhang.net/files/publications/2023-cvpr-polyformer.pdf}, arxiv={2302.07387} }
Visual Relationship Detection Using Part-and-Sum Transformers with Composite Queries
Qi Dong, Zhuowen Tu, Haofu Liao, Yuting Zhang, Vijay Mahadevan, Stefano Soatto
International Conference on Computer Vision (ICCV), October 2021.
[]
[]
[paper (supp)]
[arXiv]
@inproceedings{2021-iccv-PST, author={Qi Dong and Zhuowen Tu and Haofu Liao and Yuting Zhang and Vijay Mahadevan and Stefano Soatto}, booktitle={International Conference on Computer Vision ({ICCV})}, title={Visual Relationship Detection Using Part-and-Sum Transformers with Composite Queries}, year={2021}, month={October}, url={http://www.ytzhang.net/files/publications/2021-iccv-PST.pdf}, arxiv={2105.02170} }
Humble Teachers Teach Better Students for Semi-Supervised Object Detection
Yihe Tang, Weifeng Chen, Yijun Luo, Yuting Zhang
Conference on Computer Vision and Pattern Recognition (CVPR), June 2021.
[]
[]
[paper]
[arXiv]
@inproceedings{2021-cvpr-humble-teacher, author={Yihe Tang and Weifeng Chen and Yijun Luo and Yuting Zhang}, booktitle={Conference on Computer Vision and Pattern Recognition ({CVPR})}, title={Humble Teachers Teach Better Students for Semi-Supervised Object Detection}, year={2021}, month={June}, url={http://www.ytzhang.net/files/publications/2021-cvpr-humble-teacher.pdf}, arxiv={2106.10456} }
Dynamic Grown Generative Adversarial Networks
Lanlan Liu, Yuting Zhang, Jia Deng, Stefano Soatto
AAAI Conference on Artificial Intelligence (AAAI), February 2021.
[]
[]
[paper]
[arXiv]
@inproceedings{2021-aaai-dggan, author={Lanlan Liu and Yuting Zhang and Jia Deng and Stefano Soatto}, booktitle={AAAI Conference on Artificial Intelligence ({AAAI})}, title={Dynamic Grown Generative Adversarial Networks}, year={2021}, month={February}, url={http://www.ytzhang.net/files/publications/2021-aaai-dggan.pdf}, arxiv={2106.08505} }
Visual Question Answering on Image Sets
Ankan Bansal, Yuting Zhang, Rama Chellappa
European Conference on Computer Vision (ECCV), August 2020.
[]
[]
[paper (main, appendices)]
[arXiv]
[project (ISVQA dataset and baseline code)]
@inproceedings{2020-eccv-isvqa, author={Ankan Bansal and Yuting Zhang and Rama Chellappa}, booktitle={European Conference on Computer Vision ({ECCV})}, title={Visual Question Answering on Image Sets}, year={2020}, month={August}, url={http://www.ytzhang.net/files/publications/2020-eccv-isvqa.pdf}, arxiv={2008.11976} }
Unsupervised Discovery of Object Landmarks as Structural Representations
Yuting Zhang, Yijie Guo, Yixin Jin, Yijun Luo, Zhiyuan He, Honglak Lee
Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
Oral presentation
[]
[]
[paper (main, appendices, supp-videos .tar.gz)]
[arXiv]
[project (code & results)]
[poster]
[slides]
[oral presentation .mp4]
@inproceedings{2018-cvpr-lmdis-rep, author={Yuting Zhang and Yijie Guo and Yixin Jin and Yijun Luo and Zhiyuan He and Honglak Lee}, booktitle={Conference on Computer Vision and Pattern Recognition ({CVPR})}, title={Unsupervised Discovery of Object Landmarks as Structural Representations}, year={2018}, month={June}, url={http://www.ytzhang.net/files/publications/2018-cvpr-lmdis-rep.pdf}, arxiv={1804.04412} }
Hierarchical Novelty Detection for Visual Object Recognition
Kibok Lee, Kimin Lee, Kyle Min, Yuting Zhang, Jinwoo Shin, Honglak Lee
Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
[]
[]
[paper (supp)]
[arXiv]
[code]
@inproceedings{2018-cvpr-hierachical-novelty, author={Kibok Lee and Kimin Lee and Kyle Min and Yuting Zhang and Jinwoo Shin and Honglak Lee}, booktitle={Conference on Computer Vision and Pattern Recognition ({CVPR})}, title={Hierarchical Novelty Detection for Visual Object Recognition}, year={2018}, month={June}, url={http://www.ytzhang.net/files/publications/2018-cvpr-hierachical-novelty.pdf}, arxiv={1804.00722} }
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
Yuting Zhang, Luyao Yuan, Yijie Guo, Zhiyuan He, I-An Huang, Honglak Lee
Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
Spotlight presentation
[]
[]
[paper 5M (high-res 24M)]
[arXiv]
[data & development toolbox]
[project (TensorFlow & Caffe Code)]
[slides (spotlight video)]
[poster]
@inproceedings{2017-cvpr-dbnet, author={Yuting Zhang and Luyao Yuan and Yijie Guo and Zhiyuan He and I-{An} Huang and Honglak Lee}, booktitle={Conference on Computer Vision and Pattern Recognition ({CVPR})}, title={Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries}, year={2017}, month={July}, url={http://www.ytzhang.net/files/publications/2017-cvpr-dbnet.pdf}, arxiv={1704.03944} }
Towards Understanding the Invertibility of Convolutional Neural Networks
Anna C. Gilbert, Yi Zhang, Kibok Lee, Yuting Zhang, Honglak Lee
International Joint Conference on Artificial Intelligence (IJCAI), August 2017.
[]
[]
[paper]
[arXiv]
@inproceedings{2017-ijcai-CNNmrip, author={Anna C. Gilbert and Yi Zhang and Kibok Lee and Yuting Zhang and Honglak Lee}, booktitle={International Joint Conference on Artificial Intelligence ({IJCAI})}, title={Towards Understanding the Invertibility of Convolutional Neural Networks}, year={2017}, month={August}, url={http://www.ytzhang.net/files/publications/2017-ijcai-CNNmrip.pdf}, arxiv={1705.08664} }
Augmenting Supervised Neural Networks with Unsupervised Objectives for Large-Scale Image Classification
Yuting Zhang, Kibok Lee, Honglak Lee
International Conference on Machine Learning (ICML), June 2016.
[]
[]
[paper (main, supp.)]
[arXiv]
[code & model]
[slides]
[poster]
[more image reconstruction examples]
@inproceedings{2016-icml-recon-dec, author={Yuting Zhang and Kibok Lee and Honglak Lee}, booktitle={International Conference on Machine Learning ({ICML})}, title={Augmenting Supervised Neural Networks with Unsupervised Objectives for Large-Scale Image Classification}, year={2016}, month={June}, url={http://www.ytzhang.net/files/publications/2016-icml-recon-dec.pdf}, arxiv={1606.06582}, pages={612-621} }
Deep Visual Analogy-Making
Scott Reed, Yi Zhang, Yuting Zhang, Honglak Lee
Advances in Neural Information Processing Systems (NIPS), December 2015.
Oral presentation
[]
[]
[paper]
[code]
[data]
@inproceedings{2015-nips-analogy, author={Scott Reed and Yi Zhang and Yuting Zhang and Honglak Lee}, booktitle={Advances in Neural Information Processing Systems ({NIPS})}, title={Deep Visual Analogy-Making}, year={2015}, month={December}, url={http://www.ytzhang.net/files/publications/2015-nips-analogy.pdf} }
Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction
Yuting Zhang, Kihyuk Sohn, Ruben Villegas, Gang Pan, Honglak Lee
Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
doi: 10.1109/CVPR.2015.7298621
Oral presentation
&
1st Winner of CV Community Top Paper Award: CVPR 2015 (OpenCV’s People’s Vote Winning Papers) [link]
[]
[]
[paper (main, supp.)]
[arXiv]
[project (code & model)]
[slides 7M (high-res 45M)]
[poster]
@inproceedings{2015-cvpr-det, author={Yuting Zhang and Kihyuk Sohn and Ruben Villegas and Gang Pan and Honglak Lee}, booktitle={Conference on Computer Vision and Pattern Recognition ({CVPR})}, title={Improving Object Detection with Deep Convolutional Networks via {Bayesian} Optimization and Structured Prediction}, year={2015}, month={June}, url={http://www.ytzhang.net/files/publications/2015-cvpr-det.pdf}, arxiv={1504.03293}, pages={249-258}, doi={10.1109/CVPR.2015.7298621} }
Single Sample Face Recognition via Learning Deep Supervised Autoencoders
Shenghua Gao, Yuting Zhang, Kui Jia, Jiwen Lu, Yingying Zhang
IEEE Transactions on Information Forensics and Security, vol. 10, no. 10, pp. 2108-2118, October 2015.
doi: 10.1109/TIFS.2015.2446438
[]
[]
[paper]
@article{2015-tifs-sup-ae, author={Shenghua Gao and Yuting Zhang and Kui Jia and Jiwen Lu and Yingying Zhang}, title={Single Sample Face Recognition via Learning Deep Supervised Autoencoders}, year={2015}, month={October}, url={http://www.ytzhang.net/files/publications/2015-tifs-sup-ae.pdf}, pages={2108-2118}, volume={10}, number={10}, doi={10.1109/TIFS.2015.2446438}, journal={{IEEE} Transactions on Information Forensics and Security}, issn={1556-6013} }
Robust Face Recognition by Constrained Part-based Alignment
Yuting Zhang, Kui Jia, Yueming Wang, Gang Pan, Tsung-Han Chan, Yi Ma
ArXiv preprint, 2015.
[]
[]
[paper]
[arXiv]
@article{2015-preprint-cpa, author={Yuting Zhang and Kui Jia and Yueming Wang and Gang Pan and Tsung-Han Chan and Yi Ma}, title={Robust Face Recognition by Constrained Part-based Alignment}, year={2015}, url={http://www.ytzhang.net/files/publications/2015-preprint-cpa.pdf}, arxiv={1501.04717}, journal={ArXiv preprint} }
Accelerometer-based Gait Recognition by Sparse Representation of Signature Points with Clusters
Yuting Zhang, Gang Pan, Kui Jia, Minlong Lu, Yueming Wang, Zhaohui Wu
IEEE Transactions on Cybernetics, vol. 45, no. 9, pp. 1864-1875, September 2015.
doi: 10.1109/TCYB.2014.2361287
[]
[]
[paper]
[dataset]
[code]
@article{2015-tcyb-gait, author={Yuting Zhang and Gang Pan and Kui Jia and Minlong Lu and Yueming Wang and Zhaohui Wu}, title={Accelerometer-based Gait Recognition by Sparse Representation of Signature Points with Clusters}, year={2015}, month={September}, url={http://www.ytzhang.net/files/publications/2015-tcyb-gait.pdf}, pages={1864-1875}, volume={45}, number={9}, doi={10.1109/TCYB.2014.2361287}, journal={IEEE Transactions on Cybernetics}, issn={2168-2267} }
Learning to Disentangle Factors of Variation with Manifold Interaction
Scott Reed, Kihyuk Sohn, Yuting Zhang, Honglak Lee
International Conference on Machine Learning (ICML), 2014.
[]
[]
[paper]
[code]
@inproceedings{2014-icml-disentangling, author={Scott Reed and Kihyuk Sohn and Yuting Zhang and Honglak Lee}, booktitle={International Conference on Machine Learning ({ICML})}, title={Learning to Disentangle Factors of Variation with Manifold Interaction}, year={2014}, url={http://www.ytzhang.net/files/publications/2014-icml-disentangling.pdf} }
L1-Norm Latent SVM for Compact Features in Object Detection
Min Tan, Gang Pan, Yueming Wang, Yuting Zhang, Zhaohui Wu
Neurocomputing, vol. 139, pp. 56-64, 2014.
doi: 10.1016/j.neucom.2013.09.054
[]
[]
@article{2014-neurocomp-l1lsvm, author={Min Tan and Gang Pan and Yueming Wang and Yuting Zhang and Zhaohui Wu}, title={L1-Norm Latent {SVM} for Compact Features in Object Detection}, year={2014}, pages={56-64}, volume={139}, doi={10.1016/j.neucom.2013.09.054}, journal={Neurocomputing}, issn={0925-2312} }
Efficient Computation of Histograms on Densely Overlapped Polygonal Regions
Yuting Zhang, Yueming Wang, Gang Pan, Zhaohui Wu
Neurocomputing, vol. 118, pp. 141-149, 2013.
doi: 10.1016/j.neucom.2013.02.027
[]
[]
[paper]
[code]
@article{2013-neurocomp-polyhist, author={Yuting Zhang and Yueming Wang and Gang Pan and Zhaohui Wu}, title={Efficient Computation of Histograms on Densely Overlapped Polygonal Regions}, year={2013}, url={http://www.ytzhang.net/files/publications/2013-neurocomp-polyhist.pdf}, pages={141-149}, volume={118}, doi={10.1016/j.neucom.2013.02.027}, journal={Neurocomputing}, issn={0925-2312} }
GPU-Accelerated Parallel Realistic 3D Facial Expression Synthesis
Song Han, Gang Pan, Junkang Fu, Yuting Zhang
Journal of Computer-Aided Design and Computer Graphics (Chinese), vol. 23, no. 5, pp. 747-755, May 2011.
[]
@article{2011-jcadcg-gpu-face, author={Song Han and Gang Pan and Junkang Fu and Yuting Zhang}, title={{GPU}-Accelerated Parallel Realistic {3D} Facial Expression Synthesis}, year={2011}, month={May}, pages={747-755}, volume={23}, number={5}, journal={Journal of Computer-Aided Design and Computer Graphics (Chinese)} }
Removal of 3D Facial Expressions: a Learning-based Approach
Gang Pan, Song Han, Zhaohui Wu, Yuting Zhang
Conference on Computer Vision and Pattern Recognition (CVPR), June 2010.
[]
[]
[paper]
@inproceedings{2010-cvpr-3dface, author={Gang Pan and Song Han and Zhaohui Wu and Yuting Zhang}, booktitle={Conference on Computer Vision and Pattern Recognition ({CVPR})}, title={Removal of {3D} Facial Expressions: a Learning-based Approach}, year={2010}, month={June}, url={http://www.ytzhang.net/files/publications/2010-cvpr-3dface.pdf}, pages={2614-2621} }
Accelerometer-based Gait Recognition via Voting by Signature Points
Gang Pan, Yuting Zhang, Zhaohui Wu
Electronics Letters, vol. 45, no. 22, pp. 1116-1118, October 2009.
doi: 10.1049/el.2009.2301
PRC Patent: 200910153244.2
[]
[]
[paper]
[related slides]
@article{2009-el-gait, author={Gang Pan and Yuting Zhang and Zhaohui Wu}, title={Accelerometer-based Gait Recognition via Voting by Signature Points}, year={2009}, month={October}, url={http://www.ytzhang.net/files/publications/2009-el-gait.pdf}, pages={1116-1118}, volume={45}, number={22}, doi={10.1049/el.2009.2301}, journal={Electronics Letters}, issn={0013-5194} }
Professional Activities
- Organizer of NeurIPS 2020 EXPO Demonstration – “AWS Computer Vision Science”
- Organizer of CVPR 2020 Workshop on Text and Documents in the Deep Learning Era
- Conference Reviewer / PC member:
- CVPR 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024
- ICCV 2017, 2019, 2021, 2023
- ECCV 2018, 2020, 2022
- NeurIPS 2016, 2017, 2018, 2019, 2020, 2021, 2022
- ICML 2016, 2017, 2018, 2019, 2021, 2022, 2023
- ICLR 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023
- IJCAI 2016, 2017, 2018 (distinguished PC), 2019, 2020
- AAAI 2018, 2020
- AISTATS 2017, 2018, 2019, 2020, 2021, 2022
- Journal Reviewer:
- IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
- IEEE Transactions on Image Processing (TIP)
- Neural Computation
Talks
- Guest Lectures: Artifical Neural Networks and Deep Learning
In EECS 592 (AI Foundations), University of Michigan, Nov 27 & Nov 29, Fall 2017. - Guest Lectures: Artifical Neural Networks and Deep Learning
In EECS 492 (Introduction to Artificial Intelligence), University of Michigan, Nov 30 & Dec 5, Fall 2017. - Invited Talk: Object Detection Using Deep Neural Networks
In a2-dlearn2016 (official website), Ann Arbor, MI, USA, Nov 2016. - Invited Talk: Accelerometer-based gait recognition [slides]
In IWCST'11 (BUAA-Tsukuba-ZJU workshop), Beijing, China, Oct 2011.