This is the project page for the following paper:
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
Yuting Zhang, Luyao Yuan, Yijie Guo, Zhiyuan He, I-An Huang, Honglak Lee
Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
Spotlight presentation
[]
[]
[paper 5M (high-res 24M)]
[arXiv]
[data & development toolbox]
[project (TensorFlow & Caffe Code)]
[slides (spotlight video)]
[poster]
@inproceedings{2017-cvpr-dbnet, author={Yuting Zhang and Luyao Yuan and Yijie Guo and Zhiyuan He and I-{An} Huang and Honglak Lee}, booktitle={Conference on Computer Vision and Pattern Recognition ({CVPR})}, title={Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries}, year={2017}, month={July}, url={http://www.ytzhang.net/files/publications/2017-cvpr-dbnet.pdf}, arxiv={1704.03944} }
Code
We provide code and model release in both MATLAB+Caffe (used for getting results in our paper) and Python+TensorFlow. The code can be obtained on our GitHub pages:
- DBNet Code & Models (Caffe+MATLAB, initial implementation): https://github.com/YutingZhang/dbnet-caffe-matlab
- DBNet Code & Models (TensorFlow+Python, replication): https://github.com/yuanluya/dbnet_tensorflow
We also provide an independent development and evaluation toolbox for visual localization and detection with natural language queries:
- Data and development toolbox: https://github.com/YutingZhang/nlvd_evaluation
Spotlight video