Education

  • 2014-2019 Ph.D Candidate Northwestern Polytechnical University
  • 2010-2014 B.S. Honors College, Northwestern Polytechnical University

Work Experience

  • 2020-Now Assistant Professor, Renmin University of China
  • 2019-2020 Research Scientist, Baidu Research

RESEARCH INTERESTS

Machine Multimodal Perception and Learning: Mining and exploring the potential problems and methods of multimodal messages (such as image, sound, touch etc.) in the direction of machine perception, reasoning and understanding, then equipping the machines with “multisensory cognitive ability”.

Prospective Students/Staffs

Curious about things surrounding, self-driven, aiming to do interesting, meaningful and valuable research

PUBLICATIONS

2022
SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance
Xinchi Zhou, Dongzhan Zhou, Wanli Ouyang, Hang Zhou, Di Hu
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Exploiting Visual Context Semantics for Sound Source Localization
Xinchi Zhou, Dongzhan Zhou, Di Hu, Hang Zhou, Wanli Ouyang
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Self-supervised Learning for Heterogeneous Audiovisual Scene Analysis
Di Hu, Zheng Wang, Feiping Nie, Rong Wang, Xuelong Li
TMM

Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li*, Yake Wei*, Yapeng Tian*, Chenliang Xu, Ji-Rong Wen, Di Hu
CVPR (ORAL)

Balanced Multimodal Learning via On-the-fly Gradient Modulation
Xiaokang Peng*, Yake Wei*, Andong Deng, Dong Wang, Di Hu
CVPR (ORAL)

SepFusion: Finding Optimal Fusion Structures for Visual Sound Separation
Dongzhan Zhou, Xinchi Zhou, Di Hu*, Hang Zhou, Lei Bai, Ziwei Liu, Wanli Ouyang
AAAI

Visual Sound Localization in-the-Wild by Cross-Modal Interference Erasing
Xian Liu, Rui Qian, Hang Zhou, Di Hu, Weiyao Lin, Ziwei Liu, Bolei Zhou, Xiaowei Zhou
AAAI

2021
Class-aware Sounding Objects Localization via Audiovisual Correspondence
Di Hu, Yake Wei, Rui Qian, Weiyao Lin, Ruihua Song, Ji-Rong Wen
TPAMI

Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
Yapeng Tian, Di Hu*, Chenliang Xu*
CVPR

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification
Zechen Bai, Zhigang Wang, Jian Wang, Di Hu*, Errui Ding*
CVPR

Temporal Relational Modeling with Self-Supervision for Action Segmentation
Dong Wang, Di Hu*, Xingjian Li, Dejing Dou
AAAI

2020
Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching
Di Hu, Rui Qian, Minyue Jiang, Xiao Tan, Shilei Wen, Errui Ding, Weiyao Lin, Dejing Dou
NeurIPS

A Two-Stage Framework for Multiple Sound-Source Localization
Rui Qian, Di Hu, Heinrich Dinkel, Mengyue Wu, Ning Xu, Weiyao Lin
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2020.

Co-Learn Sounding Object Visual Grounding and Visually Indicated Sound Separation in A Cycle
Yapeng Tian, Di Hu, Chenliang Xu
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2020.

Does Ambient Sound Help? - Audiovisual Crowd Counting
Di Hu, LichaoMou, Qingzhong Wang, Junyu Gao, Yuansheng Hua, Dejing Dou, and Xiaoxiang Zhu
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2020.

Heterogeneous Scene Analysis via Self-supervised Audiovisual Learning
Di Hu, Zheng Wang, HaoyiXiong, Dong Wang, FeipingNie, and Dejing Dou
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2020.

Multiple Sound Sources Localization from Coarse to Fine
Rui Qian, Di Hu, Heinrich Dinkel, Mengyue Wu, Ning Xu, and Weiyao Lin
In Proceedings of the European Conference on Computer Vision (ECCV), 2020.

Cross-Task Transfer for Multimodal Aerial Scene Recognition
Di Hu, Xuhong Li, LichaoMou, Pu Jin, Dong Chen, Liping Jing, Xiaoxiang Zhu, and Dejing Dou
In Proceedings of the European Conference on Computer Vision (ECCV), 2020.

2019
Dense Multimodal Fusion for Hierarchically Joint Representation
Di Hu, Chengze Wang, FeipingNie, and Xuelong Li
In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019.

Listen to the Image
Di Hu, Dong Wang, FeipingNie, Qi Wang, and Xuelong Li
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. (CCF A)

Deep Multimodal Clustering for Unsupervised Audiovisual Learning
Di Hu, FeipingNie, and Xuelong Li
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. (CCF A)

Deep Linear Discriminant Analysis Hashing
Di Hu, FeipingNie, and Xuelong Li
Sci Sin Inform, 2019. (CCF A)

2018
Deep Binary Reconstruction for Cross-modal Hashing
Di Hu, FeipingNie, and Xuelong Li
IEEE Trans. Multimedia (TMM), 2018.

Discrete Spectral Hashing for Efficient Similarity Retrieval
Di Hu, FeipingNie, and Xuelong Li
IEEE Trans. Image Processing (TIP), 2018. (CCF A)

2017
Large Graph Hashing with Spectral Rotation
Xuelong Li, Di Hu, and FeipingNie
In Proceedings of the AAAIConferenceonArtificialIntelligence (AAAI), 2017. (CCF A)

Deep Binary Reconstruction for Cross-modal Hashing
Xuelong Li, Di Hu, and FeipingNie
In Proceedings of the ACM Conference on Multimedia (ACMMM), 2017. (CCF A)

Image2song: Song Retrieval via Bridging Image Content and Lyric Words
Xuelong Li, Di Hu, and Xiaoqiang Lu
In Proceedings of the IEEE Conference on Computer Vision (ICCV), 2017. (CCF A)

2016
Temporal Multimodal Learning in Audiovisual Speech Recognition
Di Hu, Xuelong Li, and Xiaoqiang Lu
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. (CCF A)

Multimodal Learning via Exploring Deep Semantic Similarity
Di Hu, Xiaoqiang Lu, and Xuelong Li
In Proceedings of the ACM Conference on Multimedia (ACMMM), 2016. (CCF A)

Honors and Awards

  • 2020.9 Won the 2020 CAAI Outstanding Doctoral Dissertation Award
  • 2019.8 Selected by the『AIDU』Talent Recruitment Project of Baidu
  • 2019.8 Won the 2019 ACM Xi'an Doctoral Dissertation Award
  • 2019.5 Selected by the CVPR 2019 Doctoral Consortium

Services

  • 1、Reviewer of Journal: TIP, TKDE, TMM, Neurocomputing
  • 2、Program Committee of Conference: NeurIPS 2020, CVPR 2018 2020, ICCV 2019, ECCV2020, AAAI 2018 2020, ACCV 2018 2020
  • 3、Co-organizer: ICDM 2019 Tutorial on Automated Deep Learning: Theory, Algorithms, Platforms, and Applications

Contact

Tel:

Email:dihu[at]ruc.edu.cn

Website:https://dtaoo.github.io/

Address: