Hao Feng

Hao Feng 冯浩

Researcher @ Tencent Hunyuan研究员 @ 腾讯混元
Multimodal Foundation Model · Ph.D. from USTC 多模态基础模型 · 中国科学技术大学博士
haof@mail.ustc.edu.cn

About Me关于我

Ph.D. from the University of Science and Technology of China. My Ph.D. advisors are Prof. Houqiang Li and Prof. Wengang Zhou. I obtained my Bachelor's degree (2018) from Xidian University. I'm currently working on Multimodal Foundation Models at Tencent Hunyuan. Previously, I worked on Document AI and Multimodal LLM at ByteDance.

中国科学技术大学博士,导师为 李厚强教授周文罡教授。 本科毕业于西安电子科技大学(2018年)。 目前在腾讯混元从事多模态基础模型研究, 此前在字节跳动从事文档 AI 与多模态大模型研究。

Publications发表论文

Dolphin
Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting
Hao Feng, Shu Wei, Xiang Fei, Wei Shi, Yingdong Han, Lei Liao, Jinghui Lu, Binghong Wu, Qi Liu, Chunhui Lin, Jingqun Tang, Hao Liu, Can Huang
ACL 2025
DocScanner
DocScanner: Robust Document Image Rectification with Progressive Learning
Hao Feng, Wengang Zhou, Jiajun Deng, Qi Tian, Houqiang Li
IJCV 2025
DocPedia
DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding
Hao Feng, Qi Liu, Hao Liu, Jingqun Tang, Wengang Zhou, Houqiang Li, Can Huang
SCIS 2024
DeepEraser
DeepEraser: Deep Iterative Context Mining for Generic Text Eraser
Hao Feng, Wendi Wang, Shaokai Liu, Jiajun Deng, Wengang Zhou, Houqiang Li
TMM 2024
PolySnake
Recurrent Generic Contour-based Instance Segmentation with Progressive Learning
Hao Feng, Keyi Zhou, Wengang Zhou, Yufei Yin, Jiajun Deng, Qi Sun, Houqiang Li
TCSVT 2024
DocTr++
Deep Unrestricted Document Image Rectification
Hao Feng, Shaokai Liu, Jiajun Deng, Wengang Zhou, Houqiang Li
TMM 2023
SimFIR
SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning
Hao Feng, Wendi Wang, Jiajun Deng, Wengang Zhou, Li Li, Houqiang Li
ICCV 2023
DocGeoNet
Geometric Representation Learning for Document Image Rectification
Hao Feng, Wengang Zhou, Jiajun Deng, Yuechen Wang, Houqiang Li
ECCV 2022
DocTr
DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction
Hao Feng, Yuechen Wang, Wengang Zhou, Jiajun Deng, Houqiang Li
ACM MM 2021 Oral
RDTR
Model-Aware Pre-Training for Radial Distortion Rectification
Wendi Wang, Hao Feng, Wengang Zhou, Zhaokang Liao, Houqiang Li
TIP 2023
DRNet
Rethinking Supervision in Document Unwarping: A Self-consistent Flow-free Approach
Shaokai Liu, Hao Feng, Wengang Zhou
TCSVT 2023
IP-SLT
Sign Language Translation with Iterative Prototype
Huijie Yao, Wengang Zhou, Hao Feng, Hezhen Hu, Hao Zhou, Houqiang Li
ICCV 2023
PRNet
Progressive Recurrent Network for Shadow Removal
Yonghui Wang, Wengang Zhou, Hao Feng, Li Li, Houqiang Li
CVIU 2023

Manuscripts预印本

UniDoc
UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding
Hao Feng, Zijian Wang, Jingqun Tang, Jinghui Lu, Hao Liu, Wengang Zhou, Houqiang Li, Can Huang
Preprint
TextCoT
TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding
Bozhi Luan, Hao Feng, Hong Chen, Wengang Zhou, Houqiang Li
Preprint

Academic Services学术服务

Invited Reviewer for journals and conferences, including TMM, TCSVT, CVPR, ICCV, ECCV, AAAI, ACM MM, ACL, SIGGRAPH, etc.

担任多个顶级期刊与会议的审稿人,包括 TMM、TCSVT、CVPR、ICCV、ECCV、AAAI、ACM MM、ACL、SIGGRAPH 等。