Jihyung Kil

Ph.D. Candidate
Computer Science
The Ohio State University
:email:  kil.5@osu.edu

LinkedIn / Twitter / Google Scholar / GitHub

I am a Ph.D. candidate in Computer Science and Engineering at The Ohio State University, advised by Wei-Lun (Harry) Chao.
I am interested in machine learning and its applications to Vision-Language. Recently, I have focused on the following topics:

  • Multimodal Foundation Models
  • Multimodal Agents (Web Agents, Embodied AI)
  • Visual Question Answering, Scene-Text Understanding
  • Few/Zero-Shot Learning

🔥 I am on the job market in 2024. Please contact me if you are interested in my research. 🔥

Experience

Amazon Alexa AI                     Research Intern May 2023 - Dec 2023
Google Research                     Research Intern May 2022 - Nov 2022
The Ohio State University      Graduate Research Assistant Aug 2018 - Present

News
Mar, 2024 I am selected for Doctoral Consortium at CVPR 2024.
Feb, 2024 Our Dual-VCR on Web Navigation accepted to CVPR 2024.
Jan, 2024 Our II-MMR on Visual Question Answering now available on arXiv.
Jan, 2024 Our SeeAct on Web Navigation now available on arXiv.
Jul, 2023 Our PreSTU on Scene-Text Undersatnding accepted to ICCV 2023.
Mar, 2022 Our M-Track on Vision and Language Navigation accepted to CVPR 2022.
Dec, 2021 Our team is selected to participate in the Amazon Alexa Prize SimBot Challenge.
Aug, 2021 Our SimpleAug on Visual Question Answering accepted to EMNLP 2021.
Apr, 2021 Our paper on Zero Shot Learning accepted to NAACL 2021.
Research
  1. preprint
    ii-mmr.png
    II-MMR: Identifying and Improving Multi-modal Multi-hop Reasoning in Visual Question Answering
    Jihyung Kil, Farideh Tavazoee, Dongyeop Kang, Joo-Kyung Kim
    arXiv 2024
  2. preprint
    seeact.png
    GPT-4V(ision) is a Generalist Web Agent, if Grounded
    Boyuan Zheng, Boyu Gou, Jihyung Kil, Huan Sun, Yu Su
    arXiv 2024
  3. CVPR
    dual-vcr.png
    Dual-View Visual Contextualization for Web Navigation
    Jihyung Kil, Chan Hee Song, Boyuan Zheng, Xiang Deng, Yu Su, Wei-Lun Chao
    CVPR 2024
  4. ICCV
    prestu.png
    PreSTU: Pre-Training for Scene-Text Understanding
    Jihyung Kil, Soravit Changpinyo, Xi Chen, Hexiang Hu, Sebastian Goodman, Wei-Lun Chao, Radu Soricut
    ICCV 2023
  5. CVPR
    m-track.png
    One Step at a Time: Long-Horizon Vision-and-Language Navigation with Milestones
    Chan Hee Song, Jihyung Kil, Tai-Yu Pan, Brian M Sadler, Wei-Lun Chao, Yu Su
    CVPR 2022
  6. EMNLP
    simpleaug.png
    Discovering the Unknown Knowns: Turning Implicit Knowledge in the Dataset into Explicit Training Examples for Visual Question Answering
    Jihyung Kil, Cheng Zhang, Dong Xuan, Wei-Lun Chao
    EMNLP 2021
  7. NAACL
    zsl.png
    Revisiting Document Representations for Large-Scale Zero-Shot Learning
    Jihyung Kil, Wei-Lun Chao
    NAACL 2021