[세미나] Grounding World Simulation Models in a Real-World Metropolis
* 세미나 개요
일정 : 2026년 4월 8일(수) 16:00–17:00
장소 : ECC B131호
문의사항 : 이지영 교수 (인공지능전공) / lee.jiyoung@ewha.ac.kr
* 세미나 주제
Seoul World Model: Grounding World Simulation Models in a Real-World Metropolis
(※ 상세 초록은 아래 참고)
Recent advances in generative world models have enabled the synthesis of visually plausible environments, yet most approaches remain limited to fully imagined worlds without grounding in physical reality. In this talk, we present Seoul World Model (SWM), a city-scale world simulation model that generates videos of a real-world metropolis by leveraging large-scale street-view data from Seoul. SWM formulates video generation as a retrieval-augmented process, where nearby street-view images are used to anchor autoregressive generation to real-world geometry and appearance. This grounding, however, introduces unique challenges, including temporal misalignment between retrieved references and dynamic scenes, limited trajectory diversity, and sparsity in vehicle-captured data. To address these issues, we propose cross-temporal pairing for robust supervision, a view interpolation pipeline that transforms sparse street-view images into coherent training videos, and a Virtual Lookahead Sink mechanism that stabilizes long-horizon generation through continuous future re-grounding. We evaluate SWM across multiple cities, including Busan and Ann Arbor, demonstrating its ability to generate spatially faithful and temporally consistent urban videos over long trajectories spanning hundreds of meters. Beyond passive generation, SWM supports diverse camera movements and text-driven scenario control, opening new possibilities for real-world grounded simulation in applications such as autonomous driving, urban planning, and scenario forecasting.
Video demos and additional materials are available at: https://seoul-world-model.github.io
* 연사 정보
Jin-Hwa Kim
Leader of Generation Research, NAVER AI Lab (since Aug 2021)
Guest Associate Professor, SNU AIIS (since Aug 2022)
(이하 약력은 아래 참고)
Jin-Hwa Kim has been the Leader of Generation Research at NAVER AI Lab, working since August 2021, and a Guest Associate Professor at the Artificial Intelligence Institute of Seoul National University (SNU AIIS) since August 2022. He has studied multimodal deep learning, multimodal generation, ethical and safe AI, and other related topics. In 2018, he received a Ph.D. from Seoul National University under the supervision of Professor Byoung-Tak Zhang for the work on “Multimodal Deep Learning for Visually-grounded Reasoning.” In September 2017, he received 2017 Google Ph.D. Fellowship in Machine Learning, Ph.D. Completion Scholarship by Seoul National University, and the VQA Challenge 2018 runners-up at the CVPR 2018 VQA Challenge and Visual Dialog Workshop. He was Research Intern at Facebook AI Research (Menlo Park, CA) mentored by Yuandong Tian, Devi Parikh, and Dhruv Batra, from January to May in 2017.