Cross-View Geo-localization: Past, Present, and Future

Speaker: Chen Chen

Time: 8:40 - 9:20

Download Slides
Abstract:

Cross-view geo-localization is the task of determining geographic locations by matching ground-level images with aerial or satellite imagery. Over the years, this field has evolved from early approaches based on hand-crafted visual descriptors such as HOG and GIST to more advanced methods that rely on deep learning and latent feature representations. This talk will provide a comprehensive overview of the major developments and breakthroughs in cross-view geo-localization research by discussing the evolution of key techniques and state-of-the-art models. Furthermore, this talk will also outline the future research direction of cross-view geo-localization, such as combining Large Multimodal Models (LMMs) or enabling advanced robotic navigation with Vision Language Action (VLA) models. This session aims to offer foundational knowledge for new researchers to the field while providing insights for experienced researchers in this area.