intro image here
Date : February 28, 2025
Time : TBD
Location : TBD


Tutorial Description

The availability of geospatial data from diverse sources and perspectives has expanded rapidly in recent years, unlocking opportunities across various fields. Cross-view geo-localization—matching ground images with aerial or satellite views to pinpoint geographical locations—has become a critical area of research due to its diverse applications in autonomous navigation, urban planning, and augmented reality. Despite substantial advancements, this field faces significant challenges, including handling extreme viewpoint variations, reliability, and integration of multimodal information across diverse perspectives. Recent developments in Generative AI (GenAI), including Large Vision-Language Models (LVLMs), have introduced more generalized approaches to incorporate multimodality data. These advancements not only enhance performance but also redefine how multimodality is utilized in real-world settings. As the field evolves rapidly, this tutorial offers an ideal opportunity for researchers—especially those new to the domain—to grasp the latest methodologies, explore cutting-edge datasets, and understand emerging trends that will shape the future of cross-view geo-localization.


Organizers


Speakers


Schedule


Covered Publications

  • X. Zhang, X. Li, W. Sultani, Y. Zhou, and S. Wshah, “Cross-View Geo-Localization via Learning Disentangled Geometric Layout Correspondence,” AAAI, vol. 37, no. 3, pp. 3480–3488, Jun. 2023, doi: 10.1609/aaai.v37i3.25457.
  • X. Zhang, W. Sultani, and S. Wshah, “Cross-View Image Sequence Geo-Localization,” presented at the Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 2914–2923.
  • A. Arrabi, X. Zhang, W. Sultani, C. Chen, and S. Wshah, “Cross-View Meets Diffusion: Aerial Image Synthesis with Geometry and Text Guidance,” Aug. 20, 2024, arXiv: arXiv:2408.04224. doi: 10.48550/arXiv.2408.04224.
  • N. C. Mithun, K. S. Minhas, H.-P. Chiu, T. Oskiper, M. Sizintsev, S. Samarasekera, and R. Kumar, “Cross-View Visual Geo-Localization for Outdoor Augmented Reality,” in 2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR), Mar. 2023, pp. 493–502. doi: 10.1109/VR55154.2023.00064.
  • X. Zhang, X. Li, W. Sultani, C. Chen, and S. Wshah, “GeoDTR+: Toward Generic Cross-View Geolocalization via Geometric Disentanglement,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–19, 2024, doi: 10.1109/TPAMI.2024.3443652.
  • D. Wilson, X. Zhang, W. Sultani, and S. Wshah, “Image and Object Geo-Localization,” Int J Comput Vis, vol. 132, no. 4, pp. 1350–1392, Apr. 2024, doi: 10.1007/s11263-023-01942-3.
  • M. Chu, Z. Zheng, W. Ji, T. Wang, and T.-S. Chua, “Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching”.
  • Z. Zheng, Y. Wei, and Y. Yang, “University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization,” in Proceedings of the 28th ACM International Conference on Multimedia, in MM ’20. New York, NY, USA: Association for Computing Machinery, Oct. 2020, pp. 1395–1403. doi: 10.1145/3394171.3413896.

Content: Xiaohan Zhang 2024.

Theme: workshop-template-b by evanwill is built using Jekyll on GitHub Pages. The site is styled using Bootstrap.