An Awesome Collection of Urban Foundation Models (UFMs).
🌟 2024-05: Urban Foundation Models: A Survey has been accepted as a Tutorial Track Paper at KDD'24 and will be published in the conference proceedings. Additionally, we will host a tutorial on Urban Foundation Models at the KDD'24 conference. More details can be found on the tutorial website.
Urban Foundation Models (UFMs) are a family of large-scale models pre-trained on vast amounts of multi-source, multi-granularity, and multimodal urban data. They acquire notable general-purpose capabilities in the pre-training phase, exhibiting remarkable emergent abilities and adaptability dedicated to a range of urban application domains, such as transportation, urban planning, energy management, environmental monitoring, and public safety and security.
Urban Foundation Models: A Survey
Authors: Weijia Zhang, Jindong Han, Zhao Xu, Hang Ni, Hao Liu, Hui Xiong
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Authors: Weijia Zhang, Jindong Han, Zhao Xu, Hang Ni, Hao Liu, Hui Xiong
🌟 If you find this resource helpful, please consider starring this repository and citing our survey paper:
@inproceedings{ufmsurvey-kdd2024,
title={Urban Foundation Models: A Survey},
author={Zhang, Weijia and Han, Jindong and Xu, Zhao and Ni, Hang and Liu, Hao and Xiong, Hui},
booktitle={Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
pages={6633--6643},
year={2024}
}
@misc{zhang2024urban,
title={Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models},
author={Weijia Zhang and Jindong Han and Zhao Xu and Hang Ni and Hao Liu and Hui Xiong},
year={2024},
eprint={2402.01749},
archivePrefix={arXiv},
primaryClass={cs.CY}
}
- Awesome-Urban-Foundation-Models
- (KDD'22) ERNIE-GeoL: A Geography-and-Language Pre-trained Model and its Applications in Baidu Maps [paper]
- (arXiv 2023.10) Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning [paper]
- (arXiv 2023.05) GPT4GEO: How a Language Model Sees the World's Geography [paper]
- (arXiv 2023.05) On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence [paper]
- (arXiv 2023.05) ChatGPT is on the Horizon: Could a Large Language Model be Suitable for Intelligent Traffic Safety Research and Applications? [paper]
- (Urban Informatics) Towards Human-AI Collaborative Urban Science Research Enabled by Pre-trained Large Language Models [paper]
- (ICLR'24) GeoLLM: Extracting Geospatial Knowledge from Large Language Models [paper]
- (GIScience'23) Evaluating the Effectiveness of Large Language Models in Representing Textual Descriptions of Geometry and Spatial Relations [paper]
- (SIGSPATIAL'23) Are Large Language Models Geospatially Knowledgeable? [paper]
- (SIGSPATIAL'23) Towards Understanding the Geospatial Skills of ChatGPT: Taking a Geographic Information Systems (GIS) Exam [paper]
- (arXiv 2024.6) CityGPT: Empowering Urban Spatial Cognition of Large Language Models [paper]
- (arXiv 2024.6) UrbanLLM: Autonomous Urban Activity Planning and Management with Large Language Models [paper]
- (arXiv 2024.3) LAMP: A Language Model on the Map [paper]
- (WSDM'24) K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization [paper]
- (EMNLP'23) GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding [paper]
- (KDD'23) QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search [paper]
- (TOIS'23) Improving First-stage Retrieval of Point-of-interest Search by Pre-training Models [paper]
- (EMNLP'22) SpaBERT: A Pretrained Language Model from Geographic Data for Geo-Entity Representation [paper]
- (WWW'23) Knowledge-infused Contrastive Learning for Urban Imagery-based Socioeconomic Prediction [paper]
- (CIKM'22) Predicting Multi-level Socioeconomic Indicators from Structural Urban Imagery [paper]
- (AAAI'20) Urban2Vec: Incorporating Street View Imagery and POIs for Multi-Modal Urban Neighborhood Embedding [paper]
- (TGRS'24) Change-Agent: Toward Interactive Comprehensive Remote Sensing Change Interpretation and Analysis [paper]
- (JSTARS'24) A Billion-scale Foundation Model for Remote Sensing Images [paper]
- (TGRS'23) A Decoupling Paradigm With Prompt Learning for Remote Sensing Image Change Captioning [paper]
- (TGRS'23) Foundation Model-Based Multimodal Remote Sensing Data Classification [paper]
- (TGRS'23) RingMo-Sense: Remote Sensing Foundation Model for Spatiotemporal Prediction via Spatiotemporal Evolution Disentangling [paper]
- (ICCV'23) Towards Geospatial Foundation Models via Continual Pretraining [paper]
- (ICCV'23) Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning [paper]
- (ICML'23) CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations [paper]
- (TGRS'22) Advancing Plain Vision Transformer Toward Remote Sensing Foundation Model [paper]
- (TGRS'22) RingMo: A Remote Sensing Foundation Model With Masked Image Modeling [paper]
- (arXiv 2024.05) Aurora: A Foundation Model of the Atmosphere [paper]
- (arXiv 2023.04) FengWu: Pushing the Skillful Global Medium-range Weather Forecast beyond 10 Days Lead [paper]
- (arXiv 2023.04) W-MAE: Pre-trained Weather Model with Masked Autoencoder for Multi-variable Weather Forecasting [paper]
- (arXiv 2022.02) FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators [paper]
- (Nature'23) Accurate Medium-range Global Weather Forecasting with 3D Neural Networks [paper]
- (ICML'23) ClimaX: A Foundation Model for Weather and Climate [paper]
- (TGRS'24) RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation Model [paper]
- (NeurIPS'23) SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model [paper]
- (arXiv 2023.11) GeoSAM: Fine-tuning SAM with Sparse and Dense Visual Prompting for Automated Segmentation of Mobility Infrastructure [paper]
- (arXiv 2023.02) Learning Generalized Zero-Shot Learners for Open-Domain Image Geolocalization [paper]
- (NeurIPS'23) GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization [paper]
- (TGRS'23) RingMo-SAM: A Foundation Model for Segment Anything in Multimodal Remote-Sensing Images [paper]
- (IJAEOG'22) Migratable Urban Street Scene Sensing Method based on Vsion Language Pre-trained Model [paper]
- (KDD'23) Lightpath: Lightweight and scalable path representation learning [paper]
- (ICDM'23) Self-supervised Pre-training for Robust and Generic Spatial-Temporal Representations [paper]
- (TKDE'23) Pre-Training General Trajectory Embeddings With Maximum Multi-View Entropy Coding [paper]
- (ICDE'23) Self-supervised trajectory representation learning with temporal regularities and travel semantics [paper]
- (WWW'24) More Than Routing: Joint GPS and Route Modeling for Refine Trajectory Representation Learning [paper]
- (VLDBJ'22) Unified route representation learning for multi-modal transportation recommendation with spatiotemporal pre-training [paper]
- (CIKM'21) Robust road network representation learning: When traffic patterns meet traveling semantics [paper]
- (IJCAI'21) Unsupervised path representation learning with curriculum negative sampling [paper]
- (TIST'20) Trembr: Exploring road networks for trajectory representation learning [paper]
- (ICDE'18) Deep representation learning for trajectory similarity computation [paper]
- (IJCNN'17) Trajectory clustering via deep representation learning [paper]
- (arXiv 2024.08) TrajFM: A Vehicle Trajectory Foundation Model for Region and Task Transferability [paper]
- (AAAI'23) Contrastive pre-training with adversarial perturbations for check-in sequence representation learning [paper]
- (KBS'21) Self-supervised human mobility learning for next location prediction and trajectory classification [paper]
- (AAAI'21) Pre-training context and time aware location embeddings from spatial-temporal trajectories for user next location prediction [paper]
- (KDD'20) Learning to simulate human mobility [paper]
- (ToW'23) Pre-Training Across Different Cities for Next POI Recommendation [paper]
- (TIST'23) Doing more with less: overcoming data scarcity for poi recommendation via cross-region transfer [paper]
- (CIKM'21) Region invariant normalizing flows for mobility transfer [paper]
- (arXiv 2023.11) Exploring Large Language Models for Human Mobility Prediction under Public Events [paper]
- (arXiv 2023.10) Large Language Models for Spatial Trajectory Patterns Mining [paper]
- (arXiv 2023.10) Gpt-driver: Learning to drive with gpt [paper]
- (arXiv 2023.10) Languagempc: Large language models as decision makers for autonomous driving [paper]
- (arXiv 2023.09) Can you text what is happening? Integrating pre-trained language encoders into trajectory prediction models for autonomous driving [paper]
- (arXiv 2023.08) Where would i go next? large language models as human mobility predictors [paper]
- (SIGSPATIAL'22) Leveraging language foundation models for human mobility forecasting [paper]
- (arXiv 2024.03) DrPlanner: Diagnosis and Repair of Motion Planners Using Large Language Models [paper]
- (ICML'24) A decoder-only foundation model for time-series forecasting [paper]
- (arXiv 2024.03) UniTS: Building a Unified Time Series Model [paper]
- (arXiv 2024.02) Timer: Transformers for Time Series Analysis at Scale [paper]
- (arXiv 2024.02) Generative Pretrained Hierarchical Transformer for Time Series Forecasting [paper]
- (arXiv 2024.02) TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling [paper]
- (arXiv 2024.01) TTMs: Fast Multi-level Tiny Time Mixers for Improved Zero-shot and Few-shot Forecasting of Multivariate Time Series [paper]
- (arXiv 2024.01) Himtm: Hierarchical multi-scale masked time series modeling for long-term forecasting [paper]
- (arXiv 2023.12) Prompt-based Domain Discrimination for Multi-source Time Series Domain Adaptation [paper]
- (arXiv 2023.11) PT-Tuning: Bridging the Gap between Time Series Masked Reconstruction and Forecasting via Prompt Token Tuning [paper]
- (arXiv 2023.10) UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting [paper]
- (arXiv 2023.03) SimTS: Rethinking Contrastive Representation Learning for Time Series Forecasting [paper]
- (arXiv 2023.01) Ti-MAE: Self-Supervised Masked Time Series Autoencoders [paper]
- (NeurIPS'23) Forecastpfn: Synthetically-trained zero-shot forecasting [paper]
- (NeurIPS'23) SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling [paper]
- (NeurIPS'23) Lag-llama: Towards foundation models for time series forecasting [paper]
- (ICLR'23) A Time Series is Worth 64 Words: Long-term Forecasting with Transformers [paper]
- (KDD'23) TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting [paper]
- (AAAI'22) TS2Vec: Towards Universal Representation of Time Series [paper]
- (ICLR'22) CoST: Contrastive Learning of Disentangled Seasonal-Trend Representations for Time Series Forecasting [paper]
- (TNNLS'22) Self-Supervised Autoregressive Domain Adaptation for Time Series Data [paper]
- (IJCAI'21) Time-Series Representation Learning via Temporal and Contextual Contrasting [paper]
- (ICLR'21) Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding [paper]
- (AAAI'21) Meta-Learning Framework with Applications to Zero-Shot Time-Series Forecasting [paper]
- (AAAI'21) Time Series Domain Adaptation via Sparse Associative Structure Alignment [paper]
- (KDD'21) A Transformer-based Framework for Multivariate Time Series Representation Learning [paper]
- (KDD'20) Multi-Source Deep Domain Adaptation with Weak Supervision for Time-Series Sensor Data [paper]
- (NeurIPS'19) Unsupervised Scalable Representation Learning for Multivariate Time Series [paper]
- (KDD'24) UniST: A Prompt-Empowered Universal Model for Urban Spatio-Temporal Prediction [paper]
- (NeurIPS'23) GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks [paper]
- (CIKM'23) Mask- and Contrast-Enhanced Spatio-Temporal Learning for Urban Flow Prediction [paper]
- (CIKM'23) Cross-city Few-Shot Traffic Forecasting via Traffic Pattern Bank [paper]
- (KDD'23) Transferable Graph Structure Learning for Graph-based Traffic Forecasting Across Cities [paper]
- (KDD'22) Selective Cross-City Transfer Learning for Traffic Prediction via Source City Region Re-Weighting [paper]
- (WSDM'22) ST-GSP: Spatial-Temporal Global Semantic Representation Learning for Urban Flow Prediction [paper]
- (SIGSPATIAL'22) When Do Contrastive Learning Signals Help Spatio-Temporal Graph Forecasting? [paper]
- (KDD'22) Pre-training Enhanced Spatial-temporal Graph Neural Network for Multivariate Time Series Forecasting [paper]
- (WWW'19) Learning from Multiple Cities: A Meta-Learning Approach for Spatial-Temporal Prediction [paper]
- (IJCAI'18) Cross-City Transfer Learning for Deep Spatio-Temporal Prediction [paper]
- (arXiv 2023.12) Prompt-based Domain Discrimination for Multi-source Time Series Domain Adaptation [paper]
- (arXiv 2023.11) PT-Tuning: Bridging the Gap between Time Series Masked Reconstruction and Forecasting via Prompt Token Tuning [paper]
- (arXiv 2023.05) Spatial-temporal Prompt Learning for Federated Weather Forecasting [paper]
- (CIKM'23) PromptST: Prompt-Enhanced Spatio-Temporal Multi-Attribute Prediction [paper]
- (IJCAI'23) Prompt Federated Learning for Weather Forecasting: Toward Foundation Models on Meteorological Data [paper]
- (TKDE'22) PromptCast: A New Prompt-based Learning Paradigm for Time Series Forecasting [paper]
- (NeurIPS'23) Large Language Models Are Zero-Shot Time Series Forecasters [paper]
- (arXiv 2024.02) AutoTimes: Autoregressive Time Series Forecasters via Large Language Models [paper]
- (ICLR'24) TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting [paper]
- (arXiv 2024.03) TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models [paper]
- (arXiv 2024.01) How can large language models understand spatial-temporal data? [paper]
- (arXiv 2024.01) Spatial-temporal large language model for traffic prediction [paper]
- (arXiv 2023.11) One Fits All: Universal Time Series Analysis by Pretrained LM and Specially Designed Adaptors [paper]
- (arXiv 2023.11) GATGPT: A Pre-trained Large Language Model with Graph Attention Network for Spatiotemporal Imputation [paper]
- (arXiv 2023.08) LLM4TS: Two-Stage Fine-Tuning for Time-Series Forecasting with Pre-Trained LLMs [paper]
- (NeurIPS'23) One Fits All: Power General Time Series Analysis by Pretrained LM [paper]
- (arXiv 2024.08) Empowering Pre-Trained Language Models for Spatio-Temporal Forecasting via Decoupling Enhanced Discrete Reprogramming [paper]
- (KDD'24) UrbanGPT: Spatio-Temporal Large Language Models [paper]
- (ICLR'24) Time-LLM: Time Series Forecasting by Reprogramming Large Language Models [paper]
- (arXiv 2023.08) TEST: Text Prototype Aligned Embedding to Activate LLM’s Ability for Time Series [paper]
- (KDD'24) ReFound: Crafting a Foundation Model for Urban Region Understanding upon Language and Visual Foundations [paper]
- (WWW'24) UrbanCLIP: Learning Text-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web [paper]
- (TITS'23) Parallel Transportation in TransVerse: From Foundation Models to DeCAST [paper]
- (arXiv 2023.12) AllSpark: A Multimodal Spatiotemporal General Model [paper]
- (arXiv 2023.10) City Foundation Models for Learning General Purpose Representations from OpenStreetMap [paper]
- (arXiv 2024.02) LLMLight: Large Language Models as Traffic Signal Control Agents [paper]
- (arXiv 2023.09) TrafficGPT: Viewing, Processing and Interacting with Traffic Foundation Models [paper]
- (arXiv 2023.07) GeoGPT: Understanding and Processing Geospatial Tasks through An Autonomous GPT [paper]
- (ICML'24) GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model [paper]
- (AAAI'24) VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View [paper]
- (arXiv 2024.02) TransGPT: Multi-modal Generative Pre-trained Transformer for Transportation [paper]
- (arXiv 2023.12) Urban Generative Intelligence (UGI): A Foundational Platform for Agents in Embodied City Environment [paper]
- (EDBT'23) Spatial Structure-Aware Road Network Embedding via Graph Contrastive Learning [paper]
- (CIKM'21) GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World Scale [paper]
- (arXiv 2023.12) Large Language Models as Traffic Signal Control Agents: Capacity and Opportunity [paper]
- (arXiv 2023.08) Llm powered sim-to-real transfer for traffic signal control [paper]
- (arXiv 2023.06) Can ChatGPT Enable ITS? The Case of Mixed Traffic Control via Reinforcement Learning [paper]
👍 Contributions to this repository are welcome!
If you have come across relevant resources, feel free to open an issue or submit a pull request.
- (*conference|journal*) paper_name [[pdf](link)][[code](link)]