GFM papers can't be meaningfully compared because evals, weights, and pretraining configs are all over the place. Our 152-paper audit found 46 same-model/benchmark disagreements of 10+ points and 94/126 papers using unique pretraining setups. The paper proposes six concrete fixes (weight release, shared evals, baseline annotations, variance reporting, one harness, data-vs-arch-vs-algo controls) framed as a coordination problem the whole community owns, not a callout.</p>\n","updatedAt":"2026-05-18T20:38:41.924Z","author":{"_id":"60f8aa3861a97aff929a0d78","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60f8aa3861a97aff929a0d78/PLqxVoqbAep42w--H6Ca2.jpeg","fullname":"Isaac Corley","name":"isaaccorley","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":19,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8687687516212463},"editors":["isaaccorley"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/60f8aa3861a97aff929a0d78/PLqxVoqbAep42w--H6Ca2.jpeg"],"reactions":[],"isReport":false}},{"id":"6a0bc17dccc8b24adbb2bc84","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":357,"isUserFollowing":false},"createdAt":"2026-05-19T01:48:45.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [GeoSANE: Learning Geospatial Representations from Models, Not Data](https://huggingface.co/papers/2603.23408) (2026)\n* [Low-Rank Adaptation of Geospatial Foundation Models for Wildfire Mapping Using Sentinel-2 Data](https://huggingface.co/papers/2605.04989) (2026)\n* [GeoMeld: Toward Semantically Grounded Foundation Models for Remote Sensing](https://huggingface.co/papers/2604.10591) (2026)\n* [HighFM: Towards a Foundation Model for Learning Representations from High-Frequency Earth Observation Data](https://huggingface.co/papers/2604.04306) (2026)\n* [Agentic AI for Remote Sensing: Technical Challenges and Research Directions](https://huggingface.co/papers/2604.24919) (2026)\n* [Location Is All You Need: Continuous Spatiotemporal Neural Representations of Earth Observation Data](https://huggingface.co/papers/2604.07092) (2026)\n* [GeoR-Bench: Evaluating Geoscience Visual Reasoning](https://huggingface.co/papers/2605.11541) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"<p>This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. </p>\n<p>The following papers were recommended by the Semantic Scholar API </p>\n<ul>\n<li><a href=\"https://huggingface.co/papers/2603.23408\">GeoSANE: Learning Geospatial Representations from Models, Not Data</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.04989\">Low-Rank Adaptation of Geospatial Foundation Models for Wildfire Mapping Using Sentinel-2 Data</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.10591\">GeoMeld: Toward Semantically Grounded Foundation Models for Remote Sensing</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.04306\">HighFM: Towards a Foundation Model for Learning Representations from High-Frequency Earth Observation Data</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.24919\">Agentic AI for Remote Sensing: Technical Challenges and Research Directions</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2604.07092\">Location Is All You Need: Continuous Spatiotemporal Neural Representations of Earth Observation Data</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.11541\">GeoR-Bench: Evaluating Geoscience Visual Reasoning</a> (2026)</li>\n</ul>\n<p> Please give a thumbs up to this comment if you found it helpful!</p>\n<p> If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space</p>\n<p> You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code><span class=\"SVELTE_PARTIAL_HYDRATER contents\" data-target=\"UserMention\" data-props=\"{"user":"librarian-bot"}\"><span class=\"inline-block\"><span class=\"contents\"><a href=\"/librarian-bot\">@<span class=\"underline\">librarian-bot</span></a></span> </span></span> recommend</code></p>\n","updatedAt":"2026-05-19T01:48:45.783Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":357,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7300333380699158},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.12678","authors":[{"_id":"6a0b75948ca2d0b256380245","name":"Isaac Corley","hidden":false},{"_id":"6a0b75948ca2d0b256380246","name":"Nils Lehmann","hidden":false},{"_id":"6a0b75948ca2d0b256380247","name":"Caleb Robinson","hidden":false},{"_id":"6a0b75948ca2d0b256380248","name":"Gabriel Tseng","hidden":false},{"_id":"6a0b75948ca2d0b256380249","name":"Anthony Fuller","hidden":false},{"_id":"6a0b75948ca2d0b25638024a","name":"Hamed Alemohammad","hidden":false},{"_id":"6a0b75948ca2d0b25638024b","name":"Evan Shelhamer","hidden":false},{"_id":"6a0b75948ca2d0b25638024c","name":"Jennifer Marcus","hidden":false},{"_id":"6a0b75948ca2d0b25638024d","name":"Hannah Kerner","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/60f8aa3861a97aff929a0d78/8pWN_4lB3Zxp6p7cWpWcY.png"],"publishedAt":"2026-05-12T00:00:00.000Z","submittedOnDailyAt":"2026-05-18T00:00:00.000Z","title":"No One Knows the State of the Art in Geospatial Foundation Models","submittedOnDailyBy":{"_id":"60f8aa3861a97aff929a0d78","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60f8aa3861a97aff929a0d78/PLqxVoqbAep42w--H6Ca2.jpeg","isPro":false,"fullname":"Isaac Corley","user":"isaaccorley","type":"user","name":"isaaccorley"},"summary":"Geospatial foundation models (GFMs) have been proposed as generalizable backbones for disaster response, land-cover mapping, food-security monitoring, and other high-stakes Earth-observation tasks. Yet the published work about these models does not give reviewers or users enough information to tell which model fits a given task. We argue that nobody knows what the current state of the art is in geospatial foundation models. The methods may be useful, but the GFM literature does not standardize evaluations, training and testing protocols, released weights, or pretraining controls well enough for anyone to compare or rank them. In a 152-paper audit, we find 46 cross-paper disagreements of at least 10 points for the same model, benchmark, and protocol; 94/126 papers with extractable pretraining data use a configuration no other paper uses; and 39% of GFM papers release no model weights. This lack of community standards can be solved. We propose six concrete expectations: named-license weight release, shared core evaluations, copied-versus-rerun baseline annotations, variance reporting, one shared evaluation harness, and data-vs-architecture-vs-algorithm controls. These gaps are a coordination failure, not a fault of any individual lab; the authors of this paper, like many others in the GFM community, have contributed to them. Rather than just critiquing the community, we aim to provide concrete steps toward a shared understanding of how to innovate GFMs.","upvotes":0,"discussionId":"6a0b75948ca2d0b25638024e","githubRepo":"https://github.com/taylor-geospatial/gfm-leaderboard","githubRepoAddedBy":"user","ai_summary":"Geospatial foundation models lack standardized evaluation and reporting practices, creating inconsistency in performance comparisons and limiting reproducibility across studies.","ai_keywords":["geospatial foundation models","disaster response","land-cover mapping","food-security monitoring","model weights","pretraining controls","evaluation protocols","benchmarking","reproducibility"],"githubStars":13},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[],"acceptLanguages":["en"],"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.12678.md"}">
No One Knows the State of the Art in Geospatial Foundation Models
Abstract
Geospatial foundation models lack standardized evaluation and reporting practices, creating inconsistency in performance comparisons and limiting reproducibility across studies.
AI-generated summary
Geospatial foundation models (GFMs) have been proposed as generalizable backbones for disaster response, land-cover mapping, food-security monitoring, and other high-stakes Earth-observation tasks. Yet the published work about these models does not give reviewers or users enough information to tell which model fits a given task. We argue that nobody knows what the current state of the art is in geospatial foundation models. The methods may be useful, but the GFM literature does not standardize evaluations, training and testing protocols, released weights, or pretraining controls well enough for anyone to compare or rank them. In a 152-paper audit, we find 46 cross-paper disagreements of at least 10 points for the same model, benchmark, and protocol; 94/126 papers with extractable pretraining data use a configuration no other paper uses; and 39% of GFM papers release no model weights. This lack of community standards can be solved. We propose six concrete expectations: named-license weight release, shared core evaluations, copied-versus-rerun baseline annotations, variance reporting, one shared evaluation harness, and data-vs-architecture-vs-algorithm controls. These gaps are a coordination failure, not a fault of any individual lab; the authors of this paper, like many others in the GFM community, have contributed to them. Rather than just critiquing the community, we aim to provide concrete steps toward a shared understanding of how to innovate GFMs.
Community
GFM papers can't be meaningfully compared because evals, weights, and pretraining configs are all over the place. Our 152-paper audit found 46 same-model/benchmark disagreements of 10+ points and 94/126 papers using unique pretraining setups. The paper proposes six concrete fixes (weight release, shared evals, baseline annotations, variance reporting, one harness, data-vs-arch-vs-algo controls) framed as a coordination problem the whole community owns, not a callout.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.12678 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.12678 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.12678 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.