Every law in the US is technically public, but hasn't been in practice until now. In this paper we build, analyze, and release LOCUS, a dataset of 2.2 million city and county laws, available on HF: <a href=\"https://huggingface.co/datasets/LocalLaws/LOCUS-v1\">https://huggingface.co/datasets/LocalLaws/LOCUS-v1</a></p>\n","updatedAt":"2026-06-19T15:43:16.951Z","author":{"_id":"64b71b089ebb7e6c7dce4d38","avatarUrl":"/avatars/b5601682b6500f30369ad7660e88bb55.svg","fullname":"Joe Barrow","name":"jbarrow","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":27,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9309329986572266},"editors":["jbarrow"],"editorAvatarUrls":["/avatars/b5601682b6500f30369ad7660e88bb55.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.19334","authors":[{"_id":"6a3562fadb23715e9da12c5d","name":"Denis Peskoff","hidden":false},{"_id":"6a3562fadb23715e9da12c5e","name":"Joe Barrow","hidden":false},{"_id":"6a3562fadb23715e9da12c5f","name":"Christopher Vu","hidden":false},{"_id":"6a3562fadb23715e9da12c60","name":"Diag Davenport","hidden":false}],"publishedAt":"2026-06-17T00:00:00.000Z","submittedOnDailyAt":"2026-06-19T00:00:00.000Z","title":"Freeing the Law with LOCUS: A Local Ordinance Corpus for the United States","submittedOnDailyBy":{"_id":"64b71b089ebb7e6c7dce4d38","avatarUrl":"/avatars/b5601682b6500f30369ad7660e88bb55.svg","isPro":false,"fullname":"Joe Barrow","user":"jbarrow","type":"user","name":"jbarrow"},"summary":"Progress in legal AI increasingly depends on access to authoritative legal text at scale. Yet one of the most consequential layers of American law remains largely absent from existing machine-readable corpora: local ordinances. Local codes govern zoning, housing, business licensing, public health, noise, animal control, and many other domains of everyday regulation, but they are fragmented across vendor platforms designed for human browsing rather than bulk research access. We introduce LOCUS - the Local Ordinance Corpus for the United States - a comprehensive corpus and county-harmonized access layer for U.S. municipal and county ordinance codes. The raw corpus, available for release to researchers, represents nearly all publicly available municipal and county ordinance codes. The resulting raw corpus contains codes from 9,239 cities and counties. A smaller county-harmonized LOCUS access layer provides coverage for the largest 2,309 of 3,144 U.S. counties, accounting for a majority of the population. We use OCR to handle the myriad of document formats that have kept the law from being a public resource. We release the corpus with coverage metadata to support reproducibility, downstream legal AI research, and the incremental expansion of machine-readable access to local law. We train a collection of ModernBERT-based classifiers and scorers to facilitate analyzing U.S. local law among several dimensions, such as opacity and paternalism, that have not previously been studied at this scale. LOCUS-v1 and its derivative models are available at: https://huggingface.co/datasets/LocalLaws/LOCUS-v1","upvotes":2,"discussionId":"6a3562fadb23715e9da12c61","ai_summary":"A comprehensive corpus and access layer for U.S. local ordinance codes has been developed to enable machine-readable legal AI research, addressing the lack of authoritative legal text at scale for local regulations.","ai_keywords":["legal AI","local ordinances","corpus","OCR","ModernBERT-based classifiers","scorers","reproducibility","machine-readable access"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","organization":{"_id":"69fa929dd570bcdf690f2a35","name":"LocalLaws","fullname":"LOCUS","avatar":"https://www.gravatar.com/avatar/786dc907070f73651b156dd7793fb799?d=retro&size=100"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"64b71b089ebb7e6c7dce4d38","avatarUrl":"/avatars/b5601682b6500f30369ad7660e88bb55.svg","isPro":false,"fullname":"Joe Barrow","user":"jbarrow","type":"user"},{"_id":"69dc37469b90452a59768bed","avatarUrl":"/avatars/16c1f482d398c18da4a797543ae65893.svg","isPro":false,"fullname":"Denis","user":"denispeskoff","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"69fa929dd570bcdf690f2a35","name":"LocalLaws","fullname":"LOCUS","avatar":"https://www.gravatar.com/avatar/786dc907070f73651b156dd7793fb799?d=retro&size=100"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.19334.md","query":{}}">
Freeing the Law with LOCUS: A Local Ordinance Corpus for the United States
Abstract
A comprehensive corpus and access layer for U.S. local ordinance codes has been developed to enable machine-readable legal AI research, addressing the lack of authoritative legal text at scale for local regulations.
Progress in legal AI increasingly depends on access to authoritative legal text at scale. Yet one of the most consequential layers of American law remains largely absent from existing machine-readable corpora: local ordinances. Local codes govern zoning, housing, business licensing, public health, noise, animal control, and many other domains of everyday regulation, but they are fragmented across vendor platforms designed for human browsing rather than bulk research access. We introduce LOCUS - the Local Ordinance Corpus for the United States - a comprehensive corpus and county-harmonized access layer for U.S. municipal and county ordinance codes. The raw corpus, available for release to researchers, represents nearly all publicly available municipal and county ordinance codes. The resulting raw corpus contains codes from 9,239 cities and counties. A smaller county-harmonized LOCUS access layer provides coverage for the largest 2,309 of 3,144 U.S. counties, accounting for a majority of the population. We use OCR to handle the myriad of document formats that have kept the law from being a public resource. We release the corpus with coverage metadata to support reproducibility, downstream legal AI research, and the incremental expansion of machine-readable access to local law. We train a collection of ModernBERT-based classifiers and scorers to facilitate analyzing U.S. local law among several dimensions, such as opacity and paternalism, that have not previously been studied at this scale. LOCUS-v1 and its derivative models are available at: https://huggingface.co/datasets/LocalLaws/LOCUS-v1
Community
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2606.19334 in a model README.md to link it from this page.
Cite arxiv.org/abs/2606.19334 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2606.19334 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.