We've released everything to use and build on ReSyn from the Hub:</p>\n<ul>\n<li>📚 Dataset: <a href=\"https://huggingface.co/datasets/mrseongminkim/ReSyn\">https://huggingface.co/datasets/mrseongminkim/ReSyn</a></li>\n<li>🤖 Pre-trained components (loadable via <code>from_pretrained</code>): <a href=\"https://huggingface.co/mrseongminkim/ReSyn-Set2Regex\">Set2Regex</a> · <a href=\"https://huggingface.co/mrseongminkim/ReSyn-Router\">Router</a> · <a href=\"https://huggingface.co/mrseongminkim/ReSyn-Partitioner\">Partitioner</a> · <a href=\"https://huggingface.co/mrseongminkim/ReSyn-Segmenter\">Segmenter</a></li>\n<li>🤖 Prax baseline: <a href=\"https://huggingface.co/mrseongminkim/ReSyn-byt5-small\">ReSyn-byt5-small</a></li>\n<li>💻 Code: <a href=\"https://github.com/mrseongminkim/ReSyn\" rel=\"nofollow\">https://github.com/mrseongminkim/ReSyn</a></li>\n</ul>\n","updatedAt":"2026-06-19T15:23:05.295Z","author":{"_id":"66b5bde18e1ac500b6b589b5","avatarUrl":"/avatars/23c1115dd0ce822300f7a9222323f1de.svg","fullname":"Seongmin Kim","name":"mrseongminkim","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.5866929292678833},"editors":["mrseongminkim"],"editorAvatarUrls":["/avatars/23c1115dd0ce822300f7a9222323f1de.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2603.24624","authors":[{"_id":"6a34f1864c5c5e0d69bf1dbe","user":{"_id":"66b5bde18e1ac500b6b589b5","avatarUrl":"/avatars/23c1115dd0ce822300f7a9222323f1de.svg","isPro":false,"fullname":"Seongmin Kim","user":"mrseongminkim","type":"user","name":"mrseongminkim"},"name":"Seongmin Kim","status":"claimed_verified","statusLastChangedAt":"2026-06-19T14:19:26.866Z","hidden":false},{"_id":"6a34f1864c5c5e0d69bf1dbf","name":"Hyunjoon Cheon","hidden":false},{"_id":"6a34f1864c5c5e0d69bf1dc0","name":"Su-Hyeon Kim","hidden":false},{"_id":"6a34f1864c5c5e0d69bf1dc1","name":"Yo-Sub Han","hidden":false},{"_id":"6a34f1864c5c5e0d69bf1dc2","name":"Sang-Ki Ko","hidden":false}],"mediaUrls":["https://cdn-uploads.huggingface.co/production/uploads/66b5bde18e1ac500b6b589b5/lza6s-34ADAmYg4ryUShk.png","https://cdn-uploads.huggingface.co/production/uploads/66b5bde18e1ac500b6b589b5/urBLZw74CN8o7k9nil6Bg.png","https://cdn-uploads.huggingface.co/production/uploads/66b5bde18e1ac500b6b589b5/F-BHyKl2qS_4uGJrTMyo0.png"],"publishedAt":"2026-06-13T00:00:00.000Z","submittedOnDailyAt":"2026-06-19T00:00:00.000Z","title":"ReSyn: A Generalized Recursive Regular Expression Synthesis Framework","submittedOnDailyBy":{"_id":"66b5bde18e1ac500b6b589b5","avatarUrl":"/avatars/23c1115dd0ce822300f7a9222323f1de.svg","isPro":false,"fullname":"Seongmin Kim","user":"mrseongminkim","type":"user","name":"mrseongminkim"},"summary":"Existing Programming-By-Example (PBE) systems often rely on simplified benchmarks that fail to capture the high structural complexity of real-world regexes, such as deeper nesting and frequent use of union operations. To overcome the resulting performance drop, we propose ReSyn, a synthesizer-agnostic divide-and-conquer framework that decomposes complex synthesis problem into manageable sub-problems. We also introduce Set2Regex, a parameter-efficient synthesizer capturing the permutation invariance of examples. Experimental results demonstrate that ReSyn significantly boosts accuracy across various synthesizers, and its combination with Set2Regex establishes a new state-of-the-art on challenging real-world benchmark. The complete source code, datasets, and pre-trained model checkpoints are publicly available at https://github.com/mrseongminkim/ReSyn.","upvotes":0,"discussionId":"6a34f1864c5c5e0d69bf1dc3","githubRepo":"https://github.com/mrseongminkim/ReSyn","githubRepoAddedBy":"user","ai_summary":"A divide-and-conquer framework named ReSyn enhances regex synthesis accuracy by decomposing complex problems, combined with a parameter-efficient synthesizer called Set2Regex that handles example permutation invariance.","ai_keywords":["Programming-By-Example","regex synthesis","synthesizer-agnostic","divide-and-conquer framework","Set2Regex","parameter-efficient synthesizer","permutation invariance"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":0},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[],"acceptLanguages":["en"],"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2603/2603.24624.md","query":{}}">
ReSyn: A Generalized Recursive Regular Expression Synthesis Framework
Abstract
A divide-and-conquer framework named ReSyn enhances regex synthesis accuracy by decomposing complex problems, combined with a parameter-efficient synthesizer called Set2Regex that handles example permutation invariance.
Existing Programming-By-Example (PBE) systems often rely on simplified benchmarks that fail to capture the high structural complexity of real-world regexes, such as deeper nesting and frequent use of union operations. To overcome the resulting performance drop, we propose ReSyn, a synthesizer-agnostic divide-and-conquer framework that decomposes complex synthesis problem into manageable sub-problems. We also introduce Set2Regex, a parameter-efficient synthesizer capturing the permutation invariance of examples. Experimental results demonstrate that ReSyn significantly boosts accuracy across various synthesizers, and its combination with Set2Regex establishes a new state-of-the-art on challenging real-world benchmark. The complete source code, datasets, and pre-trained model checkpoints are publicly available at https://github.com/mrseongminkim/ReSyn.
Community
We've released everything to use and build on ReSyn from the Hub:
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2603.24624 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.