We present SITA - a method for bootstrapping flow matching models to sample the conformational ensemble of molecules. SITA accomplishes this via a temperature annealing importance sampling scheme. In the absences of tractable exact likelihoods for diffusion and flow matching models, we incorporate surrogate likelihood estimators in the form of energy-based models to facilitate the estimation of importance weights. SITA achieves state-of-the-art performance on Alanine Dipeptide and Alanine Tripeptide.</p>\n","updatedAt":"2026-06-04T17:22:12.466Z","author":{"_id":"6a205c27b699a3d5b2990850","avatarUrl":"/avatars/c78f6e6c2c23cc2812b91fc45d530fd0.svg","fullname":"Daniel Penaherrera","name":"dpenaherrera","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8170005679130554},"editors":["dpenaherrera"],"editorAvatarUrls":["/avatars/c78f6e6c2c23cc2812b91fc45d530fd0.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2605.31498","authors":[{"_id":"6a2062a015100c5272a84498","user":{"_id":"6a205c27b699a3d5b2990850","avatarUrl":"/avatars/c78f6e6c2c23cc2812b91fc45d530fd0.svg","isPro":false,"fullname":"Daniel Penaherrera","user":"dpenaherrera","type":"user","name":"dpenaherrera"},"name":"Daniel Peñaherrera","status":"claimed_verified","statusLastChangedAt":"2026-06-04T12:40:24.449Z","hidden":false},{"_id":"6a2062a015100c5272a84499","name":"Rishal Aggarwal","hidden":false},{"_id":"6a2062a015100c5272a8449a","name":"David Ryan Koes","hidden":false}],"publishedAt":"2026-06-01T00:00:00.000Z","submittedOnDailyAt":"2026-06-04T00:00:00.000Z","title":"Scalable Inference-Time Annealing with Surrogate Likelihood Estimators","submittedOnDailyBy":{"_id":"6a205c27b699a3d5b2990850","avatarUrl":"/avatars/c78f6e6c2c23cc2812b91fc45d530fd0.svg","isPro":false,"fullname":"Daniel Penaherrera","user":"dpenaherrera","type":"user","name":"dpenaherrera"},"summary":"A long standing challenge in computational chemistry and biophysics is efficiently sampling the Boltzmann distribution of molecules. Advances in generative modeling have been proposed to address the limitations of conventional sampling techniques by eliminating the computational cost of simulation. A promising direction is iteratively finetuning diffusion models along a temperature ladder whereby training data is generated via importance sampling during inference-time annealing. Unfortunately, these methods require computing a divergence over the score field to estimate importance weights, rendering them intractable for larger systems. Here we present scalable inference-time annealing (SITA), which retrains flow-based models to generate samples at progressively lower temperatures using an energy-based model to facilitate fast surrogate likelihoods. We demonstrate state-of-the-art performance on both Alanine Dipeptide and Alanine Tripeptide while avoiding costly divergence terms. Our code is available at https://github.com/countrsignal/sita.git","upvotes":0,"discussionId":"6a2062a015100c5272a8449b","githubRepo":"https://github.com/countrsignal/sita","githubRepoAddedBy":"user","ai_summary":"Scalable inference-time annealing method uses flow-based models with energy-based surrogates to efficiently sample Boltzmann distributions without costly divergence calculations.","ai_keywords":["diffusion models","temperature ladder","importance sampling","score field","flow-based models","energy-based model","surrogate likelihoods","Boltzmann distribution","generative modeling","computational chemistry","biophysics"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":1},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[],"acceptLanguages":["en"],"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2605/2605.31498.md"}">
Scalable Inference-Time Annealing with Surrogate Likelihood Estimators
Abstract
Scalable inference-time annealing method uses flow-based models with energy-based surrogates to efficiently sample Boltzmann distributions without costly divergence calculations.
A long standing challenge in computational chemistry and biophysics is efficiently sampling the Boltzmann distribution of molecules. Advances in generative modeling have been proposed to address the limitations of conventional sampling techniques by eliminating the computational cost of simulation. A promising direction is iteratively finetuning diffusion models along a temperature ladder whereby training data is generated via importance sampling during inference-time annealing. Unfortunately, these methods require computing a divergence over the score field to estimate importance weights, rendering them intractable for larger systems. Here we present scalable inference-time annealing (SITA), which retrains flow-based models to generate samples at progressively lower temperatures using an energy-based model to facilitate fast surrogate likelihoods. We demonstrate state-of-the-art performance on both Alanine Dipeptide and Alanine Tripeptide while avoiding costly divergence terms. Our code is available at https://github.com/countrsignal/sita.git
Community
We present SITA - a method for bootstrapping flow matching models to sample the conformational ensemble of molecules. SITA accomplishes this via a temperature annealing importance sampling scheme. In the absences of tractable exact likelihoods for diffusion and flow matching models, we incorporate surrogate likelihood estimators in the form of energy-based models to facilitate the estimation of importance weights. SITA achieves state-of-the-art performance on Alanine Dipeptide and Alanine Tripeptide.
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Cite arxiv.org/abs/2605.31498 in a model README.md to link it from this page.
Cite arxiv.org/abs/2605.31498 in a dataset README.md to link it from this page.
Cite arxiv.org/abs/2605.31498 in a Space README.md to link it from this page.
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.