r/MachineLearning

500 articles archived · Visit source ↗ · RSS

r/MachineLearning community 4d ago

I stopped trusting model benchmarks and started running my own eval set, here is what changed[D]

Three things broke my faith in published benchmarks recently. One, Kimi K2.7 Code shipped with plus 21.8 percent on Kimi Code Bench v2, plus 11 percent on Program Bench, plus 31.5 percent on MLS Bench Lite. All three are Moonshot's own benchmarks. None were submitted to DeepSWE,…

23
r/MachineLearning community 5d ago

Any ideas for unconventional ML projects? [D]

Hey everyone, I'm a stats student and I'm struggling to come up with a personal machine learning project. I just can't seem to find an idea that genuinely sparks my curiosity, and that's usually how I learn best. For example, back when I was learning SQL, I got so obsessed with…

6
r/MachineLearning community 5d ago

Xperience-10M Download Help [D]

Hi, I really really need access to Xperience-10M for a deadline which is very soon. https://huggingface.co/datasets/ropedia-ai/xperience-10m Unfortunately, it looks like the owners have stopped approving people to download the dataset. I filled out the form a few weeks ago but…

27
r/MachineLearning community 5d ago

My toy spiking network completely flunked NARMA-10, but a simple neuroscience trick unlocked a 15x compute bargain. [D]

(Disclaimer: This post was drafted with the help of AI to keep it concise, but the research and work are entirely mine.) I’ve been building a spiking neural network (SNN) engine from scratch on my laptop as a solo project. To see if it was actually tracking anything useful…

27
r/MachineLearning community 5d ago

MuJoCo derived Simulator for High Fidelity Vision RL training natively on GPU [D]

Hi everyone, For the past couple of weeks I have been working on a simulator project considering the shortcomings of MuJoCo. There are things that people like and also don't like about MuJoCo, like the CPU dependency on MuJoCo which makes the simulation not parallelizable beyond…

31
r/MachineLearning community 5d ago

High Dimensional, Dynamic Rotary Positional Embedding [P]

At the end of my last post , I presented an idea: what if I used the core of my last project, the cumulative matrix product, and repurposed it as a positional embedding? I just finished fleshing out the math behind HDD-RoPE and training a model with this positional embedding…

31
r/MachineLearning community 5d ago

Find the best open-source OCR models in one place at Papers with Code [P]

Hi, I've created an overview of the most important OCR benchmarks, along with the top open models, and links to their paper and code: https://paperswithcode.co/tasks/ocr . This week, new OCR models were released by Baidu and Mistral. Baidu released Unlimited OCR , a 3B-parameter…

27
r/MachineLearning community 5d ago

I made a superhuman Generals.io agent with self-play RL [P]

Hi everyone, I trained a self-play RL agent for Generals.io that reached superhuman-level and ranked #1 on the human 1v1 leaderboard. It began as my master's thesis where the goal was to beat a prior algorithm based agent. We succeeded using behavior cloning, RL fine-tuning and…

6
r/MachineLearning community 5d ago

The verifier based vs verifier free test time scaling result is older than people act, and it keeps getting confirmed [D]

The Setlur et al result that scaling test time compute without verification or RL is provably suboptimal keeps showing up in my reading and I think it deserves more weight than the "yet another scaling paper" treatment it got. The core claim is that verifier based methods, RL or…

13
r/MachineLearning community 5d ago

I compiled LLM inference pricing across 7 providers — the caching numbers are surprising(spreadsheet included) [R]

I've been comparing GPU/LLM providers for a side project and ended up with way too many browser tabs and spreadsheets. So I decided to pull the public pricing data into one sheet and compare it side by side. A quick disclaimer: this is not benchmark data . I didn't run latency…

32
r/MachineLearning community 5d ago

Could it be that there aren’t really any medical LLM APIs available right now? [D]

As part of my ablations, I want to generate text with a medical-oriented LLM, and I was surprised to find no exposed APIs for this kind of model. I found models like MedGemma and BioMistral on Hugging Face, but they don’t seem to offer public APIs, and I really don’t want to…

24
r/MachineLearning community 6d ago

DeepSWE: new benchmark looking at how well today's frontier models can actually write code [R]

DeepSWE delivers four advances over existing public benchmarks: Contamination free: Tasks are written from scratch, not adapted from existing commits or PRs, so no model has seen the solution during pretraining. High diversity: Tasks span a broad pool of 91 repositories across 5…

9
r/MachineLearning community 6d ago

Will I be desk rejected for this[R]

so I submitted a paper to a conference, and literally went one line on a 2 column submission so literally half a line over the page limit. im really paranoid that this will be a desk rejection.. has anyone ever had this happen before? will it be desk rejected?   submitted by…

18
r/MachineLearning community 6d ago

Miccai grants results [D]

Do you guys get a miccai grants result? I do not receive any mail. Don’t I accept?   submitted by   /u/CrazyIndependent7436 [link]   [comments]

28
r/MachineLearning community 6d ago

WACV supp. mat. video [R]

Hello, WACV conference submission deadline is by the end of this week, good luck everyone! Does anyone know what the expected format/duration of the video for the supp. mat. is? The guidelines only mention: The supplementary material can be either PDF or ZIP only (maximum…

14
r/MachineLearning community 6d ago

What's your biggest pain point when choosing between cloud GPU providers for LLM inference?[R]

Trying to understand how other people make this decision. Do you compare $/hr, $/token, throughput, reliability? Is there a tool or resource you rely on, or are you just doing the math manually? Asking because I'm an ML engineer who's been doing this in spreadsheets and…

14
r/MachineLearning community 6d ago

Are model security risks (extraction, poisoning) actually being tested in production? [R]

Talk to a lot of ML teams who ship models but skip any adversarial testing before deployment. Feels like security review for models is way behind where it is for regular software. Anyone here actually doing this at their job?   submitted by   /u/Xorphian [link]  …

14
r/MachineLearning community 6d ago

Found a potential mistake in an ICLR 2026 blogpost [D]

I think I found a mistake in an ICLR 2026 blog post. I created an issue and have been trying to contact the author and organizers, but I haven't received a response after several weeks. Could anyone please take a look and let me know your thoughts? (I'm just curious and would…

26
r/MachineLearning community 7d ago

Just landed a Computer Vision internship, here's the preparation list I used [D]

Hey everyone, I recently landed a Computer Vision internship after prepping with this checklist I put together. It starts with core math and ML fundamentals, then moves into the specialized CV topics that actually come up in interviews. I compressed it into just 7 days due to…

25
r/MachineLearning community 7d ago

Non-deterministic Vulnerability Detection Benchmark System [P]

I work in firmware adjacent to AI, so not an ML guy exactly, so that's why I've come here. For work we got a bit concerned about Mythos and all the hype made me explore some benchmarking work. I now have this pretty cool benchmark that's about 80% done sitting around and haven't…

26
r/MachineLearning community 7d ago

Syntactically robust NLI for semantics of imperfectly generated text? [R]

Hi all, I'm looking for literature on relatively specific tooling. In autoregressive LLMs, there is substantial published work that used NLI on sub-claims produced by LLMs to gauge correctness of LLM answers. In diffusion (or D-) LLMs, the SoTA model generations that I see…

37
r/MachineLearning community 7d ago

Recommendations for speech annotation tools [D]

I'm looking for human-in-the-loop platforms that allow you to automatically transcribe audio followed by manually fixing the transcriptions and fine tuning the model. Is there a local (not an online service) installable platform for doing this?   submitted by  …

11
r/MachineLearning community 7d ago

About ML research collab group post [D]

Hi, I'm thinking of building a small community of 10-15 people where we can help each other to learn something new. The primary focus will be on ML research and open-source projects. If you're interested, DM me. knowledge of machine learning is a plus, as want to keep this a…

16
r/MachineLearning community 7d ago

Some new updates to Papers with Code [P]

Hi folks, Niels here from the open-source team at Hugging Face. I continue working on a revival of paperswithcode.co as we're back to the "age of research" per Ilya Sutskever! Hence, it's important to discover each other's research and build on each other's work, so we can…

38
r/MachineLearning community 8d ago

[ECCV 2026] Paper Decision Appeals Discussion [D]

With the release of meta-reviews, ECCV sent out a google form for dissatisfied authors to submit an appeal for the following reasons: Policy errors, e.g., reviewers or Area Chairs applied a policy that does not exist, or reviewers or Area Chairs applied policies that are not…

18
r/MachineLearning community 8d ago

An Update on Matrix Recurrent Units, an Attention Alternative [R]

I recently revisited my matrix recurrent units algorithm (the MRU), a novel linear-time sequence architecture I created as an alternative to attention. I explain it in depth at the repo , but the gist is the MRU works by transforming the embedding into an input state matrix,…

29
r/MachineLearning community 8d ago

Data-centric debugging for teams training neural nets [P]

We just did a big revamp of WeightsLab and wanted to share it here. If you’ve ever spent hours debugging a training run only to discover it was a data problem all along, this is for you. WeightsLab lets you pause training mid-run, inspect your live loss signals, and catch…

29
r/MachineLearning community 8d ago

Best current methods for finetuning whisper on domain specific vocabulary? [P]

Hey everyone, I’m wondering whether there are any newer or more effective methods for fine tuning whisper on domain specific speech. I’m working on a project where the model needs to reliably detect certain specific words and technical terms. The vocabulary and context are…

4
r/MachineLearning community 8d ago

EMA on LoRA ? [R]

Hi guys Does anyone know of papers where EMA on LoRA adapters has been used successfully? Im interested in cases where the EMA adapter acts as a self-teacher generating soft labels for the trainable adapter. On-policy self-distillation [1] uses ema for the teacher. However, they…

20
r/MachineLearning community 8d ago

A slightly improved DVD-JEPA demo [P]

Hey! I came across this post , which I found quite neat as a minimal demonstration of JEPA. However, as the comments pointed out, there was some room for improvement. So I added a few things such as environment noise and a fair* comparison to a pixel-space baseline. I think the…

19
r/MachineLearning community 8d ago

I released a softmax-free attention model at GPT-2 Medium scale (~354M params, 11.5B tokens): structural sparsity + tile-skipping kernels for long-context VRAM savings. Open weights + custom Triton kernels [R]

  submitted by   /u/NonGameCatharsis [link]   [comments]

29
r/MachineLearning community 8d ago

Looking for an ML/data collaborator — open to any project idea [p]

I want to team up on a ML project, no fixed idea yet. Open to whatever's interesting: NLP, CV, time series, whatever you're into. Looking for: anyone with an idea (Or without, we can think about something togther) + ML engineer to build it with Goal: my goal is to strengthen my…

33
r/MachineLearning community 9d ago

Python packages for particle swarms, genetic algorithms. Scikit-opt maybe? [D]

I'm working with a client on a curve-fitting optimization problem. They are currently using a constrained Levenburg-Marquardt optimizer for their task which is complex, slow, and sometimes gets stuck in local minima. I suggested using particle swarm optimization (PSO), and the…

17
r/MachineLearning community 9d ago

Studying FLUX in diffusers library was hard, so I built a smaller open-source version [P]

If you've tried to study modern diffusion models by digging through the official diffusers library, you know it can be overwhelming with its complexity and abstractions. I wanted to simplify FLUX diffusion models, so I built minFLUX : a PyTorch implementation focused on its core…

38
r/MachineLearning community 9d ago

TSAuditor: A time-series auditing framework [P]

This happened a few months ago when I was working on an analysis project that dealt with time-series data. The dataset was large (10 years of data). I was using a standard profiling tool to check the pipeline. Everything looked fine because the tool reported 3% missing data rate…

29
r/MachineLearning community 9d ago

American businesses are using Chinese AI again? [N]

https://econlab.substack.com/p/top-saas-vendors-on-ramp-june-2026   submitted by   /u/NoVillage8460 [link]   [comments]

36
r/MachineLearning community 9d ago

Hi Reddit, I posted my Build Your Own LLM workshop to Youtube teaching ML, LLM and math intuition [P]

Hi internet friends, I recorded a workshop about building your own LLM without any math / ML prerequisites. It covers everything from machine learning fundamentals, deep neural networks, transformer architecture, and pre/post-training. The only prerequisite is being comfortable…

5
r/MachineLearning community 9d ago

Would you let an ML PhD student graduate without a top-tier paper? [D]

Suppose you’re a PhD advisor in machine learning. Your student has been in the program for 4 years, has done solid work, and has a coherent thesis direction but they haven’t published in an A*ML venue or top journal. No NeurIPS/ICML/ICLR/CVPR/etc., and no equivalent top venue in…

11
r/MachineLearning community 9d ago

An open handbook on LLM inference at scale (GPU internals, KV cache, batching, vLLM/SGLang/TensorRT-LLM) [P]

I've been working through the internals of LLM inference and writing up what I learn as an open, in-progress handbook. Just wrapped another chapter on GPU execution and memory internals: why a GPU sits mostly idle during inference, how the memory hierarchy gates throughput, and…

13
r/MachineLearning community 9d ago

DVD-JEPA: an open-source, fully-reproducible JEPA world model [P]

A paper currently trending on paperswithcode.co in the "Anomaly Detection" category is DVD-JEPA . https://i.redd.it/r6fd8n3d4f8h1.gif Here is the short summary: Most attempts to learn a world model from video try to predict the next frame pixel-by-pixel, and drown in detail that…

11
r/MachineLearning community 9d ago

Time Series Modeling Needs a Dynamical Systems Perspective [R]

In our #ICML2026 position paper we argue a dynamical systems perspective is needed to drive time series (TS) modeling forward: https://arxiv.org/abs/2602.16864 Essentially all time series in nature and engineering come from some underlying dynamical system (DS), mostly chaotic…

31
r/MachineLearning community 9d ago

Built a Global AQ (PM2.5) Forecaster ML Model [P]

Hey everyone, I’ve been building an end-to-end Air Quality (PM2.5) forecasting pipeline for 4 countries (US, UK, India, Australia) using 1.6M+ rows of OpenAQ and NASA weather data. The problem i hit (the variance trap): My V7 model was a standard stateless Gradient Boosting…

23
r/MachineLearning community 10d ago

how to access books3 dataset for research purposes? [R]

as per the title, how to access books3 dataset for research purposes?   submitted by   /u/xolmnyc [link]   [comments]

17
r/MachineLearning community 10d ago

Top notch best modern Probability or Statistics Books to get started with ML? [D]

Recommend some of the best modern books about probability and/or statistics to help you get that probability intuition or mindset needed to excel at ML, from beginning to advanced or separated please!   submitted by   /u/c_carav_io [link]   [comments]

9
r/MachineLearning community 10d ago

Built a local ML pipeline that blocks risky commits before they leave your machine [P]

I'm a recent CS grad trying to break into ML engineering, and I just finished the first version of a side project I've been working on. Posting it here because I want people who know this space better than me to poke holes in it. The idea started from that feeling every dev has…

4
r/MachineLearning community 10d ago

Dealing with a messy prescriptive monolith. How do you survive this? [D]

Months ago, I got my first maintenance project. Before this, I had only built new solutions from scratch and maintained my own code. But maintaining someone else's system feels completely different.  It’s a prescriptive recommendation system that uses XGBoost models and…

20
r/MachineLearning community 10d ago

Best library for releasing my research optimization algorithm? [D]

Hi All! I have developed a research optimizer (QQN Quadratic Quasi-Newton) and published a paper on it where I am able to, but I would really like to make the algorithm itself easily available to the community for evaluation. I have a Rust, Java, and Javascript implementations,…

36
r/MachineLearning community 10d ago

How does torch.compile() achieve massive speedups despite highly optimized NumPy functions? [D]

I was pondering on this question and decided to dive deep into torch.compile. It was a lot of fun learning about operator fusion as the central idea behind torch.compile. So I created a tiny version of torch.compile in 500 lines of python and a notebook showing how this works:…

8
r/MachineLearning community 11d ago

Fearless Concurrency on the GPU: Safe GPU inference in Rust, competitive with vLLM/SGLang [R]

I maintain cuTile Rust and just posted the paper "Fearless Concurrency on the GPU." As more GPU code gets AI-generated, the bottleneck moves from writing it to trusting it. cuTile Rust lets you write or generate GPU kernels whose memory safety and data-race freedom are verified…

29
r/MachineLearning community 11d ago

Neuron Populations Exhibit Divergent Selectivity with Scale [R]

Hi! We just released a paper where we study “Rosetta Neurons”: universal neurons across different neural networks, and their relationship to scaling laws, specialization, and monosemanticity. Would love to kick off a discussion and get the community's thoughts. Main Findings: We…

11

I stopped trusting model benchmarks and started running my own eval set, here is what changed[D]

Any ideas for unconventional ML projects? [D]

Xperience-10M Download Help [D]

My toy spiking network completely flunked NARMA-10, but a simple neuroscience trick unlocked a 15x compute bargain. [D]

MuJoCo derived Simulator for High Fidelity Vision RL training natively on GPU [D]

High Dimensional, Dynamic Rotary Positional Embedding [P]

Find the best open-source OCR models in one place at Papers with Code [P]

I made a superhuman Generals.io agent with self-play RL [P]

The verifier based vs verifier free test time scaling result is older than people act, and it keeps getting confirmed [D]

I compiled LLM inference pricing across 7 providers — the caching numbers are surprising(spreadsheet included) [R]

Could it be that there aren’t really any medical LLM APIs available right now? [D]

DeepSWE: new benchmark looking at how well today's frontier models can actually write code [R]

Will I be desk rejected for this[R]

Miccai grants results [D]

WACV supp. mat. video [R]

What's your biggest pain point when choosing between cloud GPU providers for LLM inference?[R]

Are model security risks (extraction, poisoning) actually being tested in production? [R]

Found a potential mistake in an ICLR 2026 blogpost [D]

Just landed a Computer Vision internship, here's the preparation list I used [D]

Non-deterministic Vulnerability Detection Benchmark System [P]

Syntactically robust NLI for semantics of imperfectly generated text? [R]

Recommendations for speech annotation tools [D]

About ML research collab group post [D]

Some new updates to Papers with Code [P]

[ECCV 2026] Paper Decision Appeals Discussion [D]

An Update on Matrix Recurrent Units, an Attention Alternative [R]

Data-centric debugging for teams training neural nets [P]

Best current methods for finetuning whisper on domain specific vocabulary? [P]

EMA on LoRA ? [R]

A slightly improved DVD-JEPA demo [P]

I released a softmax-free attention model at GPT-2 Medium scale (~354M params, 11.5B tokens): structural sparsity + tile-skipping kernels for long-context VRAM savings. Open weights + custom Triton kernels [R]

Looking for an ML/data collaborator — open to any project idea [p]

Python packages for particle swarms, genetic algorithms. Scikit-opt maybe? [D]

Studying FLUX in diffusers library was hard, so I built a smaller open-source version [P]

TSAuditor: A time-series auditing framework [P]

American businesses are using Chinese AI again? [N]

Hi Reddit, I posted my Build Your Own LLM workshop to Youtube teaching ML, LLM and math intuition [P]

Would you let an ML PhD student graduate without a top-tier paper? [D]

An open handbook on LLM inference at scale (GPU internals, KV cache, batching, vLLM/SGLang/TensorRT-LLM) [P]

DVD-JEPA: an open-source, fully-reproducible JEPA world model [P]

Time Series Modeling Needs a Dynamical Systems Perspective [R]

Built a Global AQ (PM2.5) Forecaster ML Model [P]

how to access books3 dataset for research purposes? [R]

Top notch best modern Probability or Statistics Books to get started with ML? [D]

Built a local ML pipeline that blocks risky commits before they leave your machine [P]

Dealing with a messy prescriptive monolith. How do you survive this? [D]

Best library for releasing my research optimization algorithm? [D]

How does torch.compile() achieve massive speedups despite highly optimized NumPy functions? [D]

Fearless Concurrency on the GPU: Safe GPU inference in Rust, competitive with vLLM/SGLang [R]

Neuron Populations Exhibit Divergent Selectivity with Scale [R]