r/MachineLearning

500 articles archived · Visit source ↗ · RSS

r/MachineLearning community 22d ago

Software and ops skills for data scientists[D]

With more software engineers entering into data science and AI, I feel it's equally important for a person with data and AI background to dive into software development to survive, thrive in industry. I Know it's a very broad question, so suggestions with broad subjects, topics…

9
r/MachineLearning community 22d ago

M5 air 24gb or M5 pro 16gb for swe + ml ? [D]

Hi folks, Deciding between these two Mac options has been a challenge for me, so pls help. I know mac is not even necessary for this but just help me to decide between these two options. For the reference, Im a swe student and looking forward to go deep into ml and data science…

11
r/MachineLearning community 22d ago

For those using Google Colab, what features did you wish it had? [D]

Hi everyone, I'm an undergraduate student and ML researcher at UC Berkeley. My colleagues and I are working on a project that hopes to fix some of the problems users face with Colab. What are the features you wish it had as an ML professional, researcher, or enthusiast? What're…

32
r/MachineLearning community 22d ago

Two independent ML/CV researchers (M.Eng, ex-research-institute in Europe) looking for an arXiv cs.CV endorser for a nearly finished paper. Happy to share the full draft, repo, or talk collaboration [D]

Hey everyone, hope this is okay to post here. My co-author and I are currently between institutional affiliations, which means we don't have the academic email arXiv needs for an endorsement. We're hoping to find someone in cs.CV willing to take a quick look at our paper and…

29
r/MachineLearning community 23d ago

Research collection of Arxiv whitepapers [R]

I read and collected Arxiv whitepapers starting after the launch of ChatGPT. I copied and pasted excerpts into Word to track them. Then migrated to Obsidian. That vault of some 1700 papers is now online. I figured it was time to see if others would find the collection useful. My…

38
r/MachineLearning community 23d ago

Got told my open-source model experiments are too scattered. I'm organizing a journal to provide clarity before structuring the first git release. Is this readable for ML folks who aren’t in mech interp? Open to ANY feedback [D]

# Results Journal: Qwen3.5-35B-A3B E114 as a Generated-Register Routing Signal Date: 2026-06-06 This is an experiment-history document, not a publication claim. It states the current best evidence for the strongest positive result in the Qwen3.5-35B-A3B set, the narrow…

20
r/MachineLearning community 23d ago

ML reading group to read recent interesting and trending papers from ICML/ICLR/NeurIPS [D]

Hi, I am and PhD student and trying to run a ML reading group focused on interpretability and robustness every weekend. Its always nice to hear different takes and opinions on a paper and this discussion group could serve the purpose. If you are a fellow PhD student or a ML…

15
r/MachineLearning community 23d ago

Sources for ML news? [D]

I need a break from social media and all the bots.. Aside from Arxiv are there any sources that do a good job of aggregating the good stuff and filtering out all the junk?   submitted by   /u/Tiny_Arugula_5648 [link]   [comments]

17
r/MachineLearning community 23d ago

Training-free graph SSL matches GCN with 5× fewer labels — live demo [P]

Hi all, I have been working on this method based on a hunch along with many llm for quite some time. Though first it was being engineered by me but I was learning in supervised ml area but this hunch took to semi-supervised ml and that to too deep. I then became llm orchestrator…

23
r/MachineLearning community 23d ago

Does it make sense to use alternative quantizations of QAT models? [D]

From TF's website: Quantization aware training emulates inference-time quantization, creating a model that downstream tools will use to produce actually quantized models. So is it designed to work with a very specific quantization method (for Gemma-4, presumably, Google's own)?…

17
r/MachineLearning community 23d ago

Anyone here with experience submitting to Nature Machine Intelligence? [R]

I'm planning to submit a paper to either NMI, but this will be my first paper to a nature-like venue. Would love a quick chat with anyone that has experience. My paper's specifically more geared towards signal processing with ML for a specific subfield of engineering. But can be…

10
r/MachineLearning community 24d ago

Using FC26 to simulate the world cup ? [D]

maybe this should be asked in the Fc26 game subreddit but not sure. Anyway I just saw a video of someone predicting the winner of the world cup using the simulate match feature in the game but he only did it once. Would running this feature 100-1000 times give a significant…

23
r/MachineLearning community 24d ago

What laptop do suggest I buy?[D]

Guys, for those experienced in the space. im actually confused at this point. I work around ML, Data science, analytics, engineering, research and general programmatic. What laptop or workstation would you advise me on? I need speed, high performance, durability and cost…

8
r/MachineLearning community 24d ago

Building a Custom Drones MuJoCo Environment [P]

Hi all, Lately I have been working on creating a package for Multi Agent RL based drone environments with different objectives, all bundled into a single GitHub repository: tau-intelligence/MuJoCo-drones-gym. I am currently trying to organize things for RL community people, with…

31
r/MachineLearning community 24d ago

TinyTPU: SystemVerilog systolic array compiled to WASM, running live in browser - RTL golden-verified against numpy [P]

Most explanations of TPUs and systolic arrays are either hand-wavy diagrams or papers. I wanted to see the thing actually run, so I built it. TinyTPU is a 4×4 weight-stationary systolic array in real SystemVerilog, compiled to WebAssembly, with a step-by-step browser…

32
r/MachineLearning community 24d ago

ICML non-archival workshop - worth attending? [D]

I have a paper accepted at a non-archival ICML workshop this year, and I am trying to decide whether it is worth registering and attending. By coincidence, I will already be in Seoul around that time, but I would have to pay the workshop registration fee (~$400) out of my own…

34
r/MachineLearning community 24d ago

I'm looking to join/form a team working on physical AI robotics challenge [P]

Hey all, I'm a robotics engineer by training turned ML/AI engineer because of passion right after school. I want to start combining these skills together and I think a competition is the best way of doing it. Here's an example of a challenge I'm talking about to set expectations…

18
r/MachineLearning community 24d ago

How do you identify researchers who are good? [D]

About 10 years ago, I got into the basics of ML (like regression, KNN's, LVQ's) and read a few papers before taking a break a few years back. It feels like now, there's a lot of researchers in AI. How do you identify the ones who are actually solid vs those who (forgive my…

19
r/MachineLearning community 24d ago

Benchmark: ONNX Runtime vs HF Transformers vs GGUF for Parakeet TDT 0.6B on CPU-only hardware [D]

Sharing a small CPU inference benchmark for nvidia/parakeet-tdt-0.6b-v3 that turned up a result I didn't expect going in. Setup: 2 x86-64 vCPUs (AVX2/FMA), 7.7GB RAM, no GPU. Test audio: 16.78s Harvard sentences at 16kHz mono. Results: Inference path RTF Peak Memory CPU…

26
r/MachineLearning community 24d ago

An autonomous research agent was the #1 contributor in OpenAI's Hiring Competition Parameter Golf (by merged records)[R]

https://preview.redd.it/kucy7n6nrg5h1.png?width=1600&format=png&auto=webp&s=b1c2e537667fbca3d1736fc103296c7374270d9c An autonomous research agent ended up with more merged leaderboard records than any individual human contributor in OpenAI's spring hiring competition, Parameter…

27
r/MachineLearning community 24d ago

Are We Underestimating Small Edge AI Models?[D]

A lot of recent discussion around Edge AI focuses on running increasingly larger local LLMs. Meanwhile modern smartphones already have enough compute for many practical computer vision tasks that don't require massive models at all. I recently built and released an Android…

7
r/MachineLearning community 25d ago

Would you say capture-time semantic annotation for robot trajectories is a solved problem? [R]

It seems raw teleoperation data (RGB + joint states) structurally lacks affordance, contact intent, and embodiment-specific kinematic context. (information that can't be reliably recovered post-hoc once the demonstration is recorded) Most current approaches either filter/clean…

11
r/MachineLearning community 25d ago

Is it allowed to use OpenAI API outputs to create a silver code dataset or benchmark for a specific Python library? [d]

Hello everyone, Is it allowed to use OpenAI API outputs to create a silver code dataset or benchmark for a specific Python library? I am working on a project idea related to library-specific code generation. The concrete case is a specific Python library used in a…

18
r/MachineLearning community 25d ago

Scrap the LLMs. Scoring 4.76% on the brand new ARC-3 using pure code, a 2012 AMD CPU, and zero AI tokens.[P]

Hey everyone, The ARC Prize 2026 just launched the interactive ARC-AGI-3 track, and the collective AI world is panic-renting massive H100 clusters trying to get multi-billion parameter LLMs to navigate these dynamic environments. Predictably, out-of-the-box LLMs are faceplanting…

31
r/MachineLearning community 25d ago

[R] Measuring the Symmetry--Data Exchange Rate

The prediction that equivariance reduces sample complexity by a factor of |G| appears in roughly every paper on geometric deep learning and is measured as an actual scaling law in roughly none of them. This paper does the measurement. The methodology is the interesting part.…

9
r/MachineLearning community 25d ago

How do ML researchers actually use AI tools to improve their writing? [D]

As an ML researcher, how do you use AI tools in your daily work? Do you mostly use them to clean up grammar and wording, or also to rewrite, structure, or draft technical text?   submitted by   /u/Hope999991 [link]   [comments]

5
r/MachineLearning community 25d ago

We built a source-available LLM reliability library (free for research / personal / internal eval) that can cut inference cost by half at matched quality, and you adopt it by changing one import [P] [R]

TL;DR: Reliability techniques (methods that boost an LLM's correctness by spending extra inference, e.g., retries with feedback, ensembling, generator/critic refinement, verification passes, difficulty-aware routing) are scattered across the literature, each in its own…

10
r/MachineLearning community 25d ago

[P]Stop using print() to debug your agents. Here's a 60-second alternative.[P]

Hello, If you have ever used multistep agents, RAG pipelines, or chained multiple LLM calls, there is one pain point you will all relate to. When an agent gets stuck in an infinite loop, suddenly hallucinates on the third step, or is quietly burning through OpenAI API credits...…

20
r/MachineLearning community 25d ago

Faithful uncertainty in LLM agents: calibration vs utility tradeoff in practice[D]

The Google paper on metacognition for hallucination reduction makes a distinction that is underappreciated in benchmarks. Calibration is not about being right more often. It is about matching confidence to correctness. A perfectly calibrated model can still be wrong twenty five…

26
r/MachineLearning community 25d ago

KVarN: Variance-Normalized KV-Cache Quantization [R]

Excited to share some of my own work here :) KVarN is our new KV-Cache quantization method. In very brief, we combine Hadamard rotations with variance-normalization on both axes of the K and V matrices, then round to nearest. Simple, but works very well, especially for…

21
r/MachineLearning community 25d ago

On-policy distillation: one of the hottest terms on PapersWithCode [R]

Hi, Niels here from the open-source team at Hugging Face. At paperswithcode.co I am trying to make it easier for people to learn about the newest techniques used across AI papers. One of the hottest terms in AI research that I've recently added is On-policy distillation , also…

27
r/MachineLearning community 25d ago

ICML financial aid [D]

Hello I am curious about the election criteria for ICML financial aid. If anyone have been granted financial aid would you mind sharing your profile. Somehow being a black woman ( 2 underrepresented groups) with one paper accepted at the main conference and two papers accepted…

7
r/MachineLearning community 25d ago

How Do You Handle Ablation Studies When the Original Model Is Already Trained?[R]

I'm running into an issue with an ablation study for a paper I'm preparing. I trained a model. The model achieved my best result, and I saved the trained checkpoint ( .pth file). Now my supervisor wants me to perform an ablation study by removing components and how it impacts…

29
r/MachineLearning community 26d ago

Embedding space [D]

Hello everyone, I’m relatively new to this area of machine learning and currently experimenting with Variational Autoencoders (VAEs) to build an embedding space for an image dataset with images have different spatial dimensions, I cannot easily standardize them to a fixed size.…

11
r/MachineLearning community 26d ago

Repo for implementations of various Transformer Attn mechanisms [P]

Initially, I developed this so I can easily switch between different Attention mechanisms for my Small Language Model (SLM) experiments and benchmarking. However, I also realized that these implementations can be applicable in Computer Vision, modernize Vision Encoders, RL, and…

14
r/MachineLearning community 26d ago

Research in Image/Video Gen AI models [D]

I've been going down a rabbit hole with image/video generation/editing models for a few months now, started with playing around with Stable Diffusion and ComfyUI, then got genuinely hooked on understanding why things work, not just that they do. I have an Engineering background…

20
r/MachineLearning community 26d ago

In current ML systems, where is the main bottleneck: dataset quality or model architecture improvements? [D]

A lot of recent progress in ML appears to come from scaling existing architectures rather than introducing fundamentally new ones. At the same time, there’s increasing emphasis on dataset quality, curation, and synthetic data pipelines. In practice, I’m trying to understand how…

13
r/MachineLearning community 26d ago

Best Visual Reasoning Model in 2026 (Including APIs) [D]

For example, suppose I have a one-hour video and I provide it to ChatGPT or another AI model. If I ask complex reasoning questions about the video, which models are best suited for long-horizon video understanding and reasoning? Which models can produce the most reliable answers…

38
r/MachineLearning community 26d ago

I have done a ML Project as a Novice [P]

Hi there! I am going to complete my MSc in Business Analytics and planning to do some real-life projects to attract the recruiters. I am sharing one of such projects here: FIFA World Cup 2026 Prediction: https://amit-world-cup-2026-simulator.streamlit.app/ Project Overview Large…

5
r/MachineLearning community 26d ago

Has anyone heard back from citadel ICML travel grant ? [D]

It’s confusing because they said applicants will be notified on 3rd June but also said you’ll be notified 2-4 weeks after the deadline (29th may)   submitted by   /u/Smol_pp001 [link]   [comments]

6
r/MachineLearning community 26d ago

First paper acceptance (ICML Workshop), should I attend? [D]

I just finished my first year of undergrad, and I got my first first-author paper accepted to an ICML workshop! Super stoked, especially since I was lowk a crashout in high school I wanted to know if it is worth it for me to go? It's quite expensive, and I will be the only one…

30
r/MachineLearning community 26d ago

NeurIPS Reciprocal Reviewers be careful in reviewing with LLMs [D]

As the title says. I am not a reciprocal reviewer but I just noticed a clever prompt injection like they did in ICML for our submission.   submitted by   /u/Massive-Bobcat-5363 [link]   [comments]

18
r/MachineLearning community 26d ago

How are production ML systems typically handling distribution shift over time? [D]

In deployed ML systems, data distribution drift seems unavoidable over longer time horizons. I’m trying to understand what approaches are commonly used in practice: Continuous retraining pipelines (fixed intervals vs trigger-based) Online monitoring for feature or prediction…

25
r/MachineLearning community 26d ago

NeurIPS used uncalibrated AI detector for desk rejections [D]

I recently had a submission desk-rejected from the NeurIPS 2026 Position Paper Track for an alleged AI-policy violation. After corresponding with the track leadership and reading their public blog post, I think the broader methodological issue is worth discussing here. The track…

13
r/MachineLearning community 26d ago

Analysis of AlphaZero training data [D]

I am trying to train an AlphaZero model for Othello on a 6x6-board. Having been warned that too little exploration during data generation can lead to models being overconfident and trapped in some tight region of the search tree, I started with the value c_puct = 4.0, and then…

35
r/MachineLearning community 26d ago

A semantic tokenization scheme where token geometry reflects semantic relationships [R]

I have been thinking about an alternative tokenization and representation scheme for language models and would be interested in hearing whether similar ideas have been explored before, as well as potential advantages or flaws. The core observation is that modern tokenizers (BPE,…

30
r/MachineLearning community 26d ago

Encodec.cpp, a portable C++ implementation of Meta's EnCodec using Eigen [P]

I built a C++ implementation of Meta’s EnCodec using Eigen . Github: https://github.com/pfeatherstone/encodec.cpp Motivation: - A lightweight implementation of EnCodec with no runtime dependencies, in C++ - No ML runtime - Easy integration in CMake project - Maximum performance…

7
r/MachineLearning community 26d ago

TorchDAE: Implicit DAE Solvers with Index Reduction and Adjoint Sensitivity [P]

Hello everyone, I've been working on a PyTorch library for solving Differential Algebraic Equations (DAEs) that supports vectorized execution and GPU acceleration. The library implements several algorithms that are not currently available in the Python ecosystem, including…

27
r/MachineLearning community 27d ago

MiniMax dropped a new attention architecture. [N]

It contains something interesting about context windows. They’re natively scaling to 1M tokens with MiniMax Sparse Attention (MSA) , bypassing standard quadratic complexity by completely restructuring the memory access patterns at the operator level. Instead of relying on…

26
r/MachineLearning community 27d ago

Thoughts on Logical Intelligence’s Kona [D]

Sometime late last year a company called Logical Intelligence developed an EBM called Kona. What do people make of the company’s claims that they have a close to functioning EBM. And if true, what impact would this have on existing AI?   submitted by   /u/Treey1234…

24

Software and ops skills for data scientists[D]

M5 air 24gb or M5 pro 16gb for swe + ml ? [D]

For those using Google Colab, what features did you wish it had? [D]

Two independent ML/CV researchers (M.Eng, ex-research-institute in Europe) looking for an arXiv cs.CV endorser for a nearly finished paper. Happy to share the full draft, repo, or talk collaboration [D]

Research collection of Arxiv whitepapers [R]

Got told my open-source model experiments are too scattered. I'm organizing a journal to provide clarity before structuring the first git release. Is this readable for ML folks who aren’t in mech interp? Open to ANY feedback [D]

ML reading group to read recent interesting and trending papers from ICML/ICLR/NeurIPS [D]

Sources for ML news? [D]

Training-free graph SSL matches GCN with 5× fewer labels — live demo [P]

Does it make sense to use alternative quantizations of QAT models? [D]

Anyone here with experience submitting to Nature Machine Intelligence? [R]

Using FC26 to simulate the world cup ? [D]

What laptop do suggest I buy?[D]

Building a Custom Drones MuJoCo Environment [P]

TinyTPU: SystemVerilog systolic array compiled to WASM, running live in browser - RTL golden-verified against numpy [P]

ICML non-archival workshop - worth attending? [D]

I'm looking to join/form a team working on physical AI robotics challenge [P]

How do you identify researchers who are good? [D]

Benchmark: ONNX Runtime vs HF Transformers vs GGUF for Parakeet TDT 0.6B on CPU-only hardware [D]

An autonomous research agent was the #1 contributor in OpenAI's Hiring Competition Parameter Golf (by merged records)[R]

Are We Underestimating Small Edge AI Models?[D]

Would you say capture-time semantic annotation for robot trajectories is a solved problem? [R]

Is it allowed to use OpenAI API outputs to create a silver code dataset or benchmark for a specific Python library? [d]

Scrap the LLMs. Scoring 4.76% on the brand new ARC-3 using pure code, a 2012 AMD CPU, and zero AI tokens.[P]

[R] Measuring the Symmetry--Data Exchange Rate

How do ML researchers actually use AI tools to improve their writing? [D]

We built a source-available LLM reliability library (free for research / personal / internal eval) that can cut inference cost by half at matched quality, and you adopt it by changing one import [P] [R]

[P]Stop using print() to debug your agents. Here's a 60-second alternative.[P]

Faithful uncertainty in LLM agents: calibration vs utility tradeoff in practice[D]

KVarN: Variance-Normalized KV-Cache Quantization [R]

On-policy distillation: one of the hottest terms on PapersWithCode [R]

ICML financial aid [D]

How Do You Handle Ablation Studies When the Original Model Is Already Trained?[R]

Embedding space [D]

Repo for implementations of various Transformer Attn mechanisms [P]

Research in Image/Video Gen AI models [D]

In current ML systems, where is the main bottleneck: dataset quality or model architecture improvements? [D]

Best Visual Reasoning Model in 2026 (Including APIs) [D]

I have done a ML Project as a Novice [P]

Has anyone heard back from citadel ICML travel grant ? [D]

First paper acceptance (ICML Workshop), should I attend? [D]

NeurIPS Reciprocal Reviewers be careful in reviewing with LLMs [D]

How are production ML systems typically handling distribution shift over time? [D]

NeurIPS used uncalibrated AI detector for desk rejections [D]

Analysis of AlphaZero training data [D]

A semantic tokenization scheme where token geometry reflects semantic relationships [R]

Encodec.cpp, a portable C++ implementation of Meta's EnCodec using Eigen [P]

TorchDAE: Implicit DAE Solvers with Index Reduction and Adjoint Sensitivity [P]

MiniMax dropped a new attention architecture. [N]

Thoughts on Logical Intelligence’s Kona [D]