r/MachineLearning
500 articles archived · Visit source ↗ · RSS
-
r/MachineLearning community 1mo ago
Social Simulation with LLMs - Fidelity in Applications (CFP @ COLM'26) [R]
🌟 Announcing the 2nd Workshop on Social Simulation with LLMs (Social Sim'26) @ COLM 📣 Welcoming Submissions! Submission here:. 🗓️ Deadline: June 23, 2026 (AoE) This year's theme is "Fidelity in Applications”, moving beyond compelling demos toward evaluation, robustness,…
11 -
-
-
r/MachineLearning community 1mo ago
ACM MM 2026 review discussion [D]
The AC email says the rebuttal is between 28 to 4th. The June 4th on website is the deadline. So I created this post for the discussion. I know it's a MM conference and less about ML but I think many people here are still submitting there.   submitted by  …
32 -
r/MachineLearning community 1mo ago
Training GPT-like model on non-language series [R]
I am responsible for a research project that is supposed to train a GPT-like model (Transformer-decoder) with 100M, 250M and 500M model variants. # params ## training dataset - 750M tokens - vocabulary is ~15k to ~100k tokens (depends on tokenizer settings) - ~3% of the…
29 -
r/MachineLearning community 1mo ago
Diffusion models for sketch-guided trajectory simulation [R]
Blog post: https://wezteoh.github.io/posts/diffusion-for-sketch-guided-trajectory-simulation/ During NBA games, coaches often sketch attacking plays on a whiteboard and mentally simulate how teammates and defenders might react. In this project, I explored using diffusion models…
30 -
r/MachineLearning community 1mo ago
STEM PhD's transitioning to MLE/Data [R]
I'm hoping for some advice from any former PhD's outside of machine learning. If you made it into machine learning engineering and/or data science, what was the key for you? Any tips for this job market? It seems like non computer science PhD's are especially in trouble at the…
38 -
r/MachineLearning community 1mo ago
Should I attend ICML as a junior? [D]
I am a junior in college, and have two accepted workshop papers at ICML 2026. Some background: I had an accepted workshop paper last year at ICLR, but couldn't attend due to a rejected visa, which led to all the more disappointment. So this year I was VERY eager to attend, and…
4 -
-
r/MachineLearning community 1mo ago
[R] What 1000+ Harness Experiments Taught Me About Self-Improving Agents [R]
I recently wanted to see whether an AI agent could self-improve a harness to solve terminal bench tasks. It’s possible for an AI agent to propose a meaningful one-time change to the harness, but after experimenting with this for a couple of weeks, I think the continuous…
35 -
r/MachineLearning community 1mo ago
AI-generated CUDA kernels silently break training and inference [R]
Last month NVIDIA released SOL-ExecBench , a new benchmark of 235 production CUDA kernels lifted from DeepSeek, Qwen, Gemma, and Kimi. We took several top-ranked AI-generated submissions and tried using them in production workloads. Many of them broke, sometimes in surprising…
14 -
r/MachineLearning community 1mo ago
Best Text to Text Translation Model? [D]
I'm working on a project that translates any language into English. So far, I've tried NMT models like NLLB, MADLAD, and SeamlessM4T v2. The main issue is that they struggle with proper nouns such as: - names - places - dates - organizations I also tried LLMs like Gemma 4, Qwen…
22 -
r/MachineLearning community 1mo ago
EMA-Gated Temporal Sequence Compression in Vision Transformers [P]
Vision Transformers waste 90% of their compute recalculating stationary asphalt. NeuroFlow tracks semantic surprise in embedding space, physically eliminating background tokens before the encoder. NeuroFlow is a dynamic routing framework for Vision Transformer video inference.…
34 -
r/MachineLearning community 1mo ago
Profiling PyTorch training without accidentally stalling the GPU [D]
Profiling PyTorch training has an interesting measurement problem: the more you measure, the more you can change the behavior of the run itself. A simple example is torch.cuda.synchronize() . It gives cleaner timing boundaries, but it also inserts synchronization points into an…
13 -
r/MachineLearning community 1mo ago
A Tiny Open-Source Self-Driving AI That Runs on a Phone [P]
https://preview.redd.it/ww14mzr2fm3h1.png?width=1890&format=png&auto=webp&s=79873d47ae79c7815ca3e7e91fd43141632174f5 https://www.youtube.com/watch?v=rr_uS4bf0B4&feature=youtu.be trained a 7MB open-source L4 self-driving AI that learns navigation, lane following, and drift…
11 -
r/MachineLearning community 1mo ago
What to use for Sign Language Recognition [R]
Hi everyone, I'm finishing up my proposal for my undergraduate thesis for computer science on sign language recognition, specifically Filipino Sign Language and i want to ask what architecture to use for my methodology that is best, rn im considering Mediapipe Holistic +…
32 -
r/MachineLearning community 1mo ago
[R]GNN Model For Fraud Detection Isn't Performing Well[R]
We're writing a research paper on explainable fraud detection GNN model and in the first step we're creating a basic Graph Neural Network for that. We're using the most famous dataset available on this topic i.e IEEE CIS Fraud Detection Dataset and implemented all necessary…
7 -
r/MachineLearning community 1mo ago
Trouble exploring in ai/ml,idk where to being with [D]
So as the title says Context:I am a sophomore in computer science Have prior knowledge in maths(especially the relevant topics in ml) Good enough with numpy,pandas I don't really know where to start Ok internet every second guy is trying to make me earn 100k/year in 3 months…
24 -
r/MachineLearning community 1mo ago
[P] have a couple technical questions for my LLM router. [P]
I am a CS undergrad and I think token economics is the next big problem for companies. I am building a LLM router specifically for code and codebases. The Routing is not actually done by a heavily fine tuned llm(already existing solutions do this). Using a bit of a different…
11 -
r/MachineLearning community 1mo ago
[D] Dlib or pytorch to CNN? [D]
I’m currently studying ML, more specifically convolutional neural networks (CNNs) for finding patterns in images. Right now, I’m trying to develop a model that can solve the “Where’s Waldo?” challenge. However, I currently have a question: what would be the best option for…
31 -
r/MachineLearning community 1mo ago
[D] Where do you go for serious AI research discussion online? [D]
Looking for communities where people actually dig into ML/AI research, not hype, not "look what I built with an LLM API," but discussions about papers, training dynamics, debugging real models, infra problems, that kind of thing. I'm specifically interested in places where you…
15 -
r/MachineLearning community 1mo ago
Aiki my local Wikipedia Retrieval-Augmented Generation system [R]
Hey i built Aiki a lightweight tool that let's you chat with Wikipedia locally. what it does: - Downloads and chunks wikipedia articles (u can choose those articles by their name or articles and also the option of downloading the similar topics) - Uses a custom TF-IDF + cosine…
23 -
r/MachineLearning community 1mo ago
Is AI inference platform really that saturated now? [D]
I’m thinking of expanding an on-device inference SDk into a full blown AI inference platform and seeing more and more inference platform popping out. Been talking with a VC from Seattle/NY. Is this space really that saturated?   submitted by   /u/kampak212 [link]  …
35 -
r/MachineLearning community 1mo ago
𝐃𝐞𝐥𝐭𝐚 𝐀𝐭𝐭𝐞𝐧𝐭𝐢𝐨𝐧 𝐑𝐞𝐬𝐢𝐝𝐮𝐚𝐥𝐬 [R]
We're excited to release 𝐃𝐞𝐥𝐭𝐚 𝐀𝐭𝐭𝐞𝐧𝐭𝐢𝐨𝐧 𝐑𝐞𝐬𝐢𝐝𝐮𝐚𝐥𝐬, a drop-in upgrade to residual connections that learns which past layers to route from — without the routing collapse that breaks prior cross-layer attention at scale. 🚀 Attention Residuals route over…
9 -
r/MachineLearning community 1mo ago
Anyone heard from ICML about Oral decisions yet? [D]
hi all, my paper received a spotlight from ICML. they told us that we would receive decisions as to whether our paper would get an oral by the end of the month with the implication that we wouldn’t receive a notification if we didn’t get it; I was just wondering if anyone has…
30 -
r/MachineLearning community 1mo ago
I’m building an open-source decision layer above AI agents [P]
Hi everyone, I’m Jia, the creator of Spice. I’ve been working on an open-source project called Spice. The simplest way to describe it is: Spice is a decision layer above agents. Most agent systems today are very focused on execution, They are getting better at doing tasks after…
30 -
r/MachineLearning community 1mo ago
Call for Papers - Workshop on Efficient Reasoning at COLM 2026 [R]
🌟 Announcing the 2nd Workshop on Efficient Reasoning (ER) at @colm2026 — Oct 9! 📣 We welcome submissions! Submit your work here: https://openreview.net/group?id=colmweb.org/COLM/2026/Workshop/Efficient_Reasoning 🗓️ Deadline: July 12, 2026 (AoE) 🔗 Website:…
11