r/MachineLearning
500 articles archived · Visit source ↗ · RSS
-
r/MachineLearning community 22d ago
Software and ops skills for data scientists[D]
With more software engineers entering into data science and AI, I feel it's equally important for a person with data and AI background to dive into software development to survive, thrive in industry. I Know it's a very broad question, so suggestions with broad subjects, topics…
9 -
r/MachineLearning community 22d ago
M5 air 24gb or M5 pro 16gb for swe + ml ? [D]
Hi folks, Deciding between these two Mac options has been a challenge for me, so pls help. I know mac is not even necessary for this but just help me to decide between these two options. For the reference, Im a swe student and looking forward to go deep into ml and data science…
11 -
r/MachineLearning community 22d ago
For those using Google Colab, what features did you wish it had? [D]
Hi everyone, I'm an undergraduate student and ML researcher at UC Berkeley. My colleagues and I are working on a project that hopes to fix some of the problems users face with Colab. What are the features you wish it had as an ML professional, researcher, or enthusiast? What're…
32 -
r/MachineLearning community 23d ago
Research collection of Arxiv whitepapers [R]
I read and collected Arxiv whitepapers starting after the launch of ChatGPT. I copied and pasted excerpts into Word to track them. Then migrated to Obsidian. That vault of some 1700 papers is now online. I figured it was time to see if others would find the collection useful. My…
38 -
r/MachineLearning community 23d ago
Sources for ML news? [D]
I need a break from social media and all the bots.. Aside from Arxiv are there any sources that do a good job of aggregating the good stuff and filtering out all the junk?   submitted by   /u/Tiny_Arugula_5648 [link]   [comments]
17 -
r/MachineLearning community 23d ago
Does it make sense to use alternative quantizations of QAT models? [D]
From TF's website: Quantization aware training emulates inference-time quantization, creating a model that downstream tools will use to produce actually quantized models. So is it designed to work with a very specific quantization method (for Gemma-4, presumably, Google's own)?…
17 -
r/MachineLearning community 24d ago
Using FC26 to simulate the world cup ? [D]
maybe this should be asked in the Fc26 game subreddit but not sure. Anyway I just saw a video of someone predicting the winner of the world cup using the simulate match feature in the game but he only did it once. Would running this feature 100-1000 times give a significant…
23 -
r/MachineLearning community 24d ago
What laptop do suggest I buy?[D]
Guys, for those experienced in the space. im actually confused at this point. I work around ML, Data science, analytics, engineering, research and general programmatic. What laptop or workstation would you advise me on? I need speed, high performance, durability and cost…
8 -
r/MachineLearning community 24d ago
Building a Custom Drones MuJoCo Environment [P]
Hi all, Lately I have been working on creating a package for Multi Agent RL based drone environments with different objectives, all bundled into a single GitHub repository: tau-intelligence/MuJoCo-drones-gym. I am currently trying to organize things for RL community people, with…
31 -
-
r/MachineLearning community 24d ago
ICML non-archival workshop - worth attending? [D]
I have a paper accepted at a non-archival ICML workshop this year, and I am trying to decide whether it is worth registering and attending. By coincidence, I will already be in Seoul around that time, but I would have to pay the workshop registration fee (~$400) out of my own…
34 -
r/MachineLearning community 24d ago
How do you identify researchers who are good? [D]
About 10 years ago, I got into the basics of ML (like regression, KNN's, LVQ's) and read a few papers before taking a break a few years back. It feels like now, there's a lot of researchers in AI. How do you identify the ones who are actually solid vs those who (forgive my…
19 -
-
r/MachineLearning community 24d ago
Are We Underestimating Small Edge AI Models?[D]
A lot of recent discussion around Edge AI focuses on running increasingly larger local LLMs. Meanwhile modern smartphones already have enough compute for many practical computer vision tasks that don't require massive models at all. I recently built and released an Android…
7 -
-
r/MachineLearning community 25d ago
[R] Measuring the Symmetry--Data Exchange Rate
The prediction that equivariance reduces sample complexity by a factor of |G| appears in roughly every paper on geometric deep learning and is measured as an actual scaling law in roughly none of them. This paper does the measurement. The methodology is the interesting part.…
9 -
r/MachineLearning community 25d ago
How do ML researchers actually use AI tools to improve their writing? [D]
As an ML researcher, how do you use AI tools in your daily work? Do you mostly use them to clean up grammar and wording, or also to rewrite, structure, or draft technical text?   submitted by   /u/Hope999991 [link]   [comments]
5 -
-
-
r/MachineLearning community 25d ago
KVarN: Variance-Normalized KV-Cache Quantization [R]
Excited to share some of my own work here :) KVarN is our new KV-Cache quantization method. In very brief, we combine Hadamard rotations with variance-normalization on both axes of the K and V matrices, then round to nearest. Simple, but works very well, especially for…
21 -
r/MachineLearning community 25d ago
On-policy distillation: one of the hottest terms on PapersWithCode [R]
Hi, Niels here from the open-source team at Hugging Face. At paperswithcode.co I am trying to make it easier for people to learn about the newest techniques used across AI papers. One of the hottest terms in AI research that I've recently added is On-policy distillation , also…
27 -
r/MachineLearning community 25d ago
ICML financial aid [D]
Hello I am curious about the election criteria for ICML financial aid. If anyone have been granted financial aid would you mind sharing your profile. Somehow being a black woman ( 2 underrepresented groups) with one paper accepted at the main conference and two papers accepted…
7 -
r/MachineLearning community 26d ago
Embedding space [D]
Hello everyone, I’m relatively new to this area of machine learning and currently experimenting with Variational Autoencoders (VAEs) to build an embedding space for an image dataset with images have different spatial dimensions, I cannot easily standardize them to a fixed size.…
11 -
r/MachineLearning community 26d ago
Repo for implementations of various Transformer Attn mechanisms [P]
Initially, I developed this so I can easily switch between different Attention mechanisms for my Small Language Model (SLM) experiments and benchmarking. However, I also realized that these implementations can be applicable in Computer Vision, modernize Vision Encoders, RL, and…
14 -
r/MachineLearning community 26d ago
Research in Image/Video Gen AI models [D]
I've been going down a rabbit hole with image/video generation/editing models for a few months now, started with playing around with Stable Diffusion and ComfyUI, then got genuinely hooked on understanding why things work, not just that they do. I have an Engineering background…
20 -
r/MachineLearning community 26d ago
Best Visual Reasoning Model in 2026 (Including APIs) [D]
For example, suppose I have a one-hour video and I provide it to ChatGPT or another AI model. If I ask complex reasoning questions about the video, which models are best suited for long-horizon video understanding and reasoning? Which models can produce the most reliable answers…
38 -
r/MachineLearning community 26d ago
I have done a ML Project as a Novice [P]
Hi there! I am going to complete my MSc in Business Analytics and planning to do some real-life projects to attract the recruiters. I am sharing one of such projects here: FIFA World Cup 2026 Prediction: https://amit-world-cup-2026-simulator.streamlit.app/ Project Overview Large…
5 -
r/MachineLearning community 26d ago
Has anyone heard back from citadel ICML travel grant ? [D]
It’s confusing because they said applicants will be notified on 3rd June but also said you’ll be notified 2-4 weeks after the deadline (29th may)   submitted by   /u/Smol_pp001 [link]   [comments]
6 -
r/MachineLearning community 26d ago
First paper acceptance (ICML Workshop), should I attend? [D]
I just finished my first year of undergrad, and I got my first first-author paper accepted to an ICML workshop! Super stoked, especially since I was lowk a crashout in high school I wanted to know if it is worth it for me to go? It's quite expensive, and I will be the only one…
30 -
r/MachineLearning community 26d ago
NeurIPS Reciprocal Reviewers be careful in reviewing with LLMs [D]
As the title says. I am not a reciprocal reviewer but I just noticed a clever prompt injection like they did in ICML for our submission.   submitted by   /u/Massive-Bobcat-5363 [link]   [comments]
18 -
r/MachineLearning community 26d ago
NeurIPS used uncalibrated AI detector for desk rejections [D]
I recently had a submission desk-rejected from the NeurIPS 2026 Position Paper Track for an alleged AI-policy violation. After corresponding with the track leadership and reading their public blog post, I think the broader methodological issue is worth discussing here. The track…
13 -
r/MachineLearning community 26d ago
Analysis of AlphaZero training data [D]
I am trying to train an AlphaZero model for Othello on a 6x6-board. Having been warned that too little exploration during data generation can lead to models being overconfident and trapped in some tight region of the search tree, I started with the value c_puct = 4.0, and then…
35 -
r/MachineLearning community 27d ago
MiniMax dropped a new attention architecture. [N]
It contains something interesting about context windows. They’re natively scaling to 1M tokens with MiniMax Sparse Attention (MSA) , bypassing standard quadratic complexity by completely restructuring the memory access patterns at the operator level. Instead of relying on…
26 -
r/MachineLearning community 27d ago
Thoughts on Logical Intelligence’s Kona [D]
Sometime late last year a company called Logical Intelligence developed an EBM called Kona. What do people make of the company’s claims that they have a close to functioning EBM. And if true, what impact would this have on existing AI?   submitted by   /u/Treey1234…
24