r/MachineLearning · June 15, 2026 · 1 min read

Embedded/edge ML folks: what actually eats the most time ,getting data, or cleaning/labeling it (time series sensor data, not computer vision/audio)? [D]

Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.

I'm trying to understand where people doing sensor based ML on microcontrollers (IMU, accelerometer, vibration ,that kind of time-series data) actually lose the most time.

When you've built something like this, what was the bottleneck:

Getting enough real world data in the first place?
Cleaning / labeling / organizing the data you have?
Actually building and training the model?
Getting it optimized and deployed on the device?

I am working on a project that aims to eliminate some of these pains and wanted to get some validation on this topic first before I go and add more features. It is essentially edge impulse, but hardware agnostic, gen ai native, and targeted for time series data. I am still trying to figure out what the best vertical would be as there are many to choose from.

submitted by /u/No-Bug-4879
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/MachineLearning