Embedded/edge ML folks: what actually eats the most time ,getting data, or cleaning/labeling it (time series sensor data, not computer vision/audio)? [D]
Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.
I'm trying to understand where people doing sensor based ML on microcontrollers (IMU, accelerometer, vibration ,that kind of time-series data) actually lose the most time.
When you've built something like this, what was the bottleneck:
- Getting enough real world data in the first place?
- Cleaning / labeling / organizing the data you have?
- Actually building and training the model?
- Getting it optimized and deployed on the device?
I am working on a project that aims to eliminate some of these pains and wanted to get some validation on this topic first before I go and add more features. It is essentially edge impulse, but hardware agnostic, gen ai native, and targeted for time series data. I am still trying to figure out what the best vertical would be as there are many to choose from.
[link] [comments]
More from r/MachineLearning
-
Loss functions in Instance Representation Learning [R]
Jun 29
-
Price elasticity model [R]
Jun 29
-
Rejected MICCAI paper: workshop -> journal/conference or directly journal/conference [R]
Jun 29
-
I built a demo agricultural planning system with an AI advisor for small-scale farmers in Nicaragua using NASA data [p]
Jun 29
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.