r/MachineLearning · May 23, 2026 · 2 min read

LQS v3.1 — an open methodology for rating AI training data (multi-oracle consensus + signed certificates) [P]

Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.

Solo author here. I spent the last six months building (and then sunsetting) a marketplace for AI training data. The marketplace failed for an interesting reason: the actual bottleneck isn't supply. There's tons of data. The bottleneck is that buyers can't independently evaluate quality, and there's no Cleanlab/Galileo-style tool that occupies the rating-authority position — those products are diagnostics owned by the data owner, not third-party attestations a procurement team or model risk officer can cite.

So I rebuilt the whole thing as the rating layer. The methodology is published with a DOI (10.5281/zenodo.20278981, CC BY 4.0) — full v3.1 paper, every dimension defined.

What's in v3.1:

- 19 dimensions: label correctness, coverage, leakage, contamination, plausibility, oracle agreement, conformal

coverage, downstream projection, adversarial stability, subgroup equity, license clarity, provenance chain, and more

- 7-oracle consensus across the score, with oracle_agreement itself being a scored dimension (i.e., the score knows

when the score is uncertain)

- Outcome Registry: downstream signals feed back to recalibrate oracle credibility — the rating learns from real-world

quality outcomes, not just inter-rater agreement

- Ed25519-signed certificates auditors can verify offline against the published public key (no API call needed)

- Public LQS Index: 11 tickers, ~263 datasets scored, daily rebalance, free API

This is genuinely pre-revenue (zero acquired customers — being honest with you, not posturing). What I'd actually value from this sub:

Methodology review. The paper is open. If any dimension definitions are wrong, weights are gameable, or the oracle

aggregation is misspecified, I want to know now before this gets cited.
Adversarial datasets. If you have a dataset where you think the LQS would score it wrong (either direction), I'll

score it free and we can publish the disagreement.
Comparable systems I should be citing. I'm aware of Cleanlab, Galileo, the FT Spectrum project — what else?

Free score for any public dataset: labelsets.ai/rate

Paper: https://doi.org/10.5281/zenodo.20278981

Happy to AMA on the architecture, conformal intervals, the marketplace pivot, or anything else.

submitted by /u/plomii
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/MachineLearning