r/MachineLearning · · 1 min read

Why our #1 LightGBM feature by importance made predictions worse [D]

Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.

We recently hit a classic gradient boosting trap with our pricing engine (Flyback), and I wanted to share the ablation data. We run LightGBM quantile regression to forecast secondary market watch prices.

We engineered a variant-conditioned Bayesian target encoder to isolate within-reference pricing dynamics. LightGBM absolutely loved it. It ranked #1 in feature importance at q90 by a wide margin, with gains several times the next-highest feature, across all our multi seed runs.

But when we ran a strict 4-seed × 3-variant ablation on the hold-out set, the results inverted. Test MAPE regressed by +0.28pp and the between-variant delta was 7x the within-variant standard deviation. The encoder was finding effective splits that completely failed to generalize because the signal it was learning was driven by irreducible label variance: unobserved factors like condition nuance, seller behavior, and timing that no feature can capture.

I wrote a full post breaking down the architecture, the ablation methodology, and the mechanism behind the divergence.

Happy to discuss LightGBM split mechanics, target encoding leakage, or the ablation setup.

Full post and ablation results: https://flyback.ai/engineering/target-encoding-divergence

submitted by /u/Nj-yeti
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/MachineLearning