r/MachineLearning · · 1 min read

ML for UFC predictions: logistic regression vs random forest? [P]

Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.

Hello everyone, I am pretty new to anything ML related so bear with me.

I’ve been working on a UFC fight prediction project in Python using pandas + scikit-learn. Right now I’m using logistic regression since the output is binary (fighter A wins or fighter B wins). I’m currently using features like striking accuracy, takedown averages, reach, height, and age from historical UFC data, then generating predicted probabilities for fights and parlays. I'm interested in pushing this project to assist with round robin betting.

One thing I’ve noticed is that the model tends to favor simply stacking the highest-probability fighters, which made me start thinking more about the difference between raw probability and actual betting value/EV. I also already knew that MMA stats are very nonlinear. For example, age might barely matter until a certain threshold, takedown stats may matter much more depending on matchup style, and certain combinations of traits seem more important than the individual stats themselves.

Because of that, I’m wondering whether random forests (or another tree-based model) would make more sense than logistic regression for capturing those interactions. I'm still trying to fully grasp how random forests work, so this might not apply though? Anyway I'm just trying to have fun with this project and I’d genuinely appreciate input from anyone.

submitted by /u/xoVinny-
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/MachineLearning