Enter the champions below, or to generate a random set of champions.
Champions on Blue Side:
Champions on Red Side:
Anyone with the slightest experience with ARAM knows that there are many good (and bad) champions on this map - in particular, ranged champions with long range poke and/or sustain tend to be favoured in ARAM. To quickly show that this is indeed the case, here are the top and bottom 5 champions in ARAM on Patch 6.15 by win rate after removing mirror matches:
Bottom 5:
Champion | Winrate |
Ryze | 36.25% |
Evelynn | 36.68% |
LeBlanc | 37.80% |
Rek'Sai | 38.39% |
Kha'Zix | 38.78% |
Top 5:
Champion | Winrate |
Swain | 61.22% |
Teemo | 61.32% |
Galio | 62.86% |
Sona | 63.30% |
Ziggs | 64.10% |
With such discrepancy in power between different champions and the fact that the champions chosen in ARAM are random, it is natural to ask how much of the game is decided as soon as the champions are selected. In other words, given the champions locked in for both sides and no other additional information, how well can we predict the outcome of the game?
By constructing a predictive model using machine learning techniques, I have discovered that I can predict the outcome of ARAM games in Patch 6.15 with around 66% accuracy. You can play with the my predictive model above, where you can enter the champions and see the predicted outcome.
Warning: technical descriptions of the model ahead.
As far as the methodology is concerned, it is very standard - I collected around 160k ARAM games from the NA server on Patch 6.15, split the data into a training and testing set (in 3:1 ratio), trained several machine learning models on the training set, and finally computed prediction error using the testing set.
Several different models were attempted, including logistic regression, random forest, XGB, and some simple MLP. Somewhat surprisingly, a logistic regression model performed remarkably well against the other models. A small amount of regularization was needed for the logistic regression since the covariate matrix was rank deficient.
The result from the testing set is as follows:
Which seems very good. The ROC curve is as follows:
which may have room for improvement.
Several different models were attempted, including logistic regression, random forest, XGB, and some simple MLP. Somewhat surprisingly, a logistic regression model performed remarkably well against the other models. A small amount of regularization was needed for the logistic regression since the covariate matrix was rank deficient.
The result from the testing set is as follows:
Confusion Matrix and Statistics Reference Prediction LOSS WIN LOSS 13617 7174 WIN 7371 14621 Accuracy : 0.66 95% CI : (0.6555, 0.6645) No Information Rate : 0.5094 P-Value [Acc > NIR] : <2e-16 Kappa : 0.3197 Mcnemar's Test P-Value : 0.1041 Sensitivity : 0.6488 Specificity : 0.6708 Pos Pred Value : 0.6549 Neg Pred Value : 0.6648 Prevalence : 0.4906 Detection Rate : 0.3183 Detection Prevalence : 0.4860 Balanced Accuracy : 0.6598 'Positive' Class : LOSS
Which seems very good. The ROC curve is as follows: