Wednesday, May 6, 2015

The NA server starts 150K+ new games per hour during perk periods almost every day

How popular is League of Legends?

Below is a graph for NA server activity in terms of (estimated) number of games per hour on April 14th, 2015.


As you can see, during 2AM to 8AM (keep in mind this is west coast time - PDT), there are less than 50k games starting per hour. In fact, according to my rough estimate, during these non-peak hours there are still around 500 games starting every minute on the NA server - which is very impressive considering a game of League of Legends will usually last more than 20-30 minutes and usually involves 10 players.

On the other hand, during peak periods of the day there can easily be more than 150K new games starting in an hour. This usually happens around 7PM to 10PM. For example, on April 14th (which is a Tuesday, in case you are interested), about 270K games started on the NA server on 8PM to 9PM. That's a lot of games!

There does seem to be some variations on server activity. Here's a similar but bigger plot for most of April:

It seems that there are some day-to-day variations. Since Patch 5.7 was launched on April 8th, it is not too surprising that we see some spikes of activities right after the new patch. It would also be interesting to know whether or not the server activity correlates to LCS/LCK/LPL in any shape or form. If you are interested to analyze this, I have provided the raw data here.


A note on how number of games per hour is estimated: each game on the server has a unique Match ID associated to it which is assumed to autoincrement with time. Therefore, if I can sample, say, 10000 games that started in an hour and find the smallest and the largest Match ID in this sample, then I can estimate the total number of games that started during this hour.

Saturday, May 2, 2015

This blog, goldper10, reddit bot, and the future

Dear reader,

Since earlier this year, I have also started writing for goldper10.com. Here are the articles I've posted for goldper10 so far:

Nemesis Draft - Win Rates and Analysis

How Win Rate Lies To You

Analysis on Teleport and Smite Top Lane in Solo Queue

I have also made a reddit bot which garnered some attention. This bot reads in historical record of a player and suggests some other champions the player should play - using association rules mining. I've had some great fun making it and I definitely plan to make other kinds of bots in the future that also uses neat machine learning / data mining techniques on League of Legends data. Additionally, I want to thank everyone who provided me with great feedback (especially on the ARAM issue) and I am very happy that many people enjoyed using the bot.

I want to thank all my supporters so far (especially /u/x-o). It has been a bumpy road.

As far as the future is concerned, I will keep writing for goldper10.com as well as producing more contents right here - ideally on a weekly basis. I also want this blog to consolidate all of my contents such that everything is self-contained right here.

Thank you again,

Sufficiency

Monday, April 20, 2015

How I mine and analyze Riot's data

I seem to get this question a lot, so I figure I should write a short piece about this.

Disclaimer: I am not a computer scientist, software developer, or a DBA. I just try to make things work with my hacky ways.

First of all, the Riot API provides a way to download game data from Riot's server. You need to read their documentations to figure out what kind of data are available, but the short answer is that there are a LOT of data available and it does take some bandwidth if you mine continuously just below Riot's threshold.

All of Riot's data are in the JSON format. There are a lot of tools which will allow you to parse JSON, but I am currently using a Java package called ulti. With this and my Java code, I request data from Riot's servers while staying barely below the threshold they set out.

On a side note, I am sure other programming languages like Python can do the job just as well, if not better. It just happened that I have fairly strong roots in C/C++ and I enjoy coding in C-like languages.

Once the data is collected, I save the data using a PostgreSQL database. If you are also mining data from Riot, I highly recommend that you set up a database for storage as the data can easily get out of hands if they are stored in plain text. Currently, my database is about 400GB in size; if entirely exported to CSV files, I will not be surprised if they will take more than 4TB of storage.

My disk usage from the data I mined since January, 2015
Databases also make retrieval easier, since SQL queries are fairly efficient.

When it comes to actual data analysis, I use a combination of SQL (simple loading and aggregation),  R, and Excel. Loading data and doing simple aggregations is simple; just write some SQL queries.




R, on the other hand, is slower but it is packed with more statistical tools that allows more insightful analysis.


Excel is another tool that I often use for quick-and-dirty plotting. Not every plot I make is sophisticated; sometimes I just want a quick bar chart and Excel does the job really well.


So there you have it. I hope it helps!



Wednesday, April 15, 2015

Flash on D or F - revisited

Let us revisit this question. Flash on D or on F? It seems that Flash on D is slightly more popular. But perhaps more interestingly, it is possible that players who put Flash on D have different lane preference compared to players who put Flash on F. I will also give my theory on this phenomenon near the end of this blog entry.

I analyzed about 370k ranked solo queue games played on Patch 5.6 and found about 600k unique players. Here's the frequency table for these players:

Flash on D Flash on F Not Identifiable
# of Players 301269 290322 10947
Percentage 50.0% 48.2% 1.8%

In summary, there are slightly more players who use D for Flash than F. Note that the "not identifiable" 1.8% of the players are due to a combination of inconsistent use of D and F for Flash and/or running no Flash at all.

More interestingly, it is possible that players who use D for Flash have slightly different lane of preference compared to players who use F for Flash - although the difference in preference is fairly small. As you can see below, players with Flash on D chooses to play mid lane 19.8% of the time; for players with Flash on F, it's 20.5%.

Player \ Chance to play BOTTOM JUNGLE MIDDLE TOP
Players with Flash on D 40.2% 20.6% 19.8% 19.5%
Players with Flash on F 40.0% 19.8% 20.5% 19.6%

Again, the difference is small, but a chi-squared test of independence shows that the difference is significant so the choice of D or F for Flash and choice of lane are probably not independent[1]. 

I have a theory for the result above.

First of all, why do some players choose D and some players do F for Flash? Well, when you play your first game on a level 1 account, the game places Heal on D and Ghost on F by default. Flash is not available for a level 1 account; but over time you will gain the option to use Flash and you will be faced with the dilemma of assigning it to D or F.

I think at this point, the decision to use D or F for Flash depends on if the player thinks Flash is similar to Ghost or not. If the player treats Flash like a movement spell similar to Ghost, I think it is natural to keep Flash on F. On the other hand, if the player treats Flash differently from Ghost, it makes more sense to bind it on a different button.

Therefore, the choice of using D or F for Flash may be demonstrating different modes of thinking for players - and perhaps the button choice is a reflection of the player's personality. Because of this, we also see that players who choose D for Flash have slightly different lane choices compared to players who choose F for Flash.

Then again, this is just my theory. It would be cool to be able to do some personality tests on players and see how personality correlates with lane choices though!


Footnote
[1] Feel free to rip me apart on how I inappropriately used chi-squared test here.

Thursday, April 9, 2015

URF Win Rate 2015

95k games collected via Riot API.
Sorted by win rate highest to lowest.
For win rates, mirror match-ups are filtered out.

Champion Popularity Win Rate
Sona 27.04% 69.03%
Galio 5.27% 64.13%
Wukong 14.29% 58.76%
Malzahar 12.14% 56.79%
Skarner 2.92% 56.79%
Jax 6.67% 56.69%
Maokai 12.43% 55.58%
Diana 6.31% 55.36%
Evelynn 11.62% 55.23%
Karthus 11.36% 55.16%
Fiora 6.47% 55.02%
Hecarim 23.67% 54.97%
Zed 32.42% 54.97%
Orianna 4.67% 54.72%
Xin Zhao 9.03% 54.72%
Poppy 5.26% 54.49%
Vladimir 10.77% 54.12%
Shaco 27.51% 54.09%
Ahri 14.07% 53.65%
Zyra 4.17% 53.41%
Sivir 5.64% 53.32%
Master Yi 17.00% 53.29%
Soraka 8.08% 53.24%
Ezreal 39.49% 53.13%
Nami 7.57% 53.06%
Morgana 13.67% 53.03%
Jayce 11.85% 52.95%
Nocturne 1.65% 52.80%
Karma 8.43% 52.66%
Fizz 15.71% 52.63%
Tryndamere 5.72% 52.62%
Shen 6.40% 52.55%
Lulu 5.77% 52.48%
Kog'Maw 5.76% 52.29%
Urgot 7.59% 51.96%
Kennen 3.87% 51.85%
Syndra 6.49% 51.75%
Annie 9.86% 51.71%
Talon 7.36% 51.70%
Malphite 10.73% 51.56%
Kayle 5.05% 51.47%
Cho'Gath 8.81% 51.47%
Sejuani 2.85% 51.41%
Janna 2.99% 51.18%
Rumble 3.13% 51.07%
Twisted Fate 7.28% 50.83%
Yorick 3.85% 50.83%
Blitzcrank 23.23% 50.72%
Amumu 7.76% 50.60%
Irelia 2.82% 50.40%
Swain 6.20% 50.37%
Mordekaiser 4.80% 50.12%
Graves 4.10% 50.08%
Dr. Mundo 5.03% 50.02%
Nautilus 5.36% 50.02%
Gangplank 11.51% 50.00%
Katarina 17.83% 49.89%
Varus 3.60% 49.81%
Alistar 12.67% 49.80%
Azir 8.99% 49.65%
Darius 5.21% 49.65%
Lux 20.67% 49.40%
Nasus 13.10% 49.22%
Jarvan IV 3.04% 49.08%
Nidalee 30.39% 48.98%
Quinn 1.70% 48.60%
Warwick 4.10% 48.45%
Veigar 8.41% 48.36%
Heimerdinger 5.26% 48.29%
Zac 1.53% 48.15%
Taric 3.03% 48.07%
Draven 1.87% 48.05%
Teemo 19.35% 48.01%
Riven 14.79% 47.93%
Rammus 2.59% 47.82%
Ziggs 10.12% 47.76%
Ashe 10.57% 47.31%
Ryze 10.71% 46.44%
Volibear 1.92% 46.38%
Vi 6.12% 46.35%
Aatrox 1.53% 46.31%
Vel'Koz 8.72% 46.31%
Leona 8.39% 45.95%
Udyr 5.59% 45.77%
Rengar 5.01% 45.66%
Kha'Zix 4.40% 45.60%
Renekton 1.68% 45.45%
Shyvana 0.96% 45.42%
Lucian 13.78% 45.29%
Trundle 1.21% 45.19%
Pantheon 6.21% 44.96%
Lissandra 3.98% 44.95%
Miss Fortune 2.50% 44.93%
Fiddlesticks 6.01% 44.92%
Xerath 6.07% 44.65%
Braum 1.55% 44.59%
Caitlyn 3.22% 44.54%
Sion 4.49% 44.52%
Jinx 15.27% 44.35%
Brand 6.16% 44.17%
Singed 2.96% 44.07%
Viktor 3.91% 44.05%
Gragas 3.17% 43.35%
Olaf 1.87% 43.27%
Twitch 2.27% 43.21%
Tristana 3.26% 42.81%
Vayne 5.22% 42.78%
Kassadin 12.51% 42.61%
LeBlanc 20.49% 42.41%
Garen 3.64% 42.01%
Nunu 4.16% 41.96%
Akali 6.91% 41.71%
Rek'Sai 3.29% 41.59%
Lee Sin 7.02% 41.52%
Anivia 3.36% 41.43%
Corki 3.24% 40.90%
Elise 0.97% 39.96%
Yasuo 6.30% 39.44%
Cassiopeia 3.30% 38.82%
Zilean 4.50% 38.66%
Gnar 2.51% 38.66%
Kalista 2.08% 36.82%
Thresh 5.44% 34.32%
Bard 11.81% 30.75%


Blue side win rate: 52%

I think Riot deserves some praise here. They spent a lot of time tuning many champions individually so they do not need to be disabled. In URF 2014, 9 champions were either initially or later disabled due to their gross power on URF ( Kassadin, Ryze, Sona, Hecarim, Kayle, Soraka, Nidalee, and Alistar). This year, no champions were disabled (yet) after global healing/shield adjustments and some individual adjustments.

The URF mode win rate from last year can be found here.

Wednesday, December 17, 2014

Legend of the Poro King Mode Statistics!

75k games collected.

Mirror matchups are filtered out for the computation of champion win rate (not for popularity).

Champion Win % Popularity
Sona 65.57% 22.45%
Heimerdinger 61.14% 11.18%
Janna 60.49% 7.49%
Kog'Maw 60.43% 5.99%
Sion 60.40% 7.68%
Swain 58.74% 4.16%
Galio 58.61% 9.20%
Maokai 58.12% 7.54%
Ziggs 57.64% 28.80%
Zyra 57.12% 4.49%
Vladimir 57.02% 4.25%
Karthus 56.95% 7.14%
Rammus 56.91% 3.17%
Wukong 56.27% 13.88%
Leona 55.96% 6.08%
Caitlyn 55.92% 12.59%
Alistar 55.90% 13.67%
Singed 55.80% 8.22%
Lux 55.75% 25.45%
Taric 55.61% 1.84%
Varus 55.58% 9.45%
Fiddlesticks 55.28% 16.28%
Talon 54.73% 7.98%
Sivir 54.53% 14.19%
Amumu 53.87% 10.60%
Garen 53.65% 10.65%
Malzahar 53.45% 13.51%
Vel'Koz 53.30% 10.50%
Miss Fortune 53.23% 9.77%
Teemo 53.21% 15.51%
Master Yi 53.13% 13.99%
Jinx 53.09% 17.70%
Katarina 52.84% 11.63%
Soraka 51.94% 8.37%
Fiora 51.91% 8.23%
Ashe 51.91% 14.17%
Poppy 51.74% 3.34%
Yorick 51.73% 1.28%
Draven 51.36% 3.47%
Trundle 51.28% 1.28%
Gangplank 51.27% 9.00%
Brand 51.24% 6.71%
Mordekaiser 50.95% 2.26%
Sejuani 50.94% 13.93%
Graves 50.43% 3.65%
Morgana 50.34% 14.96%
Jarvan IV 50.28% 5.65%
Jayce 50.26% 14.81%
Nautilus 50.24% 3.82%
Anivia 50.14% 3.02%
Dr. Mundo 50.00% 4.40%
Annie 49.47% 7.53%
Darius 49.43% 9.43%
Warwick 49.40% 2.65%
Diana 49.28% 2.87%
Nami 49.21% 4.44%
Blitzcrank 49.20% 18.67%
Viktor 49.16% 3.34%
Ezreal 49.15% 27.94%
Nasus 48.89% 2.89%
Xin Zhao 48.76% 4.63%
Karma 48.59% 5.53%
Twisted Fate 48.33% 6.61%
Olaf 48.19% 1.97%
Pantheon 48.15% 7.16%
Irelia 48.07% 2.78%
Gnar 47.79% 10.32%
Malphite 47.46% 10.22%
Volibear 47.36% 4.28%
Shen 47.35% 2.70%
Azir 47.18% 10.35%
Ryze 47.11% 2.86%
Xerath 47.11% 21.07%
Renekton 47.03% 2.10%
Cho'Gath 46.97% 8.27%
Cassiopeia 46.95% 2.77%
Ahri 46.93% 8.53%
Jax 46.87% 5.26%
Hecarim 46.76% 2.34%
Fizz 46.53% 11.60%
Corki 46.20% 4.11%
Quinn 45.92% 2.02%
RekSai 45.76% 13.38%
Rumble 45.76% 3.46%
Yasuo 45.54% 17.67%
Braum 45.52% 6.19%
Shyvana 44.98% 2.36%
Vi 44.74% 7.41%
Lissandra 44.71% 5.88%
Vayne 44.52% 5.95%
Veigar 44.52% 10.81%
Skarner 44.42% 3.92%
Nidalee 44.42% 24.56%
Orianna 44.39% 23.58%
Lucian 44.35% 15.40%
Kayle 44.18% 3.24%
Twitch 44.14% 3.69%
Shaco 43.97% 6.61%
Riven 43.76% 9.55%
Nunu 43.41% 4.27%
Kennen 43.25% 4.33%
Urgot 43.23% 0.94%
Nocturne 43.21% 0.76%
Zed 42.86% 14.02%
Lulu 42.71% 11.87%
Tryndamere 42.63% 3.47%
Gragas 42.50% 4.47%
Udyr 42.25% 1.66%
Akali 42.11% 7.43%
Rengar 41.85% 2.43%
Tristana 41.74% 7.59%
Kassadin 41.67% 2.60%
Kalista 40.83% 12.42%
Thresh 40.69% 6.53%
Syndra 40.62% 4.22%
Zac 40.45% 6.82%
Zilean 40.20% 8.58%
Aatrox 39.86% 1.40%
Lee Sin 39.72% 10.08%
Kha'Zix 39.27% 2.87%
Elise 38.97% 0.61%
LeBlanc 36.72% 9.80%
Evelynn 34.34% 0.52%



Side Win Rate
Blue Side 58.47%
Red Side 41.53%
I wish I made up the 58.47% win rate on blue side. Unfortunately, it's real.