Boom Beach Data Science
Outline
- Boom Beach Background
- Data science in a game team
- 90% of the daily work
- Task force operation difficulty predictor
- 5% of the daily work
I Boom Beach
Come with a plan or leave in defeat!
- Combat strategy game set in tropical archipelago
- Attack hundreds of unique island bases
- Explore a huge tropical archipelago
- Raid other players’ bases
- Defeat the evil Blackguard
II Data science in a game team
- One data scientist in each (live) game team
- Support the game development and maintenance work by data and stats
- Very important: responsible for interpreting the statistics
- Work is fast-paced, mostly short-term (same day) tasks
What’s a data scientist
Data scientist is an overloaded term
Game team data scientist: descriptive/exploratory work
- What’s happening out there in the world among our players?
- How are the players responding to the game product?
Support decision making
- Fact checking before decisions
- Learning after decisions
Practically “ad hoc” work
- Write-only programming :)
- Interactive querying and visualization
- Writing reports
- Relatively little A/B testing
- (Almost) no data products
What’s a game team
Game team is a small (less than 20 member) team responsible for the game product
- Our games are services that are continuously being improved
Game design principles
- Making the games fun
- Long lasting
- Listening to players
- Playing the game yourself
Game development is an artisanal craft
- Everything comes from the experience and skill of the team
- FALSE: “Supercell games are based on a data-driven formula”
- Instead: The best people make the best games
- Data scientist is only needed after a game goes to beta
- A game has to exist, and have players, to be able to generate any data
Customers of analytics
- Game teams
- Finance
- Marketing
- Player support
- Community
- Leadership
- Everybody in the company!
Example KPI’s
So LTV increased because D7–180 retentions have improved while ARPDAU has stayed flat, or what’s the reason?
– someone somewhere
Retention: how many players return to the game d days after installation
- Day-1 retention, day-7 retention, …
- 40%, 20%, 10% retention for D1, D7, D30 used to be considered “good enough”
- Compare: Clash of Clans has 10% D720 retention
LTV: lifetime value, essentially ARPU at 180 or 360 days after installation
ARPU: average revenue per user, over d days after installation
- Day-1 ARPU, day-7 ARPU, …
ARPDAU: average revenue per DAU
DAU: daily active users
Plus others, ad infinitum
- MAU: monthly active users
- Revenue: daily, weekly, monthly, …
- Session count, length
- New players: how many players installed the game
- Concurrent players: how many players are logged in simultaneously
- PLTV: predicted LTV, expected value of LTV in the future
Sliced and diced
By country, state, platform, marketing network, language, …
SELECT SUM(x), AVG(y) FROM table GROUP BY z
Some warnings
- KPI’s are defined differently in different organizations
- KPI’s are computed differently in different organizations
- Data pipeline will influence how KPI’s will turn out
Example game analytics questions
- What are our DAU and revenue going to be six months ahead?
- Are the Dr. Terror levels and Task Force operations challenging enough for endgame players?
- Was the $1 price point worth introducing?
- What are the win rates by troops and troop combinations, and their recent trends?
- How many players churn after game updates?
- Is the PvP matchmaking working as intended?
- How many bases do the players have on their maps?
- How is the Arabic localization usage picking up?
- Could you sample nice task force attacks to be replayed in the company lobby daily?
- Could you generate a leaderboard of top Chinese players?
- How important is player-vs-player (PvP) compared to player-vs-environment (PvE)?
- Which troop combinations are being used the most and the least?
- Are new troops replacing some older ones?
- How many players are playing the in-game events (Dr. Terror, Gearheart, Hammerman attack)?
- Are the Power Bases well-balanced?
- Is the tutorial funnel working or is there a problem?
- What’s the outcome of the recent TV campaigns?
- How many riflemen were deployed during first year of Boom?
- 118,000,000,000 (of which only 36% survived)
- How much resources do players have, gain, and consume by HQ level?
- One integer overflow bug was first found by staring at data
- How many players logged in between 11:55–15:35 EET?
- Are the push notifications valuable?
You probably got the point
…that the list is endless
Meta-questions
- Why is metric X changing?
- Usually asked during the update cycle
- Why is metrix X not changing?
- Usually asked after a game update
- Why is metric X so good/bad in market Y?
III Task Force operation difficulty predictor
- Currently the only in-game data product
Task Forces
- Collaborative gameplay feature
- Players can form Task Forces with up to 50 members in each
- Task Forces organize collaborative attacks against the evil Blackguard
- Each operation is run by one task force against one target
The problem
The solution overview
Inputs:
- XP levels of all Task Force members: list of up to 50 integers, each between 12–62
- Operation tier: integer between 1–20
There are in total more than different task force combinations.
Output:
Success probability (win rate), as a label: too easy, easy, normal, hard, impossible
Algorithm:
Logistic regression
The solution, with details
Inputs:
Encode operation tier and player XP level into a feature vector:
- Both the operation tier and XP levels are encoded as one-hot vectors and concatenated together.
- Also, we add 20 × 51 interaction features for all combinations of operation tiers and XP levels
- Note also the linear regressor bias term .
Labels come from whether the task force operations were successful or not:
Output:
The logistic regressor predicts the win rate of a given task force in each operation:
Algorithm:
Logistic regression, loss function:
- Choose weights that minimize loss
- Set regularizer parameter with held-out validation
- Plug and play:
from sklearn.linear_model import LogisticRegression
model = LogisticRegression(C=1e3)
model.fit(X, y)
Done!?
Show to team, who are not happy
Instead, additional requirements emerge
The problem, take two
- The difficulty goes up with the tiers (tier 1 > … > tier 20)
- The difficulty goes down with XP levels (XP 1 < … < XP 62)
- The difficulty goes down with more players (1 < … < 50)
The solution, take two
Transform the solution for constrained optimization.
Inputs
Design feature vector encoding for nonnegative weight vector values:
Note that win rates should go up with decreasing tier and increasing XP.
For example,
- Weight of feature represents how much easier tier 1 is compared to tier 2
- Weight of feature represents how much better XP 62 player is compared to XP 61
Outputs
Same training labels, same predictions (win rate)
Algorithm
Same logistic regression, but with constraints:
- There is no such thing as constrained logistic regression in scikit-learn
scipy.optimize
module has high-quality constrained optimization routines, and we need the gradient of the loss function for those:
Then, run L-BFGS-B with the above constraints and the logistic loss and gradient functions.
Done!!
Practicalities
- Training examples extracted from game events using Pig on EMR
- Model fitting done on laptop
- Weight vectors deployed as a hardcoded Java class, compiled into the game server
- A simple web page/javascript implementation is good for development, testing, and also selling the result
Bonus question
- What to do when the team wants to introduce two new operation tiers?