Case Balance

Case study

Reverse engineering balance of a live game

Introduction

I joined War Robots team in 2017 as a balance game designer. At the time there were no analytical balance tools at all. New content was being made "by analogy", similar to existing working content. All live balance changes were done by "adjusting by ear". Like: okay, this gun is overperforming, let's reduce its damage by 5%, if it's not enough, we'll take another 5% next time.

So one of my first big tasks was to design a balance system for that. What I had was a live game and a set of analytical data from a live server. What I've done at the end was a set of automated balance calculators for the team, a whole new approach to balance strategy on the project, and pipeline improvements.

The body of this article is nerdy math. You can skip to the bottom to read the conclusion and how it affected our work pipelines and project performance.

Guns balance

The main metric for gun balance was 'damage done over lifetime'. As War Robots combat is sustained with limited to no ways to heal damage back, damage done represents efficiency good enough. In shooters with regen would need to take lethality into account as well or instead.

So the question is what are the components of that damage over match? First, it's
(DPS x Lifetime). Lifetime is a constant. I mean average lifetime between all robots. So we need only to deconstruct DPS here. First, there are several ways to calculate DPS. One is being damage per bullet multiplied by rate of fire. But then we need to take into account weapon reload duration. And then damage loss due to misses. So terminology here:

DPS = damage_per_bullet x rate_of_fire = damage_per_bullet * bullets / unload_time - how much damage gun deals with one full clip over time it empties the clip.
DPS' = Damage_per_clip / (unload_time + reload_time) - in War Robots balance one clip is never enough to kill a target. So taking both unload and reload time is more accurate within this balance model. DPS is calculated over full shoot-reload cycle.
TrueDPS = DPS' * accuracy% - due to bullet spread there are damage losses. Lasers and snipers have no spread, their accuracy is 100%, so TrueDPS = DPS'

This is first part of our formula

This is a good estimate, but it didn't match analytical data from the live server. First, we need to take into account accuracy. But do it on average, as we don't track hit rate of guns. This is a geometrically solved task. Bullet spread is a cone, that depends on shooting distance. The probability to hit is the relation between the sizes of the spread cone and the target. If the target is 2/3 of the cone, so there's 66% to hit it and therefore 66% accuracy.

Spread cone size is constant and can be calculated analytically. But target size can vary depending on the distance to the target. Obviously, accuracy depends on the distance to the target. What is the average combat distance? It's fair to assume that combat distance follows normal distribution. the attacker tries to get closer to target to increase accuracy, defender tries to get away. There are a number of other factors. So average combat distance would be under normal distribution with minimal distance being 0 and maximum distance set in the weapon parameters (in WR shooting distance is limited by weapon).

To get one number for weapon accuracy we need to calculate expected value for that weapon as accuracy at different distances multiplied by the probability of being at that distance. For example

To be fair this was completely unnecessary. Estimations like 25%, 50%, 75%, and 100% accuracy worked just fine. But I need exact values to verify that at first.

By now weapon DPS formula looks like that

This is a good enough approximation. And for some weapons it worked, but not all. There were more unknown components. First is firing distance. In War Robots weapons have limited range, after which projectile just disappears. It's fair to assume that the longer the range is, the more opportunities there are for the weapon to deal damage. So it should influence damage done somehow.

Long stroy short, distance affects DPS linearly. In War Robots possible firing distances range from 1 meter to 1000 meters. And
f(Distance) = Distance/1000.

Weapons with a 1000 range are able to shoot virtually always. So their damage over lifetime doesn't drop. But weapons with a 500m distance are able to shoot only in half of the situation, weapons with 300 meters range - in 30% of situations.

Approximation became better but still not good enough. It worked for more weapons with different firing ranges. For some weapons approximation was correct, but for some, it was exactly 2 times off. The final component turned out to be the possibility of the target firing back.

The thing in gunfight that usually when you deal damage, you receive damage as well. On average you receive the same damage you deal. Therefore your lifetime is twice short compared to using weapons that do not put you in danger.

So there are classes of indirect or safe weapons. And to account for that their DPS should be 2 times lower compared to more risky guns. What are indirect weapons? Those are homing that allows to shoot from behind a cover, those are snipers that outrange all other guns (with range >800) and those are hit-n-run weapons.

Latter is the most interesting. In my math model it's expressed in fraction (unload_time)/(reload_time)
If this number is lower than a threshold, I consider this weapon burst and lower its DPS. Otherwise, it's a sustain weapon. A Burst weapon means that its user is able to dish out damage quickly and then hide behind a cover for reload, giving the opponent no time to deal damage back. I called it burst_factor. And this was the final component of guns math model.

Returning to our initial balance metric - damage done over lifetime. It's equal to TrueDPS * 60 as the average lifetime of our robots was 60 seconds. And this approximation worked perfectly. My calculations matched with retrospective data for all but 1-2 special guns and it was able to predict the performance of newly designed guns based on their balance parameters only.

Having this solid math it was easy to make an Excel balance calculator where designers could put their intended gun parameters and calc would tell them if the gun was balanced. That allowed any designer even with 0 skill in math or balance to create balanced content based on their design intention (defined range, rate of fire, and other gun parameters).

Mechs balance

The second big part of this story is mechs balance. And to some degree, it's a more complicated topic. Mech is a complex entity. Guns had only one thing - damage. But mechs have few orthogonal parameters - HP, speed, firepower, and ability. Abilities can be drastically different in their effects. But in fact, all of this is perfectly solvable by math and common balance understanding.

So, our mechs have 4 components - HP, firepower, speed, and ability. In fact, we can reduce that to 3. Any ability can be converted to a basic stat. Like 60% haste with a duration of 5 seconds and cooldown of 15 seconds can be converted to speed as haste%*duration/(duration+cooldown))
So 60% haste as above equals 15% of permanent move speed boost. This works by the assumption that the player uses this optimally, but generally, it's always true.
One way or another all possible ability effects can be converted to HP, damage, or speed. Being untargetable mathematically equal to having some additional HP. The ability to fly means that your linear speed is higher as you can cut corners and reach parts of the map faster and so on.

I won't describe this topic fully as it's a common thing about transitive mechanics. There's a solid article about this in the Ian Schreiber lectures.

So, now we have HP, firepower, and speed. All adjusted by mech ability. And we need to establish some relations between those. Fist is to select a common metric for all mechs. And that metric is win rate. Generally, we want all mechs to have 50% win rate, with some leeway. So mechs that already have 50% win rate I considered balanced. Those became anchor mechs to compare balance state between mechs.

Fortunately, at the time we had like 30-40 different mechs in the game. And some of them were quite similar to each other in their parameters. Like there are two mechs and the difference between them is that one has 20k HP more while another has 5 kmph speed more. But conveniently they had similar winrate. This way I was able to establish that 5 kmph speed was equal to 20k HP. Comparing different mech to each other this way I was able to establish all relations between mech parameters. This comparison analysis technique is described in the same Schreiber lecture I linked above.

In addition the same way I found the "cost" of those parameters. There were pairs of mechs where one of them was just strictly better than the other. Like all stats are identical except one has 20k HP more. So resulting relationships were
1 light weapon slot = 20k HP = 5kmph = 1% winrate.

For each parameter there were average anchor values. Like average mech is 100k HP, 40 kmph and 4 points of firepower. Build like this resulted in 100% win rate. Increasing HP by 20k or giving it an ability equivalent of 20k HP resulted in increasing win rate to 51%. In this model mech works as the sum of the 3 parameters.

And yet again having a solid math that was proven by retrospective live data allowed to create Excel balance calc for mechs. Any GD could enter HP, speed, firepower, ability type, ability value, and its duration\CD and the calc would return the result, showing if the mech is in balance and how close it is to the intended power budget. Like 90% power or 132$ of power.

Conclusion

A lot of nerdy stuff. But all in all, what was the value of it?

Firstly it's automated balance calculators I made. Having those meant that any designer could do balance for the content. Eliminating dedicated balance designer from the content production pipeline. Conveniently it freed my hands and allowed me to dedicate time to more interesting and difficult projects, ramping my career effectively.
Those balance calculators had a lot of predictive power. Not ideal, but very good. Before them, balance was done via test servers. We hosted closed beta tests and gathered telemetry data. But the thing is that some content was especially tricky and required 4-5 weekly iterations in those CBTs. My calcs reduced the number of iterations to 1-2 allowing content to spend less time in the production pipeline and making the pipeline more predictable, as math models are much more stable than guestimating.
Obviously better balance means happier users and better product metrics.
Having a clean balance allowed us to develop content tier system. Where different tiers of content had different power budget and different costs. The thing about balance here is that all prices were perceived as fair by players, as no content was off balance. When it costs $100, punches for $100. That leads to better sales and gacha engagement.
Having content tiers allowed us to introduce color gradation of content in UI. Explaining point 4 to users and leading to better engagement with the shop.