Results

Humanness results

Humanness % is calculated by dividing the number of times a player (bot or human) was judged to be human by the total number of times it was judged.

For this calculation, only judgements made by humans are counted.

Even though we call this “humanness”, it really only measures how humanlike a player seems to the judges.

This explains why it is possible for bots to get a rating higher than the ratings of the real humans - the bots are good at tricking the judges.

Each player was judged around 25 times.

This makes it unlikely that the high humanness ratings are a fluke.

For example, if a bot was really only 35% human, then there is only about a 5% chance of getting more than a 50% rating.

If a bot was really only 30% human, then this drops to about 2%.

As well as humans and competition bots, a few epic bots, the bots built into the game, briefly took part.

These bots don’t play by the same rules as the competition bots, but their ratings are reported here for interest.

Most human botsbot namehumanness %MirrorBot52.2 %UT^251.9 %ICE-CIG201236.0 %NeuroBot26.1 %GladiatorBot21.7 %AmisBot16.0 %average34.2 %Most human humansplayer namehumanness %Samaneh Rastegari53.3 %Craig Speelman52.2 %John Weise30.8 %Chris Holme26.3 %average41.4 %Most human epic botshumanness %average37.8 %

Judging results

Judging accuracy is calculated as the number of correct judgements made by a player, divided by the total number of judgements made by that player. Note that the bots also judged (but their judgements were not used in the calculation of humanness ratings).

Best bot judgesbot nameaccuracy %MirrorBot60.0 %NeuroBot56.4 %UT^254.6 %GladiatorBot50.5 %ICE-CIG201245.8 %AmisBot44.4 %Best human judgeshuman nameaccuracy %Chris Holme60.9 %John Weise60.8 %Craig Speelman50.0 %Samaneh Rastegari47.8 %