Players and algorithms

Many of you may wonder how we will ensure that playing Genigma will allow us to obtain scientific results. During last days we have been working in this direction with the data of the games that were obtained from 12 volunteers who played between 20 and 22 November. All match results were collected on the server for later analysis.

The volunteers were in charge of reorganising the game blocks (corresponding to pieces of DNA) to get the maximum score (the higher the score, the closer the solution is to the correct one). In this test, players provided 176 possible solutions corresponding to various games of different levels: easy (8 blocks), medium (16 blocks) or difficult (35 blocks).

For the test, we used 3D genomic data extracted from a human non-tumour cell line called GM12878. Since we already know the sequence of this line, we were able to verify whether the solution provided by the players was the correct one or not. In particular, we used a fragment of chromosome 3, which ranges from the nucleotide 125,000,000 to 130,000,000, or as typically denoted in genomics, the fragment chr3: 125Mb-130Mb.

How was the check done?

Alessandra and Marco, from the scientific team, compared the solutions obtained by the players with the solutions of two different bioinformatics algorithms. The first, which we will call the fast algorithm, provides an approximate solution of the game in almost no time, while the second, which we will call complete algorithm, analyses all the possible solutions and offers the exact solution. Unfortunately, the complete algorithm needs extremely long calculation times to solve complex games (it is capable of controlling a single solution in 0.4s, but would need 10^34 days of uninterrupted calculations to solve a single 35-piece game!). It has therefore been discarded for this use in our test. However, we used it to address only the easy level (8 pieces).

From the analysis of these data, we have obtained two very interesting conclusions:

  1. The highest scores (solutions closer to the correct one) were reached when many people face the same game (they analyse the same DNA fragment). The competition between players, generated by knowing that someone has previously obtained a higher score in the same game, facilitates the achievement of the exact solution. In the test, these multiplayer games got better solutions than those provided by the fast algorithm, regardless of the difficulty level.
  2. In the easy levels (8 blocks), the players were able to find the solution corresponding to the maximum possible score (proven with the complete algorithm). If this did not happen, the solution provided was still very close.

These results make us think that we are going in the right direction! How do we continue from now?

We are currently working on the automated bioinformatics part to correctly normalise data and provide new games for the next test. At the same time, we are improving the user experience, so that Genigma is always more attractive to players.

In thi test, we have perfected the mechanism of the game and verified that it is possible to reach correct solutions. In the following phases, we will be able to provide players with DNA fragments of tumour cells (of which we do not know all the details of the sequence) and be sure that the best solutions will be useful for science. The ultimate goal is to understand how the genome of some tumours is structured and to think about possible medical applications based on this new information.