# Introduction

TODO: Introduce game rules and real word applications (diagnostics). Related work.

## Notations

Let `N`

be the number of pegs in the code and `C`

be the number of possible peg colors. We will call a game with these particular parameters a `(N, C)`

-game. A *code* is a vector in `\mathbb{C} = \{0, \ldots, C - 1\}^N`

. The number of possible codes is `M = C^N`

. The game is played as follows:

- The codemaker chooses a secret code
`c \in \mathbb{C}`

. - The codebreaker tries to guess
`c`

in as few iterations as possible. At each iteration`t`

:

- The codebreaker produces a guess
`g_t \in \mathbb{C}`

. - The codemaker provides a feedback
`f(c, g_t) = \langle b(c,g_t), w(c, g_t) \rangle`

where`b`

is the number of pegs which are correct in both color and position and`w`

is the number of pegs which are correct in color but not in position.

- The game ends when the codebreaker guesses the secret code (gets feedback
`\langle N, 0 \rangle`

) or when a predefined maximum number of iterations`T`

is reached.

The notion of "correct in color but not in position" is a bit tricky because color repetitions are allowed. To avoid ambiguity, here is a possible feedback implementation:

```
int[] feedback(int[] code, int[] guess) {
int[] vCode = new int[C]; // count vector of colors in code
int[] vGuess = new int[C]; // count vector of colors in guess
int black = 0;
for (int i = 0; i < N; i++) {
if (code[i] == guess[i]) {
black++;
} else {
vCode[code[i]]++;
vGuess[guess[i]]++;
}
}
int white = 0;
for (int c = 0; c < C; c++) {
white += min(vCode[c], vGuess[c]);
}
int[] f = {black, white};
return f;
}
```

### Technical note

For performance reasons, we represent codes as integers between `0`

and `M - 1`

. Writing the code in base `C`

we obtain the vector of peg colors. For example, in the classical `(4, 6)`

-game, the number `51 = 0 \times 6^3 + 1 \times 6^2 + 2 \times 6^1 + 3 \times 6^0`

represents the code `0123`

.

The feedbacks are also encoded as integers between `0`

and `F - 1`

, where `F = \frac{(N + 1)(N + 2)}{2}`

. To reduce the size of arrays indexed by the possible feedbacks, we use the fact that `b + w \le N`

. Here is an example of encoding for `N = 3`

.

b\w | 0 | 1 | 2 | 3 |
---|---|---|---|---|

0 | 6 | 7 | 8 | 9 |

1 | 3 | 4 | 5 | - |

2 | 1 | 2 | - | - |

3 | 0 | - | - | - |

Using this encoding, feedback `0`

always means that the codebreaker guessed the code. According to the situation, some of the feedbacks above are impossible. For example, the codebreaker can never get a feedback `\langle N - 1, 1 \rangle`

. This encoding tries to find a good trade-off between implementation simplicity and memory economy.

The class `CodeToolkit`

makes these encodings more or less transparent. The main method is `int feedback(int code, int guess)`

. For performance reasons, feedbacks are precomputed and stored in a `M \times M`

symmetrical `feedbacks`

matrix. The class also contains some methods for encoding, decoding and displaying codes and feedbacks. The sketch `test_toolkit`

shows how to use this class.

# Exhaustive strategies

TODO. Partitions and guesses that optimize different score functions :

- Minimize the maximum partition size (Knuth)
- Minimize the expected partition size (Irving)
- Maximize the number of Partitions (Kooi)
- Maximize information gained (Bestavros & Belal)
- Use only consistent guesses or the whole set
`\mathbb{C}`

Tables with average and maximal number of guesses for different `(N, C)`

. To be used as reference for comparison with our results.

# Evolution

TODO: related work. Some evolutionary strategies in the literature. They evolve a set of guesses to be used by the breaker.

Our goal is to evolve a *breaker*. Deliberately we will use a very few knowledge about the game. Our breakers know very few things:

- The set of possible codes (and guesses)
`\mathbb{C}`

- A feedback
`\langle N, 0 \rangle`

means that they guessed the code - The fact that the feedback is deterministic. In particular, this means that it is useless to make the same guess more than once during a game, because this will not bring any new information.

Note that the breakers do not know the meaning of the feedbacks (except of the winning one) and hence they can play only a trial and error strategies.

## Breaker representation

We will represent a breaker as a rooted tree. Each node of the tree is a guess. The subtrees are labeled with the possible feedbacks of this guess. Here is an example of such a tree for the (3, 3)-game:

# References

TODO: add papers to the repo