Preparation commit for the upcoming Stockfish 10 version, giving a chance to catch last minute feature bugs and evaluation regression during the one-week code freeze period. Also changing the copyright dates to include 2019.
No functional change
We now use per-thread dynamic contempt. This patch has the following
effects:
* for Threads=1: **non-functional**
* for Threads>1:
* with MultiPV=1: **no regression, little to no ELO gain**
* with MultiPV>1: **clear improvement over master**
First, I tried testing at standard MultiPV=1 play with [0,5] bounds.
This yielded 2 yellow and 1 red test:
5+0.05, Threads=5:
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 82689 W: 16439 L: 16190 D: 50060
http://tests.stockfishchess.org/tests/view/5aa93a5a0ebc5902952892e6
5+0.05, Threads=8:
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 27164 W: 4974 L: 4983 D: 17207
http://tests.stockfishchess.org/tests/view/5ab2639b0ebc5902a6fbefd5
5+0.5, Threads=16:
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 41396 W: 7127 L: 7082 D: 27187
http://tests.stockfishchess.org/tests/view/5ab124220ebc59029516cb62
Then, I tested with Skill Level=17 (implicitly MutliPV=4), showing
a clear improvement:
5+0.05, Threads=5:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 3498 W: 1316 L: 1135 D: 1047
http://tests.stockfishchess.org/tests/view/5ab4b6580ebc5902932aeca2
Next, I tested the patch with MultiPV=1 again, this time checking for
non-regression ([-3, 1]):
5+0.5, Threads=5:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 65575 W: 12786 L: 12745 D: 40044
http://tests.stockfishchess.org/tests/view/5ab4e8500ebc5902932aecb3
Finally, I ran some tests with fixed number of games, checking if
reverting dynamic contempt gains more elo with Skill Level=17 (i.e.
MultiPV) than applying the "prevScore" fix and this patch. These tests
showed, that this patch gains 15 ELO when playing with Skill Level=17:
5+0.05, Threads=3, "revert dynamic contempt" vs. "WITHOUT this patch":
ELO: -11.43 +-4.1 (95%) LOS: 0.0%
Total: 20000 W: 7085 L: 7743 D: 5172
http://tests.stockfishchess.org/tests/view/5ab636450ebc590295d88536
5+0.05, Threads=3, "revert dynamic contempt" vs. "WITH this patch":
ELO: -26.42 +-4.1 (95%) LOS: 0.0%
Total: 20000 W: 6661 L: 8179 D: 5160
http://tests.stockfishchess.org/tests/view/5ab62e680ebc590295d88524
---
***FAQ***
**Why should this be commited?**
I believe that the gain for multi-thread MultiPV search is a sufficient
justification for this otherwise neutral change. I also believe this
implementation of dynamic contempt is more logical, although this may
be just my opinion.
**Why is per-thread contempt better at MultiPV?**
A likely explanation for the gain in MultiPV mode is that during
search each thread independently switches between rootMoves and via
the shared contempt score skews each other's evaluation.
**Why were the tests done with Skill Level=17?**
This was originally suggested by @Hanamuke and the idea is that with
Skill Level Stockfish sometimes plays also moves it thinks are slightly
sub-optimal and thus the quality of all moves offered by the MultiPV
search is checked by the test.
**Why are the ELO differences so huge?**
This is most likely because of the nature of Skill Level mode --
since it slower and weaker than normal mode, bugs in evaluation have
much greater effect.
---
Closes https://github.com/official-stockfish/Stockfish/pull/1515.
No functional change -- in single thread mode.
To more clearly distinguish them from "const" local variables, this patch
defines compile-time local constants as constexpr. This is consistent with
the definition of PvNode as constexpr in search() and qsearch(). It also
makes the code more robust, since the compiler will now check that those
constants are indeed compile-time constants.
We can go even one step further and define all the evaluation and search
compile-time constants as constexpr.
In generate_castling() I replaced "K" with "step", since K was incorrectly
capitalised (in the Chess960 case).
In timeman.cpp I had to make the non-local constants MaxRatio and StealRatio
constepxr, since otherwise gcc would complain when calculating TMaxRatio and
TStealRatio. (Strangely, I did not have to make Is64Bit constexpr even though
it is used in ucioption.cpp in the calculation of constexpr MaxHashMB.)
I have renamed PieceCount to pieceCount in material.h, since the values of
the array are not compile-time constants.
Some compile-time constants in tbprobe.cpp were overlooked. Sides and MaxFile
are not compile-time constants, so were renamed to sides and maxFile.
Non-functional change.
Make contempt dependent on the current score of the root position.
The idea is that we now use a linear formula like the following to decide
on the contempt to use during a search :
contempt = x + y * eval
where x is the base contempt set by the user in the "Contempt" UCI option,
and y * eval is the dynamic part which adapts itself to the estimation of
the evaluation of the root position returned by the search. In this patch,
we use x = 18 centipawns by default, and the y * eval correction can go
from -20 centipawns if the root eval is less than -2.0 pawns, up to +20
centipawns when the root eval is more than 2.0 pawns.
To summarize, the new contempt goes from -0.02 to 0.38 pawns, depending if
Stockfish is losing or winning, with an average value of 0.18 pawns by default.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 110052 W: 24614 L: 23938 D: 61500
http://tests.stockfishchess.org/tests/view/5a72e6020ebc590f2c86ea20
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 16470 W: 2896 L: 2705 D: 10869
http://tests.stockfishchess.org/tests/view/5a76c5b90ebc5902971a9830
A second match at LTC was organised against the current master:
ELO: 1.45 +-2.9 (95%) LOS: 84.0%
Total: 19369 W: 3350 L: 3269 D: 12750
http://tests.stockfishchess.org/tests/view/5a7acf980ebc5902971a9a2e
Finally, we checked that there is no apparent problem with multithreading,
despite the fact that some threads might have a slightly different contempt
level that the main thread.
Match of this version against master, both using 5 threads, time control 30+0.3:
ELO: 2.18 +-3.2 (95%) LOS: 90.8%
Total: 14840 W: 2502 L: 2409 D: 9929
http://tests.stockfishchess.org/tests/view/5a7bf3e80ebc5902971a9aa2
Include suggestions from Marco Costalba, Aram Tumanian, Ronald de Man, etc.
Bench: 5207156
* A better contempt implementation for Stockfish
The round 2 of TCEC season 10 demonstrated the benefit of having a nice contempt implementation: it gives the strongest programs in the tournament the ability to slow down the game when they feel the position is slightly worse, prefering to stay in a complicated (even if slightly risky) middle game rather than simplifying by force into a drawn endgame.
The current contempt implementation of Stockfish is inadequate, and this patch is an attempt to provide a better one.
Passed STC non-regression test against master:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 83360 W: 15089 L: 15075 D: 53196
http://tests.stockfishchess.org/tests/view/5a1bf2de0ebc590ccbb8b370
This contempt implementation is showing promising results in certains situations. For instance, it obtained a nice +30 Elo gain when playing with contempt=40 against Stockfish 7, compared to current master:
• master against SF 7 (20000 games at LTC): +121.2 Elo
• this patch with contempt=40 (20000 games at LTC): +154.11 Elo
This was the result of real cooperative work from the Stockfish team, with key ideas coming from Stefan Geschwentner (locutus2) and Chris Cain (ceebo) while most of the community helped with feedback and computer time.
In this commit the bench is unchanged by default, but you can test at home with the new contempt in the UCI options. The style of play will change a lot when using contempt different of zero (I repeat: not done in this version by default, however)!
The Stockfish team is still deliberating over the best default contempt value in self-play and the best contempt modeling strategy, to help users choosing a contempt value when playing against much weaker programs. These informations will be given in future commits when available :-)
Bench: 5051254
* Remove the prefetch
No functional change.
This patch puts the evaluation helper functions inside EvalInfo struct, which simplifies a bit their signature and (most importantly, IMHO) makes their C++ code much cleaner and simpler to read (by removing the "ei." qualifiers all around in evaluate.cpp).
Also rename the EvalInfo struct into Evaluation class to get a natural invocation v = Evaluation(p).value() to evaluation position p.
The downside is an increase of 20 lines in evaluate.cpp (for the prototypes of the helper functions). The upsides are better readability and a speed-up of 0.6% (by generating all the helpers for the NO_TRACE case together, which helps the instruction cache).
No functional change
Closes#1135
Rescales the king danger variables in evaluate_king() to
suppress the KingDanger[] array. This avoids the cost of
the memory accesses to the array and simplifies the non-linear
transformation used.
Full credits to "hxim" for the seminal idea and implementation,
see pull request #786.
https://github.com/official-stockfish/Stockfish/pull/786
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 9649 W: 1829 L: 1689 D: 6131
Passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 53494 W: 7254 L: 7178 D: 39062
Bench: 6116200
Collect and give a second try to some almost passed tuning attempts and
one-line tweaks from the last month.
Passed STC
LLR: 3.07 (-2.94,2.94) [0.00,4.00]
Total: 15124 W: 2974 L: 2756 D: 9394
And LTC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 21577 W: 3507 L: 3289 D: 14781
Bench: 8855226
Resolves#464
Apart from usual renaiming, take advantage of
C++11 function template default parmeter to
get rid of Eval trampoline functions.
Some triviality fixes while there.
No functional change.
1/ eval margin and gains removed:
16bit are now free on TT entries, due to the removal of eval margin. may be useful
in the future :) gains removed: use instead by Value(128). search() and qsearch()
are now consistent in this regard.
2/ futility_margin()
linear formula instead of complex (log(depth), movecount) formula.
3/ unify pre & post futility pruning
pre futility pruning used depth < 7 plies, while post futility pruning used
depth < 4 plies. Now it's always depth < 7.
Tested with fixed number of games both at short TC:
ELO: 0.82 +-2.1 (95%) LOS: 77.3%
Total: 40000 W: 7939 L: 7845 D: 24216
And long TC
ELO: 0.59 +-2.0 (95%) LOS: 71.9%
Total: 40000 W: 6876 L: 6808 D: 26316
bench 7243575
1/ eval margin and gains removed:
- gains removed by Value(128): search() and qsearch() now behave consistently!
2/ futility_margin()
- testing showed that there is no added value in this weird (log(depth), movecount)
formula, and a much simpler linear formula is just as good. In fact, it is most
likely better, as it is not yet optimally tuned.
- the new simplified formula also means we get rid of FutilityMargins[], its
initialization code, and more importantly ss->futilityMoveCount, and the hacky
code that updates it throughout the search().
- the current formula gives negative futility margins, and there is a hidden interaction
between the move coutn pruning formula and the futility margin one: what happens is
that MCP is supposed to be triggered before we use the non-sensical negative futility
margins.
3/ unify pre & post futility pruning
- pre futility pruning (what SF calls value based pruning) used depth < 7 plies,
while post futility pruning (what SF calls static null move pruning) used depth < 4 plies.
- also the condition depth < 7 in pre futility pruning was not obvious, and it seemd
to be depth < 16 (futility_margin() returns an infinite value when depth >= 7).
Tested with fixed number of games both at short TC:
ELO: 0.82 +-2.1 (95%) LOS: 77.3%
Total: 40000 W: 7939 L: 7845 D: 24216
And long TC
ELO: 0.59 +-2.0 (95%) LOS: 71.9%
Total: 40000 W: 6876 L: 6808 D: 26316
bench: 10206576
And #ifdef instead of #if defined
This is more standard form (see for example iostream file).
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Allow to use EvalInfo struct, populated by
evaluation(), in search.
In particular we allocate Eval::Info on the stack
and pass a pointer to this to evaluate().
Also add to Search::Stack a pointer to Eval::Info,
this allows to reference eval info of previous/next
nodes.
WARNING: Eval::Info is NOT initialized and is populated
by evaluate(), only if the latter is called, and this
does not happen in all the code paths, so care should be
taken when accessing this struct.
No functional change.
And return on using TT as backing store for position
evaluations.
Tests (even on single thread) show eval cache was a regression.
In multi thread result should be even worst because eval cache
is a per-thread struct, while TT is shared.
After 4957 games at 15"+0.05 (single thread)
eval cache vs master 969 - 1093 - 2895 -9 ELO
So previous reported result of +18 ELO was probably due to an
issue in the testing framework (a bug in cutechess-cli) that
has been fixed in the meanwhile.
bench: 5386711
With this patch series we want to introduce a per-thread
evaluation cache to store node evaluation and do not
rely anymore on the TT table for this.
This patch just introduces the infrastructure.
No functional change.
First disable Contempt Factor during analysis, then
calculate the modified draw score from the point of
view of the player, so from the point of view of
RootPosition color.
Thanks to Ryan Taker for suggesting the fixes.
No functional change.
This is very crude and very basic: simply in case
of a draw for repetition or 50 moves rule return
a negative score instead of zero according to the
contempt factor (in centipawns). If contempt is
positive engine will try to avoid draws (to use
with weaker opponents), if negative engine will
try to draw. If zero (default) there are no changes.
No functional change.
So to be done only once at startup and in the (unlikely)
cases that a relevant UCI parameter is changed, instead
of doing it at the beginning of each search.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This change allows to remove some quite a bit of code
and seems the natural thing to do.
Introduced file thread.cpp to move away from search.cpp a lot
of threads related stuff.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This patch is based on Justin Blanchard's original
work and allows to breakdown evaluation in its sub terms and
show to the user.
Tracing code has zero speed impact when not used.
Note that tracing code is not thread-safe, but this
should not be a problem given the typical usage scenario.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is the "selective search depth in plies" and we set
equal to PV line length.
Tested that works under FritzGUI.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Currently are used by evaluation itself and the
whole EvalInfo will be removed from global visibility
by next patch, so no reason to use them.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It will be more clear when we will go to add stuff
apart from king danger itself.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Get it from the position instead.
A good semplification of function calling and a speedup too.
No functional change also with faked split.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>