In the so-called "hybrid" method of evaluation of current master, we use the
classical eval (because of its speed) instead of the NNUE eval when the classical
material balance approximation hints that the position is "winning enough" to
rely on the classical eval.
This trade-off idea between speed and accuracy works well in general, but in
some fortress positions the classical eval is just bad. So in shuffling branches
of the search tree, we (slowly) increase the thresehold so that eventually we
don't trust classical anymore and switch to NNUE evaluation.
This patch increases that threshold faster, so that we switch to NNUE quicker
in shuffling branches. Idea is to incite Stockfish to spend less time in fortresses
lines in the search tree, and spend more time searching the critical lines.
passed STC:
LLR: 2.96 (-2.94,2.94) <-0.50,2.50>
Total: 47872 W: 3908 L: 3720 D: 40244
Ptnml(0-2): 122, 3053, 17419, 3199, 143
https://tests.stockfishchess.org/tests/view/60cef34b457376eb8bcab79d
passed LTC:
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 73616 W: 2326 L: 2143 D: 69147
Ptnml(0-2): 21, 1940, 32705, 2119, 23
https://tests.stockfishchess.org/tests/view/60cf6d842114332881e73528
Retested at LTC against lastest master:
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 18264 W: 642 L: 532 D: 17090
Ptnml(0-2): 6, 479, 8055, 583, 9
https://tests.stockfishchess.org/tests/view/60d18cd540925195e7a6c351
closes https://github.com/official-stockfish/Stockfish/pull/3578
Bench: 5139233
This patch removes the UCI option for setting Contempt in classical evaluation.
It is exactly equivalent to using Contempt=0 for the UCI contempt value and keeping
the dynamic part in the algo (renaming this dynamic part `trend` to better describe
what it does). We have tried quite hard to implement a working Contempt feature for
NNUE but nothing really worked, so it is probably time to give up.
Interested chess fans wishing to keep playing with the UCI option for Contempt and
use it with the classical eval are urged to download the version tagged "SF_Classical"
of Stockfish (dated 31 July 2020), as it was the last version where our search
algorithm was tuned for the classical eval and is probably our strongest classical
player ever: https://github.com/official-stockfish/Stockfish/tags
Passed STC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 72904 W: 6228 L: 6175 D: 60501
Ptnml(0-2): 221, 5006, 25971, 5007, 247
https://tests.stockfishchess.org/tests/view/60c98bf9457376eb8bcab18d
Passed LTC:
LLR: 2.93 (-2.94,2.94) <-2.50,0.50>
Total: 45168 W: 1601 L: 1547 D: 42020
Ptnml(0-2): 38, 1331, 19786, 1397, 32
https://tests.stockfishchess.org/tests/view/60c9c7fa457376eb8bcab1bb
closes https://github.com/official-stockfish/Stockfish/pull/3575
Bench: 4947716
This patch increase the weight of pawns and pieces from 28 to 32
in the scaling formula we apply to the output of the NNUE pure eval.
Increasing this gradient for pawns and pieces means that Stockfish
will try a little harder to keep material when she has the advantage,
and try a little bit harder to escape into an endgame when she is
under pressure.
STC:
LLR: 2.93 (-2.94,2.94) <-0.50,2.50>
Total: 53168 W: 4371 L: 4177 D: 44620
Ptnml(0-2): 160, 3389, 19283, 3601, 151
https://tests.stockfishchess.org/tests/view/60cefd1d457376eb8bcab7ab
LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 10888 W: 386 L: 288 D: 10214
Ptnml(0-2): 3, 260, 4821, 356, 4
https://tests.stockfishchess.org/tests/view/60cf709d2114332881e7352b
closes https://github.com/official-stockfish/Stockfish/pull/3571
Bench: 4965430
trained with the Python command
c:\nnue>python train.py i:/bin/all.binpack i:/bin/all.binpack --gpus 1 --threads 4 --num-workers 30 --batch-size 16384 --progress_bar_refresh_rate 300 --smart-fen-skipping --random-fen-skipping 3 --features=HalfKAv2^ --lambda=1.0 --max_epochs=440 --seed %random%%random% --default_root_dir exp/run_10 --resume-from-model ./pt/nn-3b20abec10c1.pt
`
all.binpack equaled 4 parts Wrong_NNUE_2.binpack https://drive.google.com/file/d/1seGNOqcVdvK_vPNq98j-zV3XPE5zWAeq/view?usp=sharing plus two parts of Training_Data.binpack https://drive.google.com/file/d/1RFkQES3DpsiJqsOtUshENtzPfFgUmEff/view?usp=sharing
Each set was concatenated together - making one large Wrong_NNUE 2 binpack and one large Training so the were approximately equal in size. They were then interleaved together. The idea was to give Wrong_NNUE.binpack closer to equal weighting with the Training_Data binpack .
Net nn-3b20abec10c1.nnue was chosen as the --resume-from-model with the idea that through learning, the manually hex edited values will be learned and will not need to be manually adjusted going forward. They would also be fine tuned by the learning process.
passed STC:
https://tests.stockfishchess.org/tests/view/60cdf91e457376eb8bcab66f
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 18256 W: 1639 L: 1479 D: 15138
Ptnml(0-2): 59, 1179, 6505, 1313, 72
passed LTC:
https://tests.stockfishchess.org/tests/view/60ce2166457376eb8bcab6e1
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 18792 W: 654 L: 542 D: 17596
Ptnml(0-2): 9, 490, 8291, 592, 14
closes https://github.com/official-stockfish/Stockfish/pull/3570
Bench: 5020972
The Cygwin environment has two g++ compilers, each with a different problem
for compiling Stockfish at the moment:
(a) g++.exe : full posix build compiler, linked to cygwin dll.
=> This one has a problem embedding the net.
(b) x86_64-w64-mingw32-g++.exe : native Windows build compiler.
=> This one manages to embed the net, but has a problem related to libgcov
when we use the profile-build target of Stockfish.
This patch solves the problem for compiler (b), so that our recommended command line
if you want to build an optimized version of Stockfish on Cygwin becomes something
like the following (you can change the ARCH value to whatever you want, but note
the COMP and CXX variables pointing at the right compiler):
```
make -j profile-build ARCH=x86-64-modern COMP=mingw CXX=x86_64-w64-mingw32-c++.exe
```
closes https://github.com/official-stockfish/Stockfish/pull/3569
No functional change
move to github actions to replace travis CI.
First version, testing on linux using gcc and clang.
gcc build with sanitizers and valgrind.
No functional change
Optimization of vondele's nn-33c9d39e5eb6.nnue using SPSA
https://tests.stockfishchess.org/tests/view/60ca68be457376eb8bcab28b
Setting: ck values are default based on how large the parameters are
The new values for this net are the raw values at the end of the tuning (80k games)
The significant changes are in buckets 1 and 2 (5-12 pieces) so the main difference is in playing endgames if we compare it to nn-33c9. There is also change in bucket 7 (29-32 pieces) but not as substantial as the changes in buckets 1 and 2. If we interpret the changes based on an experiment a few months ago, this new net plays more optimistically during endgames and less optimistically during openings.
STC:
LLR: 2.93 (-2.94,2.94) <-0.50,2.50>
Total: 49504 W: 4246 L: 4053 D: 41205
Ptnml(0-2): 140, 3282, 17749, 3407, 174
https://tests.stockfishchess.org/tests/view/60cbd752457376eb8bcab478
LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.50>
Total: 88720 W: 4926 L: 4651 D: 79143
Ptnml(0-2): 105, 4048, 35793, 4295, 119
https://tests.stockfishchess.org/tests/view/60cc7828457376eb8bcab4fa
closes https://github.com/official-stockfish/Stockfish/pull/3566
Bench: 4758885
This net was created by @pleomati, who manually edited with an hex editor
10 values randomly chosen in the LCSFNet10 net (nn-6ad41a9207d0.nnue) to
create this one. The LCSFNet10 net was trained by Joost VandeVondele from
a dataset combining Stockfish games and Leela games (16x10^9 positions from
SF self-play at depth 9, and 6.3x10^9 positions from Leela games, so overall
72% of Stockfish positions and 28% of Leela positions).
passed STC 10+0.1:
LLR: 2.94 (-2.94,2.94) <-0.50,2.50>
Total: 50888 W: 5881 L: 5654 D: 39353
Ptnml(0-2): 281, 4290, 16085, 4497, 291
https://tests.stockfishchess.org/tests/view/60cbfa68457376eb8bcab49a
passed LTC 60+0.6:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 25480 W: 1498 L: 1338 D: 22644
Ptnml(0-2): 36, 1155, 10193, 1325, 31
https://tests.stockfishchess.org/tests/view/60cc4af8457376eb8bcab4d4
closes https://github.com/official-stockfish/Stockfish/pull/3564
Bench: 4904930
This reverts commit "Fix for Cygwin's environment build-profile", as it was
giving errors for "make clean" on some Windows environments. See comments in
68bf362ea2
Possibly somebody can propose a solution that would fix Cygwin builds and
not break on other system too, stay tuned! :-)
No functional change
The Cygwin environment has two g++ compilers, each with a different problem
for compiling Stockfish at the moment:
(a) g++.exe : full posix build compiler, linked to cygwin dll.
=> This one has a problem embedding the net.
(b) x86_64-w64-mingw32-g++.exe : native Windows build compiler.
=> This one manages to embed the net, but has a problem related to libgcov
when we use the profile-build target of Stockfish.
This patch solves the problem for compiler (b), so that our recommended command line
if you want to build an optimized version of Stockfish on Cygwin becomes something
like the following (you can change the ARCH value to whatever you want, but note
the COMP and CXX variables pointing at the right compiler):
```
make -j profile-build ARCH=x86-64-modern COMP=mingw CXX=x86_64-w64-mingw32-c++.exe
```
closes https://github.com/official-stockfish/Stockfish/pull/3463
No functional change
of a root move leading to a 3-fold repetition.
With this small fix a draw ranking and thus a draw score is being applied.
This works for both, ranking by dtz or wdl tables.
Fixes https://github.com/official-stockfish/Stockfish/issues/3542
(No functional change without TBs.)
Bench: 4877339
This net is the result of training on data used by the Leela project. More precisely,
we shuffled T60 and T74 data kindly provided by borg (for different Tnn, the data is
a result of Leela selfplay with differently sized Leela nets).
The data is available at vondele's google drive:
https://drive.google.com/drive/folders/1mftuzYdl9o6tBaceR3d_VBQIrgKJsFpl.
The Leela data comes in small chunks of .binpack files. To shuffle them, we simply
used a small python script to randomly rename the files, and then concatenated them
using `cat`. As validation data we picked a file of T60 data. We will further investigate
T74 data.
The training for the NNUE architecture used 200 epochs with the Python trainer from
the Stockfish project. Unlike the previous run we tried with this data, this run does
not have adjusted scaling — not because we didn't want to, but because we forgot.
However, this training randomly skips 40% more positions than previous run. The loss
was very spiky and decreased slower than it does usually.
Training loss: https://github.com/official-stockfish/images/blob/main/training-loss-8e47cf062333.png
Validation loss: https://github.com/official-stockfish/images/blob/main/validation-loss-8e47cf062333.png
This is the exact training command:
python train.py --smart-fen-skipping --random-fen-skipping 14 --batch-size 16384 --threads 4 --num-workers 4 --gpus 1 trainingdata\training_data.binpack validationdata\val.binpack
---
10k STC result:
ELO: 3.61 +-3.3 (95%) LOS: 98.4%
Total: 10000 W: 1241 L: 1137 D: 7622
Ptnml(0-2): 68, 841, 3086, 929, 76
https://tests.stockfishchess.org/tests/view/60c67e50457376eb8bcaae70
10k LTC result:
ELO: 2.71 +-2.4 (95%) LOS: 98.8%
Total: 10000 W: 659 L: 581 D: 8760
Ptnml(0-2): 22, 485, 3900, 579, 14
https://tests.stockfishchess.org/tests/view/60c69deb457376eb8bcaae98
Passed LTC:
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 9648 W: 685 L: 545 D: 8418
Ptnml(0-2): 22, 448, 3740, 596, 18
https://tests.stockfishchess.org/tests/view/60c6d41c457376eb8bcaaecf
---
closes https://github.com/official-stockfish/Stockfish/pull/3550
Bench: 4877339
Compute optimal register count for feature transformer accumulation dynamically.
This also introduces a change where AVX512 would only use 8 registers instead of 16
(now possible due to a 2x increase in feature transformer size).
closes https://github.com/official-stockfish/Stockfish/pull/3543
No functional change
This patch restricts LMR extensions (of non-transposition table moves) from being
used when the transposition table move was extended by two plies via singular
extension. This may serve to limit search explosions in certain positions.
This makes a lot of sense because the precondition for the tt-move to have been
singular extended by two plies is that the result of the alternate search (with
excluded the tt-move) has been a hard fail low: it is natural to later search less
for non tt-moves in this situation.
The current state of depth/extensions/reductions management is getting quite tricky
in our search algo, see https://github.com/official-stockfish/Stockfish/pull/3546#issuecomment-860174549
for some discussion. Suggestions welcome!
Passed STC
https://tests.stockfishchess.org/tests/view/60c3f293457376eb8bcaac8d
LLR: 2.95 (-2.94,2.94) <-0.50,2.50>
Total: 117984 W: 9698 L: 9430 D: 98856
Ptnml(0-2): 315, 7708, 42703, 7926, 340
passed LTC
https://tests.stockfishchess.org/tests/view/60c46ea5457376eb8bcaacc7
LLR: 2.97 (-2.94,2.94) <0.50,3.50>
Total: 11280 W: 401 L: 302 D: 10577
Ptnml(0-2): 2, 271, 4998, 364, 5
closes https://github.com/official-stockfish/Stockfish/pull/3546
Bench: 4709974
Load feature transformer weights in bulk on little-endian machines.
This is in particular useful to test new nets with c-chess-cli,
see https://github.com/lucasart/c-chess-cli/issues/44
```
$ time ./stockfish.exe uci
Before : 0m0.914s
After : 0m0.483s
```
No functional change
Cleaner vector code structure in feature transformer. This patch just
regroups the parts of the inner loop for each SIMD instruction set.
Tested for non-regression:
LLR: 2.96 (-2.94,2.94) <-2.50,0.50>
Total: 115760 W: 9835 L: 9831 D: 96094
Ptnml(0-2): 326, 7776, 41715, 7694, 369
https://tests.stockfishchess.org/tests/view/60b96b39457376eb8bcaa26e
It would be nice if a future patch could use some of the macros at
the top of the file to unify the code between the distincts SIMD
instruction sets (of course, unifying the Relu will be the challenge).
closes https://github.com/official-stockfish/Stockfish/pull/3506
No functional change
This simplification patch implements two changes:
1. it simplifies away the so-called "lazy" path in the NNUE evaluation internals,
where we trusted the psqt head alone to avoid the costly "positional" head in
some cases;
2. it raises a little bit the NNUEThreshold1 in evaluate.cpp (from 682 to 800),
which increases the limit where we switched from NNUE eval to Classical eval.
Both effects increase the number of positional evaluations done by our new net
architecture, but the results of our tests below seem to indicate that the loss
of speed will be compensated by the gain of eval quality.
STC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 26280 W: 2244 L: 2137 D: 21899
Ptnml(0-2): 72, 1755, 9405, 1810, 98
https://tests.stockfishchess.org/tests/view/60ae73f112066fd299795a51
LTC:
LLR: 2.95 (-2.94,2.94) <-2.50,0.50>
Total: 20592 W: 750 L: 677 D: 19165
Ptnml(0-2): 9, 614, 8980, 681, 12
https://tests.stockfishchess.org/tests/view/60ae88e812066fd299795a82
closes https://github.com/official-stockfish/Stockfish/pull/3503
Bench: 3817907
Definition of the lazy threshold moved to evaluate.cpp where all others are.
Lazy threshold only used for real searches, not used for the "eval" call.
This preserves the purity of NNUE evaluation, which is useful to verify
consistency between the engine and the NNUE trainer.
closes https://github.com/official-stockfish/Stockfish/pull/3499
No functional change
Our new nets output two values for the side to move in the last layer.
We can interpret the first value as a material evaluation of the
position, and the second one as the dynamic, positional value of the
location of pieces.
This patch changes the balance for the (materialist, positional) parts
of the score from (128, 128) to (121, 135) when the piece material is
equal between the two players, but keeps the standard (128, 128) balance
when one player is at least an exchange up.
Passed STC:
LLR: 2.93 (-2.94,2.94) <-0.50,2.50>
Total: 15936 W: 1421 L: 1266 D: 13249
Ptnml(0-2): 37, 1037, 5694, 1134, 66
https://tests.stockfishchess.org/tests/view/60a82df9ce8ea25a3ef0408f
Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.50>
Total: 13904 W: 516 L: 410 D: 12978
Ptnml(0-2): 4, 374, 6088, 484, 2
https://tests.stockfishchess.org/tests/view/60a8bbf9ce8ea25a3ef04101
closes https://github.com/official-stockfish/Stockfish/pull/3492
Bench: 3856635