since the introduction of NNUE (first released with Stockfish 12), we
have maintained the classical evaluation as part of SF in frozen form.
The idea that this code could lead to further inputs to the NN or
search did not materialize. Now, after five releases, this PR removes
the classical evaluation from SF. Even though this evaluation is
probably the best of its class, it has become unimportant for the
engine's strength, and there is little need to maintain this
code (roughly 25% of SF) going forward, or to expend resources on
trying to improve its integration in the NNUE eval.
Indeed, it had still a very limited use in the current SF, namely
for the evaluation of positions that are nearly decided based on
material difference, where the speed of the classical evaluation
outweights its inaccuracies. This impact on strength is small,
roughly 2Elo, and probably decreasing in importance as the TC grows.
Potentially, removal of this code could lead to the development of
techniques to have faster, but less accurate NN evaluation,
for certain positions.
STC
https://tests.stockfishchess.org/tests/view/64a320173ee09aa549c52157
Elo: -2.35 ± 1.1 (95%) LOS: 0.0%
Total: 100000 W: 24916 L: 25592 D: 49492
Ptnml(0-2): 287, 12123, 25841, 11477, 272
nElo: -4.62 ± 2.2 (95%) PairsRatio: 0.95
LTC
https://tests.stockfishchess.org/tests/view/64a320293ee09aa549c5215b
Elo: -1.74 ± 1.0 (95%) LOS: 0.0%
Total: 100000 W: 25010 L: 25512 D: 49478
Ptnml(0-2): 44, 11069, 28270, 10579, 38
nElo: -3.72 ± 2.2 (95%) PairsRatio: 0.96
VLTC SMP
https://tests.stockfishchess.org/tests/view/64a3207c3ee09aa549c52168
Elo: -1.70 ± 0.9 (95%) LOS: 0.0%
Total: 100000 W: 25673 L: 26162 D: 48165
Ptnml(0-2): 8, 9455, 31569, 8954, 14
nElo: -3.95 ± 2.2 (95%) PairsRatio: 0.95
closes https://github.com/official-stockfish/Stockfish/pull/4674
Bench: 1444646
- loop through the commits starting from the latest one
- read the bench value from the last match, if any, of the template
in the commit body text
closes https://github.com/official-stockfish/Stockfish/pull/4627
No functional change
Current logic can apply Null move pruning
on a dead-lost position returning an unproven loss
(i.e. in TB loss score or mated in losing score) on nonPv nodes.
on a default bench, this can be observed by adding this debugging line:
```
if (nullValue >= beta)
{
// Do not return unproven mate or TB scores
nullValue = std::min(nullValue, VALUE_TB_WIN_IN_MAX_PLY-1);
dbg_hit_on(nullValue <= VALUE_TB_LOSS_IN_MAX_PLY); // Hit #0: Total 73983 Hits 1 Hit Rate (%) 0.00135166
if (thisThread->nmpMinPly || depth < 14)
return nullValue;
```
This fixes this very rare issue (happens at ~0.00135166% of the time) by
eliminating the need to try Null Move Pruning with dead-lost positions
and leaving it to be determined by a normal searching flow.
The previous try to fix was not as safe enough because it was capping
the returned value to (out of TB range) thus reviving the dead-lost position
based on an artificial clamp (i.e. the in TB score/mate score can be lost on that nonPv node):
https://tests.stockfishchess.org/tests/view/649756d5dc7002ce609cd794
Final fix:
Passed STC:
https://tests.stockfishchess.org/tests/view/649a5446dc7002ce609d1049
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 577280 W: 153613 L: 153965 D: 269702
Ptnml(0-2): 1320, 60594, 165190, 60190, 1346
Passed LTC:
https://tests.stockfishchess.org/tests/view/649cd048dc7002ce609d4801
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 246432 W: 66769 L: 66778 D: 112885
Ptnml(0-2): 83, 22105, 78847, 22100, 81
closes https://github.com/official-stockfish/Stockfish/pull/4649
Bench: 2425978
faster permutation of master net weights
Activation data taken from https://drive.google.com/drive/folders/1Ec9YuuRx4N03GPnVPoQOW70eucOKngQe?usp=sharing
Permutation found using 836387a0e5/ftperm.py
See also https://github.com/glinscott/nnue-pytorch/pull/254
The algorithm greedily selects 2- and 3-cycles that can be permuted to increase the number of runs of zeroes. The percent of zero runs from the master net increased from 68.46 to 70.11 from 2-cycles and only increased to 70.32 when considering 3-cycles. Interestingly, allowing both halves of L1 to intermix when creating zero runs can give another 0.5% zero-run density increase with this method.
Measured speedup:
```
CPU: 16 x AMD Ryzen 9 3950X 16-Core Processor
Result of 50 runs
base (./stockfish.master ) = 1561556 +/- 5439
test (./stockfish.patch ) = 1575788 +/- 5427
diff = +14231 +/- 2636
speedup = +0.0091
P(speedup > 0) = 1.0000
```
closes https://github.com/official-stockfish/Stockfish/pull/4640
No functional change
using github actions, create a prerelease for the latest commit to master.
As such a development version will be available on github, in addition to the latest release.
closes https://github.com/official-stockfish/Stockfish/pull/4622
No functional change
Implemented LEB128 (de)compression for the feature transformer.
Reduces embedded network size from 70 MiB to 39 Mib.
The new nn-78bacfcee510.nnue corresponds to the master net compressed.
closes https://github.com/official-stockfish/Stockfish/pull/4617
No functional change
Use block sparse input for the first fully connected layer on architectures with at least SSSE3.
Depending on the CPU architecture, this yields a speedup of up to 10%, e.g.
```
Result of 100 runs of 'bench 16 1 13 default depth NNUE'
base (...ockfish-base) = 959345 +/- 7477
test (...ckfish-patch) = 1054340 +/- 9640
diff = +94995 +/- 3999
speedup = +0.0990
P(speedup > 0) = 1.0000
CPU: 8 x AMD Ryzen 7 5700U with Radeon Graphics
Hyperthreading: on
```
Passed STC:
https://tests.stockfishchess.org/tests/view/6485aa0965ffe077ca12409c
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 8864 W: 2479 L: 2223 D: 4162
Ptnml(0-2): 13, 829, 2504, 1061, 25
This commit includes a net with reordered weights, to increase the likelihood of block sparse inputs,
but otherwise equivalent to the previous master net (nn-ea57bea57e32.nnue).
Activation data collected with https://github.com/AndrovT/Stockfish/tree/log-activations, running bench 16 1 13 varied_1000.epd depth NNUE on this data. Net parameters permuted with https://gist.github.com/AndrovT/9e3fbaebb7082734dc84d27e02094cb3.
closes https://github.com/official-stockfish/Stockfish/pull/4612
No functional change
Created by retraining an earlier epoch (ep659) of the experiment that led to the first SFNNv6 net:
- First retrained on the nn-0dd1cebea573 dataset
- Then retrained with skip 20 on a smaller dataset containing unfiltered Leela data
- And then retrained again with skip 27 on the nn-0dd1cebea573 dataset
The equivalent 7-step training sequence from scratch that led here was:
1. max-epoch 400, lambda 1.0, constant LR 9.75e-4, T79T77-filter-v6-dd.min.binpack
ep379 chosen for retraining in step2
2. max-epoch 800, end-lambda 0.75, T60T70wIsRightFarseerT60T74T75T76.binpack
ep679 chosen for retraining in step3
3. max-epoch 800, end-lambda 0.75, skip 28, nn-e1fb1ade4432 dataset
ep799 chosen for retraining in step4
4. max-epoch 800, end-lambda 0.7, skip 28, nn-e1fb1ade4432 dataset
ep759 became nn-8d69132723e2.nnue (first SFNNv6 net)
ep659 chosen for retraining in step5
5. max-epoch 800, end-lambda 0.7, skip 28, nn-0dd1cebea573 dataset
ep759 chosen for retraining in step6
6. max-epoch 800, end-lambda 0.7, skip 20, leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack
ep639 chosen for retraining in step7
7. max-epoch 800, end-lambda 0.7, skip 27, nn-0dd1cebea573 dataset
ep619 became nn-ea57bea57e32.nnue
For the last retraining (step7):
python3 easy_train.py
--experiment-name L1-1536-Re6-masterShuffled-ep639-sk27-Re5-leela-dfrc-v2-T77toT80small-Re4-masterShuffled-ep659-Re3-sameAs-Re2-leela96-dfrc99-16t-v2-T60novdecT80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-Re1-LeelaFarseer-new-T77T79 \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-L1-1536 \
--early-fen-skipping 27 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 800 \
--start-from-engine-test-net False \
--start-from-model /data/L1-1536-Re5-leela-dfrc-v2-T77toT80small-epoch639.nnue \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--seed $RANDOM \
--gpus "0,"
For preparing the step6 leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack dataset:
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-16tb7p-eval-filt-v2.binpack \
test77-dec2021-16tb7p.no-db.min-mar2023.binpack \
test78-janfeb2022-16tb7p.no-db.min-mar2023.binpack \
test79-apr2022-16tb7p-filter-v6-dd.binpack \
test80-apr2022-16tb7p.no-db.min-mar2023.binpack \
/data/leela-dfrc-v2-T77decT78janfebT79aprT80apr.binpack
The unfiltered Leela data used for the step6 dataset can be found at:
https://robotmoon.com/nnue-training-data
Local elo at 25k nodes per move:
nn-epoch619.nnue : 2.3 +/- 1.9
Passed STC:
https://tests.stockfishchess.org/tests/view/6480d43c6e6ce8d9fc6d7cc8
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 40992 W: 11017 L: 10706 D: 19269
Ptnml(0-2): 113, 4400, 11170, 4689, 124
Passed LTC:
https://tests.stockfishchess.org/tests/view/648119ac6e6ce8d9fc6d8208
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 129174 W: 35059 L: 34579 D: 59536
Ptnml(0-2): 66, 12548, 38868, 13050, 55
closes https://github.com/official-stockfish/Stockfish/pull/4611
bench: 2370027
Created by retraining an earlier epoch of the experiment leading to the first SFNNv6 net
on a more-randomized version of the nn-e1fb1ade4432.nnue dataset mixed with unfiltered
T80 apr2023 data. Trained using early-fen-skipping 28 and max-epoch 960.
The trainer settings and epochs used in the 5-step training sequence leading here were:
1. train from scratch for 400 epochs, lambda 1.0, constant LR 9.75e-4, T79T77-filter-v6-dd.min.binpack
2. retrain ep379, max-epoch 800, end-lambda 0.75, T60T70wIsRightFarseerT60T74T75T76.binpack
3. retrain ep679, max-epoch 800, end-lambda 0.75, skip 28, nn-e1fb1ade4432 dataset
4. retrain ep799, max-epoch 800, end-lambda 0.7, skip 28, nn-e1fb1ade4432 dataset
5. retrain ep439, max-epoch 960, end-lambda 0.7, skip 28, shuffled nn-e1fb1ade4432 + T80 apr2023
This net was epoch 559 of the final (step 5) retraining:
```bash
python3 easy_train.py \
--experiment-name L1-1536-Re4-leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr-shuffled-sk28 \
--training-dataset /data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack \
--nnue-pytorch-branch linrock/nnue-pytorch/misc-fixes-L1-1536 \
--early-fen-skipping 28 \
--start-lambda 1.0 \
--end-lambda 0.7 \
--max_epoch 960 \
--start-from-engine-test-net False \
--start-from-model /data/L1-1536-Re3-nn-epoch439.nnue \
--engine-test-branch linrock/Stockfish/L1-1536 \
--lr 4.375e-4 \
--gamma 0.995 \
--tui False \
--seed $RANDOM \
--gpus "0,"
```
During data preparation, most binpacks were unminimized by removing positions with
score 32002 (`VALUE_NONE`). This makes the tradeoff of increasing dataset filesize
on disk to increase the randomness of positions in interleaved datasets.
The code used for unminimizing is at:
https://github.com/linrock/Stockfish/tree/tools-unminify
For preparing the dataset used in this experiment:
```bash
python3 interleave_binpacks.py \
leela96-filt-v2.binpack \
dfrc99-16tb7p-eval-filt-v2.binpack \
filt-v6-dd-min/test60-novdec2021-12tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-aug2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-sep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd/test80-jul2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-oct2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test80-nov2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd-min/test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd-min/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd/test79-apr2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test79-may2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd-min/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.unmin.binpack \
filt-v6-dd/test78-juntosep2022-16tb7p-filter-v6-dd.binpack \
filt-v6-dd/test77-dec2021-16tb7p-filter-v6-dd.binpack \
test80-apr2023-2tb7p.binpack \
/data/leela96-dfrc99-T60novdec-v2-T80juntonovjanfebT79aprmayT78jantosepT77dec-v6dd-T80apr.binpack
```
T80 apr2023 data was converted using lc0-rescorer with ~2tb of tablebases and can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move vs. nn-e1fb1ade4432.nnue (L1 size 1024):
nn-epoch559.nnue : 25.7 +/- 1.6
Passed STC:
https://tests.stockfishchess.org/tests/view/647cd3b87cf638f0f53f9cbb
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 59200 W: 16000 L: 15660 D: 27540
Ptnml(0-2): 159, 6488, 15996, 6768, 189
Passed LTC:
https://tests.stockfishchess.org/tests/view/647d58de726f6b400e4085d8
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 58800 W: 16002 L: 15657 D: 27141
Ptnml(0-2): 44, 5607, 17748, 5962, 39
closes https://github.com/official-stockfish/Stockfish/pull/4606
bench 2141197