Created by first retraining the spsa-tuned main net `nn-ae6a388e4a1a.nnue` with:
- using v6-dd data without bestmove captures removed
- addition of T80 mar2024 data
- increasing loss by 20% when Q is too high
- torch.compile changes for marginal training speed gains
And then SPSA tuning weights of epoch 899 following methods described in:
https://github.com/official-stockfish/Stockfish/pull/5149
This net was reached at 92k out of 120k steps in this 70+0.7 th 7 SPSA tuning run:
https://tests.stockfishchess.org/tests/view/66413b7df9f4e8fc783c9bbb
Thanks to @Viren6 for suggesting usage of:
- c value 4 for the weights
- c value 128 for the biases
Scripts for automating applying fishtest spsa params to exporting tuned .nnue are in:
https://github.com/linrock/nnue-tools/tree/master/spsa
Before spsa tuning, epoch 899 was nn-f85738aefa84.nnue
https://tests.stockfishchess.org/tests/view/663e5c893a2f9702074bc167
After initially training with max-epoch 800, training was resumed with max-epoch 1000.
```
experiment-name: 3072--S11--more-data-v6-dd-t80-mar2024--see-ge0-20p-more-loss-high-q-sk28-l8
nnue-pytorch-branch: linrock/nnue-pytorch/3072-r21-skip-more-wdl-see-ge0-20p-more-loss-high-q-torch-compile-more
start-from-engine-test-net: False
start-from-model: /data/config/apr2024-3072/nn-ae6a388e4a1a.nnue
early-fen-skipping: 28
training-dataset:
/data/S11-mar2024/:
- leela96.v2.min.binpack
- test60-2021-11-12-novdec-12tb7p.v6-dd.min.binpack
- test78-2022-01-to-05-jantomay-16tb7p.v6-dd.min.binpack
- test80-2022-06-jun-16tb7p.v6-dd.min.binpack
- test80-2022-08-aug-16tb7p.v6-dd.min.binpack
- test80-2022-09-sep-16tb7p.v6-dd.min.binpack
- test80-2023-01-jan-16tb7p.v6-sk20.min.binpack
- test80-2023-02-feb-16tb7p.v6-sk20.min.binpack
- test80-2023-03-mar-2tb7p.v6-sk16.min.binpack
- test80-2023-04-apr-2tb7p.v6-sk16.min.binpack
- test80-2023-05-may-2tb7p.v6.min.binpack
# https://github.com/official-stockfish/Stockfish/pull/4782
- test80-2023-06-jun-2tb7p.binpack
- test80-2023-07-jul-2tb7p.binpack
# https://github.com/official-stockfish/Stockfish/pull/4972
- test80-2023-08-aug-2tb7p.v6.min.binpack
- test80-2023-09-sep-2tb7p.binpack
- test80-2023-10-oct-2tb7p.binpack
# S9 new data: https://github.com/official-stockfish/Stockfish/pull/5056
- test80-2023-11-nov-2tb7p.binpack
- test80-2023-12-dec-2tb7p.binpack
# S10 new data: https://github.com/official-stockfish/Stockfish/pull/5149
- test80-2024-01-jan-2tb7p.binpack
- test80-2024-02-feb-2tb7p.binpack
# S11 new data
- test80-2024-03-mar-2tb7p.binpack
/data/filt-v6-dd/:
- test77-dec2021-16tb7p-filter-v6-dd.binpack
- test78-juntosep2022-16tb7p-filter-v6-dd.binpack
- test79-apr2022-16tb7p-filter-v6-dd.binpack
- test79-may2022-16tb7p-filter-v6-dd.binpack
- test80-jul2022-16tb7p-filter-v6-dd.binpack
- test80-oct2022-16tb7p-filter-v6-dd.binpack
- test80-nov2022-16tb7p-filter-v6-dd.binpack
num-epochs: 1000
lr: 4.375e-4
gamma: 0.995
start-lambda: 0.8
end-lambda: 0.7
```
Training data can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move:
nn-epoch899.nnue : 4.6 +/- 1.4
Passed STC:
https://tests.stockfishchess.org/tests/view/6645454893ce6da3e93b31ae
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 95232 W: 24598 L: 24194 D: 46440
Ptnml(0-2): 294, 11215, 24180, 11647, 280
Passed LTC:
https://tests.stockfishchess.org/tests/view/6645522d93ce6da3e93b31df
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 320544 W: 81432 L: 80524 D: 158588
Ptnml(0-2): 164, 35659, 87696, 36611, 142
closes https://github.com/official-stockfish/Stockfish/pull/5254
bench 1995552
Basically the same idea as it is for continuation/main history, but it
has some tweaks.
1) it has * 2 multiplier for bonus instead of full/half bonus - for
whatever reason this seems to work better;
2) attempts with this type of big bonuses scaled somewhat poorly (or
were unlucky at longer time controls), but after measuring the fact
that average value of pawn history in LMR after adding this bonuses
increased by substantial number (for multiplier 1,5 it increased by
smth like 400~ from 8192 cap) attempts were made to make default pawn
history negative to compensate it - and version with multiplier 2 and
initial fill value -900 passed.
Passed STC:
https://tests.stockfishchess.org/tests/view/66424815f9f4e8fc783cba59
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 115008 W: 30001 L: 29564 D: 55443
Ptnml(0-2): 432, 13629, 28903, 14150, 390
Passed LTC:
https://tests.stockfishchess.org/tests/view/6642f5437134c82f3f7a3ffa
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 56448 W: 14432 L: 14067 D: 27949
Ptnml(0-2): 36, 6268, 15254, 6627, 39
Bench: 1857237
Stockfish appears to take too much time on the first move of a game and
then not enough on moves 2,3,4... Probably caused by most of the factors
that increase time usually applying on the first move.
Attempts to give more time to the subsequent moves have not worked so
far, but this change to simply reduce first move time by 5% worked.
STC 10+0.1 :
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 78496 W: 20516 L: 20135 D: 37845
Ptnml(0-2): 340, 8859, 20456, 9266, 327
https://tests.stockfishchess.org/tests/view/663d47bf507ebe1c0e9200ba
LTC 60+0.6 :
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 94872 W: 24179 L: 23751 D: 46942
Ptnml(0-2): 61, 9743, 27405, 10161, 66
https://tests.stockfishchess.org/tests/view/663e779cbb28828150dd9089
closes https://github.com/official-stockfish/Stockfish/pull/5235
Bench: 1876282
1. The current time management system utilizes limits.inc and
limits.time, which can represent either milliseconds or node count,
depending on whether the nodestime option is active. There have been
several modifications which brought Elo gain for typical uses (i.e.
real-time matches), however some of these changes overlooked such
distinction. This patch adjusts constants and multiplication/division to
more accurately simulate real TC conditions when nodestime is used.
2. The advance_nodes_time function has a bug that can extend the time
limit when availableNodes reaches exact zero. This patch fixes the bug
by initializing the variable to -1 and make sure it does not go below
zero.
3. elapsed_time function is newly introduced to print PV in the UCI
output based on real time. This makes PV output more consistent with the
behavior of trivial use cases.
closes https://github.com/official-stockfish/Stockfish/pull/5186
No functional changes
If there is an upper bound stored in the transposition table, but we still have a ttMove, the upperbound indicates that the last time the ttMove was tried, it failed low. This fail low indicates that the ttMove may not be good, so this patch introduces a depth reduction of one for cutnodes with such ttMoves.
Passed STC:
https://tests.stockfishchess.org/tests/view/663be4d1ca93dad645f7f45f
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 139424 W: 35900 L: 35433 D: 68091
Ptnml(0-2): 425, 16357, 35743, 16700, 487
Passed LTC:
https://tests.stockfishchess.org/tests/view/663bec95ca93dad645f7f5c8
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 129690 W: 32902 L: 32390 D: 64398
Ptnml(0-2): 63, 14304, 35610, 14794, 74
closes https://github.com/official-stockfish/Stockfish/pull/5227
bench 2257437
Make it formula more in line with what we use in search - current formula is more or less the one we used years ago for search but since then it was remade, this patch remakes qsearch formula to almost exactly the same as we use in search - with sum of conthist 0, 1 and pawn structure history.
Passed STC:
https://tests.stockfishchess.org/tests/view/6639c8421343f0cb16716206
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 84992 W: 22414 L: 22019 D: 40559
Ptnml(0-2): 358, 9992, 21440, 10309, 397
Passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 119136 W: 30407 L: 29916 D: 58813
Ptnml(0-2): 46, 13192, 32622, 13641, 67
closes https://github.com/official-stockfish/Stockfish/pull/5224
Bench: 2138659
The idea came to me by checking for trends from the megafauzi tunes, since the values of the divisor for this specific formula were as follows:
stc: 15990
mtc: 16117
ltc: 14805
vltc: 12719
new vltc passed by Muzhen: 12076
This shows a clear trend related to time control, the higher it is, the lower the optimum value for the divisor seems to be.
So I tried a simple formula, using educated guesses based on some calculations, tests show it works pretty fine, and it can still be further tuned at VLTC in the future to scale even better.
Passed STC:
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 431360 W: 110791 L: 109898 D: 210671
Ptnml(0-2): 1182, 50846, 110698, 51805, 1149
https://tests.stockfishchess.org/tests/view/663770409819650825aa269f
Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 114114 W: 29109 L: 28625 D: 56380
Ptnml(0-2): 105, 12628, 31101, 13124, 99
https://tests.stockfishchess.org/tests/view/66378c099819650825aa73f6https://github.com/official-stockfish/Stockfish/pull/5223
bench: 2273551
This adds the functions `update_refutations` and `update_quiet_histories` to better distinguish the two. `update_quiet_stats` now just calls both of these functions.
The functional side of this patch is two-fold:
1. Stop refutations being updated when we carry out multicut
2. Update pawn history every time we update other quiet histories
Yellow STC:
LLR: -2.95 (-2.94,2.94) <0.00,2.00>
Total: 238976 W: 61506 L: 61415 D: 116055
Ptnml(0-2): 846, 28628, 60456, 28705, 853
https://tests.stockfishchess.org/tests/view/66321b5ed01fb9ac9bcdca83
However, it passed in <-1.75, 0.25> bounds:
$ python3 sprt.py --wins 61506 --losses 61415 --draws 116055 --elo0 -1.75 --elo1 0.25
ELO: 0.132 +- 0.998 [-0.865, 1.13]
LLR: 4.15 [-1.75, 0.25] (-2.94, 2.94)
H1 Accepted
Passed LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 399126 W: 100730 L: 100896 D: 197500
Ptnml(0-2): 116, 44328, 110843, 44158, 118
https://tests.stockfishchess.org/tests/view/66357b0473559a8aa857ba6fcloses#5215
Bench 2370967
Saves a (currently) 800 KB allocation and deallocation when running
`eval`, not particularly significant and zero impact on play but not
necessary either.
closes https://github.com/official-stockfish/Stockfish/pull/5201
No functional change
Adds size in memory as well as layer sizes as in
info string NNUE evaluation using nn-ae6a388e4a1a.nnue (132MiB, (22528, 3072, 15, 32, 1))
info string NNUE evaluation using nn-baff1ede1f90.nnue (6MiB, (22528, 128, 15, 32, 1))
For example, the size in MiB is useful to keep the fishtest memory sizes up-to-date,
the L1-L3 sizes give a useful hint about the architecture used.
closes https://github.com/official-stockfish/Stockfish/pull/5193
No functional change