Improves throughput by summing 2 intermediate dot products using 16 bit addition before upconverting to 32 bit.
Potential saturation is detected and the code-path is avoided in this case.
The saturation can't happen with the current nets,
but nets can be constructed that trigger this check.
STC https://tests.stockfishchess.org/tests/view/5fd40a861ac1691201888479
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 25544 W: 2451 L: 2296 D: 20797
Ptnml(0-2): 92, 1761, 8925, 1888, 106
about 5% speedup
closes https://github.com/official-stockfish/Stockfish/pull/3261
No functional change
Imbalance tables tweaked to contain MiddleGame and Endgame values, instead of a single value.
The idea started from Fisherman, which requested my help to tune the values back in June/July,
so I tuned the values back then, and we were able to accomplish good results,
but not enough to pass both STC and LTC tests.
So after the recent changes, I decided to give it another shot, and I am glad that it was a successful attempt.
A special thanks goes also to mstembera, which notified me a simple way to let the patch perform a little better.
Passed STC:
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 115976 W: 23124 L: 22695 D: 70157
Ptnml(0-2): 2074, 13652, 26285, 13725, 2252
https://tests.stockfishchess.org/tests/view/5fc92d2d42a050a89f02ccc8
Passed LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 156304 W: 20617 L: 20024 D: 115663
Ptnml(0-2): 1138, 14647, 46084, 15050, 1233
https://tests.stockfishchess.org/tests/view/5fc9fee142a050a89f02cd3e
closes https://github.com/official-stockfish/Stockfish/pull/3255
Bench: 4278746
This appears to be slightly faster than using a comparison against zero
to compute the high bits, on both old (like Pentium III) and new (like
Zen 2) hardware.
closes https://github.com/official-stockfish/Stockfish/pull/3254
No functional change.
The idea of this patch can be described as following: we update static
history stats based on comparison of the static evaluations of the
position before and after the move. If the move increases static evaluation
it's assigned positive bonus, if it decreases static evaluation
it's assigned negative bonus. These stats are used in movepicker
to sort quiet moves.
passed STC
https://tests.stockfishchess.org/tests/view/5fca4c0842a050a89f02cd66
LLR: 3.00 (-2.94,2.94) {-0.25,1.25}
Total: 78152 W: 7409 L: 7171 D: 63572
Ptnml(0-2): 303, 5695, 26873, 5871, 334
passed LTC
https://tests.stockfishchess.org/tests/view/5fca6be442a050a89f02cd75
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 40240 W: 1602 L: 1441 D: 37197
Ptnml(0-2): 19, 1306, 17305, 1475, 15
closes https://github.com/official-stockfish/Stockfish/pull/3253
bench 3845156
Include scaling change as suggested by Dietrich Kappe,
the one who trained net for Komodo. According to him,
some nets may require different scaling in order to utilize its full strength.
STC:
LLR: 2.93 (-2.94,2.94) {-0.25,1.25}
Total: 99856 W: 9669 L: 9401 D: 80786
Ptnml(0-2): 374, 7468, 34037, 7614, 435
https://tests.stockfishchess.org/tests/view/5fc2697642a050a89f02c8ec
LTC:
LLR: 2.96 (-2.94,2.94) {0.25,1.25}
Total: 29840 W: 1220 L: 1081 D: 27539
Ptnml(0-2): 10, 969, 12827, 1100, 14
https://tests.stockfishchess.org/tests/view/5fc2ea5142a050a89f02c957
Bench: 3561701
This patch removes the incrementally updated piece lists from the Position object.
This has been tried before but always failed. My reasons for trying again are:
* 32-bit systems (including phones) are now much less important than they were some years ago (and are absent from fishtest);
* NNUE may have made SF less finely tuned to the order in which moves were generated.
STC:
LLR: 2.94 (-2.94,2.94) {-1.25,0.25}
Total: 55272 W: 5260 L: 5216 D: 44796
Ptnml(0-2): 208, 4147, 18898, 4159, 224
https://tests.stockfishchess.org/tests/view/5fc2986a42a050a89f02c926
LTC:
LLR: 2.96 (-2.94,2.94) {-0.75,0.25}
Total: 16600 W: 673 L: 608 D: 15319
Ptnml(0-2): 14, 533, 7138, 604, 11
https://tests.stockfishchess.org/tests/view/5fc2f98342a050a89f02c95c
closes https://github.com/official-stockfish/Stockfish/pull/3247
Bench: 3940967
+-----------------+
| . . . . . . . . | All files are closed. Some files are
| . . . . . o o . | more valuable for rooks, because
| . . . . o . . o | they might open in the future.
| . . . o x . . x |
| o . o x . x x . |
| x o x . . . . . | x our pawns
| . x . . . . . . | o their pawns
| . . . . . . . . | ^ rooks are scored higher on these files
+-----------------+
^ ^
Files containing none of our own pawns are open or half-open (otherwise
they are closed). Rooks on (half-)open files recieve a bonus for the
future potential to act along all ranks.
This commit refines the (relative) penalty of rooks on closed files.
Files that contain one of our blocked pawns are considered less likely
to open in the future; rooks on these files are now penalized stronger.
This bonus does not generally correlate with mobility. If the condition
is sufficiently refined in the future, it may be beneficial to adjust or
override mobility scores in some cases.
LTC
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 494384 W: 71565 L: 70231 D: 352588
Ptnml(0-2): 3907, 48050, 142118, 49036, 4081
https://tests.stockfishchess.org/tests/view/5fb9312e67cbf42301d6afb9
LTC (non-regression w/ book noob_3moves.epd)
LLR: 2.95 (-2.94,2.94) {-0.75,0.25}
Total: 208520 W: 27044 L: 26937 D: 154539
Ptnml(0-2): 1557, 19850, 61391, 19853, 1609
https://tests.stockfishchess.org/tests/view/5fc01ced67cbf42301d6b3df
STC
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 98392 W: 20269 L: 19868 D: 58255
Ptnml(0-2): 1804, 11297, 22589, 11706, 1800
https://tests.stockfishchess.org/tests/view/5fb7f88a67cbf42301d6af10
closes https://github.com/official-stockfish/Stockfish/pull/3242
Bench: 3682630
in affine transform for AVX512/AVX2/SSSE3
The idea is to initialize sum with the first element instead of zero.
Reduce one add_epi32 and one set_zero SIMD instructions for each output dimension.
sum = 0; for i = 1 to n sum += a[i] ->
sum = a[1]; for i = 2 to n sum += a[i]
STC:
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 69048 W: 7024 L: 6799 D: 55225
Ptnml(0-2): 260, 5175, 23458, 5342, 289
https://tests.stockfishchess.org/tests/view/5faf2cf467cbf42301d6aa06
closes https://github.com/official-stockfish/Stockfish/pull/3227
No functional change.
This is a follow-up of the recent qsearch pruning patch in
a260c9a8a2
We now use the same guard condition (testing that we already have a defense with
a score better score than a TB loss) for all pruning heuristics in qsearch().
This allows some pruning when in check, but in a controlled way to ensure that
no wrong mate scores appear.
Tested with Elo-gaining bounds:
STC:
LLR: 2.97 (-2.94,2.94) {-0.25,1.25}
Total: 22632 W: 2433 L: 2264 D: 17935
Ptnml(0-2): 98, 1744, 7487, 1865, 122
https://tests.stockfishchess.org/tests/view/5fa59405936c54e11ec99515
LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 105432 W: 4965 L: 4648 D: 95819
Ptnml(0-2): 85, 4110, 44011, 4423, 87
https://tests.stockfishchess.org/tests/view/5fa5b609936c54e11ec9952a
closes https://github.com/official-stockfish/Stockfish/pull/3221
Bench: 3578092
For the feature transformer the code is analogical to AVX2 since there was room for easy adaptation of wider simd registers.
For the smaller affine transforms that have 32 byte stride we keep 2 columns in one zmm register. We also unroll more aggressively so that in the end we have to do 16 parallel horizontal additions on ymm slices each consisting of 4 32-bit integers. The slices are embedded in 8 zmm registers.
These changes provide about 1.5% speedup for AVX-512 builds.
Closes https://github.com/official-stockfish/Stockfish/pull/3218
No functional change.
Using no searching time in case of a single legal move is not beneficial from
a strength point of view, and this special case can be easily removed:
STC:
LLR: 2.93 (-2.94,2.94) {-1.25,0.25}
Total: 22472 W: 2458 L: 2357 D: 17657
Ptnml(0-2): 106, 1733, 7453, 1842, 102
https://tests.stockfishchess.org/tests/view/5f926cbc81eda81bd78cb6df
LTC:
LLR: 2.94 (-2.94,2.94) {-0.75,0.25}
Total: 37880 W: 1736 L: 1682 D: 34462
Ptnml(0-2): 22, 1392, 16057, 1448, 21
https://tests.stockfishchess.org/tests/view/5f92a26081eda81bd78cb6fe
The advantage of using the normal time management for a single legal move is that scores
reported for that move are reasonable, not searching leads to artifacts during games
(see e.g. https://tcec-chess.com/#div=sf&game=96&season=19)
The disadvantage of using normal time management of a single legal move is that thinking
times can be unnaturally long, making it 'painful to watch' in online tournaments.
This patch uses normal time management, but caps the used time to 500ms.
This should lead to reasonable scores, and be hardly perceptible.
closes https://github.com/official-stockfish/Stockfish/pull/3195
closes https://github.com/official-stockfish/Stockfish/pull/3183
variant of a patch suggested by SFisGOD
No functional change.
Only do countermove based pruning in qsearch if we already have a move with a better score than a TB loss.
This patch fixes a bug (started as 843a961) that incorrectly prunes moves if in check,
and adds an assert to make sure no wrong mate scores are given in the future.
It replaces a no-op moveCount check with a check for bestValue.
Initially discussed in #3171 and later in #3199, #3198 and #3210.
This PR effectively closes#3171
It also likely fixes#3196 where this causes user visible incorrect TB scores,
which probably result from these incorrect mate scores.
Passed STC and LTC non-regression tests.
https://tests.stockfishchess.org/tests/view/5f9ef8dabca9bf35bae7f648
LLR: 2.93 (-2.94,2.94) {-1.25,0.25}
Total: 21672 W: 2339 L: 2230 D: 17103
Ptnml(0-2): 126, 1689, 7083, 1826, 112
https://tests.stockfishchess.org/tests/view/5f9f0caebca9bf35bae7f666
LLR: 2.97 (-2.94,2.94) {-0.75,0.25}
Total: 33152 W: 1551 L: 1485 D: 30116
Ptnml(0-2): 27, 1308, 13832, 1390, 19
closes https://github.com/official-stockfish/Stockfish/pull/3214
Bench: 3625915
A non-functional speedup. Unroll the loops going over
the output dimensions in the affine transform layers by
a factor of 4 and perform 4 horizontal additions at a time.
Instead of doing naive horizontal additions on each vector
separately use hadd and shuffling between vectors to reduce
the number of instructions by using all lanes for all stages
of the horizontal adds.
passed STC of the initial version:
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 17808 W: 1914 L: 1756 D: 14138
Ptnml(0-2): 76, 1330, 5948, 1460, 90
https://tests.stockfishchess.org/tests/view/5f9d516f6a2c112b60691da3
passed STC of the final version after cleanup:
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 16296 W: 1750 L: 1595 D: 12951
Ptnml(0-2): 72, 1192, 5479, 1319, 86
https://tests.stockfishchess.org/tests/view/5f9df5776a2c112b60691de3
closes https://github.com/official-stockfish/Stockfish/pull/3203
No functional change
This patch was inspired by c065abd which updates the accumulator,
if possible, based on the accumulator of two plies back if
the accumulator of the preceding ply is not available.
With this patch we look back even further in the position history
in an attempt to reduce the number of complete recomputations.
When we find a usable accumulator for the position N plies back,
we also update the accumulator of the position N-1 plies back
because that accumulator is most likely to be helpful later
when evaluating positions in sibling branches.
By not updating all intermediate accumulators immediately,
we avoid doing too much work that is not certain to be useful.
Overall, roughly 2-3% speedup.
This patch makes the code more specific to the net architecture,
changing input features of the net will require additional changes
to the incremental update code as discussed in the PR #3193 and #3191.
Passed STC:
https://tests.stockfishchess.org/tests/view/5f9056712c92c7fe3a8c60d0
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 10040 W: 1116 L: 968 D: 7956
Ptnml(0-2): 42, 722, 3365, 828, 63
closes https://github.com/official-stockfish/Stockfish/pull/3193
No functional change.
Idea of this patch can be described as following - in case we have consecutive fail highs and we reach late enough moves at root node probability of remaining quiet moves being able to produce even bigger value than moves that produced previous cutoff (so ones that should be high in move ordering but now they fail to produce beta cutoff because we actually reached high move count) should be quiet small so we can reduce them more.
passed STC
LLR: 2.94 (-2.94,2.94) {-0.25,1.25}
Total: 53392 W: 5681 L: 5474 D: 42237
Ptnml(0-2): 214, 4104, 17894, 4229, 255
https://tests.stockfishchess.org/tests/view/5f88501adcdad978fe8c527e
passed LTC
LLR: 2.94 (-2.94,2.94) {0.25,1.25}
Total: 59136 W: 2773 L: 2564 D: 53799
Ptnml(0-2): 30, 2117, 25078, 2300, 43
https://tests.stockfishchess.org/tests/view/5f884dbfdcdad978fe8c527a
closes https://github.com/official-stockfish/Stockfish/pull/3184
Bench: 4066972
We now include the total pawn count in the scaling factor for the output
of the NNUE evaluation network. This should have the effect of trying to
keep more pawns when SF has the advantage, but exchange them when she
is defending.
Thanks to Alexander Pagel (Lolligerhans) for the idea of using the
value of pawns to ease the comparison with the rest of the material
estimation.
Passed STC:
LLR: 2.93 (-2.94,2.94) {-0.25,1.25}
Total: 15072 W: 1700 L: 1539 D: 11833
Ptnml(0-2): 65, 1202, 4845, 1355, 69
https://tests.stockfishchess.org/tests/view/5f7235a63b22d6afa50699b3
Passed LTC:
LLR: 2.93 (-2.94,2.94) {0.25,1.25}
Total: 25880 W: 1270 L: 1124 D: 23486
Ptnml(0-2): 23, 980, 10788, 1126, 23
https://tests.stockfishchess.org/tests/view/5f723b483b22d6afa5069a99
closes https://github.com/official-stockfish/Stockfish/pull/3164
Bench: 3776081
Idea is that division by fraction of 2 is slightly faster than by other numbers so parameters are adjusted in a way that division in null move pruning depth reduction features dividing by 256 instead of dividing by 213.
Other than this patch is almost non-functional - difference starts to exist by depth 133.
passed STC
https://tests.stockfishchess.org/tests/view/5f70dd943b22d6afa50693c5
LLR: 2.95 (-2.94,2.94) {-0.25,1.25}
Total: 57048 W: 6616 L: 6392 D: 44040
Ptnml(0-2): 304, 4583, 18531, 4797, 309
passed LTC
https://tests.stockfishchess.org/tests/view/5f7180db3b22d6afa506941f
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 45960 W: 2419 L: 2229 D: 41312
Ptnml(0-2): 43, 1779, 19137, 1987, 34
closes https://github.com/official-stockfish/Stockfish/pull/3159
bench 3789924