To avoid output that depends on timing, output currmove and similar only from depth > 30
onward. Current choice of 3s makes the output of the same search depending on
the system load, and doesn't always start at move 1. Depth 30 is nowadays
reached in a few seconds on most systems.
closes https://github.com/official-stockfish/Stockfish/pull/5436
No functional change
This patch introduces history updates to probcut. Standard depth - 3 bonus and
maluses are given to the capture that caused fail high and previously searched
captures, respectively. Similar to #5243, a negative history fill is applied to
compensate for an increase in capture history average, thus improving the
scaling of this patch.
Passed STC:
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 84832 W: 21941 L: 21556 D: 41335
Ptnml(0-2): 226, 9927, 21688, 10386, 189
https://tests.stockfishchess.org/tests/view/6682fab9389b9ee542b1d029
Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 104298 W: 26469 L: 26011 D: 51818
Ptnml(0-2): 43, 11458, 28677, 11940, 31
https://tests.stockfishchess.org/tests/view/6682ff06389b9ee542b1d0a0
closes https://github.com/official-stockfish/Stockfish/pull/5428
bench 1281351
try to avoid missing good moves for opponent or engine, by updating bestMove
also when value == bestValue (i.e. value == alpha) under certain conditions.
In particular require this is at higher depth in the tree, leaving the logic
near the root unchanged, and only apply randomly. Avoid doing this near mate
scores, leaving mate PVs intact.
Passed SMP STC 6+0.06 th7 :
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 42040 W: 10930 L: 10624 D: 20486
Ptnml(0-2): 28, 4682, 11289, 4998, 23
https://tests.stockfishchess.org/tests/view/66608b00c340c8eed7757d1d
Passed SMP LTC 24+0.24 th7 :
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 73692 W: 18978 L: 18600 D: 36114
Ptnml(0-2): 9, 7421, 21614, 7787, 15
https://tests.stockfishchess.org/tests/view/666095e8c340c8eed7757d49
closes https://github.com/official-stockfish/Stockfish/pull/5367
Bench 1205168
In case a stop is received during multithreaded searches, the PV of the best
thread might be printed without the correct upperbound/lowerbound indicators.
This was due to the pvIdx variable being incremented after receiving the stop.
passed STC:
https://tests.stockfishchess.org/tests/view/666985da602682471b064d08
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 196576 W: 51039 L: 50996 D: 94541
Ptnml(0-2): 760, 22545, 51603, 22652, 728
closes https://github.com/official-stockfish/Stockfish/pull/5391
Bench: 1160467
currently extensions can cause depth to exceed MAX_PLY.
This triggers the assert near line 542 in search when running a binary compiled with `debug=yes` on a testcase like:
```
position fen 7K/P1p1p1p1/2P1P1Pk/6pP/3p2P1/1P6/3P4/8 w - - 0 1
go nodes 1000000
```
passed STC
https://tests.stockfishchess.org/tests/view/6668a56a602682471b064c8d
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 143936 W: 37338 L: 37238 D: 69360
Ptnml(0-2): 514, 16335, 38149, 16477, 493
closes https://github.com/official-stockfish/Stockfish/pull/5383
Bench: 1160467
This reverts commit 783dfc2eb2.
could lead to a division by zero for:
ttValue = (ttValue * tte->depth() + beta) / (tte->depth() + 1)
as other threads can overwrite the tte with a QS depth of -1.
closes https://github.com/official-stockfish/Stockfish/pull/5338
Bench: 1280020
This is reintroduction of the recently simplified logic - if positive tt cutoff
occurs return not a tt value but smth between it and beta. Difference is that
instead of static linear combination there we use basically the same formula as
we do in the main search - with the only difference being using tt depth
instead of depth, which makes a lot of sense.
Passed STC:
https://tests.stockfishchess.org/tests/view/665b3a34f4a1fd0c208ea870
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 54944 W: 14239 L: 13896 D: 26809
Ptnml(0-2): 151, 6407, 14008, 6760, 146
Passed LTC:
https://tests.stockfishchess.org/tests/view/665b520011645bd3d3fac341
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 90540 W: 23070 L: 22640 D: 44830
Ptnml(0-2): 39, 9903, 24965, 10315, 48
closes https://github.com/official-stockfish/Stockfish/pull/5336
bench 1381237
The idea is, that if we have the information that the singular search failed low and therefore produced an upperbound score, we can use the score from singularsearch as approximate upperbound as to what bestValue our non ttMoves will produce. If this value is well below alpha, we assume that all non-ttMoves will score below alpha and therfore can skip more moves.
This patch also sets up variables for future patches wanting to use teh singular search result outside of singular extensions, in singularBound and singularValue, meaning further patches using this search result to affect various pruning techniques can be tried.
Passed STC:
https://tests.stockfishchess.org/tests/view/6658d13e6b0e318cefa90120
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 85632 W: 22112 L: 21725 D: 41795
Ptnml(0-2): 243, 10010, 21947, 10349, 267
Passed LTC:
https://tests.stockfishchess.org/tests/view/6658dd356b0e318cefa9016a
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 243978 W: 62014 L: 61272 D: 120692
Ptnml(0-2): 128, 26598, 67791, 27348, 124
closes https://github.com/official-stockfish/Stockfish/pull/5325
bench 1397172
As stockfish nets and search evolve, the existing time control appears
to give too little time at STC, roughly correct at LTC, and too little
at VLTC+.
This change adds an adjustment to the optExtra calculation. This
adjustment is easy to retune and refine, so it should be easier to keep
up-to-date than the more complex calculations used for optConstant and
optScale.
Passed STC 10+0.1:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 169568 W: 43803 L: 43295 D: 82470
Ptnml(0-2): 485, 19679, 44055, 19973, 592
https://tests.stockfishchess.org/tests/view/66531865a86388d5e27da9fa
Yellow LTC 60+0.6:
LLR: -2.94 (-2.94,2.94) <0.50,2.50>
Total: 209970 W: 53087 L: 52914 D: 103969
Ptnml(0-2): 91, 19652, 65314, 19849, 79
https://tests.stockfishchess.org/tests/view/6653e38ba86388d5e27daaa0
Passed VLTC 180+1.8 :
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 85618 W: 21735 L: 21342 D: 42541
Ptnml(0-2): 15, 8267, 25848, 8668, 11
https://tests.stockfishchess.org/tests/view/6655131da86388d5e27db95f
closes https://github.com/official-stockfish/Stockfish/pull/5297
Bench: 1212167
Allow for NUMA memory replication for NNUE weights. Bind threads to ensure execution on a specific NUMA node.
This patch introduces NUMA memory replication, currently only utilized for the NNUE weights. Along with it comes all machinery required to identify NUMA nodes and bind threads to specific processors/nodes. It also comes with small changes to Thread and ThreadPool to allow easier execution of custom functions on the designated thread. Old thread binding (WinProcGroup) machinery is removed because it's incompatible with this patch. Small changes to unrelated parts of the code were made to ensure correctness, like some classes being made unmovable, raw pointers replaced with unique_ptr. etc.
Windows 7 and Windows 10 is partially supported. Windows 11 is fully supported. Linux is fully supported, with explicit exclusion of Android. No additional dependencies.
-----------------
A new UCI option `NumaPolicy` is introduced. It can take the following values:
```
system - gathers NUMA node information from the system (lscpu or windows api), for each threads binds it to a single NUMA node
none - assumes there is 1 NUMA node, never binds threads
auto - this is the default value, depends on the number of set threads and NUMA nodes, will only enable binding on multinode systems and when the number of threads reaches a threshold (dependent on node size and count)
[[custom]] -
// ':'-separated numa nodes
// ','-separated cpu indices
// supports "first-last" range syntax for cpu indices,
for example '0-15,32-47:16-31,48-63'
```
Setting `NumaPolicy` forces recreation of the threads in the ThreadPool, which in turn forces the recreation of the TT.
The threads are distributed among NUMA nodes in a round-robin fashion based on fill percentage (i.e. it will strive to fill all NUMA nodes evenly). Threads are bound to NUMA nodes, not specific processors, because that's our only requirement and the OS can schedule them better.
Special care is made that maximum memory usage on systems that do not require memory replication stays as previously, that is, unnecessary copies are avoided.
On linux the process' processor affinity is respected. This means that if you for example use taskset to restrict Stockfish to a single NUMA node then the `system` and `auto` settings will only see a single NUMA node (more precisely, the processors included in the current affinity mask) and act accordingly.
-----------------
We can't ensure that a memory allocation takes place on a given NUMA node without using libnuma on linux, or using appropriate custom allocators on windows (https://learn.microsoft.com/en-us/windows/win32/memory/allocating-memory-from-a-numa-node), so to avoid complications the current implementation relies on first-touch policy. Due to this we also rely on the memory allocator to give us a new chunk of untouched memory from the system. This appears to work reliably on linux, but results may vary.
MacOS is not supported, because AFAIK it's not affected, and implementation would be problematic anyway.
Windows is supported since Windows 7 (https://learn.microsoft.com/en-us/windows/win32/api/processtopologyapi/nf-processtopologyapi-setthreadgroupaffinity). Until Windows 11/Server 2022 NUMA nodes are split such that they cannot span processor groups. This is because before Windows 11/Server 2022 it's not possible to set thread affinity spanning processor groups. The splitting is done manually in some cases (required after Windows 10 Build 20348). Since Windows 11/Server 2022 we can set affinites spanning processor group so this splitting is not done, so the behaviour is pretty much like on linux.
Linux is supported, **without** libnuma requirement. `lscpu` is expected.
-----------------
Passed 60+1 @ 256t 16000MB hash: https://tests.stockfishchess.org/tests/view/6654e443a86388d5e27db0d8
```
LLR: 2.95 (-2.94,2.94) <0.00,10.00>
Total: 278 W: 110 L: 29 D: 139
Ptnml(0-2): 0, 1, 56, 82, 0
```
Passed SMP STC: https://tests.stockfishchess.org/tests/view/6654fc74a86388d5e27db1cd
```
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 67152 W: 17354 L: 17177 D: 32621
Ptnml(0-2): 64, 7428, 18408, 7619, 57
```
Passed STC: https://tests.stockfishchess.org/tests/view/6654fb27a86388d5e27db15c
```
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 131648 W: 34155 L: 34045 D: 63448
Ptnml(0-2): 426, 13878, 37096, 14008, 416
```
fixes#5253
closes https://github.com/official-stockfish/Stockfish/pull/5285
No functional change