Currently (after #5407), SF has the property that any PV line with a decisive
TB score contains the corresponding TB position, with a score that correctly
identifies the depth at which TB are entered. The PV line that follows might
not preserve the game outcome, but can easily be verified and extended based on
TB information. This patch provides this functionality, simply extending the PV
lines on output, this doesn't affect search.
Indeed, if DTZ tables are available, search based PV lines that correspond to
decisive TB scores are verified to preserve game outcome, truncating the line
as needed. Subsequently, such PV lines are extended with a game outcome
preserving line until mate, as a possible continuation. These lines are not
optimal mating lines, but are similar to what a user could produce on a website
like https://syzygy-tables.info/ clicking always the top ranked move, i.e.
minimizing or maximizing DTZ (with a simple tie-breaker for moves that have
identical DTZ), and are thus an just an illustration of how to game can be won.
A similar approach is already in established in
https://github.com/joergoster/Stockfish/tree/matefish2
This also contributes to addressing #5175 where SF can give an incorrect TB
win/loss for positions in TB with a movecounter that doesn't reflect optimal
play. While the full solution requires either TB generated differently, or a
search when ranking rootmoves, current SF will eventually find a draw in these
cases, in practice quite quickly, e.g.
`1kq5/q2r4/5K2/8/8/8/8/7Q w - - 96 1`
`8/8/6k1/3B4/3K4/4N3/8/8 w - - 54 106`
Gives the same results as master on an extended set of test positions from
9173d29c41
with the exception of the above mentioned fen where this commit improves.
With https://github.com/vondele/matetrack using 6men TB, all generated PVs verify:
```
Using ../Stockfish/src/stockfish.syzygyExtend on matetrack.epd with --nodes 1000000 --syzygyPath /chess/syzygy/3-4-5-6/WDL:/chess/syzygy/3-4-5-6/DTZ
Engine ID: Stockfish dev-20240704-ff227954
Total FENs: 6555
Found mates: 3299
Best mates: 2582
Found TB wins: 568
```
As repeated DTZ probing could be slow a procedure (100ms+ on HDD, a few ms on
SSD), the extension is only done as long as the time taken is less than half
the `Move Overhead` parameter. For tournaments where these lines might be of
interest to the user, a suitable `Move Overhead` might be needed (e.g. TCEC has
1000ms already).
closes https://github.com/official-stockfish/Stockfish/pull/5414
No functional change
As stockfish nets and search evolve, the existing time control appears
to give too little time at STC, roughly correct at LTC, and too little
at VLTC+.
This change adds an adjustment to the optExtra calculation. This
adjustment is easy to retune and refine, so it should be easier to keep
up-to-date than the more complex calculations used for optConstant and
optScale.
Passed STC 10+0.1:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 169568 W: 43803 L: 43295 D: 82470
Ptnml(0-2): 485, 19679, 44055, 19973, 592
https://tests.stockfishchess.org/tests/view/66531865a86388d5e27da9fa
Yellow LTC 60+0.6:
LLR: -2.94 (-2.94,2.94) <0.50,2.50>
Total: 209970 W: 53087 L: 52914 D: 103969
Ptnml(0-2): 91, 19652, 65314, 19849, 79
https://tests.stockfishchess.org/tests/view/6653e38ba86388d5e27daaa0
Passed VLTC 180+1.8 :
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 85618 W: 21735 L: 21342 D: 42541
Ptnml(0-2): 15, 8267, 25848, 8668, 11
https://tests.stockfishchess.org/tests/view/6655131da86388d5e27db95f
closes https://github.com/official-stockfish/Stockfish/pull/5297
Bench: 1212167
Allow for NUMA memory replication for NNUE weights. Bind threads to ensure execution on a specific NUMA node.
This patch introduces NUMA memory replication, currently only utilized for the NNUE weights. Along with it comes all machinery required to identify NUMA nodes and bind threads to specific processors/nodes. It also comes with small changes to Thread and ThreadPool to allow easier execution of custom functions on the designated thread. Old thread binding (WinProcGroup) machinery is removed because it's incompatible with this patch. Small changes to unrelated parts of the code were made to ensure correctness, like some classes being made unmovable, raw pointers replaced with unique_ptr. etc.
Windows 7 and Windows 10 is partially supported. Windows 11 is fully supported. Linux is fully supported, with explicit exclusion of Android. No additional dependencies.
-----------------
A new UCI option `NumaPolicy` is introduced. It can take the following values:
```
system - gathers NUMA node information from the system (lscpu or windows api), for each threads binds it to a single NUMA node
none - assumes there is 1 NUMA node, never binds threads
auto - this is the default value, depends on the number of set threads and NUMA nodes, will only enable binding on multinode systems and when the number of threads reaches a threshold (dependent on node size and count)
[[custom]] -
// ':'-separated numa nodes
// ','-separated cpu indices
// supports "first-last" range syntax for cpu indices,
for example '0-15,32-47:16-31,48-63'
```
Setting `NumaPolicy` forces recreation of the threads in the ThreadPool, which in turn forces the recreation of the TT.
The threads are distributed among NUMA nodes in a round-robin fashion based on fill percentage (i.e. it will strive to fill all NUMA nodes evenly). Threads are bound to NUMA nodes, not specific processors, because that's our only requirement and the OS can schedule them better.
Special care is made that maximum memory usage on systems that do not require memory replication stays as previously, that is, unnecessary copies are avoided.
On linux the process' processor affinity is respected. This means that if you for example use taskset to restrict Stockfish to a single NUMA node then the `system` and `auto` settings will only see a single NUMA node (more precisely, the processors included in the current affinity mask) and act accordingly.
-----------------
We can't ensure that a memory allocation takes place on a given NUMA node without using libnuma on linux, or using appropriate custom allocators on windows (https://learn.microsoft.com/en-us/windows/win32/memory/allocating-memory-from-a-numa-node), so to avoid complications the current implementation relies on first-touch policy. Due to this we also rely on the memory allocator to give us a new chunk of untouched memory from the system. This appears to work reliably on linux, but results may vary.
MacOS is not supported, because AFAIK it's not affected, and implementation would be problematic anyway.
Windows is supported since Windows 7 (https://learn.microsoft.com/en-us/windows/win32/api/processtopologyapi/nf-processtopologyapi-setthreadgroupaffinity). Until Windows 11/Server 2022 NUMA nodes are split such that they cannot span processor groups. This is because before Windows 11/Server 2022 it's not possible to set thread affinity spanning processor groups. The splitting is done manually in some cases (required after Windows 10 Build 20348). Since Windows 11/Server 2022 we can set affinites spanning processor group so this splitting is not done, so the behaviour is pretty much like on linux.
Linux is supported, **without** libnuma requirement. `lscpu` is expected.
-----------------
Passed 60+1 @ 256t 16000MB hash: https://tests.stockfishchess.org/tests/view/6654e443a86388d5e27db0d8
```
LLR: 2.95 (-2.94,2.94) <0.00,10.00>
Total: 278 W: 110 L: 29 D: 139
Ptnml(0-2): 0, 1, 56, 82, 0
```
Passed SMP STC: https://tests.stockfishchess.org/tests/view/6654fc74a86388d5e27db1cd
```
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 67152 W: 17354 L: 17177 D: 32621
Ptnml(0-2): 64, 7428, 18408, 7619, 57
```
Passed STC: https://tests.stockfishchess.org/tests/view/6654fb27a86388d5e27db15c
```
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 131648 W: 34155 L: 34045 D: 63448
Ptnml(0-2): 426, 13878, 37096, 14008, 416
```
fixes#5253
closes https://github.com/official-stockfish/Stockfish/pull/5285
No functional change
Stockfish appears to take too much time on the first move of a game and
then not enough on moves 2,3,4... Probably caused by most of the factors
that increase time usually applying on the first move.
Attempts to give more time to the subsequent moves have not worked so
far, but this change to simply reduce first move time by 5% worked.
STC 10+0.1 :
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 78496 W: 20516 L: 20135 D: 37845
Ptnml(0-2): 340, 8859, 20456, 9266, 327
https://tests.stockfishchess.org/tests/view/663d47bf507ebe1c0e9200ba
LTC 60+0.6 :
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 94872 W: 24179 L: 23751 D: 46942
Ptnml(0-2): 61, 9743, 27405, 10161, 66
https://tests.stockfishchess.org/tests/view/663e779cbb28828150dd9089
closes https://github.com/official-stockfish/Stockfish/pull/5235
Bench: 1876282
1. The current time management system utilizes limits.inc and
limits.time, which can represent either milliseconds or node count,
depending on whether the nodestime option is active. There have been
several modifications which brought Elo gain for typical uses (i.e.
real-time matches), however some of these changes overlooked such
distinction. This patch adjusts constants and multiplication/division to
more accurately simulate real TC conditions when nodestime is used.
2. The advance_nodes_time function has a bug that can extend the time
limit when availableNodes reaches exact zero. This patch fixes the bug
by initializing the variable to -1 and make sure it does not go below
zero.
3. elapsed_time function is newly introduced to print PV in the UCI
output based on real time. This makes PV output more consistent with the
behavior of trivial use cases.
closes https://github.com/official-stockfish/Stockfish/pull/5186
No functional changes
For each thread persist an accumulator cache for the network, where each
cache contains multiple entries for each of the possible king squares.
When the accumulator needs to be refreshed, the cached entry is used to more
efficiently update the accumulator, instead of rebuilding it from scratch.
This idea, was first described by Luecx (author of Koivisto) and
is commonly referred to as "Finny Tables".
When the accumulator needs to be refreshed, instead of filling it with
biases and adding every piece from scratch, we...
1. Take the `AccumulatorRefreshEntry` associated with the new king bucket
2. Calculate the features to activate and deactivate (from differences
between bitboards in the entry and bitboards of the actual position)
3. Apply the updates on the refresh entry
4. Copy the content of the refresh entry accumulator to the accumulator
we were refreshing
5. Copy the bitboards from the position to the refresh entry, to match
the newly updated accumulator
Results at STC:
https://tests.stockfishchess.org/tests/view/662301573fe04ce4cefc1386
(first version)
https://tests.stockfishchess.org/tests/view/6627fa063fe04ce4cefc6560
(final)
Non-Regression between first and final:
https://tests.stockfishchess.org/tests/view/662801e33fe04ce4cefc660a
STC SMP:
https://tests.stockfishchess.org/tests/view/662808133fe04ce4cefc667c
closes https://github.com/official-stockfish/Stockfish/pull/5183
No functional change
Also fixes searchmoves.
Drop the need of a Position object in uci.cpp.
A side note, it is still required for the static functions,
but these should be moved to a different namespace/class
later on, since sf kinda relies on them.
closes https://github.com/official-stockfish/Stockfish/pull/5169
No functional change
Part 2 of the Split UCI into UCIEngine and Engine refactor.
This creates function callbacks for search to use when an update should occur.
The benching in uci.cpp for example does this to extract the total nodes
searched.
No functional change
- fix naming convention for `workingDirectory`
- use type alias for `EvalFiles` everywhere
- move `ponderMode` into `LimitsType`
- move limits parsing into standalone static function
closes https://github.com/official-stockfish/Stockfish/pull/5098
No functional change
This introduces a form of node counting which can
be used to further tweak the usage of our search
time.
The current approach stops the search when almost
all nodes are searched on a single move.
The idea originally came from Koivisto, but the
implemention is a bit different, Koivisto scales
the optimal time by the nodes effort and then
determines if the search should be stopped.
We just scale down the `totalTime` and stop the
search if we exceed it and the effort is large
enough.
Passed STC:
https://tests.stockfishchess.org/tests/view/65c8e0661d8e83c78bfcd5ec
LLR: 2.97 (-2.94,2.94) <0.00,2.00>
Total: 88672 W: 22907 L: 22512 D: 43253
Ptnml(0-2): 310, 10163, 23041, 10466, 356
Passed LTC:
https://tests.stockfishchess.org/tests/view/65ca632b1d8e83c78bfcf554
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 170856 W: 42910 L: 42320 D: 85626
Ptnml(0-2): 104, 18337, 47960, 18919, 108
closes https://github.com/official-stockfish/Stockfish/pull/5053
Bench: 1198939
This addresses the issue where Stockfish may output non-proven checkmate
scores if the search is prematurely halted, either due to a time control
or node limit, before it explores other possibilities where the
checkmate score could have been delayed or refuted.
The fix also replaces staving off from proven mated scores in a
multithread environment making use of the threads instead of a negative
effect with multithreads (1t was better in proving mated in scores than
more threads).
Issue reported on mate tracker repo by and this PR is co-authored with
@robertnurnberg Special thanks to @AndyGrant for outlining that a fix is
eventually possible.
Passed Adj off SMP STC:
https://tests.stockfishchess.org/tests/view/65a125d779aa8af82b96c3eb
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 303256 W: 75823 L: 75892 D: 151541
Ptnml(0-2): 406, 35269, 80395, 35104, 454
Passed Adj off SMP LTC:
https://tests.stockfishchess.org/tests/view/65a37add79aa8af82b96f0f7
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 56056 W: 13951 L: 13770 D: 28335
Ptnml(0-2): 11, 5910, 16002, 6097, 8
Passed all tests in matetrack without any better mate for opponent found in 1t and multithreads.
Fixed bugs in https://github.com/official-stockfish/Stockfish/pull/4976
closes https://github.com/official-stockfish/Stockfish/pull/4990
Bench: 1308279
Co-Authored-By: Robert Nürnberg <28635489+robertnurnberg@users.noreply.github.com>
Also remove dead code, `rootSimpleEval` is no longer used since the introduction of dual net.
`iterBestValue` is also no longer used in evaluate and can be reduced to a local variable.
closes https://github.com/official-stockfish/Stockfish/pull/4979
No functional change
This aims to remove some of the annoying global structure which Stockfish has.
Overall there is no major elo regression to be expected.
Non regression SMP STC (paused, early version):
https://tests.stockfishchess.org/tests/view/65983d7979aa8af82b9608f1
LLR: 0.23 (-2.94,2.94) <-1.75,0.25>
Total: 76232 W: 19035 L: 19096 D: 38101
Ptnml(0-2): 92, 8735, 20515, 8690, 84
Non regression STC (early version):
https://tests.stockfishchess.org/tests/view/6595b3a479aa8af82b95da7f
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 185344 W: 47027 L: 46972 D: 91345
Ptnml(0-2): 571, 21285, 48943, 21264, 609
Non regression SMP STC:
https://tests.stockfishchess.org/tests/view/65a0715c79aa8af82b96b7e4
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 142936 W: 35761 L: 35662 D: 71513
Ptnml(0-2): 209, 16400, 38135, 16531, 193
These global structures/variables add hidden dependencies and allow data
to be mutable from where it shouldn't it be (i.e. options). They also
prevent Stockfish from internal selfplay, which would be a nice thing to
be able to do, i.e. instantiate two Stockfish instances and let them
play against each other. It will also allow us to make Stockfish a
library, which can be easier used on other platforms.
For consistency with the old search code, `thisThread` has been kept,
even though it is not strictly necessary anymore. This the first major
refactor of this kind (in recent time), and future changes are required,
to achieve the previously described goals. This includes cleaning up the
dependencies, transforming the network to be self contained and coming
up with a plan to deal with proper tablebase memory management (see
comments for more information on this).
The removal of these global structures has been discussed in parts with
Vondele and Sopel.
closes https://github.com/official-stockfish/Stockfish/pull/4968
No functional change
- remove the blank line between the declaration of the function and it's
comment, leads to better IDE support when hovering over a function to see it's
description
- remove the unnecessary duplication of the function name in the functions
description
- slightly refactored code for lsb, msb in bitboard.h There are still a few
things we can be improved later on, move the description of a function where
it was declared (instead of implemented) and add descriptions to functions
which are behind macros ifdefs
closes https://github.com/official-stockfish/Stockfish/pull/4840
No functional change
This introduces clang-format to enforce a consistent code style for Stockfish.
Having a documented and consistent style across the code will make contributing easier
for new developers, and will make larger changes to the codebase easier to make.
To facilitate formatting, this PR includes a Makefile target (`make format`) to format the code,
this requires clang-format (version 17 currently) to be installed locally.
Installing clang-format is straightforward on most OS and distros
(e.g. with https://apt.llvm.org/, brew install clang-format, etc), as this is part of quite commonly
used suite of tools and compilers (llvm / clang).
Additionally, a CI action is present that will verify if the code requires formatting,
and comment on the PR as needed. Initially, correct formatting is not required, it will be
done by maintainers as part of the merge or in later commits, but obviously this is encouraged.
fixes https://github.com/official-stockfish/Stockfish/issues/3608
closes https://github.com/official-stockfish/Stockfish/pull/4790
Co-Authored-By: Joost VandeVondele <Joost.VandeVondele@gmail.com>
fixes the lowerbound/upperbound output by avoiding
scores outside the alpha,beta bracket. Since SF search
uses fail-soft we can't simply take the returned value
as score.
closes https://github.com/official-stockfish/Stockfish/pull/4259
No functional change
fixes the lowerbound/upperbound output by taking the alpha,beta bracket
into account also if a bestThread is selected that is different from the master thread.
Instead of keeping track which bounds where used in the specific search,
in this version we simply store the quality (exact, upperbound,
lowerbound) of the score along with the actual score as information on
rootMove.
closes https://github.com/official-stockfish/Stockfish/pull/4239
No functional change
Maintain for each root move an exponential average of the search value with a weight ratio of 2:1 (new value vs old values). Then the average score is used as the center of the initial aspiration window instead of the previous score.
Stats indicate (see PR) that the deviation for previous score is in general greater than using average score, so later seems a better estimation of the next search value. This is probably the reason this patch succeded besides smoothing the sometimes wild swings in search score. An additional observation is that at higher depth previous score is above but average score below zero. So for average score more/less fail/low highs should be occur than previous score.
STC:
LLR: 2.97 (-2.94,2.94) <0.00,2.50>
Total: 59792 W: 15106 L: 14792 D: 29894
Ptnml(0-2): 144, 6718, 15869, 7010, 155
https://tests.stockfishchess.org/tests/view/61841612d7a085ad008eef06
LTC:
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 46448 W: 11835 L: 11537 D: 23076
Ptnml(0-2): 21, 4756, 13374, 5050, 23
https://tests.stockfishchess.org/tests/view/618463abd7a085ad008eef3e
closes https://github.com/official-stockfish/Stockfish/pull/3776
Bench: 6719976
This patch detects some search explosions (due to double extensions in
search.cpp) which can happen in some pathological positions, and takes
measures to ensure progress in search even for these pathological situations.
While a small number of double extensions can be useful during search
(for example to resolve a tactical sequence), a sustained regime of
double extensions leads to search explosion and a non-finishing search.
See the discussion in https://github.com/official-stockfish/Stockfish/pull/3544
and the issue https://github.com/official-stockfish/Stockfish/issues/3532 .
The implemented algorithm is the following:
a) at each node during search, store the current depth in the stack.
Double extensions are by definition levels of the stack where the
depth at ply N is strictly higher than depth at ply N-1.
b) during search, calculate for each thread a running average of the
number of double extensions in the last 4096 visited nodes.
c) if one thread has more than 2% of double extensions for a sustained
period of time (6 millions consecutive nodes, or about 4 seconds on
my iMac), we decide that this thread is in an explosion state and
we calm down this thread by preventing it to do any double extension
for the next 6 millions nodes.
To calculate the running averages, we also introduced a auxiliary class
generalizing the computations of ttHitAverage variable we already had in
code. The implementation uses an exponential moving average of period 4096
and resolution 1/1024, and all computations are done with integers for
efficiency.
-----------
Example where the patch solves a search explosion:
```
./stockfish
ucinewgame
position fen 8/Pk6/8/1p6/8/P1K5/8/6B1 w - - 37 130
go infinite
```
This algorithm does not affect search in normal, non-pathological positions.
We verified, for instance, that the usual bench is unchanged up to depth 20
at least, and that the node numbers are unchanged for a search of the starting
position at depth 32.
-------------
See https://github.com/official-stockfish/Stockfish/pull/3714
Bench: 5575265
We introduce a metric for each internal node in search, called DistanceFromPV.
This distance indicated how far the current node is from the principal variation.
We then use this distance to search the nodes which are close to the PV a little
deeper (up to 4 plies deeper than the PV): this improves the quality of the search
at these nodes and bring better updates for the PV during search.
STC:
LLR: 2.96 (-2.94,2.94) {-0.25,1.25}
Total: 54936 W: 5047 L: 4850 D: 45039
Ptnml(0-2): 183, 3907, 19075, 4136, 167
https://tests.stockfishchess.org/tests/view/6037b88e7f517a561bc4a392
LTC:
LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 49608 W: 1880 L: 1703 D: 46025
Ptnml(0-2): 22, 1514, 21555, 1691, 22
https://tests.stockfishchess.org/tests/view/6038271b7f517a561bc4a3cb
Closes https://github.com/official-stockfish/Stockfish/pull/3369
Bench: 5037279