Stockfish currently relies on the "filter_root_moves" function also
having the side effect of clamping Cardinality against MaxCardinality
(the actual piece count in the tablebases). So if we skip this function,
we will end up probing in the search even without tablebases installed.
We cannot bail out of this function before this check is done, so move
the MultiPV hack a few lines below.
If the PV leads to a draw (3-fold / 50-moves) position
and we're ahead of time, think a little longer, possibly
finding a better way.
As this is most likely effective at higher draw rates,
tried speculative LTC after a yellow STC:
STC:
http://tests.stockfishchess.org/tests/view/59eb173a0ebc590ccbb8975d
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 56095 W: 10013 L: 9902 D: 36180
elo = 0.688 +- 1.711 LOS: 78.425%
LTC:
http://tests.stockfishchess.org/tests/view/59eba1670ebc590ccbb897b4
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 59579 W: 7577 L: 7273 D: 44729
elo = 1.773 +- 1.391 LOS: 99.381%
bench: 5234652
A band-aid patch to workaround current TB code
limitations with multi PV.
Hopefully this will be removed after committing the
big update of TB impementation, now under discussion.
No functional change.
If the search is quit before skill.pick_best is called,
skill.best_move might be MOVE_NONE.
Ensure skill.best is always assigned anyhow.
Also retire the tricky best_move() and let the underlying
semantic to be clear and explicit.
No functional change.
The first change (ss->statScore >= 0) does nothing.
The second change ((ss-1)->statScore >= 0 ) has a massive change.
(ss-1)->statScore is not set until (ss-1) begins to apply LMR to moves.
So we now increase the reduction for bad quiets when our opponent is
running through the first captures and the hash move.
STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 57762 W: 10533 L: 10181 D: 37048
LTC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 19973 W: 2662 L: 2480 D: 14831
Bench: 5037819
This patch lets ss->ply be equal to 0 at the root of the search.
Currently, the root has ss->ply == 1, which is less intuitive:
- Setting the rootNode bool has to check (ss-1)->ply == 0.
- All mate values are off by one: the code seems to assume that mated-in-0
is -VALUE_MATE, mate-1-in-ply is VALUE_MATE-1, mated-in-2-ply is VALUE_MATE+2, etc.
But the mate_in() and mated_in() functions are called with ss->ply, which is 1 in
at the root.
- The is_draw() function currently needs to explain why it has "ply - 1 > i" instead
of simply "ply > i".
- The ss->ply >= MAX_PLY tests in search() and qsearch() already assume that
ss->ply == 0 at the root. If we start at ss->ply == 1, it would make more sense to
go up to and including ss->ply == MAX_PLY, so stop at ss->ply > MAX_PLY. See also
the asserts testing for 0 <= ss->ply && ss->ply < MAX_PLY.
The reason for ss->ply == 1 at the root is the line "ss->ply = (ss-1)->ply + 1" at
the start for search() and qsearch(). By replacing this with "(ss+1)->ply = ss->ply + 1"
we keep ss->ply == 0 at the root. Note that search() already clears killers in (ss+2),
so there is no danger in accessing ss+1.
I have NOT changed pv[MAX_PLY + 1] to pv[MAX_PLY + 2] in search() and qsearch().
It seems to me that MAX_PLY + 1 is exactly right:
- MAX_PLY entries for ss->ply running from 0 to MAX_PLY-1, and 1 entry for the
final MOVE_NONE.
I have verified that mate scores are reported correctly. (They were already reported
correctly due to the extra ply being rounded down when converting to moves.)
The value of seldepth output to the user should probably not change, so I add 1 to it.
(Humans count from 1, computers from 0.)
A small optimisation I did not include: instead of setting ss->ply in every invocation
of search() and qsearch(), it could be set once for all plies at the start of
Thread::search(). This saves a couple of instructions per node.
No functional change (unless the search searches a branch MAX_PLY deep), so bench
does not change.
After increasing the number of threads, the histories were not cleared,
resulting in uninitialized memory usage.
This patch fixes this by clearing threads histories in Thread c'tor as
is the idomatic way.
This fixes issue 1227
No functional change.
If any thread found a 'mate in x' stop the search. Previously only
mainThread would do so. Requires the bestThread selection to be
adjusted to always prefer mate scores, even if the search depth is less.
I've tried to collect some data for this patch. On 30 cores, mate finding
seems 5-30% faster on average. It is not so easy to get numbers for this,
as the time to find a mate fluctuates significantly with multi-threaded runs,
so it is an average over 100 searches for the same position. Furthermore,
hash size and position make a difference as well.
Bench: 5965302
Avoid constructing, passing as a parameter and binding a useless empty tuple of pointers in the qsearch move picker constructor.
Also reformat the scoring function in movepicker.cpp and do some cleaning in evaluate.cpp while there.
No functional change.
Rewrite perft to be placed naturally inside new
bench code. In particular we don't have special
custom code to run perft anymore but perft is
just a new parameter of 'go' command.
So user API is now changed, old style command:
$perft 5
becomes
$go perft 4
No functional change.
The only call site of Time.maximum() corrected by 10.
Do this directly in remaining().
Ponder increased Time.optimum by 25% in init(). Idem.
Delete unused includes.
No functional change.
Some warnings after a run of:
$ clang-tidy-3.8 -checks='modernize-*' *.cpp syzygy/*.cpp -header-filter=.* -- -std=c++11
I have not fixed all suggestions, for instance I still prefer
to declare the type instead of a spread use of 'auto'. I also
perfer good old 'typedef' to the new 'using' form.
I have not fixed some warnings in the last functions of
syzygy code because those are still the original functions
and need to be completely rewritten anyhow.
Thanks to erbsenzaehler for the original idea.
No functional change.
Simplify out low level sync stuff (mutex
and friends) and avoid to use them directly
in many functions.
Also some renaming and better comment while
there.
No functional change.
In this rare case (e.g. go infinite on a stalemate),
just spin till ponderhit/stop comes.
The Thread::wait() is a renmant of the old YBWC
code, today with lazy SMP, threads don't need to
wait when outside of their idle loop.
No functional change.
But this time correctly set Threads.ponder
We avoid using 'limits' for passing pondering
flag because we don't want to have 2 ponder
variables in search scope: Search::Limits.ponder
and Threads.ponder. This would be confusing also
because limits.ponder is set at the beginning of
the search and never changes, instead Threads.ponder
can change value asynchronously during search.
No functional change.
This reverts commit 5410424e3d.
After the commit pondering is broken, so revert for now. I will
resubmit with a proper fix.
The issue is mine, Joost original code is correct.
No functional change.
Limits::ponder was used as a signal between uci and search threads,
but is not an atomic variable, leading to the following race as
flagged by a sanitized binary.
Expect input:
```
spawn ./stockfish
send "uci\n"
expect "uciok"
send "setoption name Ponder value true\n"
send "go wtime 4000 btime 4000\n"
expect "bestmove"
send "position startpos e2e4 d7d5\n"
send "go wtime 4000 btime 4000 ponder\n"
sleep 0.01
send "ponderhit\n"
expect "bestmove"
send "quit\n"
expect eof
```
Race:
```
WARNING: ThreadSanitizer: data race (pid=7191)
Read of size 4 at 0x0000005c2260 by thread T1:
Previous write of size 4 at 0x0000005c2260 by main thread:
Location is global 'Search::Limits' of size 88 at 0x0000005c2220 (stockfish+0x0000005c2260)
```
The reason of teh race is that ponder is not just set in UCI go()
assignment but also is signaled by an async ponderhit in uci.cpp:
else if (token == "ponderhit")
Search::Limits.ponder = 0; // Switch to normal search
The fix is to add an atomic bool to the threads structure to
signal the ponder status, letting Search::Limits to reflect just
what was passed to 'go'.
No functional change.
Better split code that should be run at
startup from code run at ucinewgame. Also
fix several races when 'bench', 'perft' and
'ucinewgame' are sent just after 'bestomve'
from the engine threads are still running.
Also use a specific UI thread instead of
main thread when setting up the Position
object used by UI uci loop. This fixes a
race when sending 'eval' command while searching.
We accept a race on 'setoption' to allow the
GUI to change an option while engine is searching
withouth stalling the pipe. Note that changing an
option while searchingg is anyhow not mandated by
UCI protocol.
No functional change.
as a lower level routine, movepicker should not depend on the
search stack or the thread class, removing a circular dependency.
Instead of copying the search stack into the movepicker object,
as well as accessing the thread class for one of the histories,
pass the required fields explicitly to the constructor (removing
the need for thread.h and implicitly search.h in movepick.cpp).
The signature is thus longer, but more explicit:
Also some renaming of histories structures while there.
passed STC [-3,1], suggesting a small elo impact:
LLR: 3.13 (-2.94,2.94) [-3.00,1.00]
Total: 381053 W: 68071 L: 68551 D: 244431
elo = -0.438 +- 0.660 LOS: 9.7%
No functional change.
Instead of having Signals in the search namespace,
make the stop variables part of the Threads structure.
This moves more of the shared (atomic) variables towards
the thread-related structures, making their role more clear.
No functional change
Closes#1149
It is not needed becuase the only case is a real special
one (bench on depth with many threads) and can be easily
rewritten to avoid sharing rootDepth.
Verified with ThreadSanitizer.
No functional change.
Closes#1159
The main change of the patch is that now time check
is done only by main thread. In the past, before lazy
SMP, we needed all the threds to check for available
time because main thread could have been blocked on
a split point, now this is no more the case and main
thread can do the job alone, greatly simplifying the logic.
Verified for regression testing on STC with 7 threads:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 11895 W: 1741 L: 1608 D: 8546
No functional change.
Closes#1152
The idea is that chances are the tt-move is best and will be difficult to raise alpha when playing a quiet move.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 7582 W: 1415 L: 1259 D: 4908
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 59553 W: 7885 L: 7573 D: 44095
Bench: 5725676
Closes#1147
the nodes, tbHits, rootDepth and lastInfoTime variables are read by multiple threads, but not declared atomic, leading to data races as found by -fsanitize=thread. This patch fixes this issue. It is based on top of the CI-threading branch (PR #1129), and should fix the corresponding CI error messages.
The patch passed an STC check for no regression:
http://tests.stockfishchess.org/tests/view/5925d5590ebc59035df34b9f
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 169597 W: 29938 L: 30066 D: 109593
Whereas rootDepth and lastInfoTime are not performance critical, nodes and tbHits are. Indeed, an earlier version using relaxed atomic updates on the latter two variables failed STC testing (http://tests.stockfishchess.org/tests/view/592001700ebc59035df34924), which can be shown to be due to x86-32 (http://tests.stockfishchess.org/tests/view/592330ac0ebc59035df34a89). Indeed, the latter have no instruction to atomically update a 64bit variable. The proposed solution thus uses a variable in Position that is accessed only by one thread, which is copied every few thousand nodes to the shared variable in Thread.
No functional change.
Closes#1130Closes#1129
The change passed an STC regression:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 59350 W: 10793 L: 10738 D: 37819
I verified that there was no change in performance on my machine, but of course YMMV:
Results for 40 tests for each version:
Base Test Diff
Mean 2014338 2016121 -1783
StDev 62655 63441 3860
p-value: 0.678
speedup: 0.001
No functional change.
Closes#1137
TT.new_search() was being called by mainThread in Thread::search(). However, mainThread is the last to start searching, and helper threads could reach a measured rootDepth 10 (on 64 cores) before mainThread increments the TT generation. Fixed by moving the call to MaintThread::search() before helper threads start searching.
No functional change.
Closes#1134
Rearrange and rename all history heuristic code. Naming
is now based on chessprogramming.wikispaces.com conventions
and the relations among the various heuristics are now more
clear and consistent.
No functional change.
Fixes a bug in Search::clear, where the filling of CounterMoveStats&, overwrote (currently presumably unused) memory because sizeof(cm) returns the size in bytes, whereas elements was needed.
No functional change
Closes#1119
execute an implied ucinewgame upon entering the UCI::loop,
to make sure that searches starting with and without an (optional) ucinewgame
command yield the same search.
This is needed now that seach::clear() initializes tables to non-zero default values.
No functional change
Closes#1101Closes#1104
In general, this patch handles the cases where we don't have a valid score for each PV line in a multiPV search. This can happen if the search has been stopped in an unfortunate moment while still in the aspiration loop. The patch consists of two parts.
Part 1: The new PVIdx was already part of the k-best pv's in the last iteration, and we therefore have a valid pv and score to output from the last iteration. This is taken care of with:
bool updated = (i <= PVIdx && rootMoves[i].score != -VALUE_INFINITE);
Case 2: The new PVIdx was NOT part of the k-best pv's in the last iteration, and we have no valid pv and score to output. Not from the current nor from the previous iteration. To avoid this, we are now also considering the previous score when sorting, so that the PV lines with no actual but with a valid previous score are pushed up again, and the previous score can be displayed.
bool operator<(const RootMove& m) const {
return m.score != score ? m.score < score : m.previousScore < previousScore; } // Descending sort
I also added an assertion in UCI::value() to possibly catch similar issues earlier.
No functional change.
Closes#502Closes#1074