After increasing the number of threads, the histories were not cleared,
resulting in uninitialized memory usage.
This patch fixes this by clearing threads histories in Thread c'tor as
is the idomatic way.
This fixes issue 1227
No functional change.
credit goes to @mstembera for suggesting this approach.
SEE now deals with castling, promotion and en passant in a similar way.
passed STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 32902 W: 6079 L: 5979 D: 20844
passed LTC
LLR: 3.92 (-2.94,2.94) [-3.00,1.00]
Total: 110698 W: 14198 L: 14145 D: 82355
Bench: 5713905
And set x86 and x64 platforms for real.
Currently this is broken and the same binary is compiled for all platforms.
This is becuase we use a custom build step. OTH the default
build step seems not compatible with cmake generated *sln file.
No functional change.
If any thread found a 'mate in x' stop the search. Previously only
mainThread would do so. Requires the bestThread selection to be
adjusted to always prefer mate scores, even if the search depth is less.
I've tried to collect some data for this patch. On 30 cores, mate finding
seems 5-30% faster on average. It is not so easy to get numbers for this,
as the time to find a mate fluctuates significantly with multi-threaded runs,
so it is an average over 100 searches for the same position. Furthermore,
hash size and position make a difference as well.
Bench: 5965302
Avoid constructing, passing as a parameter and binding a useless empty tuple of pointers in the qsearch move picker constructor.
Also reformat the scoring function in movepicker.cpp and do some cleaning in evaluate.cpp while there.
No functional change.
What this patch does is:
* increase safety margin from 40ms to 60ms. It's worth noting that the previous
code not only used 60ms incompressible safety margin, but also an additional
buffer of 30ms for each "move to go".
* remove a whart, integrating the extra 10ms in Move Overhead value instead.
Additionally, this ensures that optimumtime doesn't become bigger than maximum
time after maximum time has been artificially discounted by 10ms. So it keeps
the code more logical.
Tested at 3 different time controls:
Standard 10+0.1
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 58008 W: 10674 L: 10617 D: 36717
Sudden death 16+0
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 59664 W: 10945 L: 10891 D: 37828
Tournament 40/10
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 16371 W: 3092 L: 2963 D: 10316
bench: 5479946
Add tests for:
- Positions with move list
- Chess960 positions
Now bench covers almost all cases, only few endgames
are still out of reach (verified with lcov)
It is a non functionality patch, but bench
changed because we added new test positions.
bench: 5479946
Rewrite perft to be placed naturally inside new
bench code. In particular we don't have special
custom code to run perft anymore but perft is
just a new parameter of 'go' command.
So user API is now changed, old style command:
$perft 5
becomes
$go perft 4
No functional change.
First step in improving bench to handle
arbitrary UCI commands so to test many
more code paths.
This first patch just set the new code
structure.
No functional change.
In particular clarify that 'sd'
parameter is used only in !movesToGo
case.
Verified with Ivan's check tool it is
equivalent to original code.
No functional change.
The only call site of Time.maximum() corrected by 10.
Do this directly in remaining().
Ponder increased Time.optimum by 25% in init(). Idem.
Delete unused includes.
No functional change.
Avoid a couple of redundant rebuilds and compile
with 2 threads since travis gives 2vCPUs.
Also enable -O1 optimization for valgrind and
sanitizers, it should be safe withouth false
positives and it gives a very sensible speed
up, especially with valgrind.
The spee dup allow us to increase testing to
depth 10, useful for thread sanitizer.
No functional change.
Current update formula ensures that the
possible value range is [-32 * D, 32 * D].
So we never overflow if abs(32 * D) < INT16_MAX
Thanks to Joost and mstembera to clarify this.
No functional change.
The trick is to create an ambiguity for the
compiler in case an unwanted conversion to
Move is attempted like in:
ExtMove m1{Move(17),4}, m2{Move(4),17};
std::cout << (m1 < m2) << std::endl; // 1
std::cout << (m1 > m2) << std::endl; // 1(!)
This fixes issue #1204
No functional change.
Reduces memory footprint by ~1.2MB (per thread).
Strong pressure: small but mesurable gain
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 258430 W: 46977 L: 45943 D: 165510
Low pressure: no regression
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 73542 W: 13058 L: 13026 D: 47458
Strong pressure + LTC: elo gain confirmed
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 31489 W: 4532 L: 4295 D: 22662
Tested for crashing on overflow and after 70K
games at STC we have only 4 time losses,
possible candidate for an overflow.
No functional change.
We use Position::set() to set root position across
threads. But there are some StateInfo fields (previous,
pliesFromNull, capturedPiece) that cannot be deduced
from a fen string, so set() clears them and to not lose
the info we need to backup and later restore setupStates->back().
Note that setupStates is shared by threads but is accessed
in read-only mode.
This fixes regression introduced by df6cb446ea
Tested with 3 threads at STC:
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 14436 W: 2304 L: 2196 D: 9936
Bench: 5608839
Some warnings after a run of:
$ clang-tidy-3.8 -checks='modernize-*' *.cpp syzygy/*.cpp -header-filter=.* -- -std=c++11
I have not fixed all suggestions, for instance I still prefer
to declare the type instead of a spread use of 'auto'. I also
perfer good old 'typedef' to the new 'using' form.
I have not fixed some warnings in the last functions of
syzygy code because those are still the original functions
and need to be completely rewritten anyhow.
Thanks to erbsenzaehler for the original idea.
No functional change.
Simplify out low level sync stuff (mutex
and friends) and avoid to use them directly
in many functions.
Also some renaming and better comment while
there.
No functional change.
The case of three or more bishops against a long king must look at all of the
bishops and not just the first two in the piece lists. This patch makes sure
that the position is treated as a win when there are bishops on opposite
colors. This functional change is very small because bench remains the same.
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 24249 W: 4349 L: 4275 D: 15625
http://tests.stockfishchess.org/tests/view/598186530ebc5916ff64a218
Bench: 5608839
In this rare case (e.g. go infinite on a stalemate),
just spin till ponderhit/stop comes.
The Thread::wait() is a renmant of the old YBWC
code, today with lazy SMP, threads don't need to
wait when outside of their idle loop.
No functional change.
But this time correctly set Threads.ponder
We avoid using 'limits' for passing pondering
flag because we don't want to have 2 ponder
variables in search scope: Search::Limits.ponder
and Threads.ponder. This would be confusing also
because limits.ponder is set at the beginning of
the search and never changes, instead Threads.ponder
can change value asynchronously during search.
No functional change.
This reverts commit 5410424e3d.
After the commit pondering is broken, so revert for now. I will
resubmit with a proper fix.
The issue is mine, Joost original code is correct.
No functional change.
Limits::ponder was used as a signal between uci and search threads,
but is not an atomic variable, leading to the following race as
flagged by a sanitized binary.
Expect input:
```
spawn ./stockfish
send "uci\n"
expect "uciok"
send "setoption name Ponder value true\n"
send "go wtime 4000 btime 4000\n"
expect "bestmove"
send "position startpos e2e4 d7d5\n"
send "go wtime 4000 btime 4000 ponder\n"
sleep 0.01
send "ponderhit\n"
expect "bestmove"
send "quit\n"
expect eof
```
Race:
```
WARNING: ThreadSanitizer: data race (pid=7191)
Read of size 4 at 0x0000005c2260 by thread T1:
Previous write of size 4 at 0x0000005c2260 by main thread:
Location is global 'Search::Limits' of size 88 at 0x0000005c2220 (stockfish+0x0000005c2260)
```
The reason of teh race is that ponder is not just set in UCI go()
assignment but also is signaled by an async ponderhit in uci.cpp:
else if (token == "ponderhit")
Search::Limits.ponder = 0; // Switch to normal search
The fix is to add an atomic bool to the threads structure to
signal the ponder status, letting Search::Limits to reflect just
what was passed to 'go'.
No functional change.
Better split code that should be run at
startup from code run at ucinewgame. Also
fix several races when 'bench', 'perft' and
'ucinewgame' are sent just after 'bestomve'
from the engine threads are still running.
Also use a specific UI thread instead of
main thread when setting up the Position
object used by UI uci loop. This fixes a
race when sending 'eval' command while searching.
We accept a race on 'setoption' to allow the
GUI to change an option while engine is searching
withouth stalling the pipe. Note that changing an
option while searchingg is anyhow not mandated by
UCI protocol.
No functional change.
as a lower level routine, movepicker should not depend on the
search stack or the thread class, removing a circular dependency.
Instead of copying the search stack into the movepicker object,
as well as accessing the thread class for one of the histories,
pass the required fields explicitly to the constructor (removing
the need for thread.h and implicitly search.h in movepick.cpp).
The signature is thus longer, but more explicit:
Also some renaming of histories structures while there.
passed STC [-3,1], suggesting a small elo impact:
LLR: 3.13 (-2.94,2.94) [-3.00,1.00]
Total: 381053 W: 68071 L: 68551 D: 244431
elo = -0.438 +- 0.660 LOS: 9.7%
No functional change.
A pawn (according to all the searched positions of a bench run) is not supported 85% of the time,
(in current master it is either isolated, backward or "unsupported").
So it made sense to try moving the S(17, 8) "unsupported" penalty value into the base pawn value hoping for a more representative pawn value, and accordingly
a) adjust backward and isolated so that they stay more or less the same as master
b) increase the mg connected bonus in the supported case by S(17, 0) and let the Connected formula find a suitable eg value according to rank.
Tested as a simplification SPRT(-3, 1)
Passed STC
http://tests.stockfishchess.org/tests/view/5970dbd30ebc5916ff649dd6
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 19613 W: 3663 L: 3540 D: 12410
Passed LTC
http://tests.stockfishchess.org/tests/view/597137780ebc5916ff649de3
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 24721 W: 3306 L: 3191 D: 18224
Bench: 5581946
Closes#1179
in the last month a couple of timeouts have been seen in travis valgrind testing, leading to undesired false positives. The precise cause of this is unclear: a normal valgrind instrumented run is about 6min, the timeout is 10min. Either there are rare hangs (not reproduced locally), or maybe the actual runtime fluctuates on the travis infrastructure (which uses VMs on AWS as far as I know). This patch leads to roughly a 2x speedup of the instrumented testing by reducing the depth from 10 to 9. If timeouts persist, it needs further analysis.
No functional change.
Closes#1171
For some reason, although game phase is used
only in material, it is computed in Position.
Move computation to material, where it belongs,
and remove the useless call chain.
No functional change.
Instead of having Signals in the search namespace,
make the stop variables part of the Threads structure.
This moves more of the shared (atomic) variables towards
the thread-related structures, making their role more clear.
No functional change
Closes#1149
Optimization options for official stockfish should be
consistent, easy, future proof and simple.
We don't want to optimize for any specific version of gcc
No functional change
Closes#1165
Addition of correction values in case of Imbalance of queens,
depending on the number of light pieces on the side without a queen.
Passed patch:
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 29036 W: 5379 L: 5130 D: 18527
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13680 W: 1836 L: 1674 D: 10170
Bench: 6258930
Closes#1155
It is not needed becuase the only case is a real special
one (bench on depth with many threads) and can be easily
rewritten to avoid sharing rootDepth.
Verified with ThreadSanitizer.
No functional change.
Closes#1159
Only one remains (also in tbprobe.cpp), but is bougus.
As a side note, tbprobe.cpp is almost clean, only the last 3
functions probe_wdl(), root_probe() and root_probe_wdl()
are still the original ones and are quite hacky.
Somewhere in the future we will reformat also the last 3
ones. The reason why has not been done before it is because
these functions are really wrong by design and should be
rewritten entirely, not only reformatted.
No functional change.
Closes#1160