the nodes, tbHits, rootDepth and lastInfoTime variables are read by multiple threads, but not declared atomic, leading to data races as found by -fsanitize=thread. This patch fixes this issue. It is based on top of the CI-threading branch (PR #1129), and should fix the corresponding CI error messages.
The patch passed an STC check for no regression:
http://tests.stockfishchess.org/tests/view/5925d5590ebc59035df34b9f
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 169597 W: 29938 L: 30066 D: 109593
Whereas rootDepth and lastInfoTime are not performance critical, nodes and tbHits are. Indeed, an earlier version using relaxed atomic updates on the latter two variables failed STC testing (http://tests.stockfishchess.org/tests/view/592001700ebc59035df34924), which can be shown to be due to x86-32 (http://tests.stockfishchess.org/tests/view/592330ac0ebc59035df34a89). Indeed, the latter have no instruction to atomically update a 64bit variable. The proposed solution thus uses a variable in Position that is accessed only by one thread, which is copied every few thousand nodes to the shared variable in Thread.
No functional change.
Closes#1130Closes#1129
The change passed an STC regression:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 59350 W: 10793 L: 10738 D: 37819
I verified that there was no change in performance on my machine, but of course YMMV:
Results for 40 tests for each version:
Base Test Diff
Mean 2014338 2016121 -1783
StDev 62655 63441 3860
p-value: 0.678
speedup: 0.001
No functional change.
Closes#1137
TT.new_search() was being called by mainThread in Thread::search(). However, mainThread is the last to start searching, and helper threads could reach a measured rootDepth 10 (on 64 cores) before mainThread increments the TT generation. Fixed by moving the call to MaintThread::search() before helper threads start searching.
No functional change.
Closes#1134
Gather all magic relevant data into a struct.
This changes memory layout putting everything necessary for processing a single square
in the same memory location thus speeding up access.
Original patch by @snicolet
No functional change.
Closes#1127Closes#1128
In the current version a search stops when the current position is the same as
any position earlier in the search stack,
including the root position but excluding positions before the root.
The new version makes an exception for repeating the root position.
This gives correct scores for moves in the MultiPV > 1 mode.
Fixes#948 (see it for the detailed description of the bug).
LTC: http://tests.stockfishchess.org/tests/view/587910bc0ebc5915193f754b
ELO: 0.38 +-1.7 (95%) LOS: 66.8%
Total: 40000 W: 5166 L: 5122 D: 29712
STC: http://tests.stockfishchess.org/tests/view/5922e6230ebc59035df34a50
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 94622 W: 17059 L: 17064 D: 60499
LTC: http://tests.stockfishchess.org/tests/view/59273a000ebc59035df34c03
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 61259 W: 7965 L: 7897 D: 45397
Bench: 6599721
Closes#1126
Rearrange and rename all history heuristic code. Naming
is now based on chessprogramming.wikispaces.com conventions
and the relations among the various heuristics are now more
clear and consistent.
No functional change.
The previous patch has added a fraction of the king danger score to the
endgame score of the tapered eval, so it seems natural to perform the
king danger computation later in the endgame.
With this patch we extend the limit of such check analysis down to the
material of Rook+Knight, when we have at least two pieces attacking the
opponent king zone.
Passed STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 7446 W: 1409 L: 1253 D: 4784
http://tests.stockfishchess.org/tests/view/591c097c0ebc59035df3477c
and LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 14234 W: 1946 L: 1781 D: 10507
http://tests.stockfishchess.org/tests/view/591c24f10ebc59035df3478c
Bench: 5975183
Closes#1121
Fixes a bug in Search::clear, where the filling of CounterMoveStats&, overwrote (currently presumably unused) memory because sizeof(cm) returns the size in bytes, whereas elements was needed.
No functional change
Closes#1119
In current master the size of the king ring varies abruptly from eight
squares when the king is in g8, to 12 squares when it is in g7. Because
the king ring is used for estimating attack strength, this may lead to
an overestimation of king danger in some positions. This patch limits
the king ring to eight squares in all cases.
Inspired by the following forum thread:
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/xrUCQ7b0ObE
Passed STC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 9244 W: 1777 L: 1611 D: 5856
and LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 87121 W: 11765 L: 11358 D: 63998
Bench: 6121121
Closes#1115
execute an implied ucinewgame upon entering the UCI::loop,
to make sure that searches starting with and without an (optional) ucinewgame
command yield the same search.
This is needed now that seach::clear() initializes tables to non-zero default values.
No functional change
Closes#1101Closes#1104
Replacing the old Protector table with a simple linear formula which takes into account a different slope for each different piece type.
The idea of this simplification of Protector is originated by Alain (Rocky)
STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 70382 W: 12859 L: 12823 D: 44700
LTC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 61554 W: 8098 L: 8031 D: 45425
Bench: 6107863
Closes#1099
In general, this patch handles the cases where we don't have a valid score for each PV line in a multiPV search. This can happen if the search has been stopped in an unfortunate moment while still in the aspiration loop. The patch consists of two parts.
Part 1: The new PVIdx was already part of the k-best pv's in the last iteration, and we therefore have a valid pv and score to output from the last iteration. This is taken care of with:
bool updated = (i <= PVIdx && rootMoves[i].score != -VALUE_INFINITE);
Case 2: The new PVIdx was NOT part of the k-best pv's in the last iteration, and we have no valid pv and score to output. Not from the current nor from the previous iteration. To avoid this, we are now also considering the previous score when sorting, so that the PV lines with no actual but with a valid previous score are pushed up again, and the previous score can be displayed.
bool operator<(const RootMove& m) const {
return m.score != score ? m.score < score : m.previousScore < previousScore; } // Descending sort
I also added an assertion in UCI::value() to possibly catch similar issues earlier.
No functional change.
Closes#502Closes#1074
Testing the release candidate revealed only one minor issue, namely a new warning -Wimplicit-fallthrough (part of -Wextra) triggers in the movepicker. This can be silenced by adding a comment, and once we move to c++17 by adding a standard annotation [[fallthrough]];.
No functional change.
Closes#1090
StepAttacks[] is misdesigned, the color dependance is specific
to pawns, and trying to generalise to king and knights, proves
neither useful nor convinient in practice.
So this patch reformats the code with the following changes:
- Use PieceType instead of Piece in attacks_() functions
- Use PseudoAttacks for KING and KNIGHT
- Rename StepAttacks[] into PawnAttacks[]
Original patch and idea from Alain Savard.
No functional change.
Closes#1086
ss->killers can change while the movepicker is active.
The reason ss->killers changes is related to the singular
extension search in the moves loop that calls search<>
recursively with ss instead of ss+1,
effectively using the same stack entry for caller and callee.
By making a copy of the killers,
the movepicker does the right thing nevertheless.
Tested as a bug fix
STC:
http://tests.stockfishchess.org/tests/view/58ff130f0ebc59035df33f37
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 70845 W: 12752 L: 12716 D: 45377
LTC:
http://tests.stockfishchess.org/tests/view/58ff48000ebc59035df33f3d
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 28368 W: 3730 L: 3619 D: 21019
Bench: 6465887
Closes#1085
history related scores are not related to evaluation based scores.
For example, can easily exceed the range -VALUE_INFINITE,VALUE_INFINITE.
As such the current type is confusing, and a plain int is a better match.
tested for no regression:
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 43693 W: 7909 L: 7827 D: 27957
No functional change.
Closes#1070
the order of elements returned by std::partition is implementation defined (since not stable) and could depend on the version of libstdc++ linked.
As std::stable_partition was tested to be too slow (http://tests.stockfishchess.org/tests/view/585cdfd00ebc5903140c6082).
Instead combine partition with our custom implementation of insert_sort, which fixes this issue.
Implementation based on a patch by mstembera (http://tests.stockfishchess.org/tests/view/58d4d3460ebc59035df3315c), which suggests some benefit by itself.
Higher depth moves are all sorted (INT_MIN version), as in current master.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 33116 W: 6161 L: 6061 D: 20894
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 88703 W: 11572 L: 11540 D: 65591
Bench: 6256522
Closes#1058Closes#1065
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 58462 W: 10615 L: 10558 D: 37289
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 65061 W: 8539 L: 8477 D: 48045
It is worth noting that an attempt to only increase the bonus passed STC but failed LTC, and
an attempt to remove the cap without increasing the bonus is still running at STC, but will probably fail after more than 100k.
Bench: 6188591
Closes#1063
By adding pos.non_pawn_material(pos.side_to_move()) as a precondition in step 13,
which is already in use in Futility Pruning (child node) and Null Move Pruning for similar reasons.
Pawn endgames, especially those with only 1 or 2 pawns, are simply heavily influenced by zugzwang situations.
Since we are using a bitbase for KPK endgames, I see no reason to accept buggy evals as shown in #760
Patch looks neutral at STC
LLR: 2.32 (-2.94,2.94) [-3.00,1.00]
Total: 79580 W: 10789 L: 10780 D: 58011
and LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 27071 W: 3502 L: 3390 D: 20179
Bench: 6259071
Closes#1051Closes#760
a single Xeon Phi can present itself as a single numa node with up to 288 threads (4 threads per hardware core).
Tested to work as expected with a Xeon Phi CPU 7230 up to 256 threads.
No functional change
Closes#1045