This patch corrects both MultiPV behaviour and "go searchmoves" behaviour
for tablebases.
We change the logic of table base probing at root positions from filtering
to ranking. The ranking code is much more straightforward than the current
filtering code (this is a simplification), and also more versatile.
If the root is a TB position, each root move is probed and assigned a TB score
and a TB rank. The TB score is the Value to be displayed to the user for that
move (unless the search finds a mate score), while the TB rank determines which
moves should appear higher in a multi-pv search. In game play, the engine will
always pick a move with the highest rank.
Ranks run from -1000 to +1000:
901 to 1000 : TB win
900 : normally a TB win, in rare cases this could be a draw
1 to 899 : cursed TB wins
0 : draw
-1 to -899 : blessed TB losses
-900 : normally a TB loss, in rare cases this could be a draw
-901 to -1000 : TB loss
Normally all winning moves get rank 1000 (to let the search pick the best
among them). The exception is if there has been a first repetition. In that
case, moves are ranked strictly by DTZ so that the engine will play a move
that lowers DTZ (and therefore cannot repeat the position a second time).
Losing moves get rank -1000 unless they have relatively high DTZ, meaning
they have some drawing chances. Those get ranks towards -901 (when they
cross -900 the draw is certain).
Closes https://github.com/official-stockfish/Stockfish/pull/1467
No functional change (without tablebases).
Implements renaming suggestions by Marco Costalba, Günther Demetz,
Gontran Lemaire, Ronald de Man, Stéphane Nicolet, Alain Savard,
Joost VandeVondele, Jerry Donald Watson, Mike Whiteley, xoto10,
and I hope that I haven't forgotten anybody.
Perpetual renaming thread for suggestions:
https://github.com/official-stockfish/Stockfish/issues/1426
No functional change.
To compute dicovered check or pinned pieces we use some bitwise
operators that are not really needed because already accounted for
at the caller site.
For instance in evaluation we compute:
pos.pinned_pieces(Us) & s
Where pinned_pieces() is:
st->blockersForKing[c] & pieces(c)
So in this case the & operator with pieces(c) is useless,
given the outer '& s'.
There are many places where we can use the naked blockersForKing[]
instead of the full pinned_pieces() or discovered_check_candidates().
This path is simpler than original and gives around 1% speed up for me.
Also tested for speed by mstembera and snicolet (neutral in both cases).
No functional change.
Where variable names are explicitly incorrect, I feel morally obligated to at least
suggest an alternative. There are many, but these two are especially egregious.
No functional change.
For some reason, although game phase is used
only in material, it is computed in Position.
Move computation to material, where it belongs,
and remove the useless call chain.
No functional change.
The main change of the patch is that now time check
is done only by main thread. In the past, before lazy
SMP, we needed all the threds to check for available
time because main thread could have been blocked on
a split point, now this is no more the case and main
thread can do the job alone, greatly simplifying the logic.
Verified for regression testing on STC with 7 threads:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 11895 W: 1741 L: 1608 D: 8546
No functional change.
Closes#1152
Now we don't need anymore the tricky pointer to
show the failed test. Added some few tests too.
Also small rename in see_ge() while there.
No functional change
Closes#1151
the nodes, tbHits, rootDepth and lastInfoTime variables are read by multiple threads, but not declared atomic, leading to data races as found by -fsanitize=thread. This patch fixes this issue. It is based on top of the CI-threading branch (PR #1129), and should fix the corresponding CI error messages.
The patch passed an STC check for no regression:
http://tests.stockfishchess.org/tests/view/5925d5590ebc59035df34b9f
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 169597 W: 29938 L: 30066 D: 109593
Whereas rootDepth and lastInfoTime are not performance critical, nodes and tbHits are. Indeed, an earlier version using relaxed atomic updates on the latter two variables failed STC testing (http://tests.stockfishchess.org/tests/view/592001700ebc59035df34924), which can be shown to be due to x86-32 (http://tests.stockfishchess.org/tests/view/592330ac0ebc59035df34a89). Indeed, the latter have no instruction to atomically update a 64bit variable. The proposed solution thus uses a variable in Position that is accessed only by one thread, which is copied every few thousand nodes to the shared variable in Thread.
No functional change.
Closes#1130Closes#1129
StepAttacks[] is misdesigned, the color dependance is specific
to pawns, and trying to generalise to king and knights, proves
neither useful nor convinient in practice.
So this patch reformats the code with the following changes:
- Use PieceType instead of Piece in attacks_() functions
- Use PseudoAttacks for KING and KNIGHT
- Rename StepAttacks[] into PawnAttacks[]
Original patch and idea from Alain Savard.
No functional change.
Closes#1086
Implement a threefold repetition detection. Below are the examples of
problems fixed by this change.
Loosing move in a drawn position.
position fen 8/k7/3p4/p2P1p2/P2P1P2/8/8/K7 w - - 0 1 moves a1a2 a7a8 a2a1
The old code suggested a loosing move "bestmove a8a7", the new code suggests "bestmove a8b7" leading to a draw.
Incorrect evaluation (happened in a real game in TCEC Season 9).
position fen 4rbkr/1q3pp1/b3pn2/7p/1pN5/1P1BBP1P/P1R2QP1/3R2K1 w - - 5 31 moves e3d4 h8h6 d4e3
The old code evaluated it as "cp 0", the new code evaluation is around "cp -50" which is adequate.
Brings 0.5-1 ELO gain. Passes [-3.00,1.00].
STC: http://tests.stockfishchess.org/tests/view/584ece040ebc5903140c5aea
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 47744 W: 8537 L: 8461 D: 30746
LTC: http://tests.stockfishchess.org/tests/view/584f134d0ebc5903140c5b37
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 36775 W: 4739 L: 4639 D: 27397
Patch has been rewritten into current form for simplification and
logic slightly changed so that return a draw score if the position
repeats once earlier but after or at the root, or repeats twice
strictly before the root. In its original form, repetition at root
was not returned as an immediate draw.
After retestimng testing both version with SPRT[-3, 1], both passed
succesfully, but this version was chosen becuase more natural. There is
an argument about MultiPV in which an extended draw at root may be sensible.
See discussion here:
https://github.com/official-stockfish/Stockfish/pull/925
For documentation, current version passed both at STC and LTC:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 51562 W: 9314 L: 9245 D: 33003
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 115663 W: 14904 L: 14906 D: 85853
bench: 5468995
In 10 of 12 calls total to Position::do_move()the givesCheck argument is
simply gives_check(m). So it's reasonable to make an overload without
this parameter, which wraps the existing version.
No functional change.
Rewrite the code in SF style, simplify and
document it.
Code is now much clear and bug free (no mem-leaks and
other small issues) and is also smaller (more than
600 lines of code removed).
All the code has been rewritten but root_probe() and
root_probe_wdl() that are completely misplaced and should
be retired altogheter. For now just leave them in the
original version.
Code is fully and deeply tested for equivalency both in
functionality and in speed with hundreds of games and
test positions and is guaranteed to be 100% equivalent
to the original.
Tested with tb_dbg branch for functional equivalency on
more than 12M positions.
stockfish.exe bench 128 1 16 syzygy.epd
Position: 2016/2016
Total 12121156 Hits 0 hit rate (%) 0
Total time (ms) : 4417851
Nodes searched : 1100151204
Nodes/second : 249024
Tested with 5,000 games match against master, 1 Thread,
128 MB Hash each, tc 40+0.4, which is almost equivalent
to LTC in Fishtest on this machine. 3-, 4- and 5-men syzygy
bases on SSD, 12-moves opening book to emphasize mid- and endgame.
Score of SF-SyzygyC++ vs SF-Master: 633 - 617 - 3750 [0.502] 5000
ELO difference: 1
No functional change.
Use a per-thread counter to reduce contention
with many cores and endgame positions.
Measured around 1% speed-up on a 12 core and 8%
on 28 cores with 6-men, searching on:
7R/1p3k2/2p2P2/3nR1P1/8/3b1P2/7K/r7 b - - 3 38
Also retire the unused set_nodes_searched() and fix
a couple of return types and naming conventions.
No functional change.
Stephane's patch removes the only usage of Position::see, where the
returned value isn't immediately compared with a value. So I replaced
this function by its optimised and more specific version see_ge. This
function also supersedes the function Position::see_sign.
bool Position::see_ge(Move m, Value v) const;
This function tests if the SEE of a move is greater or equal than a
given value. We use forward iteration on captures instread of backward
one, therefore we don't need the swapList array. Also we stop as soon
as we have enough information to obtain the result, avoiding unnecessary
calls to the min_attacker function.
Speed tests (Windows 7), 20 runs for each engine:
Test engine: mean 866648, st. dev. 5964
Base engine: mean 846751, st. dev. 22846
Speedup: 1.023
Speed test by Stephane Nicolet
Fishtest STC test:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 26040 W: 4675 L: 4442 D: 16923
http://tests.stockfishchess.org/tests/view/57f648990ebc59038170fa03
No functional change.
Don't allow pinned pieces to attack the exchange-square as long all
pinners (this includes also potential ones) are on their original
square.
As soon a pinner moves to the exchange-square or get captured on it, we
fall back to standard SEE behaviour.
This correctly handles the majority of cases with absolute pins.
bench: 6883133
This greately simplifies usage because hides to the
search the implementation specific CheckInfo.
This is based on the work done by Marco in pull request #716,
implementing on top of it the ideas in the discussion: caching
the calls to slider_blockers() in the CheckInfo structure,
and simplifying the slider_blockers() function by removing its
first parameter.
Compared to master, bench is identical but the number of calls
to slider_blockers() during bench goes down from 22461515 to 18853422,
hopefully being a little bit faster overall.
archlinux, gcc-6
make profile-build ARCH=x86-64-bmi2
50 runs each
bench:
base = 2356320 +/- 981
test = 2403811 +/- 981
diff = 47490 +/- 1828
speedup = 0.0202
P(speedup > 0) = 1.0000
perft 6:
base = 175498484 +/- 429925
test = 183997959 +/- 429925
diff = 8499474 +/- 469401
speedup = 0.0484
P(speedup > 0) = 1.0000
perft 7 (but only 10 runs):
base = 185403228 +/- 468705
test = 188777591 +/- 468705
diff = 3374363 +/- 476687
speedup = 0.0182
P(speedup > 0) = 1.0000
$ ./pyshbench ../Stockfish/master ../Stockfish/test 20
run base test diff
...
base = 2501728 +/- 182034
test = 2532997 +/- 182034
diff = 31268 +/- 5116
speedup = 0.0125
P(speedup > 0) = 1.0000
No functional change.
In this position we should have draw for repetition:
position fen rnbqkbnr/2pppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1 moves g1f3 g8f6 f3g1
go infinite
But latest patch broke it.
Actually we had two(!) very subtle bugs, the first is that Position::set()
clears the passed state and in particular 'previous' member, so
that on passing setupStates, 'previous' pointer was reset.
Second bug is even more subtle: SetupStates was based on std::vector
as container, but when vector grows, std::vector copies all its contents
to a new location invalidating all references to its entries. Because
all StateInfo records are linked by 'previous' pointer, this made pointers
go stale upon adding more element to setupStates. So revert to use a
std::deque that ensures references are preserved when pushing back new
elements.
No functional change.
And passed in do_move(), this ensures maximum efficiency and
speed and at the same time unlimited move numbers.
The draw back is that to handle Position init we need to
reserve a StateInfo inside Position itself and use at
init time and when copying from another Position.
After lazy SMP we don't need anymore this gimmick and we can
get rid of this special case and always pass an external
StateInfo to Position object.
Also rewritten and simplified Position constructors.
Verified it does not regress with a 3 threads SMP test:
ELO: -0.00 +-12.7 (95%) LOS: 50.0%
Total: 1000 W: 173 L: 173 D: 654
No functional change.
Instead of creating a running std::thread and
returning, wait in Thread c'tor that the native
thread of execution goes to sleep in idle_loop().
In this way we can simplify how search is started,
because when main thread is idle we are sure also
all other threads will be idle, in any case, even
at thread creation and startup.
After lazy smp went in, we can simpify and rewrite
a lot of logic that is now no more needed. This is
hopefully the final big cleanup.
Tested for no regression at 5+0.1 with 3 threads:
LLR: 2.95 (-2.94,2.94) [-5.00,0.00]
Total: 17411 W: 3248 L: 3198 D: 10965
No functional change.
Use Position::square and Position::squares instead.
This allow us to remove king_square(), simplify
endgames and to have more naming uniformity.
Moreover, this is a prerequisite step in case
in the future we decide to retire piece lists
altoghter and use pop_lsb() to loop across
pieces and serialize the moves. In this way
we just need to change definition of Position::square
to something like:
template<PieceType Pt> inline
Square Position::square(Color c) const {
return lsb(byColorBB[c]);
}
No functional change.
Easier for tuning psq tables:
TUNE(myParameters, PSQT::init);
Also move PSQT code in a new *.cpp file, and retire the
old and hacky psqtab.h that required to be included only
once to work correctly, this is not idiomatic for a header
file.
Give wide visibility to psq tables (previously visible only
in position.cpp), this will easy the use of psq tables outside
Position, for instance in move ordering.
Finally trivial code style fixes of the latest patches.
Original patch of Lucas Braesch.
No functional change.
This micro-optimization only complicates the code and provides no benefit.
Removing it is even a speedup on my machine (i7-3770k, linux, gcc 4.9.1):
stat test master diff
mean 2,403,118 2,390,904 12,214
stdev 12,043 10,620 3,677
speedup 0.51%
P(speedup>0) 100.0%
No functional change.
Also remove useless StateCopySize64 optimization:
compiler uses SSE movups instruction anyhow and
does not need this trick (verified with fishbench).
No functional change.
Import C++11 branch from:
https://github.com/mcostalba/Stockfish/tree/c++11
The version imported is teh last one as of today:
6670e93e50
Branch is fully equivalent with master but syzygy
tablebases that are missing (but will be added with
next commit).
bench: 8080602