Also the penalty/bonus function is misleading, we
should simply change it to stat_bonus(depth) for
bonus and -stat_bonus(depth+ ONE_PLY) for extra
penalty.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 11656 W: 2183 L: 2008 D: 7465
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 11152 W: 1531 L: 1377 D: 8244
Bench: 6101931
Previously, we had duplicated History:
- one with (piece,to) called History
- one with (from,to) called FromTo
Now that we have only one, rename it to History, which is the generally accepted
name in the chess programming litterature for this technique.
Also correct some comments that had not been updated since the introduction of CMH.
No functional change.
EasyMove is cleared after every iteration of the
search if the 3rd move in the PV of the main thread
changes from the previous iteration. Therefore,
clearing EasyMove during a search iteration may be
excessive. The tests show that this is indeed unnecessary.
In the new version the EasyMove variable is used only in
the Thread::search function.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 47719 W: 8438 L: 8362 D: 30919
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 122841 W: 15448 L: 15457 D: 91936
bench: 5468995
Implement a threefold repetition detection. Below are the examples of
problems fixed by this change.
Loosing move in a drawn position.
position fen 8/k7/3p4/p2P1p2/P2P1P2/8/8/K7 w - - 0 1 moves a1a2 a7a8 a2a1
The old code suggested a loosing move "bestmove a8a7", the new code suggests "bestmove a8b7" leading to a draw.
Incorrect evaluation (happened in a real game in TCEC Season 9).
position fen 4rbkr/1q3pp1/b3pn2/7p/1pN5/1P1BBP1P/P1R2QP1/3R2K1 w - - 5 31 moves e3d4 h8h6 d4e3
The old code evaluated it as "cp 0", the new code evaluation is around "cp -50" which is adequate.
Brings 0.5-1 ELO gain. Passes [-3.00,1.00].
STC: http://tests.stockfishchess.org/tests/view/584ece040ebc5903140c5aea
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 47744 W: 8537 L: 8461 D: 30746
LTC: http://tests.stockfishchess.org/tests/view/584f134d0ebc5903140c5b37
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 36775 W: 4739 L: 4639 D: 27397
Patch has been rewritten into current form for simplification and
logic slightly changed so that return a draw score if the position
repeats once earlier but after or at the root, or repeats twice
strictly before the root. In its original form, repetition at root
was not returned as an immediate draw.
After retestimng testing both version with SPRT[-3, 1], both passed
succesfully, but this version was chosen becuase more natural. There is
an argument about MultiPV in which an extended draw at root may be sensible.
See discussion here:
https://github.com/official-stockfish/Stockfish/pull/925
For documentation, current version passed both at STC and LTC:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 51562 W: 9314 L: 9245 D: 33003
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 115663 W: 14904 L: 14906 D: 85853
bench: 5468995
Fixes the only exception, in razoring.
The code already does assert(PvNode || (alpha == beta - 1)), and it can be verified by studying the program flow that this is indeed the case, also for the modified line.
No functional change.
make skipEarlyPruning a search argument instead of managing this by hand.
Verified for no regression at STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 96754 W: 17089 L: 17095 D: 62570
No functional change.
* Refactor bonus and penalty calculation
Compute common terms in a helper function.
No functional change.
* Further refactoring
Remove some parenthesis that are now useless.
Define prevSq once, use repeatedly.
No functional change.
bench: 5884767 (bench of previous patch is wrong)
Makes the actual number of nodes searched match closely
the number of nodes requested, by increasing the frequency
of checking the number of nodes searched at low node count.
All other searches retain the default checking frequency of
once per 4096 nodes, and are thus unaffected.
Passed STC as non-regression
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 26643 W: 4766 L: 4655 D: 17222
No functional change.
In 10 of 12 calls total to Position::do_move()the givesCheck argument is
simply gives_check(m). So it's reasonable to make an overload without
this parameter, which wraps the existing version.
No functional change.
Non functional change, tests under sanitizers OK.
Rationales for change
- Offset in code is in range -4 ... 2
- There was an error by (pathological) corner case MAX_PLY=0
No functional change.
Rewrite the code in SF style, simplify and
document it.
Code is now much clear and bug free (no mem-leaks and
other small issues) and is also smaller (more than
600 lines of code removed).
All the code has been rewritten but root_probe() and
root_probe_wdl() that are completely misplaced and should
be retired altogheter. For now just leave them in the
original version.
Code is fully and deeply tested for equivalency both in
functionality and in speed with hundreds of games and
test positions and is guaranteed to be 100% equivalent
to the original.
Tested with tb_dbg branch for functional equivalency on
more than 12M positions.
stockfish.exe bench 128 1 16 syzygy.epd
Position: 2016/2016
Total 12121156 Hits 0 hit rate (%) 0
Total time (ms) : 4417851
Nodes searched : 1100151204
Nodes/second : 249024
Tested with 5,000 games match against master, 1 Thread,
128 MB Hash each, tc 40+0.4, which is almost equivalent
to LTC in Fishtest on this machine. 3-, 4- and 5-men syzygy
bases on SSD, 12-moves opening book to emphasize mid- and endgame.
Score of SF-SyzygyC++ vs SF-Master: 633 - 617 - 3750 [0.502] 5000
ELO difference: 1
No functional change.
Instead of outputting "info nodes ... time ..." when the last
iteration is interrupted, simply call UCI::pv() to output the PV.
I thought about calling UCI:pv() with bounds -VALUE_INFINITE, VALUE_INFINITE
to avoid "lowerbound" or "upperbound" appearing in it, but I'm not sure that
would be any better.
This patch fixes rare inconsistencies between the first move of
the last PV output and the bestmove played. It also makes sure
that all the latest statistics are sent to the GUI (not only nodes
and time but also nps, tbhits, hashfull).
No functional change.
Use a per-thread counter to reduce contention
with many cores and endgame positions.
Measured around 1% speed-up on a 12 core and 8%
on 28 cores with 6-men, searching on:
7R/1p3k2/2p2P2/3nR1P1/8/3b1P2/7K/r7 b - - 3 38
Also retire the unused set_nodes_searched() and fix
a couple of return types and naming conventions.
No functional change.
We reference (ss-1)->currentMove, i.e. we peek
current move of the parent node, so currentMove
should be valid in the main move loop, when we
search() the subtree, but outside of main loop
it is useless.
No functional change.
Also a speedup since we don't need to recalculate SEE
for extensions...as it already determined to be positive.
Results for 12 tests for each version:
Base Test Diff
Mean 2132395 2191002 -58607
StDev 128058 85917 134239
p-value: 0.669
speedup: 0.027
Non functional change.
Stephane's patch removes the only usage of Position::see, where the
returned value isn't immediately compared with a value. So I replaced
this function by its optimised and more specific version see_ge. This
function also supersedes the function Position::see_sign.
bool Position::see_ge(Move m, Value v) const;
This function tests if the SEE of a move is greater or equal than a
given value. We use forward iteration on captures instread of backward
one, therefore we don't need the swapList array. Also we stop as soon
as we have enough information to obtain the result, avoiding unnecessary
calls to the min_attacker function.
Speed tests (Windows 7), 20 runs for each engine:
Test engine: mean 866648, st. dev. 5964
Base engine: mean 846751, st. dev. 22846
Speedup: 1.023
Speed test by Stephane Nicolet
Fishtest STC test:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 26040 W: 4675 L: 4442 D: 16923
http://tests.stockfishchess.org/tests/view/57f648990ebc59038170fa03
No functional change.
This is a bit tricky because we don't want
to prune the only legal evasions, even if
with negative SEE. So add an assert to avoid
this subtle bug to slip in later.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 14140 W: 2625 L: 2421 D: 9094
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 11558 W: 1555 L: 1379 D: 8624
bench: 5256717
Rename shift_bb() to shift(), and DELTA_S to SOUTH, etc.
to improve code readability, especially in evaluate.cpp
when they are used together:
old b = shift_bb<DELTA_S>(pos.pieces(PAWN))
new b = shift<SOUTH>(pos.pieces(PAWN))
While there fix some small code style issues.
No functional change.
Drop useless condition
abs(ttValue) < VALUE_KNOWN_WIN
And extend singular extension search to cases when ttValue
stores a mate score. This improves mate finding and does
not introduce any regression.
Yery tested this patch against current master on the 6500+
Chest mate suite with 200K fixed nodes:
shortest mates found: master: 1206 patch:1205
any mate found: master: 1903 patch: 2003
with 1 sec time:
shortest mates found: master: 2667 patch: 2628
any mate found: master: 3585 patch: 3646
Verified for no regression:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 25655 W: 4578 L: 4465 D: 16612
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 66247 W: 8618 L: 8557 D: 49072
bench: 6335042
Both Tablebases::filter_root_moves() and
extract_ponder_from_tt(9 were unable to handle
a mate/stalemate position.
Spotted and reported by Dann Corbit.
Added some mate/stalemate positions to bench so
to early catch this regression in the future.
No functional change.
Rescales the king danger variables in evaluate_king() to
suppress the KingDanger[] array. This avoids the cost of
the memory accesses to the array and simplifies the non-linear
transformation used.
Full credits to "hxim" for the seminal idea and implementation,
see pull request #786.
https://github.com/official-stockfish/Stockfish/pull/786
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 9649 W: 1829 L: 1689 D: 6131
Passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 53494 W: 7254 L: 7178 D: 39062
Bench: 6116200
Drops a scalability bottleneck due to memory contention
of a single shared table across threads. The effect starts
to be sensible with a high number of threads. Specifically
we have a small regression with 7 threads both at 60 and
180 seconds TC:
10000 @ 60+0.6 th 7
ELO: -2.46 +-3.2 (95%) LOS: 6.5%
Total: 9896 W: 1037 L: 1107 D: 7752
5000 @ 180+0.6 th 7
ELO: -1.95 +-4.1 (95%) LOS: 17.7%
Total: 5000 W: 444 L: 472 D: 4084
We have a regression because counterMoveHistory table is
quite big and it takes time for a single thread to fill it.
Sharing the table yields to a higher fill rate and better
quality of moves and up to 7 threads the benefits of sharing
more then compensate the loss in speed due to contention.
Interestingly even with a 3X longer TC, so with more time
for the single thread to catch up, the improvment is quite
limited and below noise level. It seems we really need much
longer TC to saturate the table.
When we move to high threads number it's another story:
5000 @ 60+0.6 th 22
ELO: 3.49 +-4.3 (95%) LOS: 94.6%
Total: 4880 W: 490 L: 441 D: 3949
2000 @ 60+0.6 th 32
ELO: 8.34 +-6.9 (95%) LOS: 99.1%
Total: 2000 W: 229 L: 181 D: 1590
As expected the speed-up more than compensates the filling
rate, and we expect that with tournament TC, where single
thread is able to saturate the table, the difference will
be even stronger. For instance for TCEC 9 super-final time
control will be 180 minutes + 15 seconds and this scalability
improvement seems definitely the way to go.
So, summarizing:
GOOD:
Measured big improvement in high core scenario
Suitable for TCEC 9 superfinal (big hardware, very long TC)
Consistent and natural patch that extends to counterMoveHistory
what we already do for remaining history tables, that are all per-thread
Non functional change for the common case of a single core
Very simple (just 6 lines modified, no added ones)
BAD:
Small regression (within 2-3 ELO) with few threads and short TC
bench: 5341477