Official release version of Stockfish 10.
This is also the 10th anniversary version of the Stockfish project, which
started exactly ten years ago! I wish to extend a huge thank you to
all contributors and authors in our amazing community :-)
Bench: 3939338
The patch was tested for correctness by running bench with and
without the change against current master, and the tablebase hit
numbers were found to be identical in both cases. See the pull
request comments for details:
https://github.com/official-stockfish/Stockfish/pull/1826
No functional change.
On November 16th, before the removal of the depth condition, I tried
revising castling extensions to only handle castling moves, rather than
moves that change castling rights generally. It appeared to be a slight
Elo gain at STC but insufficient to pass [0, 4] (+0.5 Elo), but what I
overlooked was that it made pos.can_castle(us) irrelevant and should
have been a simplification. Recent discussion with @Chess13234 and
Michael Chaly (@Vizvezdenec) inspired me to take a second look, and
the simplification continues to pass when rebased on the current master.
This replaces two conditions with one, because type_of(move) == CASTLING
implies pos.can_castle(Us), allowing us to remove the latter condition.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 110948 W: 24209 L: 24263 D: 62476
http://tests.stockfishchess.org/tests/view/5bf8f65c0ebc5902bced3a63
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 88283 W: 14681 L: 14668 D: 58934
http://tests.stockfishchess.org/tests/view/5bf994a60ebc5902bced4349
Bench: 3939338
When running on a cloud VM (n1-highcpu-96) with several NVMe SSDs and
some non-SSDs for tablebases, I noticed that the average SSD request size was
more than 256 kB. This doesn't make a lot of sense for Syzygy tablebases,
which have a block size of 32 bytes and very low locality.
Seemingly, the tablebase access patterns during probing make the OS,
at least Linux, think that readahead is advantageous; normally, it
gives up doing readahead if there are too many misses, but it doesn't,
perhaps due to the fairly high overall hit rates. (It seems the kernel cannot
distinguish between reading a block that was paged in because the userspace
wanted it explicitly, and one that was read as part of readahead.)
Setting MADV_RANDOM effectively turns off readahead, which causes
the request size to drop to 4 kB. In the aforemented cloud VM test,
this roughly tripled the amount of I/O requests that were able to go
through, while reducing the total traffic from 2.8 GB/sec to 56 MB/sec
(moving the bottleneck to the non-SSDs; it seems the SSDs could have
sustained many more requests).
Closes https://github.com/official-stockfish/Stockfish/pull/1829
No functional change.
Don't do an extra TT update in case of a fail-high,
but simply break off the moves loop and let the TT update
at the end of qsearch do this job.
Same workflow/logic as in our main search function now.
Tested for no regression to be on the safe side.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 30237 W: 6665 L: 6560 D: 17012
http://tests.stockfishchess.org/tests/view/5bf928e80ebc5902bced3f3a
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 51067 W: 8625 L: 8553 D: 33889
http://tests.stockfishchess.org/tests/view/5bf937180ebc5902bced3fdc
No functional change.
Tropism in kingdanger was simplified away in this pull request #1821.
This patch reintroduces tropism in kingdanger with using quadratic scaling.
Passed STC http://tests.stockfishchess.org/tests/view/5bf7c1b10ebc5902bced1f8f
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 52803 W: 11835 L: 11442 D: 29526
Passed LTC http://tests.stockfishchess.org/tests/view/5bf816e90ebc5902bced24f1
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 17204 W: 2988 L: 2795 D: 11421
How do we continue from there?
I've recently tried to introduce tropism difference term in kingdanger which
passed STC 6 times but failed LTC all the time. Maybe using quadratic scaling
for it will also be helpful.
Bench 4041387
A recent LTC tuning session by @candirufish showed this term decreasing significantly. It appears that it can be removed altogether without significant Elo loss.
I also thank @GuardianRM, whose attempt to remove tropism from king danger inspired this one.
After this PR is merged, my next step will be to attempt to tune the coefficients of this new, simplified kingDanger calculation.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 12518 W: 2795 L: 2656 D: 7067
http://tests.stockfishchess.org/tests/view/5befadda0ebc595e0ae3a289
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 164771 W: 26463 L: 26566 D: 111742
http://tests.stockfishchess.org/tests/view/5befcca70ebc595e0ae3a343
LTC 2, rebased on Stockfish 10 beta:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 75226 W: 12563 L: 12529 D: 50134
http://tests.stockfishchess.org/tests/view/5bf2e8910ebc5902bcecb919
Bench: 3412071
Because of aggressive time management and optimistic assumptions
about move overhead, it's still very easy to get Stockfish to forfeit
on time when we hit an endgame and have Syzygy EGTB on a spinning
drive. The latency from serving a few thousand EGTB probes (~10ms each),
of which there can currently be up to 4000 outstanding before a time
check, will easily overwhelm the default Move Overhead of 30ms.
This problem was first raised by Gian-Carlo Pascutto and some solutions
and improvements were discussed in the following pull requests:
https://github.com/official-stockfish/Stockfish/pull/1471https://github.com/official-stockfish/Stockfish/pull/1623https://github.com/official-stockfish/Stockfish/pull/1783
This patch is a minimal change proposed by Marco Costalba to lower
the impact of the bug. We now force a check of the clock right after
each tablebase read.
No functional change.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 51883 W: 11297 L: 10915 D: 29671
http://tests.stockfishchess.org/tests/view/5bf1e2ee0ebc595e0ae3cacd
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 15859 W: 2752 L: 2565 D: 10542
http://tests.stockfishchess.org/tests/view/5bf337980ebc5902bcecbf62
Notes:
(1) The bonus value has not been carefully tested, so it may be possible
to find slightly better values.
(2) Plan is to now try adding similar restriction for pawns. I wanted to
include that as part of this pull request, but I was advised to do it as
two separate pull requests. STC is currently running here, but may not add
enough value to pass green.
Bench: 3679086
Preparation commit for the upcoming Stockfish 10 version, giving a chance to catch last minute feature bugs and evaluation regression during the one-week code freeze period. Also changing the copyright dates to include 2019.
No functional change
STC:
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 84697 W: 18173 L: 18009 D: 48515
http://tests.stockfishchess.org/tests/view/5bea366f0ebc595e0ae34793
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 157625 W: 25533 L: 24893 D: 107199
http://tests.stockfishchess.org/tests/view/5be8b69e0ebc595e0ae33024
Personally, I feel like SF has been tuned to death recently and that we
need to step away from existing-parameter tunes for a bit and focus more
on new ideas. I don't really think there's much more ELO in these tunes
(for now). For me at least, this was the last existing-parameter tune I'll
be running for quite a while. Cheers!
Bench: 3572567
It does not appear to be not necessary or advantageous to
conditionally initialize kingRing[Us] or kingAttackersCount[Them],
so the 'else' can be removed.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22873 W: 4923 L: 4804 D: 13146
http://tests.stockfishchess.org/tests/view/5be9a8270ebc595e0ae33c7e
No functional change
To top the rating lists and get more interesting middle play, it
is a good habit to set the default contempt to the highest value
that does not regress against contempt=0. We recently decreased
PawnValueEg it is logical that to raise a little bit the default
higher contempt because of the following internal dependency in
line 334 of search.cpp :
````
int ct = int(Options["Contempt"]) * PawnValueEg / 100; // From centipawns
````
STC: contempt=24 passed non-regression vs contempt=0
http://tests.stockfishchess.org/tests/view/5bd6d7f80ebc595e0ae21e14
LTC: contempt=24 passed non-regression LTC vs contempt=0
http://tests.stockfishchess.org/tests/view/5bd6e0980ebc595e0ae21f07
On 2018-11-01, we also tested the effects of contempt=21 and contempt=24
against Stockfish 9, and the net result was neutral:
Contempt 21
ELO: 51.68 +-1.9 (95%) LOS: 100.0%
Total: 40000 W: 9487 L: 3581 D: 26932
http://tests.stockfishchess.org/tests/view/5bdb1a140ebc595e0ae2620a
Contempt 24
ELO: 52.21 +-2.0 (95%) LOS: 100.0%
Total: 40000 W: 9759 L: 3793 D: 26448
http://tests.stockfishchess.org/tests/view/5bdb1b680ebc595e0ae2620d
Bench: 3459874
This patch will make possible to free mapped TB files with "ucinewgame" command.
We wrote this patch specifically to address a problem that arose while
running Stockfish with 7-piece tablebases as a kibitzer at TCEC for
extended periods of time across multiple games. It was noted that after
some time, the NPS of the kibitzing Stockfish (which is usually 3x faster
than the Stockfish actually competing) would drop precipitously, eventually
falling to preposterously low numbers until restarted.
Their eval bot basically inputs FEN, go infinite, stop and loop, it probably
didn't do ucinewgame either. As time goes it gradually slowed down and OS
starts to use swap, this is not reasonable since the engine only uses 16GB
hash and the machine has 1TB physical RAM and does nothing else.
Author : noobpwnftw
Closes https://github.com/official-stockfish/Stockfish/pull/1790
No functional change.
We don't need to pass the king square as an explicit parameter to the functions
king_safety() and do_king_safety() since we already pass in the position.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 69686 W: 14894 L: 14866 D: 39926
http://tests.stockfishchess.org/tests/view/5be84ac20ebc595e0ae3283c
No functional change.
We calculate tropism as a sum of two factors. The first is the number of squares in our kingFlank and Camp that are attacked by the enemy; the second is number of these squares that are attacked twice. Prior to this commit, we excluded squares we defended with pawns from this second value, but this appears unnecessary. (Doubly-attacked squares near our king are still dangerous.) The removal of this exclusion is a possible small Elo gain at STC (estimated +1.59) and almost exactly neutral at LTC (estimated +0.04).
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20942 W: 4550 L: 4427 D: 11965
http://tests.stockfishchess.org/tests/view/5be4e0ae0ebc595e0ae308a0
LTC:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 56941 W: 9172 L: 9108 D: 38661
http://tests.stockfishchess.org/tests/view/5be4ec340ebc595e0ae30938
Bench: 3813986
The recently committed Fail-High patch (081af90805)
had a number of changes beyond adjusting the depth of search on fail high, with
some undesirable side effects.
1) Decreasing depth on PV output, confusing GUIs and players alike as described in
issue #1787. The depth printed is anyway a convention, let's consider adjustedDepth
an implementation detail, and continue to print rootDepth. Depth, nodes, time and
move quality all increase as we compute more. (fixing this output has no effect on
play).
2) Fixes go depth output (now based on rootDepth again, no effect on play), also
reported in issue #1787
3) The depth lastBestDepth is used to compute how long a move is stable, a new move
found during fail-high is incorrectly considered stable if based on adjustedDepth
instead of rootDepth (this changes time management). Reverting this passed STC
and LTC:
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 82982 W: 17810 L: 17808 D: 47364
http://tests.stockfishchess.org/tests/view/5bd391a80ebc595e0ae1e993
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 109083 W: 17602 L: 17619 D: 73862
http://tests.stockfishchess.org/tests/view/5bd40c820ebc595e0ae1f1fb
4) In the thread voting scheme, the rank of the fail-high thread is now artificially
low, incorrectly since the quality of the move is much better than what adjustedDepth
suggests (e.g. if it takes 10 iterations to find VALUE_KNOWN_WIN, it has very low
depth). Further evidence comes from a test that showed that the move of highest
depth is not better than that of the last PV (which is potentially of much lower
adjustedDepth).
I.e. this test http://tests.stockfishchess.org/tests/view/5bd37a120ebc595e0ae1e7c3
failed SPRT[0, 5]:
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 10609 W: 2266 L: 2345 D: 5998
In a running 5+0.05 th 8 test (more than 10000 games) a positive Elo estimate is
shown (strong enough for a [-3,1], possibly not [0,4]):
http://tests.stockfishchess.org/tests/view/5bd421be0ebc595e0ae1f315
LLR: -0.13 (-2.94,2.94) [0.00,4.00]
Total: 13644 W: 2573 L: 2532 D: 8539
Elo 1.04 [-2.52,4.61] / LOS 71%
Thus, restore old behavior as a bugfix, keeping the core of the fail-high patch
idea as resolving scheme. This is non-functional for bench, but changes searches
via time management and in the threaded case.
Bench: 3556672
This helps resolving consecutive FH's during aspiration more efficiently
STC:
http://tests.stockfishchess.org/tests/view/5bc857920ebc592439f85765
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 4992 W: 1134 L: 980 D: 2878 Elo +10.72
LTC:
http://tests.stockfishchess.org/tests/view/5bc868050ebc592439f857ef
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 8123 W: 1363 L: 1210 D: 5550 Elo +6.54
No-Regression test with 8 threads, tc=15+0.15:
http://tests.stockfishchess.org/tests/view/5bc874ca0ebc592439f85938
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 24740 W: 3977 L: 3863 D: 16900 Elo +1.60
This was a cooperation between me and Michael Stembera:
-me recognizing SF having problems with resolving FH's efficiently at
high depths, thus starting some tests based on consecutive FH's.
-mstembera picking up the idea with first success at STC & LTC (so full
credits to him!)
-me suggesting how to resolve the issues pinpointed by S.G on PR #1768
and finally restricting the logic to the main thread so that it don't
regresses at multi-thread.
bench: 3314347
Enable numa machinery only for STRICTLY MORE than 8 threads. Reason for this
change is that nowadays SMP tests are always done with 8 threads. That is a
problem for multi-socket Windows machines running on fishtest.
No functional change
The patch adds a small random component (+-1) to VALUE_DRAW for the evaluation
of draw positions (mostly 3folds). This random component is not static, but
potentially different for each visit of the node (hence derived from the node
counter). The effect is that in positions with many 3fold draw lines, different
lines are followed at each iteration. This keeps the search much more dynamic,
as opposed to being locked to one particular 3fold.
An example of a position where master suffers from 3fold-blindness and this patch
solves quickly is the famous TCEC game 53:
FEN: 3r2k1/pr6/1p3q1p/5R2/3P3p/8/5RP1/3Q2K1 b - - 0 51
master doesn't see that this is a lost position (draw eval up to depth 50) as
Qf6-e6 d4-d5 (found by patch at depth 23) leads to a loss.
The 3fold-blindness is more important at longer TC, the patch was yellow STC and
LTC, but passed VLTC:
STC
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 46328 W: 10048 L: 9953 D: 26327
http://tests.stockfishchess.org/tests/view/5b9c0ca20ebc592cf275f7c7
LTC
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 54663 W: 8938 L: 8846 D: 36879
http://tests.stockfishchess.org/tests/view/5b9ca1610ebc592cf27601d3
VLTC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 31789 W: 4512 L: 4284 D: 22993
http://tests.stockfishchess.org/tests/view/5b9d1a670ebc592cf276076d
Credit to @crossbr for pointing to this problem repeatedly, and giving the hint
that many draw lines are typical in those situations.
Bench: 4756639
Currently we update (track up) the pv even in the fail high case.
However most times in such cases the pv in the ply below remains unset
because there we have value == alpha and so finally we see truncated
pv's (=just one move) in fail high cases.
Of course tracking down these pv's (+sending them to the gui) comes at a
certian cost, but no-regression tests passed:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 16300 W: 3556 L: 3424 D: 9320
http://tests.stockfishchess.org/tests/view/5b9b73500ebc592cf275ea92
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 202411 W: 32734 L: 32897 D: 136780
http://tests.stockfishchess.org/tests/view/5b9baed10ebc592cf275ef6d
N.B.: Digging also into qsearch was tried in another version but seemed
not to pass the tests. This means that we don't always will get a pv
until the very tips.
No functional change
This PR is a combination of two unrelated [0, 4] patches that appeared promising
but not quite strong enough to pass on their own. The combination initially failed
STC with a positive score after a long run, and the subsequent speculative LTC test
passed.
* tweak_threatOnQueen4 :
Increase the middlegame components of ThreatByMinor[QUEEN]
and ThreatByRook[QUEEN] by 15 each. Bryan's (@crossbr) analysis of CCC Bonus Game 10
inspired several tests on penalizing a queen with limited safe mobility. While
attempting to implement this idea, I noticed that when I did not include the queen's
current square in the calculations, the Elo gains seemed to vanish--and only then did
I have the idea to revisit ThreatByMinor[QUEEN] and ThreatByRook[QUEEN], adding a
corresponding value to each. Without Bryan's work, this test would never have been
submitted. I would also like to recognize the efforts and contributions of @SFisGOD,
who also vigorously worked on this idea.
* Use pure static eval for null move pruning :
This idea was directly re-purposed from a promising test by Jerry Donald Watson
(@jerrydonaldwatson) in August. It was also independently developed and tested by
Stefan Geschwentner (@locutus2) previously.
Thank you all!
STC (failed yellow):
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 83913 W: 17986 L: 17825 D: 48102
http://tests.stockfishchess.org/tests/view/5bbc59300ebc592439f76aa5
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 137198 W: 22351 L: 21772 D: 93075
http://tests.stockfishchess.org/tests/view/5bbce35f0ebc592439f77639
Bench: 4312846
Note by snicolet: I use this non-functional change patch
as a pretext to correct the wrong bench number I introduced
in the message of the previous commit.
Bench: 4059356
Storing unconditionally the current generation and bound is equivalent to master.
Part of the condition was added as a speed optimization in #429.
Here the branch is fully eliminated.
passed STC single-threaded:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 73515 W: 16378 L: 16359 D: 40778
http://tests.stockfishchess.org/tests/view/5b2fc38c0ebc5902b2e57fd5
passed STC multi-threaded:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 63725 W: 12916 L: 12874 D: 37935
http://tests.stockfishchess.org/tests/view/5b307b8f0ebc5902b2e5895f
The multithreaded test was run after a plausible suggestion by @mstembera that the effect of this could be larger with many cores. The result seems to indicate this doesn't really matter on the 8core architecture abundantly available on fishtest.
No functional change