Preparation commit for the upcoming Stockfish 10 version, giving a chance to catch last minute feature bugs and evaluation regression during the one-week code freeze period. Also changing the copyright dates to include 2019.
No functional change
STC:
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 84697 W: 18173 L: 18009 D: 48515
http://tests.stockfishchess.org/tests/view/5bea366f0ebc595e0ae34793
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 157625 W: 25533 L: 24893 D: 107199
http://tests.stockfishchess.org/tests/view/5be8b69e0ebc595e0ae33024
Personally, I feel like SF has been tuned to death recently and that we
need to step away from existing-parameter tunes for a bit and focus more
on new ideas. I don't really think there's much more ELO in these tunes
(for now). For me at least, this was the last existing-parameter tune I'll
be running for quite a while. Cheers!
Bench: 3572567
It does not appear to be not necessary or advantageous to
conditionally initialize kingRing[Us] or kingAttackersCount[Them],
so the 'else' can be removed.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22873 W: 4923 L: 4804 D: 13146
http://tests.stockfishchess.org/tests/view/5be9a8270ebc595e0ae33c7e
No functional change
To top the rating lists and get more interesting middle play, it
is a good habit to set the default contempt to the highest value
that does not regress against contempt=0. We recently decreased
PawnValueEg it is logical that to raise a little bit the default
higher contempt because of the following internal dependency in
line 334 of search.cpp :
````
int ct = int(Options["Contempt"]) * PawnValueEg / 100; // From centipawns
````
STC: contempt=24 passed non-regression vs contempt=0
http://tests.stockfishchess.org/tests/view/5bd6d7f80ebc595e0ae21e14
LTC: contempt=24 passed non-regression LTC vs contempt=0
http://tests.stockfishchess.org/tests/view/5bd6e0980ebc595e0ae21f07
On 2018-11-01, we also tested the effects of contempt=21 and contempt=24
against Stockfish 9, and the net result was neutral:
Contempt 21
ELO: 51.68 +-1.9 (95%) LOS: 100.0%
Total: 40000 W: 9487 L: 3581 D: 26932
http://tests.stockfishchess.org/tests/view/5bdb1a140ebc595e0ae2620a
Contempt 24
ELO: 52.21 +-2.0 (95%) LOS: 100.0%
Total: 40000 W: 9759 L: 3793 D: 26448
http://tests.stockfishchess.org/tests/view/5bdb1b680ebc595e0ae2620d
Bench: 3459874
This patch will make possible to free mapped TB files with "ucinewgame" command.
We wrote this patch specifically to address a problem that arose while
running Stockfish with 7-piece tablebases as a kibitzer at TCEC for
extended periods of time across multiple games. It was noted that after
some time, the NPS of the kibitzing Stockfish (which is usually 3x faster
than the Stockfish actually competing) would drop precipitously, eventually
falling to preposterously low numbers until restarted.
Their eval bot basically inputs FEN, go infinite, stop and loop, it probably
didn't do ucinewgame either. As time goes it gradually slowed down and OS
starts to use swap, this is not reasonable since the engine only uses 16GB
hash and the machine has 1TB physical RAM and does nothing else.
Author : noobpwnftw
Closes https://github.com/official-stockfish/Stockfish/pull/1790
No functional change.
We don't need to pass the king square as an explicit parameter to the functions
king_safety() and do_king_safety() since we already pass in the position.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 69686 W: 14894 L: 14866 D: 39926
http://tests.stockfishchess.org/tests/view/5be84ac20ebc595e0ae3283c
No functional change.
We calculate tropism as a sum of two factors. The first is the number of squares in our kingFlank and Camp that are attacked by the enemy; the second is number of these squares that are attacked twice. Prior to this commit, we excluded squares we defended with pawns from this second value, but this appears unnecessary. (Doubly-attacked squares near our king are still dangerous.) The removal of this exclusion is a possible small Elo gain at STC (estimated +1.59) and almost exactly neutral at LTC (estimated +0.04).
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20942 W: 4550 L: 4427 D: 11965
http://tests.stockfishchess.org/tests/view/5be4e0ae0ebc595e0ae308a0
LTC:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 56941 W: 9172 L: 9108 D: 38661
http://tests.stockfishchess.org/tests/view/5be4ec340ebc595e0ae30938
Bench: 3813986
The recently committed Fail-High patch (081af90805)
had a number of changes beyond adjusting the depth of search on fail high, with
some undesirable side effects.
1) Decreasing depth on PV output, confusing GUIs and players alike as described in
issue #1787. The depth printed is anyway a convention, let's consider adjustedDepth
an implementation detail, and continue to print rootDepth. Depth, nodes, time and
move quality all increase as we compute more. (fixing this output has no effect on
play).
2) Fixes go depth output (now based on rootDepth again, no effect on play), also
reported in issue #1787
3) The depth lastBestDepth is used to compute how long a move is stable, a new move
found during fail-high is incorrectly considered stable if based on adjustedDepth
instead of rootDepth (this changes time management). Reverting this passed STC
and LTC:
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 82982 W: 17810 L: 17808 D: 47364
http://tests.stockfishchess.org/tests/view/5bd391a80ebc595e0ae1e993
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 109083 W: 17602 L: 17619 D: 73862
http://tests.stockfishchess.org/tests/view/5bd40c820ebc595e0ae1f1fb
4) In the thread voting scheme, the rank of the fail-high thread is now artificially
low, incorrectly since the quality of the move is much better than what adjustedDepth
suggests (e.g. if it takes 10 iterations to find VALUE_KNOWN_WIN, it has very low
depth). Further evidence comes from a test that showed that the move of highest
depth is not better than that of the last PV (which is potentially of much lower
adjustedDepth).
I.e. this test http://tests.stockfishchess.org/tests/view/5bd37a120ebc595e0ae1e7c3
failed SPRT[0, 5]:
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 10609 W: 2266 L: 2345 D: 5998
In a running 5+0.05 th 8 test (more than 10000 games) a positive Elo estimate is
shown (strong enough for a [-3,1], possibly not [0,4]):
http://tests.stockfishchess.org/tests/view/5bd421be0ebc595e0ae1f315
LLR: -0.13 (-2.94,2.94) [0.00,4.00]
Total: 13644 W: 2573 L: 2532 D: 8539
Elo 1.04 [-2.52,4.61] / LOS 71%
Thus, restore old behavior as a bugfix, keeping the core of the fail-high patch
idea as resolving scheme. This is non-functional for bench, but changes searches
via time management and in the threaded case.
Bench: 3556672
This helps resolving consecutive FH's during aspiration more efficiently
STC:
http://tests.stockfishchess.org/tests/view/5bc857920ebc592439f85765
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 4992 W: 1134 L: 980 D: 2878 Elo +10.72
LTC:
http://tests.stockfishchess.org/tests/view/5bc868050ebc592439f857ef
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 8123 W: 1363 L: 1210 D: 5550 Elo +6.54
No-Regression test with 8 threads, tc=15+0.15:
http://tests.stockfishchess.org/tests/view/5bc874ca0ebc592439f85938
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 24740 W: 3977 L: 3863 D: 16900 Elo +1.60
This was a cooperation between me and Michael Stembera:
-me recognizing SF having problems with resolving FH's efficiently at
high depths, thus starting some tests based on consecutive FH's.
-mstembera picking up the idea with first success at STC & LTC (so full
credits to him!)
-me suggesting how to resolve the issues pinpointed by S.G on PR #1768
and finally restricting the logic to the main thread so that it don't
regresses at multi-thread.
bench: 3314347
Enable numa machinery only for STRICTLY MORE than 8 threads. Reason for this
change is that nowadays SMP tests are always done with 8 threads. That is a
problem for multi-socket Windows machines running on fishtest.
No functional change
The patch adds a small random component (+-1) to VALUE_DRAW for the evaluation
of draw positions (mostly 3folds). This random component is not static, but
potentially different for each visit of the node (hence derived from the node
counter). The effect is that in positions with many 3fold draw lines, different
lines are followed at each iteration. This keeps the search much more dynamic,
as opposed to being locked to one particular 3fold.
An example of a position where master suffers from 3fold-blindness and this patch
solves quickly is the famous TCEC game 53:
FEN: 3r2k1/pr6/1p3q1p/5R2/3P3p/8/5RP1/3Q2K1 b - - 0 51
master doesn't see that this is a lost position (draw eval up to depth 50) as
Qf6-e6 d4-d5 (found by patch at depth 23) leads to a loss.
The 3fold-blindness is more important at longer TC, the patch was yellow STC and
LTC, but passed VLTC:
STC
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 46328 W: 10048 L: 9953 D: 26327
http://tests.stockfishchess.org/tests/view/5b9c0ca20ebc592cf275f7c7
LTC
LLR: -2.95 (-2.94,2.94) [0.00,5.00]
Total: 54663 W: 8938 L: 8846 D: 36879
http://tests.stockfishchess.org/tests/view/5b9ca1610ebc592cf27601d3
VLTC
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 31789 W: 4512 L: 4284 D: 22993
http://tests.stockfishchess.org/tests/view/5b9d1a670ebc592cf276076d
Credit to @crossbr for pointing to this problem repeatedly, and giving the hint
that many draw lines are typical in those situations.
Bench: 4756639
Currently we update (track up) the pv even in the fail high case.
However most times in such cases the pv in the ply below remains unset
because there we have value == alpha and so finally we see truncated
pv's (=just one move) in fail high cases.
Of course tracking down these pv's (+sending them to the gui) comes at a
certian cost, but no-regression tests passed:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 16300 W: 3556 L: 3424 D: 9320
http://tests.stockfishchess.org/tests/view/5b9b73500ebc592cf275ea92
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 202411 W: 32734 L: 32897 D: 136780
http://tests.stockfishchess.org/tests/view/5b9baed10ebc592cf275ef6d
N.B.: Digging also into qsearch was tried in another version but seemed
not to pass the tests. This means that we don't always will get a pv
until the very tips.
No functional change
This PR is a combination of two unrelated [0, 4] patches that appeared promising
but not quite strong enough to pass on their own. The combination initially failed
STC with a positive score after a long run, and the subsequent speculative LTC test
passed.
* tweak_threatOnQueen4 :
Increase the middlegame components of ThreatByMinor[QUEEN]
and ThreatByRook[QUEEN] by 15 each. Bryan's (@crossbr) analysis of CCC Bonus Game 10
inspired several tests on penalizing a queen with limited safe mobility. While
attempting to implement this idea, I noticed that when I did not include the queen's
current square in the calculations, the Elo gains seemed to vanish--and only then did
I have the idea to revisit ThreatByMinor[QUEEN] and ThreatByRook[QUEEN], adding a
corresponding value to each. Without Bryan's work, this test would never have been
submitted. I would also like to recognize the efforts and contributions of @SFisGOD,
who also vigorously worked on this idea.
* Use pure static eval for null move pruning :
This idea was directly re-purposed from a promising test by Jerry Donald Watson
(@jerrydonaldwatson) in August. It was also independently developed and tested by
Stefan Geschwentner (@locutus2) previously.
Thank you all!
STC (failed yellow):
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 83913 W: 17986 L: 17825 D: 48102
http://tests.stockfishchess.org/tests/view/5bbc59300ebc592439f76aa5
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 137198 W: 22351 L: 21772 D: 93075
http://tests.stockfishchess.org/tests/view/5bbce35f0ebc592439f77639
Bench: 4312846
Note by snicolet: I use this non-functional change patch
as a pretext to correct the wrong bench number I introduced
in the message of the previous commit.
Bench: 4059356
Storing unconditionally the current generation and bound is equivalent to master.
Part of the condition was added as a speed optimization in #429.
Here the branch is fully eliminated.
passed STC single-threaded:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 73515 W: 16378 L: 16359 D: 40778
http://tests.stockfishchess.org/tests/view/5b2fc38c0ebc5902b2e57fd5
passed STC multi-threaded:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 63725 W: 12916 L: 12874 D: 37935
http://tests.stockfishchess.org/tests/view/5b307b8f0ebc5902b2e5895f
The multithreaded test was run after a plausible suggestion by @mstembera that the effect of this could be larger with many cores. The result seems to indicate this doesn't really matter on the 8core architecture abundantly available on fishtest.
No functional change
a) Reduce PSQT values along the long diagonals on non-central squares
and increase the LongDiagonal bonus accordingly. The effect is to penalise
bishops on the long diagonal which can not "see" the 2 central squares.
The "good" bishops still have more or less the same bonus as current master.
b) For a bishop on a central square, because of the "| s" term in the code,
the LongDiagonalBonus was always given. So while being there, remove the "| s"
and compensate the central Bishop PSQT accordingly.
Passed STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 44498 W: 9658 L: 9323 D: 25517
http://tests.stockfishchess.org/tests/view/5b8992770ebc592cf2748942
Passed LTC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 63092 W: 10324 L: 9975 D: 42793
http://tests.stockfishchess.org/tests/view/5b89a17a0ebc592cf2748b59
Closes https://github.com/official-stockfish/Stockfish/pull/1760
bench: 4693901
It looks like PawnsOnBothFlanks can be removed from initiative().
A barrage of tests seem to confirm that the adjustment to -110
does not gain elo to offset any potential loss by removing
PawnsOnBothFlanks.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22014 W: 4760 L: 4639 D: 12615
http://tests.stockfishchess.org/tests/view/5b7f50cc0ebc5902bdbb3a3e
LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 40561 W: 6667 L: 6577 D: 27317
http://tests.stockfishchess.org/tests/view/5b801f9f0ebc5902bdbb4467
The barrage of 0,4 tests on the -136 value are in my ps_tunetests branch.
http://tests.stockfishchess.org/tests/user/protonspring
Closes https://github.com/official-stockfish/Stockfish/pull/1751
Bench: 4413173
-------------
How to continue from there?
The fact that endgames with all the pawns on only one flank are
drawish is a well-known chess idea, so it seems quite strange that
this can be removed so easily without losing Elo.
In the past there had been attempts to improve on PawnsOnBothFlanks
with similar concepts (for instance using the pawn span value), but
the tests were at best neutral. Maybe Stockfish is now mature enough
that these refined ideas would work to replace PawnsOnBothFlanks?
There is no need to make this as large as 65536 just for the sake of the
single 7-man tablebase that happens to have the key 0xf9247fff. Idea for the
fix by Ronald de Man, who suggested simply to allow more buckets past the end.
We also implement Robin Hood hashing for the hash table, which takes the worst
-case search for full 7-man tablebases down from 68 to 11 probes (Also takes
the average probe length from 2.06 to 2.05). For a table with 8K entries, the
corresponding numbers would be worst-case from 9 to 4, with average from 1.30
to 1.29.
https://github.com/official-stockfish/Stockfish/pull/1747
No functional change
This commit tries to make the new pure static eval code more readable by
splitting up the nested assignments into separate lines and making a few
more cosmetic tweaks.
No functional change.
This is a non-functional change. By pre-incrementing minKingPawnDistance
instead of post-incrementing, we can remove this -1.
This also makes DistanceRing more consistent with the rest of stockfish
since it now holds an actual "distance" instead of a less natural distance-1.
In current master, PseudoAttacks[KING][ksq] == DistanceRingBB[ksq][0]
With this patch, it will be PseudoAttacks[KING][ksq] == DistanceRingBB[ksq][1]
ie squares at distance 1 from the king. This is more natural use of distance.
The current array size DistanceRingBB[SQUARE_NB][8] is still OK with the new
definition, because maximum distance between two squares on a chess board is
seven (for example Kh1 and a8).
No functional change.