When we reach a position with only two opposite colored bishops and
one pawn on the board, current master would give it a scale factor
of 9/64=0.14 in about one position out of 7200, and a scale factor
of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in
100% of the cases.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 55845 W: 11467 L: 11410 D: 32968
http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11915 W: 1852 L: 1719 D: 8344
http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c
We also have exhaustive coverage analysis of this patch effect by
Alain Savard, comparing the perfect evaluation given by the Syzygy
tablebase with the heuristic play after this patch for the set of
all legal positions of the KBPKP endgame with opposite bishops, in
the comments thread for this pull request:
https://github.com/official-stockfish/Stockfish/pull/1520
Alain's conclusion:
> According to this definition and the data, I consider this PR is
> identical to master to "solve for draw" and slightly better than
> master to solve earlier for "wins".
Note: this patch is a side effect of an ongoing effort to improve
the evaluation of positions involving a pair of opposite bishops.
See the GitHub diff of this LTC test which almost passed at sprt[0..5]
for a discussion:
http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93
No functional change (at small bench depths)
Include some not fully supported levers in the (candidate) passed pawns
bitboard, if otherwise unblocked. Maybe levers are usually very short
lived, and some inaccuracy in the lever balance for the definition of
candidate passed pawns just triggers a deeper search.
Here is a example of a case where the patch has an effect on the definition
of candidate passers: White c5/e5 pawns, against Black d6 pawn. Let's say
we want to test if e5 is a candidate passer. The previous master looks
only at files d, e and f (which is already very good) and reject e5 as
a candidate. However, the lever d6 is challenged by 2 pawns, so it should
not fully count. Indirectly, this patch will view such case (and a few more)
to be scored as candidates.
STC
http://tests.stockfishchess.org/tests/view/5abcd55d0ebc5902926cf1e1
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 16492 W: 3419 L: 3198 D: 9875
LTC
http://tests.stockfishchess.org/tests/view/5abce1360ebc5902926cf1e6
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 21156 W: 3201 L: 2990 D: 14965
This was inspired by this test of Jerry Donald Watson, except the case of
zero supporting pawns against two levers is excluded, and it seems that
not excluding that case is bad, while excluding is it beneficial. See the
following tests on fishtest:
https://github.com/official-stockfish/Stockfish/pull/1519http://tests.stockfishchess.org/tests/view/5abccd850ebc5902926cf1ddhttp://tests.stockfishchess.org/tests/view/5abcdd490ebc5902926cf1e4
Closes https://github.com/official-stockfish/Stockfish/pull/1521
Bench: 5568461
----
Comments by Jerry Donald Watson:
> My thinking as to why this works:
>
> The evaluation is either called in an interior node or in the qsearch.
> The calls at the end of the qsearch are the more important as they
> ultimately determine the scoring of each move, whereas the internal
> values are mainly used for pruning decisions with a margin. Some strong
> engines don't even call the eval at all nodes. Now the whole point of
> the qsearch is to find quiet positions where captures do not change the
> evaluation of the position with regards to the search bounds - i.e. if
> there were good captures they would be tried.* So when a candidate lever
> appears in the evaluation at the end of the qsearch, the qsearch has
> guaranteed that it cannot just be captured, or if it can, this does not
> take the score past the search bounds. Practically this may mean that
> the side with the candidate lever has the turn, or perhaps the stopping
> lever pawn is pinned, or that side is forced for other reasons to make
> some other move (e.g. d6 can only take one of the pawns in the example
> above).
>
> Hence granting the full score for only one lever defender makes some
> sense, at least, to me.
>
> IMO this is also why huge bonuses for possible captures in the evaluation
> (e.g. threat on queen and our turn), etc. don't tend to work. Such things
> are best left to the search to figure out.
We now use per-thread dynamic contempt. This patch has the following
effects:
* for Threads=1: **non-functional**
* for Threads>1:
* with MultiPV=1: **no regression, little to no ELO gain**
* with MultiPV>1: **clear improvement over master**
First, I tried testing at standard MultiPV=1 play with [0,5] bounds.
This yielded 2 yellow and 1 red test:
5+0.05, Threads=5:
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 82689 W: 16439 L: 16190 D: 50060
http://tests.stockfishchess.org/tests/view/5aa93a5a0ebc5902952892e6
5+0.05, Threads=8:
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 27164 W: 4974 L: 4983 D: 17207
http://tests.stockfishchess.org/tests/view/5ab2639b0ebc5902a6fbefd5
5+0.5, Threads=16:
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 41396 W: 7127 L: 7082 D: 27187
http://tests.stockfishchess.org/tests/view/5ab124220ebc59029516cb62
Then, I tested with Skill Level=17 (implicitly MutliPV=4), showing
a clear improvement:
5+0.05, Threads=5:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 3498 W: 1316 L: 1135 D: 1047
http://tests.stockfishchess.org/tests/view/5ab4b6580ebc5902932aeca2
Next, I tested the patch with MultiPV=1 again, this time checking for
non-regression ([-3, 1]):
5+0.5, Threads=5:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 65575 W: 12786 L: 12745 D: 40044
http://tests.stockfishchess.org/tests/view/5ab4e8500ebc5902932aecb3
Finally, I ran some tests with fixed number of games, checking if
reverting dynamic contempt gains more elo with Skill Level=17 (i.e.
MultiPV) than applying the "prevScore" fix and this patch. These tests
showed, that this patch gains 15 ELO when playing with Skill Level=17:
5+0.05, Threads=3, "revert dynamic contempt" vs. "WITHOUT this patch":
ELO: -11.43 +-4.1 (95%) LOS: 0.0%
Total: 20000 W: 7085 L: 7743 D: 5172
http://tests.stockfishchess.org/tests/view/5ab636450ebc590295d88536
5+0.05, Threads=3, "revert dynamic contempt" vs. "WITH this patch":
ELO: -26.42 +-4.1 (95%) LOS: 0.0%
Total: 20000 W: 6661 L: 8179 D: 5160
http://tests.stockfishchess.org/tests/view/5ab62e680ebc590295d88524
---
***FAQ***
**Why should this be commited?**
I believe that the gain for multi-thread MultiPV search is a sufficient
justification for this otherwise neutral change. I also believe this
implementation of dynamic contempt is more logical, although this may
be just my opinion.
**Why is per-thread contempt better at MultiPV?**
A likely explanation for the gain in MultiPV mode is that during
search each thread independently switches between rootMoves and via
the shared contempt score skews each other's evaluation.
**Why were the tests done with Skill Level=17?**
This was originally suggested by @Hanamuke and the idea is that with
Skill Level Stockfish sometimes plays also moves it thinks are slightly
sub-optimal and thus the quality of all moves offered by the MultiPV
search is checked by the test.
**Why are the ELO differences so huge?**
This is most likely because of the nature of Skill Level mode --
since it slower and weaker than normal mode, bugs in evaluation have
much greater effect.
---
Closes https://github.com/official-stockfish/Stockfish/pull/1515.
No functional change -- in single thread mode.
Extends valgrind/sanitizer testing to cover syzygy code.
The script downloads 4 man syzygy as needed. The time needed for the
additional testing is small (in fact hard to see a difference compared
to the large fluctuations in testing time in travis).
Possible follow-ups:
* include more TB sensitive positions in bench.
* include the test script of recent commit "Refactor tbprobe.cpp".
* verify unchanged bench with TB (with a long run).
* make the TB part of the continuation integration tests optional.
Closes https://github.com/official-stockfish/Stockfish/pull/1518
and https://github.com/official-stockfish/Stockfish/pull/1490
No functional change.
Adjust criterion for applying extra reduction if not improving.
We now add an extra ply of reduction if r > 1.0, instead of the
previous condition Reductions[NonPV][imp][d][mc] >= 2.
Why does this work? Previously, reductions when not improving had
a discontinuity as the depth and/or move count increases due to the
Reductions[NonPV][imp][d][mc] >= 2 condition. Hence, values of r
such that 0.5 < r < 1.5 would be mapped to a reduction of 1, while
1.5 < r < 2.5 would be mapped to a reduction of 3. This patch allows
values of r satisfying 1.0 < r < 1.5 to be mapped to a reduction of 2,
making the reduction formula more continuous.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 35908 W: 7382 L: 7087 D: 21439
http://tests.stockfishchess.org/tests/view/5aba723a0ebc5902a4743e8f
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 23087 W: 3584 L: 3378 D: 16125
http://tests.stockfishchess.org/tests/view/5aba89070ebc5902a4743ea9
Ideas for future work:
- We could look at retuning the LMR formula.
- We could look at adjusting the reductions in PV nodes if not improving.
Bench: 5326261
Use rootMoves[PVIdx].previousScore instead of bestValue for
dynamic contempt. This is equivalent for MultiPV=1 (bench remained the
same, even for higher depths), but more correct for MultiPV.
STC (MultiPV=3):
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 2657 W: 1079 L: 898 D: 680
http://tests.stockfishchess.org/tests/view/5aaa47cb0ebc590297330403
LTC (MultiPV=3):
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 2390 W: 874 L: 706 D: 810
http://tests.stockfishchess.org/tests/view/5aaa593a0ebc59029733040b
VLTC 240+2.4 (MultiPV=3):
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 2399 W: 861 L: 694 D: 844
http://tests.stockfishchess.org/tests/view/5aaf983e0ebc5902a182131f
LTC (MultiPV=4, Skill Level=17):
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 747 W: 333 L: 175 D: 239
http://tests.stockfishchess.org/tests/view/5aabccee0ebc5902997ff006
Note: although the ELO differences seem huge, they are inflated by the
nature of Skill Level / MultiPV search, so I don't think they can be
reasonably compared with classic ELO strength.
See https://github.com/official-stockfish/Stockfish/pull/1491 for some
verifications searches with MultiPV = 10 at depths 12 and 24 from the
starting position and the position after 1.e4, comparing the outputs
of the full PV by the old master and by this patch.
No functional change for MultiPV=1
This involves:
* replacing the union hacks with simply reusing the EntryPiece arrays
for the no-pawns case
* merging the PairsData structure with the EntryPiece/-Pawn structs
(with credit to Marco: @mcostalba)
* simplifying some HashTable functions
* thanks to previous changes, removing the ugly memsets
* simplifying the template logic for WDL/DTZ distinction
(now we distinguish based on an enum type, not the entry classes)
* removing the unneeded Atomic wrapper
-----------------------------
For reference, here is a manual way to check that patches concerning
table bases code are non-functional changes:
0) Download the Syzygy table bases (up to 6 men).
1) Make sure you have branches master and the pull request pointing to
the right commits.
2) Download the bench calculation scripts from the following URL:
https://gist.github.com/WOnder93/b5fcf9c989b4a1715684d5c82367cdbe
and copy into src inside your Stockfish repo.
3) Make the scripts executable (chmod +x *.sh).
4) Run the following command to use TBs located at <path>:
export SYZYGY_PATH='<path>'
5) After that, run this (it will take a long time, this is a deep bench):
BENCH_ARGS='128 1 22' ./check_benches.sh master tbprobe_cleanup 2>/dev/null`
==> You should see two equal numbers printed.
(Of course, now we have to trust that the script itself is correct :)
-----------------------------
Closes https://github.com/official-stockfish/Stockfish/pull/1477
No functional change.
This is a patch to fix issue #1498, switching the time management variables
to 64 bits to avoid overflow of time variables after 25 days.
There was a bug in Stockfish 9 causing the output to be wrong after
2^31 milliseconds search. Here is a long run from the starting position:
info depth 64 seldepth 87 multipv 1 score cp 23 nodes 13928920239402
nps 0 tbhits 0 time -504995523 pv g1f3 d7d5 d2d4 g8f6 c2c4 d5c4 e2e3 e7e6 f1c4
c7c5 e1g1 b8c6 d4c5 d8d1 f1d1 f8c5 c4e2 e8g8 a2a3 c5e7 b2b4 f8d8 b1d2 b7b6 c1b2
c8b7 a1c1 a8c8 c1c2 c6e5 d1c1 c8c2 c1c2 e5f3 d2f3 a7a5 b4b5 e7c5 f3d4 d8c8 d4b3
c5d6 c2c8 b7c8 b3d2 c8b7 d2c4 d6c5 e2f3 b7d5 f3d5 e6d5 c4e5 a5a4 e5d3 f6e4 d3c5
e4c5 b2d4 c5e4 d4b6 e4d6 g2g4 d6b5 b6c5 b5c7 g1g2 c7e6 c5d6 g7g6
We check at compile time that the TimePoint type is exactly 64 bits long for
the compiler (TimePoint is our alias in Stockfish for std::chrono::milliseconds
-- it is a signed integer type of at least 45 bits according to the C++ standard,
but will most probably be implemented as a 64 bits signed integer on modern
compilers), and we use this TimePoint type consistently across the code.
Bug report by user "fischerandom" on the TCEC chat (thanks), and the
patch includes code and suggestions by user "WOnder93" and Ronald de Man.
Fixes issue: https://github.com/official-stockfish/Stockfish/issues/1498
Closes pull request: https://github.com/official-stockfish/Stockfish/pull/1510
No functional change.
Make kingRing always eight squares, extending the bitboard to the
F file if the king is on the H file, and to the C file if the king
is on the A file. This may deal with cases where Stockfish (like
many other engines) would shift the king around on the back rank
like g1h1, not because there is some imminent threat, but because
it makes king safety look a little better just because the king ring
had a smaller area.
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 34000 W: 7167 L: 6877 D: 19956
http://tests.stockfishchess.org/tests/view/5ab8216d0ebc5902932cbe64
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 22574 W: 3576 L: 3370 D: 15628
http://tests.stockfishchess.org/tests/view/5ab84e6a0ebc5902932cbe72
How to continue from there?
This patch probably makes it easier to tune the king safety evaluation,
because the new regularity of the king ring size will make the king
safety function more continuous.
Closes https://github.com/official-stockfish/Stockfish/pull/1512
Bench: 5934103
Rewrite the MovePicker class using lambda expressions for move filtering.
Includes code style changes by @mcostalba.
Verified for speed, passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 43191 W: 9391 L: 9312 D: 24488
http://tests.stockfishchess.org/tests/view/5a99b9df0ebc590297cc8f04
This rewrite of MovePicker.cpp seems to trigger less random crashes on Ryzen
machines than the version in previous master (reported by Bojun Guo).
Closes https://github.com/official-stockfish/Stockfish/pull/1454
No functional change.
To more clearly distinguish them from "const" local variables, this patch
defines compile-time local constants as constexpr. This is consistent with
the definition of PvNode as constexpr in search() and qsearch(). It also
makes the code more robust, since the compiler will now check that those
constants are indeed compile-time constants.
We can go even one step further and define all the evaluation and search
compile-time constants as constexpr.
In generate_castling() I replaced "K" with "step", since K was incorrectly
capitalised (in the Chess960 case).
In timeman.cpp I had to make the non-local constants MaxRatio and StealRatio
constepxr, since otherwise gcc would complain when calculating TMaxRatio and
TStealRatio. (Strangely, I did not have to make Is64Bit constexpr even though
it is used in ucioption.cpp in the calculation of constexpr MaxHashMB.)
I have renamed PieceCount to pieceCount in material.h, since the values of
the array are not compile-time constants.
Some compile-time constants in tbprobe.cpp were overlooked. Sides and MaxFile
are not compile-time constants, so were renamed to sides and maxFile.
Non-functional change.
Implements renaming suggestions by Marco Costalba, Günther Demetz,
Gontran Lemaire, Ronald de Man, Stéphane Nicolet, Alain Savard,
Joost VandeVondele, Jerry Donald Watson, Mike Whiteley, xoto10,
and I hope that I haven't forgotten anybody.
Perpetual renaming thread for suggestions:
https://github.com/official-stockfish/Stockfish/issues/1426
No functional change.
If search depth is less than ONE_PLY call qsearch(), no need to check the
depth condition at various call sites of search().
Passed STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 14568 W: 3011 L: 2877 D: 8680
http://tests.stockfishchess.org/tests/view/5aa846190ebc59029781015b
Also helps gcc to find some optimizations (smaller binary, some speedup).
Thanks to Aram and Stefan for identifying an oversight in an early version.
Closes https://github.com/official-stockfish/Stockfish/pull/1487
No functional change.
The NO_BSF does not cover any real life use-case today. The only compilers that
can compile SF today, with the current Makefile and no source code changes, are
either GCC compatible (define __GNUC__) or MSVC compatible (define _MSC_VER). So
they all support LSB/MSB intrinsics.
This patch simplifies away the software fall-backs of LSB/MSB that were still
in Stockfish code, but unused in any of the officially supported compilers.
Note the (legacy) MSVC/WIN32 case, where we use a 32-bit BSF/BSR solution, as
64-bit intrinsics aren't available there.
Discussed in: https://github.com/official-stockfish/Stockfish/pull/1447
and: https://github.com/official-stockfish/Stockfish/pull/1479
No functional change.
Adjust ProbCut rBeta by whether the score is improving, and also
set improving to false when in check. More precisely, this patch
has two parts:
1) the increased beta threshold for ProbCut is now adjusted based
on whether the score is improving
2) when in check, improving is always set to false.
Co-authored by Joost VandeVondele (@vondele) and Bill Henry (@VoyagerOne).
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 13480 W: 2840 L: 2648 D: 7992
http://tests.stockfishchess.org/tests/view/5aa693fe0ebc59029781004c
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 25895 W: 4099 L: 3880 D: 17916
http://tests.stockfishchess.org/tests/view/5aa6ac940ebc59029781006e
In terms of opportunities for future work opened up by this patch,
the ProbCut rBeta formula could probably be tuned to gain more Elo.
Closes https://github.com/official-stockfish/Stockfish/pull/1485
Bench: 5328254
King and pawn endgames are typically decisive, and a small
advantage is often sufficient to win. Therefore we now take
this into account when computing the initiative adjustment.
This idea came from a series of patches by Gian-Carlo Pascutto.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 48770 W: 10203 L: 9845 D: 28722
http://tests.stockfishchess.org/tests/view/5aa58cce0ebc59029780ff8d
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 22252 W: 3572 L: 3366 D: 15314
http://tests.stockfishchess.org/tests/view/5aa5b27c0ebc59029780ffad
Ideas for future developement:
- There have been a number of changes to the initiative
calculation lately. Perhaps the coefficients could be
tuned again.
- It may be possible to add special knowledge for other
endgames in the initiative calculation.
Closes https://github.com/official-stockfish/Stockfish/pull/1481
Bench: 5750110
"Loose pieces drop, in blitz keep everything protected"
Adding a small S(2,2) bonus for knights, bishops, rooks, and
queens that are "connected" to each other (in the sense that
they are under attack by our own pieces) apparently is a good
thing. It probably helps the pieces work together a bit better.
STC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 12317 W: 2655 L: 2467 D: 7195
http://tests.stockfishchess.org/tests/view/5aa2d86b0ebc590297cb6474
LTC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 35725 W: 5516 L: 5263 D: 24946
http://tests.stockfishchess.org/tests/view/5aa2fc6f0ebc590297cb64a8
How to continue from there (by Stefan Geschwentner)?
• First we should identify all other eval terms which have an overlap
with new connectivity bonus (like the outpost bonus). A simple way
would be subtract the connectivity bonus from them and look if this
better, or use a SPSA session for these terms.
• Tuning Connectivity himself with SPSA seems not so promising because
of the small range which is useful. Here manual testing changes of
Connectivity like +-1 seems better.
• The eg value is more important because in endgame the position gets
more open and so attacks on pieces are easier. Another important point
is that when defending/fortress-like positions each defending piece
needs a protection, otherwise attacks on them can break defense.
Closes https://github.com/official-stockfish/Stockfish/pull/1474
Bench: 5318575
Avoid duplicated code after recent commit "Use evaluation trend
to adjust futility margin". We initialize the improving variable
to true in the check case, which allows to avoid redundant code
in the general case.
Tested for speed by snicolet, patch seems about 0.4% faster.
No functional change.
Note: initializing the improving variable to false in the check
case was tested as a functional change, ending yellow in both STC
and LTC. This change is not included in the commit, but it is an
interesting result that could become part of a future patch about
improving or LMR. Reference of the LTC yellow test:
http://tests.stockfishchess.org/tests/view/5aa131560ebc590297cb636e
Allow a potential slider threat from a square currently occupied
by a harmless attacker, just as the recent "knight on queen" patch.
Also from not completely safe squares, use the mobilityArea instead
of excluding all pawns for both SlidersOnQueen and KnightOnQueen
We now compute the potential sliders threat on queen only if opponent
has one queen.
Run as SPRT [0,4] since it is some kind of simplification but maybe
not clearly one.
STC:
http://tests.stockfishchess.org/tests/view/5aa1ddf10ebc590297cb63d8
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 22997 W: 4817 L: 4570 D: 13610
LTC:
http://tests.stockfishchess.org/tests/view/5aa1fe6b0ebc590297cb63e5
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 11926 W: 1891 L: 1705 D: 8330
After this patch is committed, we may try to:
• re-introduce some "threat by queen" bonus to make Stockfish's queen
more aggressive (attacking aspect)
• introduce a concept of "queen overload" to force the opponent queen
into passivity and protecting duties (defensive aspect)
• more generally, re-tune the queen mobility array since patches in the
last three months have affected a lot the location/activity of queens.
Closes https://github.com/official-stockfish/Stockfish/pull/1473
bench: 5788691
We give a S(21,11) bonus for knight threats on the next moves
against enemy queen. The threats are from squares which are
"not strongly protected" and which may be empty, contain enemy
pieces or even one of our piece at the moment (N,B,Q,R) -- hence
be two-steps threats in the later case because we will have to
move our piece and *then* attack the enemy queen with the knight.
STC: http://tests.stockfishchess.org/tests/view/5a9e442e0ebc590297cb6162
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 35129 W: 7346 L: 7052 D: 20731
LTC: http://tests.stockfishchess.org/tests/view/5a9e6e620ebc590297cb617f
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 42442 W: 6695 L: 6414 D: 29333
How to continue from there?
• Trying to refine the threat condition ("not strongly protected")
• Trying the two-steps idea for bishops or rooks threats against queen
Bench: 6051247
Similar removal of superposition code trick as in the
"Simplify tropism computation" patch. This simplification
of the space() function will allow us to specify space
masks which can reach into enemy territory.
passed STC:
LLR: 3.38 (-2.94,2.94) [-3.00,1.00]
Total: 184630 W: 40581 L: 40758 D: 103291
http://tests.stockfishchess.org/tests/view/5a8433360ebc590297cc80c5
passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 231799 W: 37647 L: 37858 D: 156294
http://tests.stockfishchess.org/tests/view/5a96a34a0ebc590297cc8cfd
No functional change.
Add a logarithmic term in the optimism computation, increase
the maximal optimism and lower the contempt offset.
This increases the dynamics of the optimism aspects, giving
a boost for balanced positions without skewing too much on
unbalanced positions (but this version will enter panic mode
faster than previous master when behind, trying to draw faster
when slightly behind). This helps, since optimism is in general
a good thing, for instance at LTC, but too high optimism
rapidly contaminates play.
passed STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 159343 W: 34489 L: 33588 D: 91266
http://tests.stockfishchess.org/tests/view/5a8db9340ebc590297cc85b6
passed LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 47491 W: 7825 L: 7517 D: 32149
http://tests.stockfishchess.org/tests/view/5a9456a80ebc590297cc8a89
It must be mentioned that a version of the PR with contempt 0
did not pass STC [0,5]. The version in the patch, which uses
default contempt 12, was found to be as strong as current master
on different matches against SF7 and SF8, both at STC and LTC.
One drawback maybe is that it raises the draw rate in self-play
from 56% to 59%, giving a little bit less sensitivity for SF
developpers to find evaluation improvements by selfplay tests
in fishtest.
Possible further work:
• tune the values accurately, while keeping in mind the drawrate issue
• check whether it is possible to remove linear and offset term
• try to simplify the S-shape curve
Bench: 5934644
Use a recursive std::array with variadic template
parameters to get rid of the last redundacy.
The first template T parameter is the base type of
the array, the W parameter is the weight applied to
the bonuses when we update values with the << operator,
the D parameter limits the range of updates (range is
[-W * D, W * D]), and the last parameters (Size and
Sizes) encode the dimensions of the array.
This allows greater flexibility because we can now tweak
the range [-W * D, W * D] for each table.
Patch removes more lines than what adds and streamlines
the Stats soup in movepick.h
Closes PR#1422 and PR#1421
No functional change.
The first depth 2 margin triggers the verification quiescence search.
This qsearch() result has to be better then the second lower margin,
so we only skip the razoring when the qsearch gives a significant
improvement.
Passed STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 32133 W: 7395 L: 7101 D: 17637
http://tests.stockfishchess.org/tests/view/5a93198b0ebc590297cc8942
Passed LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 17382 W: 3002 L: 2809 D: 11571
http://tests.stockfishchess.org/tests/view/5a93b18c0ebc590297cc89c2
This Elo-gaining version was further simplified following a suggestion
of Marco Costalba:
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 15553 W: 3505 L: 3371 D: 8677
http://tests.stockfishchess.org/tests/view/5a964be90ebc590297cc8cc4
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 13253 W: 2270 L: 2137 D: 8846
http://tests.stockfishchess.org/tests/view/5a9658880ebc590297cc8cca
How to continue after this patch?
Reformating the razoring code (step 7 in search()) to unify the
depth 1 and depth 2 treatements seems quite possible, this could
possibly lead to more simplifications.
Bench: 5765806
In pawn structures like white pawns f6,h6 against black pawns f7,g6,h7
the attack on the king is blocked by the own pawns. So decrease the
penalty for king safety.
See diagram and discussion in
https://github.com/official-stockfish/Stockfish/pull/1434
A sample position that this patch wants to avoid is the following
1rr2bk1/3q1p1p/2n1bPpP/pp1pP3/2pP4/P1P1B3/1PBQN1P1/1K3R1R w - - 0 1
White pawn storm on the king side was a disaster, it locked the king
side completely. Therefore, all the king tropism bonus that white have
on the king side are useless, and kingadjacent attacks too. Master
gives White a static +4.5 advantage, but White cannot win that game.
The patch is lowering this evaluation artefact.
STC:
LLR: 2.94 (-2.94,2.94) [0.00,5.00]
Total: 16467 W: 3750 L: 3537 D: 9180
http://tests.stockfishchess.org/tests/view/5a92102d0ebc590297cc87d0
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 64242 W: 11130 L: 10745 D: 42367
http://tests.stockfishchess.org/tests/view/5a923dc80ebc590297cc8806
This version includes reformatting and speed optimization by Alain Savard.
Bench: 5643527
Using a SPSA tuning session to optimize the time management
parameters.
With SPSA tuning it is not always possible to say where improvements
came from. Maybe some variables changed randomly or because result
was not sensitive enough to them. So my explanation of changes will
not be necessarily correct, but here it is.
• When decrease of thinking time was added by Joost a few months ago
if best move has not changed for several plies, one more competing
indicator was introduced for the same purpose along with increase
in score and absence of fail low at root. It seems that tuning put
relatively more importance on that new indicator what allowed to save
time.
• Some of this saved time is distributed proportionally between all
moves and some more time were given to moves when score dropped a lot
or best move changed.
• It looks also that SPSA redistributed more time from the beginning to
later stages of game via other changes in variables - maybe because
contempt made game to last longer or for whatever reason.
All of this is just small tweaks here and there (a few percentages changes).
STC (10+0.1):
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 18970 W: 4268 L: 4029 D: 10673
http://tests.stockfishchess.org/tests/view/5a9291a40ebc590297cc8881
LTC (60+0.6):
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 72027 W: 12263 L: 11878 D: 47886
http://tests.stockfishchess.org/tests/view/5a92d7510ebc590297cc88ef
Additional non-regression tests at other time controls
Sudden death 60s:
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 14444 W: 2715 L: 2608 D: 9121
http://tests.stockfishchess.org/tests/view/5a9445850ebc590297cc8a65
40 moves repeating at LTC:
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 10309 W: 1880 L: 1759 D: 6670
http://tests.stockfishchess.org/tests/view/5a9566ec0ebc590297cc8be1
This is a functional patch only for time management, but the bench
does not reflect this because it uses fixed depth search, so the number
of nodes does not change during bench.
No functional change.