- Cleanups by Alain
- Group king attacks and king defenses
- Signature of futility_move_count()
- Use is_discovery_check_on_king()
- Simplify backward definition
- Use static asserts in move generator
- Factor a statement in move generator
No functional change
This patch changes the base aspiration window size depending on the absolute
value of the previous iteration score, increasing it away from zero. This
stems from the observation that the further away from zero, the more likely
the evaluation is to change significantly with more depth. Conversely, a
tighter aspiration window is more efficient when close to zero.
A beneficial side-effect is that analysis of won positions without a quick
mate is less prone to waste nodes in repeated fail-high that change the eval
by tiny steps.
STC:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 60102 W: 13327 L: 12868 D: 33907
http://tests.stockfishchess.org/tests/view/5d9a70d40ebc5902b6cf39ba
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 155553 W: 25745 L: 25141 D: 104667
http://tests.stockfishchess.org/tests/view/5d9a7ca30ebc5902b6cf4028
Future work : the values used in this patch were only a reasonable guess.
Further testing should unveil more optimal values. However, the aspiration
window is rather tight with a minimum of 21 internal units, so discrete
integers put a practical limitation to such tweaking.
More exotic experiments around the aspiration window parameters could also
be tried, but efficient conditions to adjust the base aspiration window size
or allow it to not be centered on the current evaluation are not obvious.
The aspiration window increases after a fail-high or a fail-low is another
avenue to explore for potential enhancements.
Bench: 4043748
It is always used as a bool, so let's make it a bool straight away.
We can always redefine it as a Piece in a later patch if we want
to use the piece type or the piece color.
No functional change.
Simplification that eliminates ONE_PLY, based on a suggestion in the forum that
support for fractional plies has never been used, and @mcostalba's openness to
the idea of eliminating it. We lose a little bit of type safety by making Depth
an integer, but in return we simplify the code in search.cpp quite significantly.
No functional change
------------------------------------------
The argument favoring eliminating ONE_PLY:
* The term “ONE_PLY” comes up in a lot of forum posts (474 to date)
https://groups.google.com/forum/?fromgroups=#!searchin/fishcooking/ONE_PLY%7Csort:relevance
* There is occasionally a commit that breaks invariance of the code
with respect to ONE_PLY
https://groups.google.com/forum/?fromgroups=#!searchin/fishcooking/ONE_PLY%7Csort:date/fishcooking/ZIPdYj6k0fk/KdNGcPWeBgAJ
* To prevent such commits, there is a Travis CI hack that doubles ONE_PLY
and rechecks bench
* Sustaining ONE_PLY has, alas, not resulted in any improvements to the
engine, despite many individuals testing many experiments over 5 years.
The strongest argument in favor of preserving ONE_PLY comes from @locutus:
“If we use par example ONE_PLY=256 the parameter space is increases by the
factor 256. So it seems very unlikely that the optimal setting is in the
subspace of ONE_PLY=1.”
There is a strong theoretical impediment to fractional depth systems: the
transposition table uses depth to determine when a stored result is good
enough to supply an answer for a current search. If you have fractional
depths, then different pathways to the position can be at fractionally
different depths.
In the end, there are three separate times when a proposal to remove ONE_PLY
was defeated by the suggestion to “give it a few more months.” So… it seems
like time to remove this distraction from the community.
See the pull request here:
https://github.com/official-stockfish/Stockfish/pull/2289
In lazySMP it makes sense to prune a little more, as multiple threads
search wider. We thus increase the prefactor of the reductions slowly
as a function of the threads. The prefactor of the log(threads) term
is a parameter, this pull request uses 1/2 after testing.
passed STC @ 8threads:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 118125 W: 23151 L: 22462 D: 72512
http://tests.stockfishchess.org/tests/view/5d8bbf4d0ebc59509180f217
passed LTC @ 8threads:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 67546 W: 10630 L: 10279 D: 46637
http://tests.stockfishchess.org/tests/view/5d8c463b0ebc5950918167e8
passed ~LTC @ 14threads:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 74271 W: 12421 L: 12040 D: 49810
http://tests.stockfishchess.org/tests/view/5d8db1f50ebc590f3beb24ef
Note:
A larger prefactor (1) passed similar tests at STC and LTC (8 threads),
while a very large one (2) passed STC quickly but failed LTC (8 threads).
For the single-threaded case there is no functional change.
Closes https://github.com/official-stockfish/Stockfish/pull/2337
Bench: 4088701
Fixup: remove redundant code.
A curious feature of Stockfish's current extension code is its repeated
use of "else if." In most cases, this makes no functional difference,
because no more than one extension is applied; once one extension has
been applied, the remaining ones can be safely ignored.
However, if most singular extension search conditions are true, except
"value < singularBeta", no non-singular extensions (e.g., castling) can
be performed!
Three tests were submitted, for three of Stockfish's four non-singular
extensions. I excluded the shuffle extension, because historically there
have been concerns about the fragility of its conditions, and I did not
want to risk causing any serious search problems.
- Modifying the passed pawn extension appeared roughly neutral at STC. At
best, it appeared to be an improvement of less than 1 Elo.
- Modifying check extension performed very poorly at STC
- Modifying castling extension (this patch) produced a long "yellow" run
at STC (insufficient to pass, but positive score) and a strong LTC.
In simple terms, prior to this patch castling extension was occasionally
not applied during search--on castling moves. The effect of this patch is
to perform castling extension on more castling moves. It does so without
adding any code complexity, simply by replacing an "else if" with "if" and
reordering some existing code.
STC:
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 108114 W: 23877 L: 23615 D: 60622
http://tests.stockfishchess.org/tests/view/5d8d86bd0ebc590f3beb0c88
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 20862 W: 3517 L: 3298 D: 14047
http://tests.stockfishchess.org/tests/view/5d8d99cd0ebc590f3beb1899
Bench: 3728191
--------
Where do we go from here?
- It seems strange to me that check extension performed so poorly -- clearly
some of the singular extension conditions are also very important for check
extension. I am not an expert in search, and I do not have any intuition
about which of the eight conditions is/are the culprit. I will try a
succession of eight STC tests to identify the relevant conditions, then try
to replicate this PR for check extension.
- Recent tests interacting with the castle extension may deserve retesting.
I will shortly resubmit a few of my recent castling extension tweaks, rebased
on this PR/commit.
My deepest thanks to @noobpwnftw for the extraordinary CPU donation, and to
all our other fishtest volunteers, who made it possible for a speculative LTC
to pass in 70 minutes!
Closes https://github.com/official-stockfish/Stockfish/pull/2331
Maintain best move counter at the root and allow there only moves which has a counter
of zero for Late Move Reduction. For compensation only the first three moves are excluded
from Late Move Reduction per default instead the first four moves.
What we can further do:
- here we use a simple counting scheme but perhaps some aging to fade out early iterations
could be helpful
- use the best move counter also at inner nodes for LMR and/or pruning
STC:
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 17414 W: 3984 L: 3733 D: 9697
http://tests.stockfishchess.org/tests/view/5d6234bb0ebc5939d09f2aa2
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 38058 W: 6448 L: 6166 D: 25444
http://tests.stockfishchess.org/tests/view/5d62681a0ebc5939d09f2f27
Closes https://github.com/official-stockfish/Stockfish/pull/2282
Bench: 3568210
@Vizvezdenec array suggested that alternate values may be better than current
master (see pull request #2270 ). I tuned some linear equations to more closely
represent his values and it passed. These futility values seem quite sensitive,
so perhaps additional Elo improvements can be found here.
STC
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 12257 W: 2820 L: 2595 D: 6842
http://tests.stockfishchess.org/tests/view/5d5b2f360ebc5925cf1111ac
LTC
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 20273 W: 3497 L: 3264 D: 13512
http://tests.stockfishchess.org/tests/view/5d5c0d250ebc5925cf111ac3
Closes https://github.com/official-stockfish/Stockfish/pull/2272
------------------------------------------
How to continue from there ?
a) we can try a simpler version for the futility margin, this would
be a simplification :
margin = 188 * (depth - improving)
b) on the other direction, we can try a complexification by trying
again to gain Elo with an complete array of futility values.
------------------------------------------
Bench: 4330402
Non functional simplification when we find the passed pawns in pawn.cpp
and some code clean up. It also better follows the pattern "flag the pawn"
and "score the pawn".
-------------------------
The idea behind the third condition for candidate passed pawn is a little
bit difficult to visualize. Just for the record, the idea is the following:
Consider White e5 d4 against black e6. d4 can (in some endgames) push
to d5 and lever e6. Thanks to this sacrifice, or after d5xe6, we consider
e5 as "passed".
However:
- if White e5/d4 against black e6/c6: d4 cannot safely push to d5 since d5 is double attacked;
- if White e5/d4 against black e6/d5: d4 cannot safely push to d5 since it is occupied.
This is exactly what the following expression does:
```
&& (shift<Up>(support) & ~(theirPawns | dblAttackThem)))
```
--------------------------
http://tests.stockfishchess.org/tests/view/5d3325bb0ebc5925cf0e6e91
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 124666 W: 27586 L: 27669 D: 69411
Closes https://github.com/official-stockfish/Stockfish/pull/2255
No functional change
This exploits the recent fractional Skill Level, and is a result from some discussion in #2221 and the older #758.
Basically, if UCI_LimitStrength is set, it will internally convert UCI_Elo to a matching fractional Skill Level.
The Elo estimate is based on games at TC 60+0.6, Hash 64Mb, 8moves_v3.pgn, rated with Ordo, anchored to goldfish1.13 (CCRL 40/4 ~2000).
Note that this is mostly about internal consistency, the anchoring to CCRL is a bit weak, e.g. within this tournament,
goldfish and sungorus only have a 200Elo difference, their rating difference on CCRL is 300Elo.
I propose that we continue to expose 'Skill Level' as an UCI option, for backwards compatibility.
The result of a tournament under those conditions are given by the following table, where the player name reflects the UCI_Elo.
# PLAYER : RATING ERROR POINTS PLAYED (%) CFS(%)
1 Elo2837 : 2792.2 50.8 536.5 711 75 100
2 Elo2745 : 2739.0 49.0 487.5 711 69 100
3 Elo2654 : 2666.4 49.2 418.0 711 59 100
4 Elo2562 : 2604.5 38.5 894.5 1383 65 100
5 Elo2471 : 2515.2 38.1 651.5 924 71 100
6 Elo2380 : 2365.9 35.4 478.5 924 52 100
7 Elo2289 : 2290.0 28.0 864.0 1596 54 100
8 sungorus1.4 : 2204.9 27.8 680.5 1596 43 60
9 Elo2197 : 2201.1 30.1 523.5 924 57 100
10 Elo2106 : 2103.8 24.5 730.5 1428 51 100
11 Elo2014 : 2030.5 30.3 377.5 756 50 98
12 goldfish1.13 : 2000.0 ---- 511.0 1428 36 100
13 Elo1923 : 1928.5 30.9 641.5 1260 51 100
14 Elo1831 : 1829.0 42.1 370.5 756 49 100
15 Elo1740 : 1738.3 42.9 277.5 756 37 100
16 Elo1649 : 1625.0 42.1 525.5 1260 42 100
17 Elo1558 : 1521.5 49.9 298.0 756 39 100
18 Elo1467 : 1471.3 51.3 246.5 756 33 100
19 Elo1375 : 1407.1 51.9 183.0 756 24 ---
It can be observed that all set Elos correspond within the error bars with the observed Ordo rating.
No functional change
This is a functional simplification that removes the std::pow from reduction. The resulting reduction values are within 1% of master.
This is a simplification because i believe an fp addition and multiplication is much faster than a call to std::pow() which is historically slow and performance varies widely on different architectures.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23471 W: 5245 L: 5127 D: 13099
http://tests.stockfishchess.org/tests/view/5d27ac1b0ebc5925cf0d476b
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 51533 W: 8736 L: 8665 D: 34132
http://tests.stockfishchess.org/tests/view/5d27b74e0ebc5925cf0d493c
Bench 3765158
The current skill levels (1-20) allow for adjusting playing strengths, but
do so in big steps (e.g. level 10 vs level 11 is a ~143 Elo jump at STC).
Since the 'Skill Level' input can already be a floating point number, this
patch uses the fractional part of the input to provide the user with
fine control, allowing for varying the playing strength essentially
continuously.
The implementation internally still uses integer skill levels (needed since they pick Depths),
but non-deterministically rounds up or down the used skill level such that the average integer
skill corresponds to the input floating point one. As expected, intermediate
(fractional) skill levels yield intermediate playing strenghts.
Tested at STC, playing level 10 against levels between 10 and 11 for 10000 games
level 10.25 ELO: 24.26 +-6.2
level 10.5 ELO: 67.51 +-6.3
level 10.75 ELO: 98.52 +-6.4
level 11 ELO: 143.65 +-6.7
http://tests.stockfishchess.org/tests/view/5cd9c6b40ebc5925cf056791http://tests.stockfishchess.org/tests/view/5cd9d22b0ebc5925cf056989http://tests.stockfishchess.org/tests/view/5cd9cf610ebc5925cf056906http://tests.stockfishchess.org/tests/view/5cd9d2490ebc5925cf05698e
No functional change.
Use comparison of eval with beta to predict potential cutNodes. This
allows multi-cut pruning to also prune possibly mislabeled Pv and NonPv
nodes.
STC:
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 54305 W: 12302 L: 11867 D: 30136
http://tests.stockfishchess.org/tests/view/5d048ba50ebc5925cf0a15e8
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 189512 W: 32620 L: 31904 D: 124988
http://tests.stockfishchess.org/tests/view/5d04bf740ebc5925cf0a17f0
Normally I would think such changes are risky, specially for PvNodes,
but after trying a few other versions, it seems this version is more
sound than I initially thought.
Aside from this, a small funtional change is made to return
singularBeta instead of beta to be more consistent with the fail-soft
logic used in other parts of search.
=============================
How to continue from there ?
We could try to audit other parts of the search where the "cutNode"
variable is used, and try to use dynamic info based on heuristic
eval rather than on this variable, to check if the idea behind this
patch could also be applied successfuly.
Bench: 3503788
Fixes issues #2126 and #2189 where no progress in rootDepth is made for particular fens:
8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1
8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46
the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction.
The patch was tested as a bug fix and passed:
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 56601 W: 12633 L: 12581 D: 31387
http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 52042 W: 8907 L: 8837 D: 34298
http://tests.stockfishchess.org/tests/view/5d0319420ebc5925cf09fe57
Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed:
VLTC:
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 142022 W: 20963 L: 20435 D: 100624
http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a
Bench: 3961247
Since root_probe() and root_probe_wdl() do not reset all tbRank values if they fail,
it is necessary to do this in rank_root_move(). This fixes issue #2196.
Alternatively, the loop could be moved into both root_probe() and root_probe_wdl().
No functional change
This is a non-functional simplification that combines vote counting and thread selecting in the same loop.
It is possible that the best thread would be updated more frequently than master, but I'm not sure it matters here. Perhaps "mostVotes" is a better name than "bestVote?"
STC (stopped early).
LLR: 0.70 (-2.94,2.94) [-3.00,1.00]
Total: 10714 W: 2329 L: 2311 D: 6074
http://tests.stockfishchess.org/tests/view/5ccc71470ebc5925cf03d244
No functional change.
Treat all threads the same as main thread and increment
failedHighCnt on fail highs. This makes the search try
again at lower depth.
@vondele suggested also changing the reset of failedHighCnt
when there is a fail low. Tests including this passed so the
branch has been updated to include both changes. failedHighCnt
is now handled exactly the same in helper threads and the main
thread. Thanks vondele :-)
STC @ 5+0.05 th 4 :
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 7769 W: 1704 L: 1557 D: 4508
http://tests.stockfishchess.org/tests/view/5c9f19520ebc5925cfffd2a1
LTC @ 20+0.2 th 8 :
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 37888 W: 5983 L: 5889 D: 26016
http://tests.stockfishchess.org/tests/view/5c9f57d10ebc5925cfffd696
Bench 3824325