This exploits the recent fractional Skill Level, and is a result from some discussion in #2221 and the older #758.
Basically, if UCI_LimitStrength is set, it will internally convert UCI_Elo to a matching fractional Skill Level.
The Elo estimate is based on games at TC 60+0.6, Hash 64Mb, 8moves_v3.pgn, rated with Ordo, anchored to goldfish1.13 (CCRL 40/4 ~2000).
Note that this is mostly about internal consistency, the anchoring to CCRL is a bit weak, e.g. within this tournament,
goldfish and sungorus only have a 200Elo difference, their rating difference on CCRL is 300Elo.
I propose that we continue to expose 'Skill Level' as an UCI option, for backwards compatibility.
The result of a tournament under those conditions are given by the following table, where the player name reflects the UCI_Elo.
# PLAYER : RATING ERROR POINTS PLAYED (%) CFS(%)
1 Elo2837 : 2792.2 50.8 536.5 711 75 100
2 Elo2745 : 2739.0 49.0 487.5 711 69 100
3 Elo2654 : 2666.4 49.2 418.0 711 59 100
4 Elo2562 : 2604.5 38.5 894.5 1383 65 100
5 Elo2471 : 2515.2 38.1 651.5 924 71 100
6 Elo2380 : 2365.9 35.4 478.5 924 52 100
7 Elo2289 : 2290.0 28.0 864.0 1596 54 100
8 sungorus1.4 : 2204.9 27.8 680.5 1596 43 60
9 Elo2197 : 2201.1 30.1 523.5 924 57 100
10 Elo2106 : 2103.8 24.5 730.5 1428 51 100
11 Elo2014 : 2030.5 30.3 377.5 756 50 98
12 goldfish1.13 : 2000.0 ---- 511.0 1428 36 100
13 Elo1923 : 1928.5 30.9 641.5 1260 51 100
14 Elo1831 : 1829.0 42.1 370.5 756 49 100
15 Elo1740 : 1738.3 42.9 277.5 756 37 100
16 Elo1649 : 1625.0 42.1 525.5 1260 42 100
17 Elo1558 : 1521.5 49.9 298.0 756 39 100
18 Elo1467 : 1471.3 51.3 246.5 756 33 100
19 Elo1375 : 1407.1 51.9 183.0 756 24 ---
It can be observed that all set Elos correspond within the error bars with the observed Ordo rating.
No functional change
This is a functional simplification that removes the std::pow from reduction. The resulting reduction values are within 1% of master.
This is a simplification because i believe an fp addition and multiplication is much faster than a call to std::pow() which is historically slow and performance varies widely on different architectures.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23471 W: 5245 L: 5127 D: 13099
http://tests.stockfishchess.org/tests/view/5d27ac1b0ebc5925cf0d476b
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 51533 W: 8736 L: 8665 D: 34132
http://tests.stockfishchess.org/tests/view/5d27b74e0ebc5925cf0d493c
Bench 3765158
This is another functional simplification to Stockfish passed pawn evaluation.
Stockfish evaluates some pawns which are not yet passed as "candidate" passed pawns, which are given half the bonus of fully passed ones. Prior to this commit, Stockfish considered a passed pawn to be a "candidate" if (a) it would not be a passed pawn if moved one square forward (the blocking square), or (b) there were other pawns (of either color) in front of it on the file. This latter condition used a fairly complicated method, forward_file_bb; here, rather than inspect the entire forward file, we simply re-use the blocking square. As a result, some pawns previously considered "candidates", but which are able to push forward, no longer have their bonus halved.
Simplification tests passed quickly at both STC and LTC. The results from both tests imply that this simplification is, most likely, additionally a small Elo gain, with a LTC likelihood of superiority of 87 percent.
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 12908 W: 2909 L: 2770 D: 7229
http://tests.stockfishchess.org/tests/view/5d2a1c880ebc5925cf0d9006
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20723 W: 3591 L: 3470 D: 13662
http://tests.stockfishchess.org/tests/view/5d2a21fd0ebc5925cf0d9118
Bench: 3377831
Current master code made sence when we had 2 types of bonuses for protected path to queen. But it was simplified so we have only one bonus now and code was never cleaned.
This non-functional simplification removes useless defendedsquares bitboard and removes one bitboard assignment (defendedSquares &= attackedBy[Us][ALL_PIECES] + defendedSquares & blockSq becomes just attackedBy[Us][ALL_PIECES] & blockSq also we never assign defendedSquares = squaresToQueen because we don't need it).
So should be small non-functional speedup.
Passed simplification SPRT.
http://tests.stockfishchess.org/tests/view/5d2966ef0ebc5925cf0d7659
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23319 W: 5152 L: 5034 D: 13133
bench 3361902
In Stockfish, both the middlegame and endgame bonus for a passed pawn are calculated as a product of two factors. The first is k, chosen based on the presence of defended and unsafe squares. The second is w, a quadratic function of the pawn's rank. Both are only applied if the pawn's relative rank is at least RANK_4.
It does not appear that the complexity of a quadratic function is necessary for w. Here, we replace it with a simpler linear one, which performs equally at both STC and LTC.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 46814 W: 10386 L: 10314 D: 26114
http://tests.stockfishchess.org/tests/view/5d29686e0ebc5925cf0d76a1
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 82372 W: 13845 L: 13823 D: 54704
http://tests.stockfishchess.org/tests/view/5d2980650ebc5925cf0d7bfd
Bench: 3328507
We recently added a bonus for double pawn attacks on unsupported enemy pawns,
on June 27. However, it is possible that the unsupported pawn may become a passer
by simply pushing forward out of the double attack. By rewarding double attacks,
we may inadvertently reward the creation of enemy passers, by encouraging both of
our would-be stoppers to attack the enemy pawn even if there is no opposing
friendly pawn on the same file.
Here, we revise this term to exclude passed pawns. In order to simplify the code
with this change included, we non-functionally rewrite Attacked2Unsupported to
be a penalty for enemy attacks on friendly pawns, rather than a bonus for our
attacks on enemy pawns. This allows us to exclude passed pawns with a simple
& ~e->passedPawns[Us], while passedPawns[Them] is not yet defined in this part
of the code.
This dramatically reduces the proportion of positions in which Attacked2Unsupported
is applied, to about a third of the original. To compensate, maintaining the same
average effect across our bench positions, we nearly triple Attacked2Unsupported
from S(0, 20) to S(0, 56). Although this pawn formation is rare, it is worth more
than half a pawn in the endgame!
STC: (stopped automatically by fishtest after 250,000 games)
LLR: -0.87 (-2.94,2.94) [0.50,4.50]
Total: 250000 W: 56585 L: 55383 D: 138032
http://tests.stockfishchess.org/tests/view/5d25795e0ebc5925cf0cfb51
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 81038 W: 13965 L: 13558 D: 53515
http://tests.stockfishchess.org/tests/view/5d25f3920ebc5925cf0d10dd
Closes https://github.com/official-stockfish/Stockfish/pull/2233
Bench: 3765158
PowerPC has had popcount instructions for a long time, at least as far
back as POWER5 (released 2004). Enable them via a gcc builtin.
Using a gcc builtin has the added bonus that if compiled for a processor
that lacks a hardware instruction, gcc will include a software popcount
implementation that does not use the instruction. It might be slower
than the table lookups (or it might be faster) but it will certainly work.
So this isn't going to break anything.
On my POWER8 VM, this leads to a ~4.27% speedup.
Fir prefetch, the gcc builtin generates a 'dcbt' instruction, which is
supported at least as far back as the G5 (2002) and POWER4 (2001).
This leads to a ~5% speedup on my POWER8 VM.
No functional change
The current skill levels (1-20) allow for adjusting playing strengths, but
do so in big steps (e.g. level 10 vs level 11 is a ~143 Elo jump at STC).
Since the 'Skill Level' input can already be a floating point number, this
patch uses the fractional part of the input to provide the user with
fine control, allowing for varying the playing strength essentially
continuously.
The implementation internally still uses integer skill levels (needed since they pick Depths),
but non-deterministically rounds up or down the used skill level such that the average integer
skill corresponds to the input floating point one. As expected, intermediate
(fractional) skill levels yield intermediate playing strenghts.
Tested at STC, playing level 10 against levels between 10 and 11 for 10000 games
level 10.25 ELO: 24.26 +-6.2
level 10.5 ELO: 67.51 +-6.3
level 10.75 ELO: 98.52 +-6.4
level 11 ELO: 143.65 +-6.7
http://tests.stockfishchess.org/tests/view/5cd9c6b40ebc5925cf056791http://tests.stockfishchess.org/tests/view/5cd9d22b0ebc5925cf056989http://tests.stockfishchess.org/tests/view/5cd9cf610ebc5925cf056906http://tests.stockfishchess.org/tests/view/5cd9d2490ebc5925cf05698e
No functional change.
Initialization of larger hash sizes can take some time.
Don't include this time in the bench by resetting the timer after Search::clear().
Also move 'ucinewgame' command down in the list, so that it is processed
after the configuration of Threads and Hash size.
No functional change.
This is a functional change that rewards double attacks on an unsupported pawns.
STC (non-functional difference)
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 83276 W: 18981 L: 18398 D: 45897
http://tests.stockfishchess.org/tests/view/5d0970500ebc5925cf0a77d4
LTC (incomplete looping version)
LLR: 0.50 (-2.94,2.94) [0.00,3.50]
Total: 82999 W: 14244 L: 13978 D: 54777
http://tests.stockfishchess.org/tests/view/5d0a8d480ebc5925cf0a8d58
LTC (completed non-looping version).
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 223381 W: 38323 L: 37512 D: 147546
http://tests.stockfishchess.org/tests/view/5d0e80510ebc5925cf0ad320
Closes https://github.com/official-stockfish/Stockfish/pull/2205
Bench 3633546
----------------------------------
Comments by Alain SAVARD:
interesting result ! I would have expected that search would resolve such positions
correctly on the very next move. This is not a very common pattern, and when it happens,
it will quickly disappear. So I'm quite surprised that it passed LTC.
I would be even more surprised if this would resist a simplification.
Anyway, let's try to imagine a few cases.
a) If you have White d5 f5 against Black e6, and White to move
last move by Black was probably a capture on e6 and White is about to recapture on e6
b) If you have White d5 f5 against e6, and Black to move
last move by White was possibly a capture on d5 or f5
or the pawn on e6 was pinned or could not move for some reason.
and white wants to blast open the position and just pushed d4-d5 or f4-f5
Some possible follow-ups
a) Motif is so rare that the popcount() can be safely replaced with a bool()
But this would not pass a SPRT[0,4],
So try a simplification with bool() and also without the & ~theirAttacks
b) If it works, we probably can simply have this in the loop
if (lever) score += S(0, 20);
c) remove all this and tweak something in search for pawn captures (priority, SEE, extension,..)
-removes wideUnsafeSquares bitboard
-removes a couple of bitboard operations
-removes one if operator
-updates comments so they actually represent what this part of code is doing now.
passed non-regression STC
http://tests.stockfishchess.org/tests/view/5d0c1ae50ebc5925cf0aa8db
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 16892 W: 3865 L: 3733 D: 9294
No functional change
Use comparison of eval with beta to predict potential cutNodes. This
allows multi-cut pruning to also prune possibly mislabeled Pv and NonPv
nodes.
STC:
LLR: 2.95 (-2.94,2.94) [0.50,4.50]
Total: 54305 W: 12302 L: 11867 D: 30136
http://tests.stockfishchess.org/tests/view/5d048ba50ebc5925cf0a15e8
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 189512 W: 32620 L: 31904 D: 124988
http://tests.stockfishchess.org/tests/view/5d04bf740ebc5925cf0a17f0
Normally I would think such changes are risky, specially for PvNodes,
but after trying a few other versions, it seems this version is more
sound than I initially thought.
Aside from this, a small funtional change is made to return
singularBeta instead of beta to be more consistent with the fail-soft
logic used in other parts of search.
=============================
How to continue from there ?
We could try to audit other parts of the search where the "cutNode"
variable is used, and try to use dynamic info based on heuristic
eval rather than on this variable, to check if the idea behind this
patch could also be applied successfuly.
Bench: 3503788
Fixes issues #2126 and #2189 where no progress in rootDepth is made for particular fens:
8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1
8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46
the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction.
The patch was tested as a bug fix and passed:
STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 56601 W: 12633 L: 12581 D: 31387
http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 52042 W: 8907 L: 8837 D: 34298
http://tests.stockfishchess.org/tests/view/5d0319420ebc5925cf09fe57
Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed:
VLTC:
LLR: 2.96 (-2.94,2.94) [0.00,3.50]
Total: 142022 W: 20963 L: 20435 D: 100624
http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a
Bench: 3961247
Since root_probe() and root_probe_wdl() do not reset all tbRank values if they fail,
it is necessary to do this in rank_root_move(). This fixes issue #2196.
Alternatively, the loop could be moved into both root_probe() and root_probe_wdl().
No functional change
This is a non-functional simplification.
backmost_sq and frontmost_sq are redundant. It seems quite clear to always use frontmost_sq and use the correct color.
Non functional change.
Increase size of the pawns table by the factor 8. This decreases the number of recalculations of pawn structure information significantly (at least at LTC).
I have done measurements for different depths and pawn cache sizes.
First are given the number of pawn entry calculations are done (in parentheses is the frequency that a call to probe triggers a pawn entry calculation). The delta% are the percentage of less done pawn entry calculations in comparison to master
VSTC: bench 1 1 12
STC: bench 8 1 16
LTC: bench 64 1 20
VLTC: bench 512 1 24
VSTC STC LTC VLTC
master 82218(6%) 548935(6%) 2415422(7%) 9548071(7%)
pawncache*2 79859(6%) 492943(5%) 2084794(6%) 8275206(6%)
pawncache*4 78551(6%) 458758(5%) 1827770(5%) 7112531(5%)
pawncache*8 77963(6%) 439421(4%) 1649169(5%) 6128652(4%)
delta%(p2-m) -2.9% -10.2% -13.7% -13.3%
delta%(p4-m) -4.5% -16.4% -24.3% -25.5%
delta%(p8-m) -5.2% -20.0% -31.7% -35.8%
STC: (non-regression test because at STC the effect is smaller than at LTC)
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22767 W: 5160 L: 5040 D: 12567
http://tests.stockfishchess.org/tests/view/5d00f6040ebc5925cf09c3e2
LTC:
LLR: 2.94 (-2.94,2.94) [0.00,4.00]
Total: 26340 W: 4524 L: 4286 D: 17530
http://tests.stockfishchess.org/tests/view/5d00a3810ebc5925cf09ba16
No functional change.
This is a functional simplification. This is NOT the exact version that was tested. Beyond the testing, an assignment was removed and a piece changes for consistency.
Instead of rewarding ANY square past an opponent pawn as an "outpost," only use squares that are protected by our pawn. I believe this is more consistent with what the chess world calls an "outpost."
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 23540 W: 5387 L: 5269 D: 12884
http://tests.stockfishchess.org/tests/view/5cf51e6d0ebc5925cf08b823
LTC
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 53085 W: 9271 L: 9204 D: 34610
http://tests.stockfishchess.org/tests/view/5cf5279e0ebc5925cf08b992
bench 3424592
Stockfish evaluates passed pawns in part based on a variable k, which shapes the passed pawn bonus based on the number of squares between the current square and promotion square that are attacked by enemy pieces, and the number defended by friendly ones. Prior to this commit, we gave a large bonus when all squares between the pawn and the promotion square were defended, and if they were not, a somewhat smaller bonus if at least the pawn's next square was. However, this distinction does not appear to provide any Elo at STC or LTC.
Where do we go from here? Many promising Elo-gaining patches were attempted in the past few months to refine passed pawn calculation, by altering the definitions of unsafe and defended squares. Stockfish uses these definitions to choose the value of k, so those tests interact with this PR. Therefore, it may be worthwhile to retest previously promising but not-quite-passing tests in the vicinity of this patch.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 42344 W: 9455 L: 9374 D: 23515
http://tests.stockfishchess.org/tests/view/5cf83ede0ebc5925cf0904fb
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 69548 W: 11855 L: 11813 D: 45880
http://tests.stockfishchess.org/tests/view/5cf8698f0ebc5925cf0908c8
Bench: 3854907
This is a non-functional simplification. Since our file_bb handles either Files or Squares, using Square here removes some code. Not likely any performance difference despite the test.
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 6081 W: 1444 L: 1291 D: 3346
http://tests.stockfishchess.org/tests/view/5ceb3e2e0ebc5925cf07ab03
Non functional change.