In some cases we want to go direcly to the moves loop
without checking for early return. The patch make this
logic more clear and consistent.
Tested for no regression, passed STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 25282 W: 5136 L: 5022 D: 15124
and LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 72007 W: 12133 L: 12095 D: 47779
bench: 9316798
This patch replaces RKISS by a simpler and faster PRNG, xorshift64* proposed
by S. Vigna (2014). It is extremely simple, has a large enough period for
Stockfish's needs (2^64), requires no warming-up (allowing such code to be
removed), and offers slightly better randomness than MT19937.
Paper: http://xorshift.di.unimi.it/
Reference source code (public domain):
http://xorshift.di.unimi.it/xorshift64star.c
The patch also simplifies how init_magics() searches for magics:
- Old logic: seed the PRNG always with the same seed,
then use optimized bit rotations to tailor the RNG sequence per rank.
- New logic: seed the PRNG with an optimized seed per rank.
This has two advantages:
1. Less code and less computation to perform during magics search (not ROTL).
2. More choices for random sequence tuning. The old logic only let us choose
from 4096 bit rotation pairs. With the new one, we can look for the best seeds
among 2^64 values. Indeed, the set of seeds[][] provided in the patch reduces
the effort needed to find the magics:
64-bit SF:
Old logic -> 5,783,789 rand64() calls needed to find the magics
New logic -> 4,420,086 calls
32-bit SF:
Old logic -> 2,175,518 calls
New logic -> 1,895,955 calls
In the 64-bit case, init_magics() take 25 ms less to complete (Intel Core i5).
Finally, when playing with strength handicap, non-determinism is achieved
by setting the seed of the static RNG only once. Afterwards, there is no need
to skip output values.
The bench only changes because the Zobrist keys are now different (since they
are random numbers straight out of the PRNG).
The RNG seed has been carefully chosen so that the
resulting Zobrist keys are particularly well-behaved:
1. All triplets of XORed keys are unique, implying that it
would take at least 7 keys to find a 64-bit collision
(test suggested by ceebo)
2. All pairs of XORed keys are unique modulo 2^32
3. The cardinality of { (key1 ^ key2) >> 48 } is as close
as possible to the maximum (65536)
Point 2 aims at ensuring a good distribution among the bits
that determine an TT entry's cluster, likewise point 3
among the bits that form the TT entry's key16 inside a
cluster.
Details:
Bitset card(key1^key2)
------ ---------------
RKISS
key16 64894 = 99.020% of theoretical maximum
low18 180117 = 99.293%
low32 305362 = 99.997%
Xorshift64*, old seed
key16 64918 = 99.057%
low18 179994 = 99.225%
low32 305350 = 99.993%
Xorshift64*, new seed
key16 65027 = 99.223%
low18 181118 = 99.845%
low32 305371 = 100.000%
Bench: 9324905
Resolves#148
Currently Search::RootMoves is accessed and even
modified by TB probing functions in a hidden
and sneaky way.
This is bad practice and makes the code tricky.
Instead explicily pass the vector as function
argument so to clarify that the vector is modified
inside the functions.
No functional change.
Simplified also some logic while there.
TBLargest needs renaming too, but itis for
a future patch because touches also syzygy
directory stuff.
No functional change.
We really don't need to uglify in this way
our nice count() API with this ad-hoc hack.
So remove the hack and use the already
existing infrastructure.
No functional change.
Resolves#134
Updating the makefile so that the clean and gcc-profile-clean targets also
remove the profiling data files in the syzygy directory.
No functional change.
Resolves#136
Just like in Physics, the ratio of 2 things in the same unit, should be
without unit.
Example use case:
- Ratio of a Depth by a Depth (eg. ONE_PLY) should be an int.
- Ratio of a Value by a Value (eg. PawnValueEg) should be an int.
Remove a bunch of useless const while there.
No functional change.
Resolves#128
Adds support for Syzygy tablebases to Stockfish. See
the Readme for information on using the tablebases.
Tablebase support can be enabled/disabled at the Makefile
level as well, by setting syzygy=yes or syzygy=no.
Big/little endian are both supported.
No functional change (if Tablebases are not used).
Resolves#6
It is possible that we won't have a ponder move if our PV
is too short. In that case, just don't print a ponder move.
No functional change
Resolves#130
When evaluating double pawns we use always
lsb() to extract the frontmost square.
This breaks evaluation color symmetry as is
possible to verify with an instrumented evaluate()
Value evaluate(const Position& pos) {
Value v = do_evaluate<false>(pos);
Position p = pos;
p.flip();
assert(v == do_evaluate<false>(p));
return v;
}
Passed no regression test:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 21035 W: 4244 L: 4122 D: 12669
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 39839 W: 6662 L: 6572 D: 26605
bench: 8255966
It is a non functional change, but because
we now skip copying of pv[] in SpNode, patch
has been tested for regression with 3 threads:
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 54668 W: 9873 L: 9809 D: 34986
No functional change.
This is a regression from 428962a
We have to cast to char here, otherwise the compiler
interprets it as an integer, and writes a number.
No functional change
Resolves#122
This is the first of a patch series to
rearrange and simplify accurate PV.
In this patch there is simple coding
style and reformatting stuff.
Verified with fishtest it does not crash
with MAX_PLY = 8
No functional change.
Previously token would keep its value from the previous line when an empty
line was input, leading to unexpected behaviour.
No functional change
Resolves#119
When comparing to a Depth it is more
consistent to use DEPTH_MAX instead
of a int.
This is a subtle difference because we use
ply and depth almost interchangably in SF,
but they are different. FOr counting plies
makes ense to continue using ints, while
for Depth we have our specific enum.
This cleanly fixes a new Clang 3.5 warning:
No functional change.
This gives SF accurate PVs, such that the evaluation of the leaf node in
the PV matches the score backed up to the root (99% of the time.
q-search will use the value stored in the hash table instead of the eval
value sometimes).
One drawback is that fail-high/low only get a minimal 2 move PV.
It doesn't add any additional overhead to the non-PV codepath except an
extra eight bytes to the SearchStack structure in multi-threaded
searches.
A core part of this is not pruning based on TT score in PV nodes. This
was measured as not being a regression at multiple TCs, except for one
exception, fast TC with huge hash, which is not realistic for longer
searches.
STC - 1 thread, 128 mb hash
ELO: 1.42 +-3.1 (95%) LOS: 81.9%
Total: 20000 W: 4078 L: 3996 D: 11926
STC - 3 thread, 128 mb hash
ELO: -3.60 +-2.9 (95%) LOS: 0.8%
Total: 20000 W: 3575 L: 3782 D: 12643
STC - 3 thread, 8 mb hash
ELO: 0.12 +-2.9 (95%) LOS: 53.3%
Total: 20000 W: 3654 L: 3647 D: 12699
LTC - 3 thread, 32mb hash
ELO: 2.29 +-2.0 (95%) LOS: 98.8%
Total: 35740 W: 5618 L: 5382 D: 24740
Bench: 6984058
Resolves#102
Daniel Jose reported that it was an elo gain in his engine:
http://www.talkchess.com/forum/viewtopic.php?t=54290
STC: Hash=16
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 33067 W: 6670 L: 6571 D: 19826
LTC: Hash=64
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41181 W: 7008 L: 6920 D: 27253
And another one to verify no regression with hash pressure:
STC: Hash=4
LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 25085 W: 5059 L: 4991 D: 15035
Verified that qsearch does not explode after this patch (recapture threshold).
Bench 7962287
Resolves#112
Real men jump/branch by hand...but
we prefer the humble way.
Moved also some uci info code where it
belongs, while there.
No functional change.
Resolves#110
16MB for 1" searches is more comensurate with the average use case.
And 16 is the default used by stockfish bench, so it makes sense to be
consistent, if only to have the same minimum memory requirement for using
SF and compiling it with PGO.
No functional change.
Resolves#109
Now we can directly replace it with
the definition resulting in simpler
and possibly faster code because
PvNode is evaluated at compile time.
No functional change.
* remove some erroneous comments, that were based on the ONE_PLY == 2.
* rename hd to d, because there's no more half-depth in SF.
* rescope variables into the for loops.
* reindent the for loops correctly.
* add a comment to explain the eval improving part (not so obvious to read
this code as array has 4 dimensions).
No functional change.
This should reduce search inconsistencies, and doesn't seem to have a measurable ELO Impact:
STC with Hash=16
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 49264 W: 10076 L: 10007 D: 29181
LTC with Hash=64
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 82149 W: 14044 L: 14023 D: 54082
Plus an extra test, to make sure it doesn't regress with strong hash pressure:
STC with Hash=4
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 21498 W: 4327 L: 4246 D: 12925
Bench: 7302735
Resolves#100
Idea is to apply king safety later in the endgame. Previously, we didn't
apply KS in a RR vs. Q ending for example, which causes poor play.
Now we calculate king attacks when the attacking side has a queen or more.
STC with 8moves_v3
LLR: 3.06 (-2.94,2.94) [0.00,4.00]
Total: 38481 W: 6228 L: 5952 D: 26301
LTC with 2moves_v1
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 51053 W: 8670 L: 8353 D: 34030
Bench: 7514010
Resolves#98
UCI specification is not clear on the size of
integers that are exchanged in the protocol, so
instead of a simple int, assume 'nodes' is a
int64_t because we need a bigger size to store
this value in many real cases, especialy with
very long searches.
No functional change.
Resolves#75