To verify that the end position from the PV matches the
evaluation score after these changes
We need to temporarily disable the qsearch tt-value refinement,
as that will change the score returned.
Previously token would keep its value from the previous line when an empty
line was input, leading to unexpected behaviour.
No functional change
Resolves#119
When comparing to a Depth it is more
consistent to use DEPTH_MAX instead
of a int.
This is a subtle difference because we use
ply and depth almost interchangably in SF,
but they are different. FOr counting plies
makes ense to continue using ints, while
for Depth we have our specific enum.
This cleanly fixes a new Clang 3.5 warning:
No functional change.
This gives SF accurate PVs, such that the evaluation of the leaf node in
the PV matches the score backed up to the root (99% of the time.
q-search will use the value stored in the hash table instead of the eval
value sometimes).
One drawback is that fail-high/low only get a minimal 2 move PV.
It doesn't add any additional overhead to the non-PV codepath except an
extra eight bytes to the SearchStack structure in multi-threaded
searches.
A core part of this is not pruning based on TT score in PV nodes. This
was measured as not being a regression at multiple TCs, except for one
exception, fast TC with huge hash, which is not realistic for longer
searches.
STC - 1 thread, 128 mb hash
ELO: 1.42 +-3.1 (95%) LOS: 81.9%
Total: 20000 W: 4078 L: 3996 D: 11926
STC - 3 thread, 128 mb hash
ELO: -3.60 +-2.9 (95%) LOS: 0.8%
Total: 20000 W: 3575 L: 3782 D: 12643
STC - 3 thread, 8 mb hash
ELO: 0.12 +-2.9 (95%) LOS: 53.3%
Total: 20000 W: 3654 L: 3647 D: 12699
LTC - 3 thread, 32mb hash
ELO: 2.29 +-2.0 (95%) LOS: 98.8%
Total: 35740 W: 5618 L: 5382 D: 24740
Bench: 6984058
Resolves#102
Daniel Jose reported that it was an elo gain in his engine:
http://www.talkchess.com/forum/viewtopic.php?t=54290
STC: Hash=16
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 33067 W: 6670 L: 6571 D: 19826
LTC: Hash=64
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41181 W: 7008 L: 6920 D: 27253
And another one to verify no regression with hash pressure:
STC: Hash=4
LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 25085 W: 5059 L: 4991 D: 15035
Verified that qsearch does not explode after this patch (recapture threshold).
Bench 7962287
Resolves#112
Real men jump/branch by hand...but
we prefer the humble way.
Moved also some uci info code where it
belongs, while there.
No functional change.
Resolves#110
16MB for 1" searches is more comensurate with the average use case.
And 16 is the default used by stockfish bench, so it makes sense to be
consistent, if only to have the same minimum memory requirement for using
SF and compiling it with PGO.
No functional change.
Resolves#109
Now we can directly replace it with
the definition resulting in simpler
and possibly faster code because
PvNode is evaluated at compile time.
No functional change.
* remove some erroneous comments, that were based on the ONE_PLY == 2.
* rename hd to d, because there's no more half-depth in SF.
* rescope variables into the for loops.
* reindent the for loops correctly.
* add a comment to explain the eval improving part (not so obvious to read
this code as array has 4 dimensions).
No functional change.
This should reduce search inconsistencies, and doesn't seem to have a measurable ELO Impact:
STC with Hash=16
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 49264 W: 10076 L: 10007 D: 29181
LTC with Hash=64
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 82149 W: 14044 L: 14023 D: 54082
Plus an extra test, to make sure it doesn't regress with strong hash pressure:
STC with Hash=4
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 21498 W: 4327 L: 4246 D: 12925
Bench: 7302735
Resolves#100
Idea is to apply king safety later in the endgame. Previously, we didn't
apply KS in a RR vs. Q ending for example, which causes poor play.
Now we calculate king attacks when the attacking side has a queen or more.
STC with 8moves_v3
LLR: 3.06 (-2.94,2.94) [0.00,4.00]
Total: 38481 W: 6228 L: 5952 D: 26301
LTC with 2moves_v1
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 51053 W: 8670 L: 8353 D: 34030
Bench: 7514010
Resolves#98
UCI specification is not clear on the size of
integers that are exchanged in the protocol, so
instead of a simple int, assume 'nodes' is a
int64_t because we need a bigger size to store
this value in many real cases, especialy with
very long searches.
No functional change.
Resolves#75
Clang 3.5 issues warning on constructs like: abs(f1 - f2). The thing is that
f1 and f2 are enum types, and the range given (all positive) allows the
compiler to choose an unsigned type (efficiency being one reason to prefer
unsigned arithmetic). If f1 < f2 are unsigned, then f1 - f2 wraps around zero
and the abs() becomes a no-op. It's the reinterpretation of the unsigned
result (large value) as a signed int that happens to give the correct result,
thanks to 2's complement. This is all tricky and dangerous!
In the spirit of the standard, we assume nothing on the signedness of enums,
and simply calculate the rank and file distances as:
- rank_dist(r1, r2) = r1 < r2 ? r2 - r1 : r1 - r2
- file_dist(f1, f2) = f1 < f2 ? f2 - f1 : f1 - f2
this logic can in fact be applied to any enum we may use, so for better
generality and to avoid code duplication, we use a template function diff()
here.
No functional change.
Resolves#95
This area has become obscure and tricky over the course of incremental
changes that did not respect the original's consistency and clarity. Now,
it's not clear why we use MAX_PLY = 120, or why we use MAX_PLY+6, among
other things.
This patch does the following:
* ID loop: depth ranges from 1 to MAX_PLY-1, and due to TT constraint (depth
must fit into an int8_t), MAX_PLY should be 128.
* stack[]: plies now range from 0 to MAX_PLY-1, hence stack[MAX_PLY+4],
because of the extra 2+2 padding elements (for ss-2 and ss+2). Document this
better, while we're at it.
* Enforce 0 <= ply < MAX_PLY:
- stop condition is now ss->ply >= MAX_PLY and not ss->ply > MAX_PLY.
- assert(ss->ply < MAX_PLY), before using ss+1 and ss+2.
- as a result, we don't need the artificial MAX_PLY+6 range. Instead we
can use MAX_PLY+4 and it's clear why (for ss-2 and ss+2).
* fix: extract_pv_from_tt() and insert_pv_in_tt() had no reason to use
MAX_PLY_PLUS_6, because the array is indexed by plies, so the range of
available plies applies (0..MAX_PLY before, and now 0..MAX_PLY-1).
Tested with debug compile, using MAX_PLY=16, and running bench at depth 17,
using 1 and 7 threads. No assert() fired. Feel free to submit to more severe
crash-tests, if you can think of any.
No functional change.
SSE4.2 has nothing to do with POPCNT. We must dispell this myth, because
Stockfish is a reference and many will copy this mistake if they see it in Stockfish:
* SSE is an SIMD instruction set, relative to vectorization (using special 128-bit registers).
* POPCNT/LZCNT work on normal registers (eg. AL, AX, EAX, RAX).
The confusion comes from the fact that, in the Intel product line, it just
so happens that SSE4.2 and POPCNT/LZCNT came out at the same time. But this
is not true for AMD. For example, all AMD Pheniom II have SSE3 but no
POPCNT/LZCNT, and that is why the modern compile uses -msse3 -popcnt and not -msse4.2.
No functional change.
Resolves#86
Objects that are only accessible at file-scope should be put in the anonymous namespace.
This is what the C++ standard recommends, rather than using static, which is really C-style and results in static linkage.
Stockfish already does this throughout the code. So let's weed out the few exceptions,
because... they have no reason to be exceptional.
No functional change.
Resolves#84
It is more idiomatic, we didn't used it
in the past because Position::pretty(Move)
had a calling argument, but now we can.
As an added benefit, we avoid a lot of string
copies in the process because now we avoid
std::ostringstream ss.
No functional change.