Retire current asymmetric king evaluation
and use a much simpler symmetric one.
As a side effect retire the infamous
'Aggressiveness' and 'Cowardice' UCI
options.
Tested in no-regression mode,
Passed both STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 33855 W: 5863 L: 5764 D: 22228
And LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40571 W: 5852 L: 5760 D: 28959
bench: 8321835
At root we start counting plies from 1,
instead pv[] array starts from 0. So
the variable 'ply' we use in extract_pv_from_tt
to index pv[] is misnamed, indeed it is
not the real ply, but ply-1.
The fix is to leave ply name in extract_pv_from_tt
but assign it the correct start value and
consequentely change all the references to pv[].
Instead in insert_pv_in_tt it's simpler to rename
the misnamed 'ply' in 'idx'.
The off-by-one bug was unhidden when trying to use
'ply' for what it should have been, for instance in
this position:
position fen 8/6R1/8/3k4/8/8/8/2K5 w - - 0 1
at depth 24 mate line is erroneusly truncated due
to value_from_tt() using the wrong ply.
Spotted by Ronald de Man.
bench: 8732553
If razoring conditions are satisfied and
depth is low, then directly drop in qsearch.
Passed both STC
LLR: 2.98 (-2.94,2.94) [-1.50,4.50]
Total: 12914 W: 2345 L: 2208 D: 8361
And LTC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 50600 W: 7548 L: 7230 D: 35822
bench: 8739659
When there aren't legal moves after
a search, instead of returning imediately,
save bestValue in TT as in the usual case.
There is really no reason to special case
this one.
With this patch is fully fixed (again) follwing
position:
7k/6p1/6B1/5K1P/8/8/8/8 w - - 0 1
Also in SMP case.
bench: 8802105
After last Joona's patch there is no measurable
difference between the option set or unset.
Tested by Andreas Strangmüller with 16 threads
on his Dual Opteron 6376.
After 5000 games at 15+0.05 the result is:
1 Stockfish_14050822_T16_on : 3003 5000 (+849,=3396,-755), 50.9 %
2 Stockfish_14050822_T16_off : 2997 5000 (+755,=3396,-849), 49.1 %
bench: 880215
Instead of waiting to be allocated, actively search
for another split point to join when finishes its
search. Also modify split conditions.
This patch has been tested with 7 threads SMP and
passed both STC:
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 2885 W: 519 L: 410 D: 1956
And a reduced-LTC at 25+0.05
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 4401 W: 684 L: 566 D: 3151
Was then retested against regression in 3 thread case
at standard LTC of 60+0.05:
LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 40809 W: 5446 L: 5406 D: 29957
bench: 8802105
On a final fixed game number test it failed
to prove better than standard version.
STC 15+0.05
ELO: -0.86 +-1.7 (95%) LOS: 15.8%
Total: 57578 W: 10070 L: 10213 D: 37295
bench: 8802105
If we return from split with a stale value
due to a stop or a cutoff upstream occurred,
then we exit moves loop and save a stale value
in TT before returning search().
This patch, from Joona, fixes this.
bench: 8678654
We can never have bestValue == -VALUE_INFINITE at
the end of move loop because if no legal move exists
we detect it with previous condition on !moveCount,
if a legal move exists we never prune it due to
futility pruning condition:
bestValue > VALUE_MATED_IN_MAX_PLY
So this code never executes, as I have also verified
directly.
Issue reported by Joona.
No functional change.
This is a very discussed patch with many
argumentations pro and against. The fact is
it passed both STC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 16305 W: 3001 L: 2855 D: 10449
And LTC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 34273 W: 5180 L: 4931 D: 24162
Although it is true that a correct test should
include foreign engines, we commit it anyhow so
people can test it out in the wild, under broader
conditions.
bench: 7384368
This is more consistent with what other engines are doing.
Often people thinks that SF's scores are overblown. In the
end, it just boils down to the arbitrary way of rescaling them.
No functional change.
Tested directly at LTC because previous long
test series on this topic shows it is TC dependant.
Tested with no-regression mode because gets rid of
an ugly and ad-hoc rule.
Test at LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 67918 W: 10590 L: 10541 D: 46787
bench: 7926803
Thanks to std::bitset we can easily increase
the limit of active threads above 64.
Thanks to Lucas Braesch for pointing at the
correct solution of using std::bitset.
No functional change.
Split delta value in aspiration window so that when
search depth is less than 24 a smaller delta value
is used. The idea is that the search is likely to
be more accurate at lower depths and so we can exclude
more possibilities, 25% to be exact.
Passed STC
LLR: 2.96 (-2.94, 2.94) [-1.50, 4.50]
Total: 20430 W: 3775 L: 3618 D: 13037
And LTC
LLR: 2.96 (-2.94, 2.94) [0.00, 6.00]
Total: 5032 W: 839 L: 715 D: 3478
Bench: 7451319
It has been obsoleted out already some time ago
and currently there is no point in changing eval
score according to if we are in game or analyzing.
So retire the option.
No functional change.
Try to avoid repetition draws at early midgame,
this should give an edge against weaker opponents
and reduce draw rate.
Tested for regressions with SPRT[-3, 1] and
passed both short TC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 68498 W: 12928 L: 12891 D: 42679
And long TC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 40212 W: 6386 L: 6295 D: 27531
bench: 7990513
When running the following position:
8/kPp5/2P3p1/p1P1p1P1/2PpPp2/3p1p2/3P1P2/5K2 w - - 0 1
An assert is raised at depth 92:
assert(-VALUE_INFINITE <= alpha && alpha < beta && beta <= VALUE_INFINITE);
This is because it happens that beta = 29832,
so rbeta = 30032 that is > VALUE_INFINITE
Bug spotted and analyzed by Uri, fix suggested by Joerg.
Other fixes where possible but this one is pointed
exactly at the source of the bug, so it is the best
from a code documentation point of view.
bench: 8430785
Actually MultiCut is too different from current scheme.
Note that neither ProbCut is exactly what we do because
we try just a handful of captures instead of all moves,
nevertheless it seems more in line with what we do.
Suggested by Joona.
No functional change.
Makes more sense than returning a draw score. Tested
with reduced MAX_PLY = 30 and passed both short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 17434 W: 3345 L: 3194 D: 10895
And long TC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 2610 W: 488 L: 373 D: 1749
With current limit of MAX_PLY = 100 the patch should not
introduce any measurable change, nevertheless is the correct
approach.
Idea of returning eval is from Michel Van den Bergh.
bench: 8430785
Although does not change ELO level, it seems
verification is useful in many zugzwang positions
as reported by many sources.
So revert this simplification.
bench: 8430785
Tested with SPRT in simplification mode [-4.00,0.00],
this ensures that the patch is (very probably) not
a regression.
Passed both short TC
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 27543 W: 4278 L: 4209 D: 19056
And long TC
LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 39483 W: 7325 L: 7305 D: 24853
bench: 8347121
Hopefully this patch makes the code more:
* Self-documenting: Null search is always a zero window search,
because it is testing for a fail high. It should never be done
on a full window! The current code only works because we don't
do it at PV nodes, and therefore (alpha, beta) = (beta-1, beta):
that's the kind of "clever" trick we should avoid.
* Idiot-proof: If we want to enable null search at PV nodes, all we
need to do now is comment out the !PvNode condition. It's that simple!
In theory, null search should not be done at PV nodes, because PV nodes
should never fail high. But in practice, they DO fail high, because of
aspiration windows, and search inconsistencies, for example. So it makes
sense to keep that flexibility in the code.
No functional change.
Use ralpha instead of rbeta
* rbeta is confusing people. It took THREE attempts to code razoring
at PV nodes correctly in a recent test, because of the rbeta trick.
Unnecessary tricks should be avoided.
* The more correct and self-documenting way of doing this, is to say
that we use a zero window around alpha-margin, not beta-margin.
The fact that, because we only do it at PV nodes, alpha happens to be
beta-1 and that the current stuff with rbeta works, may be correct,
but is confusing.
Remove the misleading and partially erroneous comment about returning
v + margin:
* comments should explain what the code does, not what it could have done.
* this comment is partially wrong in saying that v+margin is "logical",
and that it is "surprising" that is doesn't work.
From a theoretical perspective, at least 3 ways of doing this are equally
defendable:
1/ fail hard: return alpha: The most conservative. We bet that the search
will fail low, but we don't know by how much and don't want to take risks.
2/ aggressive fail soft: return v (what the current code does). This
corresponds to normal fail soft, with the added assumption that we don't
care about the reduction effect (see below point 3/)
3/ conservative fail soft: return v + margin. If the reduced search (qsearch)
gives us a score <= v, we bet that the non reduced search will give us a
score <= v + margin.
* Saying that 2/ is "logical" implies that 1/ and 3/ are not, which is
arguably wrong. Besides, experimental results tell us that 2/ beats 3/,
and that's not something we can argue against: experimental results are
the only trusted metric.
* Also, with the benefit of hindsight, I don't think the fact that 2/ is
better than 3/ is surprising at all. The point is that it is YOUR turn to
move, and you are assuming that by NOT playing (and letting the opponent
capture your hanging pieces in QS) you cannot generally GAIN razor_margin(depth).
No functional change.
Depth is already dependent on the actual value
of ONE_PLY, in particular can be expressed like:
Depth = n * ONE_PLY
And because formula is used to calculate R that is
also dependent on the value of ONE_PLY and can be
expressed like:
R = x * ONE_PLY
We don't want to divide depth by a 'ply' value but
directly by an integer number.
Spotted by sf-x
No functional change.
Instead of a fixed reduction of ONE_PLY, now
Null move dynamic reduction based on value can
grow larger in case we are above beta of a value
much higher then PawnValueMg.
Note that now an eval returning VALUE_KNOWN_WIN
makes null search to drop in qsearch.
Passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 26141 W: 4871 L: 4699 D: 16571
And long TC:
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 33695 W: 5309 L: 5056 D: 23330
bench: 7356053
When we have a fail-high of a quiet move, store it in
a Followupmoves table indexed by the previous move of
the same color (instead of immediate previous move as
is in countermoves case).
Then use this table for quiet moves ordering in the same
way we are already doing with countermoves.
These followup moves will be tried just after countermoves
and before remaining quiet moves.
Passed both short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10350 W: 1998 L: 1866 D: 6486
And long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 14066 W: 2303 L: 2137 D: 9626
bench: 7205153