Use ralpha instead of rbeta
* rbeta is confusing people. It took THREE attempts to code razoring
at PV nodes correctly in a recent test, because of the rbeta trick.
Unnecessary tricks should be avoided.
* The more correct and self-documenting way of doing this, is to say
that we use a zero window around alpha-margin, not beta-margin.
The fact that, because we only do it at PV nodes, alpha happens to be
beta-1 and that the current stuff with rbeta works, may be correct,
but is confusing.
Remove the misleading and partially erroneous comment about returning
v + margin:
* comments should explain what the code does, not what it could have done.
* this comment is partially wrong in saying that v+margin is "logical",
and that it is "surprising" that is doesn't work.
From a theoretical perspective, at least 3 ways of doing this are equally
defendable:
1/ fail hard: return alpha: The most conservative. We bet that the search
will fail low, but we don't know by how much and don't want to take risks.
2/ aggressive fail soft: return v (what the current code does). This
corresponds to normal fail soft, with the added assumption that we don't
care about the reduction effect (see below point 3/)
3/ conservative fail soft: return v + margin. If the reduced search (qsearch)
gives us a score <= v, we bet that the non reduced search will give us a
score <= v + margin.
* Saying that 2/ is "logical" implies that 1/ and 3/ are not, which is
arguably wrong. Besides, experimental results tell us that 2/ beats 3/,
and that's not something we can argue against: experimental results are
the only trusted metric.
* Also, with the benefit of hindsight, I don't think the fact that 2/ is
better than 3/ is surprising at all. The point is that it is YOUR turn to
move, and you are assuming that by NOT playing (and letting the opponent
capture your hanging pieces in QS) you cannot generally GAIN razor_margin(depth).
No functional change.
Depth is already dependent on the actual value
of ONE_PLY, in particular can be expressed like:
Depth = n * ONE_PLY
And because formula is used to calculate R that is
also dependent on the value of ONE_PLY and can be
expressed like:
R = x * ONE_PLY
We don't want to divide depth by a 'ply' value but
directly by an integer number.
Spotted by sf-x
No functional change.
Instead of a fixed reduction of ONE_PLY, now
Null move dynamic reduction based on value can
grow larger in case we are above beta of a value
much higher then PawnValueMg.
Note that now an eval returning VALUE_KNOWN_WIN
makes null search to drop in qsearch.
Passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 26141 W: 4871 L: 4699 D: 16571
And long TC:
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 33695 W: 5309 L: 5056 D: 23330
bench: 7356053
When we have a fail-high of a quiet move, store it in
a Followupmoves table indexed by the previous move of
the same color (instead of immediate previous move as
is in countermoves case).
Then use this table for quiet moves ordering in the same
way we are already doing with countermoves.
These followup moves will be tried just after countermoves
and before remaining quiet moves.
Passed both short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10350 W: 1998 L: 1866 D: 6486
And long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 14066 W: 2303 L: 2137 D: 9626
bench: 7205153
While editing original Uri's messy patch
I have incorrectly simplified the logic
condition. Here is the correct original
version, as it was tested.
bench: 8502826
Remove the easy move code and add the condition to
play instantly if only one legal move is available.
Verified there is no regression at 60+0.05
ELO: 0.17 +-1.9 (95%) LOS: 57.0%
Total: 40000 W: 6397 L: 6377 D: 27226
bench: 8502826
If we are still at first move, without a fail-low and
current iteration is taking too long to complete then
stop the search.
Passed short TC:
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 26030 W: 4959 L: 4785 D: 16286
Long TC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 18019 W: 2936 L: 2752 D: 12331
And performed well at 40/30
ELO: 4.33 +-2.8 (95%) LOS: 99.9%
Total: 20000 W: 3480 L: 3231 D: 13289
bench: 8502826
A great simplification that shows no regression
and it seems even a bit scalable.
Tested with fixed number of games:
Short TC
ELO: 0.60 +-2.1 (95%) LOS: 71.1%
Total: 39554 W: 7477 L: 7409 D: 24668
Long TC
ELO: 2.97 +-2.0 (95%) LOS: 99.8%
Total: 36424 W: 5894 L: 5583 D: 24947
bench: 8184352
Change updating rule after a TT hit to match
the same one at the end of the search.
Small change in functionality, but we want to
have uniform rules in the code.
bench: 7767864
We already update killers so it is natural to extend to
history and counter move too.
Passed both short TC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 52690 W: 9955 L: 9712 D: 33023
And long TC
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 5555 W: 935 L: 808 D: 3812
bench: 7876473
After a fail high in LMR, if reduction is very high do
a research at lower depth before teh full depth one.
Chances are that the re-search will fail low and the
full depth one is skipped.
Passed both short TC:
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 11363 W: 2204 L: 2069 D: 7090
And long TC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 7292 W: 1195 L: 1061 D: 5036
bench: 7869223
Since all ENPASSANT moves are now considered dangerous, this
change of order should give a slight speedup.
Also simplify futilityValue formula.
No functional change.
Instead of a passed pawn now we just require the pawn to
be in the opponent camp to be considered a dangerous
move. Added some renaming to reflect the change.
Passed both short TC test
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10358 W: 2033 L: 1900 D: 6425
And long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 21459 W: 3486 L: 3286 D: 14687
bench: 8322172
It seems to intorduce a regression when tested
with 3 threads at 15+0.05:
ELO: -2.26 +-2.2 (95%) LOS: 2.4%
Total: 30000 W: 4813 L: 5008 D: 20179
bench: 8331357
Tested setting FakeSplit to true and running
./stockfish bench 128 2
There is a different signature with and without
the patch so it affects functionality but
only in SMP case.
bench: 8331357
SMP case is very tricky and raises an assert in stage_moves():
assert(stage == KILLERS_S1 || stage == QUIETS_1_S1 || stage == QUIETS_2_S1)
So rewrite the code to just return moves[] when we are sure
we are in quiet moves stages.
Also rename stage_moves to quiet_moves to reflect that.
No functional change (but needs testing in SMP case)
Use MovePicker moves[] to access already tried
quiet moves. A bit of care shall be taken
to avoid calling stage_moves() when we are still
at ttMove stage, because moves are yet to be
generated. Actually our staging move generation
makes this code a bit more tricky than what I'd
like, but removing an ausiliary redundant
array like quietsSearched[] is a good thing.
Idea by DiscoCheck
bench: 9355734
This seems a die hard idea :-)
Passed both short TC
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 17485 W: 3307 L: 3156 D: 11022
And long TC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 38181 W: 6002 L: 5729 D: 26450
bench: 8659830
As, Gary (that analyzed the bug) says:
SF does not print a PV when the original best move fails low,
we hit our time allowance, and stop the search. The output from
the SF search is below. It was failing low on Ne1 at depth 34.
Then, we get bestmove Qd3, but no PV change.
info depth 34 seldepth 45 score cp 38 upperbound nodes 483484489 nps 15464575 time 31264 multipv 1 pv f3e1 h5h4 e1d3 h4g3 f2g3 a6f6 f1f6 e7f6 d1a4 f6e7 a1f1 d8f8 a4b3 b7b6 b3c2 f7f6 c2a4 h3g5 b2b3 g5f7 a4c6 f7d6 h1g2 f6f5 e4f5 d6f5
info depth 34 seldepth 45 score cp 38 upperbound nodes 483484489 nps 15464575 time 31264 multipv 1 pv f3e1 h5h4 e1d3 h4g3 f2g3 a6f6 f1f6 e7f6 d1a4 f6e7 a1f1 d8f8 a4b3 b7b6 b3c2 f7f6 c2a4 h3g5 b2b3 g5f7 a4c6 f7d6 h1g2 f6f5 e4f5 d6f5
info depth 34 seldepth 47 score cp 30 upperbound nodes 2112334132 nps 17255517 time 122415 multipv 1 pv f3e1 h5h4 d1a4 a6f6 e1d3 d8f8 a4c2 h4g3 f2g3 f6f1 a1f1 h7g8 b2b3 f7f6 a3a4 b7b6
info depth 34 seldepth 47 score cp 30 upperbound nodes 2112334132 nps 17255517 time 122415 multipv 1 pv f3e1 h5h4 d1a4 a6f6 e1d3 d8f8 a4c2 h4g3 f2g3 f6f1 a1f1 h7g8 b2b3 f7f6 a3a4 b7b6
info nodes 18235667001 time 969824
bestmove e2d3 ponder c8d7
Looking at the code, if we hit Signals.stop, we return from id_loop
before printing any PV. It is possible for us to have resorted the
RootMove list though, which will change the move that is actually
played.
No functional change.
1/ eval margin and gains removed:
16bit are now free on TT entries, due to the removal of eval margin. may be useful
in the future :) gains removed: use instead by Value(128). search() and qsearch()
are now consistent in this regard.
2/ futility_margin()
linear formula instead of complex (log(depth), movecount) formula.
3/ unify pre & post futility pruning
pre futility pruning used depth < 7 plies, while post futility pruning used
depth < 4 plies. Now it's always depth < 7.
Tested with fixed number of games both at short TC:
ELO: 0.82 +-2.1 (95%) LOS: 77.3%
Total: 40000 W: 7939 L: 7845 D: 24216
And long TC
ELO: 0.59 +-2.0 (95%) LOS: 71.9%
Total: 40000 W: 6876 L: 6808 D: 26316
bench 7243575
1/ eval margin and gains removed:
- gains removed by Value(128): search() and qsearch() now behave consistently!
2/ futility_margin()
- testing showed that there is no added value in this weird (log(depth), movecount)
formula, and a much simpler linear formula is just as good. In fact, it is most
likely better, as it is not yet optimally tuned.
- the new simplified formula also means we get rid of FutilityMargins[], its
initialization code, and more importantly ss->futilityMoveCount, and the hacky
code that updates it throughout the search().
- the current formula gives negative futility margins, and there is a hidden interaction
between the move coutn pruning formula and the futility margin one: what happens is
that MCP is supposed to be triggered before we use the non-sensical negative futility
margins.
3/ unify pre & post futility pruning
- pre futility pruning (what SF calls value based pruning) used depth < 7 plies,
while post futility pruning (what SF calls static null move pruning) used depth < 4 plies.
- also the condition depth < 7 in pre futility pruning was not obvious, and it seemd
to be depth < 16 (futility_margin() returns an infinite value when depth >= 7).
Tested with fixed number of games both at short TC:
ELO: 0.82 +-2.1 (95%) LOS: 77.3%
Total: 40000 W: 7939 L: 7845 D: 24216
And long TC
ELO: 0.59 +-2.0 (95%) LOS: 71.9%
Total: 40000 W: 6876 L: 6808 D: 26316
bench: 10206576
In case we find a very good move after a
troubled start, we don't return immediately
anymore.
Tested directly at long TC where it passed:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 13910 W: 2397 L: 2228 D: 9285
bench: 7995098
And remove a complex (and broken) formula.
Indeed previous code was broken in case of TC with big
time increments where available_time() was too similar
to total time yielding to many time losses, so for instance:
go wtime 2600 winc 2600
info nodes 4432770 time 2601 <-- time forfeit!
maximum search time = 2530 ms
available_time = 2300 ms
For a reference and further details see:
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/dCPAvQDcm2E
Speed tested with bench disabling timer alltogheter vs timer set at
max resolution, showed we have no speed regressions both in single
core and when using all physical cores.
No functional change.
New formula mathces the old formula until d = 45
Test code:
int main() {
for(int d=1; d<=45; d++)
{
int a = int(log(double(d * d) / 2) / log(2.0) + 1.001);
int b = int(2.9 * log(double(d)));
if (a != b) std::cout << d << std::endl;
}
return 0;
}
bench: 8455956
Rationale:
- Speed of double and float is about the same (not on the hot path anyway)
- Double makes code prettier (no need to write 1.0f, just 1.0)
- Only practical advantage of float is to use less memory, but since we never
store large arrays of double, we don't care.
No functional change.
Unify extensions between PV and not PV nodes
and remove all but check extensions.
This is a simplification so tested at fixed number
of games where proved to not regress.
About 45k games at 15+0.05
ELO: 1.23 +-2.0 (95%) LOS: 88.5%
Total: 45643 W: 9107 L: 8946 D: 27590
About 45k games at 60+0.05
ELO: 1.07 +-1.8 (95%) LOS: 87.8%
Total: 46786 W: 7728 L: 7584 D: 31474
bench: 3172206
Now that we use pre-increment on enums, it
make sense, for code style uniformity, to
swith to pre-increment also for native types,
although there is no speed difference.
No functional change.