Instead of waiting to be allocated, actively search
for another split point to join when finishes its
search. Also modify split conditions.
This patch has been tested with 7 threads SMP and
passed both STC:
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 2885 W: 519 L: 410 D: 1956
And a reduced-LTC at 25+0.05
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 4401 W: 684 L: 566 D: 3151
Was then retested against regression in 3 thread case
at standard LTC of 60+0.05:
LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 40809 W: 5446 L: 5406 D: 29957
bench: 8802105
On a final fixed game number test it failed
to prove better than standard version.
STC 15+0.05
ELO: -0.86 +-1.7 (95%) LOS: 15.8%
Total: 57578 W: 10070 L: 10213 D: 37295
bench: 8802105
Unfortunatly we have a slow down that causes
a regression in STC with no-regression mode:
LLR: -2.96 (-2.94,2.94) [-3.00,1.00]
Total: 22454 W: 3836 L: 4029 D: 14589
bench: 8678654
The bug was found to be elsewhere. This version
is correct and also is able to detect as draw
positions like:
8/8/5b2/8/8/4k1p1/6P1/5K2 b - - 6 133
bench: 8678654
Position is win also if strong side has a bishop
and a knight (plus other material, otherwise
KBNK would be triggered instead of KXK).
This fixes a subtle bug where a search on position
k7/8/8/8/8/P7/PB6/K7 b - - 6 1
Instead of returning a draw score, suddendly returns
a big score. This happens because at one point in
search we reach this position:
8/Pk6/8/8/8/4B3/P7/K7 w - - 3 8
Where white can promote. In case of rook promotion (and also in case of
queen promotion) the resutling position gets a huge static eval that is
above VALUE_KNOWN_WIN (from the point of view of white). So for rook
promotion it is
&& futilityBase > -VALUE_KNOWN_WIN
that prevents futility pruning in qsearch. (Removing this condition indeed
lets the problem occur). Raising the static eval for K+B+N+X v K to a value
higher than VALUE_KNOWN_WIN fixes this particular problem without having to
introduce an extra futility pruning condition in qsearch.
I just checked and it seems K+R v K, K+2B v K and even K+B+N v K already get
a huge static eval. Why not K+B+N+P v K?
I think this fix corrects an oversight. There is special code for KBNK, but
KBNXK is handled by KXK, so the test for sufficient material should also test
for B+N.
bench: 8678654
In the (rare) cases when the two conditions
are true, then fully check again with a slow
but correct MoveList<LEGAL>(pos).size().
This is able to detect false positives like
this one:
8/8/8/Q7/5k1p/5P2/4KP2/8 b - - 0 17
When we have a possible simple pawn push that
is not stored in attacks[] array. Because the
third condition triggers very rarely, even if
it is slow, it does not alters in a measurable
way the average speed of the engine.
bench: 8678654
Currently a stealmate position is misevaluated
in a negative/positive score, this leads qsearch(),
that does not detects stealmates too, to return the
wrong score and this yields to some kind of endgames
to be completely misevaluated.
With this patch is fully fixed follwing position
7k/6p1/6B1/5K1P/8/8/8/8 w - - 0 1
Also in SMP case.
Correct root cause analysys by Ronald de Man.
bench: 8678654
After reverting to the original Tord's
endgame, a search on position
7k/6p1/6B1/5K1P/8/8/8/8 w - - 0 1
Reports, correctly, a draw score instead of
an advantage for white.
Issue reported by Uri Blass.
bench: 8678654
Remove the optimization for Intel, is not
standard and can break at any time, moreover
our release build is not done with Intel C++
anymore so we don't need to sqeeze the extra
speed out from this compiler.
No functional change.
If we return from split with a stale value
due to a stop or a cutoff upstream occurred,
then we exit moves loop and save a stale value
in TT before returning search().
This patch, from Joona, fixes this.
bench: 8678654
We can never have bestValue == -VALUE_INFINITE at
the end of move loop because if no legal move exists
we detect it with previous condition on !moveCount,
if a legal move exists we never prune it due to
futility pruning condition:
bestValue > VALUE_MATED_IN_MAX_PLY
So this code never executes, as I have also verified
directly.
Issue reported by Joona.
No functional change.
Intel compiler is very picky:
"error: this operation on an enumerated type requires an
applicable user-defined operator function"
Reported by Tony Gaor.
No functional change.
Put the division at the end to reduce
rounding errors. This alters the bench
due to different rounding errors, but
should not alter ELO in any way.
bench: 7615217
This apparentely silly tweak allows
to speed up the bench by almost 3%.
Not clear why, repeating with perft,
the speed up vanishes.
Suggested by Jonathan Calovski.
No functional change.
Believed to be a speed optimization as benched
on Windows with bench realtime affinity 0x1 deleting
highest and lowest runs:
Base Test
1549259 1608202
1538115 1583934
1543168 1556938
1536365 1554179
1533026 1582010
Signature remains unchanged and gives anywhere from 1-2% nps
boost in analysis depending on number of cores used.
No functional change.
This is a very discussed patch with many
argumentations pro and against. The fact is
it passed both STC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 16305 W: 3001 L: 2855 D: 10449
And LTC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 34273 W: 5180 L: 4931 D: 24162
Although it is true that a correct test should
include foreign engines, we commit it anyhow so
people can test it out in the wild, under broader
conditions.
bench: 7384368
Idea from Lyudmil Tsvetkov.
The value seems to be raised a bit abruptly, but as
Gary said, a blocked pawn on the sixth rank has been
instrumental in limiting king mobility in multiple
losses that I've seen from SF. A blocked pawn on fifth
rank is much less serious on the king safety impact.
Passed both STC
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 14551 W: 2750 L: 2607 D: 9194
and LTC
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 43595 W: 6917 L: 6618 D: 30060
And even a retest at 60" fixed games 40K
ELO: 1.79 +-1.9 (95%) LOS: 97.0%
Total: 39889 W: 6018 L: 5813 D: 28058
bench: 7154916
Adding BMI1 allows the compiler to use _blsr_u64
automatically (the advertised 0.3% speed gain).
I verified that the compiler does not use this
instruction with the -mbmi2 flag only. Also, all
processors supporting BMI2 is also supporting BMI1.
No functional change
Prefer
file_of(s) < file_of(ksq)
to the inidrect
file_of(ksq) < FILE_E
To evaluate if semiopen side to check is the left side.
Also other small touches while there.
No functional change.
Right now the Makefile is cluttered with OS X equivalents
of all the x86 targets. We can get rid of all of them and
just check UNAME against "Darwin" for the few OS X-specific
things we need to do.
We also disable Clang LTO when using BMI2 instructions. For
some reason, LLVM cannot find the PEXT instruction when using
LTO. I don't know why, but disabling LTO for BMI2 fixes it.
No functional change.