Since all ENPASSANT moves are now considered dangerous, this
change of order should give a slight speedup.
Also simplify futilityValue formula.
No functional change.
We avoid to use an ad-hoc table at the cost of a
relative_rank() call in advanced_pawn_push().
On my 32 bit system it is even slightly faster (on 64bit
may be different). This is the speed in nps alternating
old and new bench runs:
new
368890
368825
369972
old
367798
367635
368026
No functional change.
Instead of a passed pawn now we just require the pawn to
be in the opponent camp to be considered a dangerous
move. Added some renaming to reflect the change.
Passed both short TC test
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 10358 W: 2033 L: 1900 D: 6425
And long TC
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 21459 W: 3486 L: 3286 D: 14687
bench: 8322172
To align to same named Position function and
avoid using std::cout directly.
Also remove some stale <iostream> include while
there.
No functional change.
Add a Mac SSE4.2 target. Also change the Mac OS X minimum version to
10.6. Rationale: 97% of Macs run at least 10.6, version 10.9 is now
free, and using 10.6 as the minimum version gives a small 5% boost in
benchmark speed over versions using 10.0 as the minimum version.
Finally, enable Clang’s Link Time Optimization when compiling for the
Mac.
No functional change.
An old idea retested at SPRT(0, 3) with 60+0.05 TC:
LLR: 2.95 (-2.94,2.94) [0.00,3.00]
Total: 98872 W: 15549 L: 15123 D: 68200
This is a very small elo increase patch so it really
stresses the limits of fishtest.
bench: 8596156
It seems to intorduce a regression when tested
with 3 threads at 15+0.05:
ELO: -2.26 +-2.2 (95%) LOS: 2.4%
Total: 30000 W: 4813 L: 5008 D: 20179
bench: 8331357
Tested setting FakeSplit to true and running
./stockfish bench 128 2
There is a different signature with and without
the patch so it affects functionality but
only in SMP case.
bench: 8331357
SMP case is very tricky and raises an assert in stage_moves():
assert(stage == KILLERS_S1 || stage == QUIETS_1_S1 || stage == QUIETS_2_S1)
So rewrite the code to just return moves[] when we are sure
we are in quiet moves stages.
Also rename stage_moves to quiet_moves to reflect that.
No functional change (but needs testing in SMP case)
Use MovePicker moves[] to access already tried
quiet moves. A bit of care shall be taken
to avoid calling stage_moves() when we are still
at ttMove stage, because moves are yet to be
generated. Actually our staging move generation
makes this code a bit more tricky than what I'd
like, but removing an ausiliary redundant
array like quietsSearched[] is a good thing.
Idea by DiscoCheck
bench: 9355734
Use the newly introduced LineBB[] to simplify this
super hot-path function.
Verified with perft we don't have any speed regression, although
the number of squares removed is less than before in case of
contact check.
Insipred by DiscoCheck implementation.
Perft numbers are the same, but we have an harmless functional
change due to reorder of moves, because now some illegal moves
are no more detected at generation time, but in the search.
bench: 8331357
This seems a die hard idea :-)
Passed both short TC
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 17485 W: 3307 L: 3156 D: 11022
And long TC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 38181 W: 6002 L: 5729 D: 26450
bench: 8659830
Actually, it is not used, as both arrays have the
same values. Some local tests in either direction
showed no improvement.
Also some minor corrections in the comments.
No functional change.
Previously some squares could be "incorrectly" awarded
to a pinned piece.
e.g. in 3k4/1q6/3b4/3Q4/8/5K2/B7/8 b - - 0 1 the black
bishop get 4 squares too many and the white queen gets 6.
Passed both short TC.
LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 4871 W: 934 L: 817 D: 3120
And long TC:
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 38968 W: 6113 L: 5837 D: 27018
bench: 9282549
But compensate by reducing rook and queen
value by 53 = (160 / 3)
Material imbalances are affected as follows:
Red. Major Rook Queen Total
QRR +160 -2*53 -53 +1
QR +160 -53 -53 +54
RR +160 -2*53 0 +54
R 0 -53 0 -53
Q 0 0 -53 -53
so that the imbalance changes by at most 54 + 53 = 107 units.
This corresponds to appromximately 3.5cp in the final evaluation.
Verified with fixed number 40000 games at both short
and long TC it does not regress.
Short TC 15+0.05
ELO: 1.93 +-2.1 (95%) LOS: 96.6%
Total: 40000 W: 7520 L: 7298 D: 25182
Long TC 60+0.05
ELO: -0.33 +-1.9 (95%) LOS: 36.5%
Total: 39663 W: 6067 L: 6105 D: 27491
bench: 6703846
As, Gary (that analyzed the bug) says:
SF does not print a PV when the original best move fails low,
we hit our time allowance, and stop the search. The output from
the SF search is below. It was failing low on Ne1 at depth 34.
Then, we get bestmove Qd3, but no PV change.
info depth 34 seldepth 45 score cp 38 upperbound nodes 483484489 nps 15464575 time 31264 multipv 1 pv f3e1 h5h4 e1d3 h4g3 f2g3 a6f6 f1f6 e7f6 d1a4 f6e7 a1f1 d8f8 a4b3 b7b6 b3c2 f7f6 c2a4 h3g5 b2b3 g5f7 a4c6 f7d6 h1g2 f6f5 e4f5 d6f5
info depth 34 seldepth 45 score cp 38 upperbound nodes 483484489 nps 15464575 time 31264 multipv 1 pv f3e1 h5h4 e1d3 h4g3 f2g3 a6f6 f1f6 e7f6 d1a4 f6e7 a1f1 d8f8 a4b3 b7b6 b3c2 f7f6 c2a4 h3g5 b2b3 g5f7 a4c6 f7d6 h1g2 f6f5 e4f5 d6f5
info depth 34 seldepth 47 score cp 30 upperbound nodes 2112334132 nps 17255517 time 122415 multipv 1 pv f3e1 h5h4 d1a4 a6f6 e1d3 d8f8 a4c2 h4g3 f2g3 f6f1 a1f1 h7g8 b2b3 f7f6 a3a4 b7b6
info depth 34 seldepth 47 score cp 30 upperbound nodes 2112334132 nps 17255517 time 122415 multipv 1 pv f3e1 h5h4 d1a4 a6f6 e1d3 d8f8 a4c2 h4g3 f2g3 f6f1 a1f1 h7g8 b2b3 f7f6 a3a4 b7b6
info nodes 18235667001 time 969824
bestmove e2d3 ponder c8d7
Looking at the code, if we hit Signals.stop, we return from id_loop
before printing any PV. It is possible for us to have resorted the
RootMove list though, which will change the move that is actually
played.
No functional change.
1/ eval margin and gains removed:
16bit are now free on TT entries, due to the removal of eval margin. may be useful
in the future :) gains removed: use instead by Value(128). search() and qsearch()
are now consistent in this regard.
2/ futility_margin()
linear formula instead of complex (log(depth), movecount) formula.
3/ unify pre & post futility pruning
pre futility pruning used depth < 7 plies, while post futility pruning used
depth < 4 plies. Now it's always depth < 7.
Tested with fixed number of games both at short TC:
ELO: 0.82 +-2.1 (95%) LOS: 77.3%
Total: 40000 W: 7939 L: 7845 D: 24216
And long TC
ELO: 0.59 +-2.0 (95%) LOS: 71.9%
Total: 40000 W: 6876 L: 6808 D: 26316
bench 7243575
1/ eval margin and gains removed:
- gains removed by Value(128): search() and qsearch() now behave consistently!
2/ futility_margin()
- testing showed that there is no added value in this weird (log(depth), movecount)
formula, and a much simpler linear formula is just as good. In fact, it is most
likely better, as it is not yet optimally tuned.
- the new simplified formula also means we get rid of FutilityMargins[], its
initialization code, and more importantly ss->futilityMoveCount, and the hacky
code that updates it throughout the search().
- the current formula gives negative futility margins, and there is a hidden interaction
between the move coutn pruning formula and the futility margin one: what happens is
that MCP is supposed to be triggered before we use the non-sensical negative futility
margins.
3/ unify pre & post futility pruning
- pre futility pruning (what SF calls value based pruning) used depth < 7 plies,
while post futility pruning (what SF calls static null move pruning) used depth < 4 plies.
- also the condition depth < 7 in pre futility pruning was not obvious, and it seemd
to be depth < 16 (futility_margin() returns an infinite value when depth >= 7).
Tested with fixed number of games both at short TC:
ELO: 0.82 +-2.1 (95%) LOS: 77.3%
Total: 40000 W: 7939 L: 7845 D: 24216
And long TC
ELO: 0.59 +-2.0 (95%) LOS: 71.9%
Total: 40000 W: 6876 L: 6808 D: 26316
bench: 10206576
RedundantRook and RedundantQueen replaced by simple
variable RedundantMajor. Also the SameColor coefficient
for Queen<->Queen has been set by definition to 0.
The remaining 5 parameters:
LinearCoefficients[ROOK]
LinearCoefficients[QUEEN]
QuadraticCoefficientsSameColor[ROOK][ROOK]
QuadraticCoefficientsSameColor[QUEEN][ROOK]
RedundantMajor
are sufficient to equate the material imbalances for the
5 common material configurations of R, RR, Q, QR and QRR
to any desired values simultaneously.
With the chosen parameters there should be no functional
change unless one side has more than 2 rooks or more
than 1 queen. For example bench from the start position
using the commands:
./stockfish
go depth 16
produces identical output except for one extra node
in the last iteration.
bench: 8198094
Coefficients for Bishop<->BishopPair and Bishop<->Bishop
are also pretty much redundant. By altering the values
in LinearCoefficients[] these coefficients can be zeroed
without changing the imbalance calculations in any position
with less than 3 bishops for one side.
bench: 7995098
First coefficient in the SameColor array does an
equivalent job when folded into the LinearCoefficients
array.
All of the diagonal terms in the OppositeColor array
are redundant due to cancellation.
No functional change.
In case we find a very good move after a
troubled start, we don't return immediately
anymore.
Tested directly at long TC where it passed:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 13910 W: 2397 L: 2228 D: 9285
bench: 7995098
And remove a complex (and broken) formula.
Indeed previous code was broken in case of TC with big
time increments where available_time() was too similar
to total time yielding to many time losses, so for instance:
go wtime 2600 winc 2600
info nodes 4432770 time 2601 <-- time forfeit!
maximum search time = 2530 ms
available_time = 2300 ms
For a reference and further details see:
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/dCPAvQDcm2E
Speed tested with bench disabling timer alltogheter vs timer set at
max resolution, showed we have no speed regressions both in single
core and when using all physical cores.
No functional change.