This greately simplifies usage because hides to the
search the implementation specific CheckInfo.
This is based on the work done by Marco in pull request #716,
implementing on top of it the ideas in the discussion: caching
the calls to slider_blockers() in the CheckInfo structure,
and simplifying the slider_blockers() function by removing its
first parameter.
Compared to master, bench is identical but the number of calls
to slider_blockers() during bench goes down from 22461515 to 18853422,
hopefully being a little bit faster overall.
archlinux, gcc-6
make profile-build ARCH=x86-64-bmi2
50 runs each
bench:
base = 2356320 +/- 981
test = 2403811 +/- 981
diff = 47490 +/- 1828
speedup = 0.0202
P(speedup > 0) = 1.0000
perft 6:
base = 175498484 +/- 429925
test = 183997959 +/- 429925
diff = 8499474 +/- 469401
speedup = 0.0484
P(speedup > 0) = 1.0000
perft 7 (but only 10 runs):
base = 185403228 +/- 468705
test = 188777591 +/- 468705
diff = 3374363 +/- 476687
speedup = 0.0182
P(speedup > 0) = 1.0000
$ ./pyshbench ../Stockfish/master ../Stockfish/test 20
run base test diff
...
base = 2501728 +/- 182034
test = 2532997 +/- 182034
diff = 31268 +/- 5116
speedup = 0.0125
P(speedup > 0) = 1.0000
No functional change.
This non-functional change patch is a deep work to allow SF to be independent
from the actual value of ONE_PLY (currently set to 1). I have verified SF is
now independent for ONE_PLY values 1, 2, 4, 8, 16, 32 and 256.
This patch gives consistency to search code and enables future work, opening
the door to safely tweaking the ONE_PLY value for any reason.
Verified for no speed regression at STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 95643 W: 17728 L: 17737 D: 60178
No functional change.
Rewritten in a way to have explicit in the search
the bonus/penalty we apply: hopefully this will lead
to further simplification/fix of current rather messy
stats update code.
No functional change.
STC: (Yellow)
LLR: -2.96 (-2.94,2.94) [0.00,4.00]
Total: 69115 W: 12880 L: 12797 D: 43438
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 124163 W: 16923 L: 16442 D: 90798
Note: Note based off past experiments / patches... history pruning
is quite TC sensitive. I believe the reason for this TC dependency
is that the CMH/FMH is a very large table that takes time to fill
up with. In addition having more time for will increase the accuracy
of the stats' value.
Bench: 7351698
Checking for legality of a possible ponder move
must be done before we undo the first pv move,
of course. (spotted by mohammed li.)
This obviously only has any effect when playing in ponder mode.
No functional change.
Take advantage that VALUE_NONE = 32002 to remove
the condition.
Commented out and not removed becuase it is tricky
to rely on the hidden value of VALUE_NONE and code
can break in case we change VALUE_NONE in the future.
No functional change.
This code was added before the accurate pv patch, when
we retrieved PV directly from TT.
It's not required for correct (and long) PVs any more and
should be safe to remove it.
Also, allowing helper threads to repeatedly over-write
TT doesn't seem to make sense(that was probably an un-intended
side-effect of lazy smp). Before Lazy SMP only Main Thread used
to run ID loop and insert PV into TT.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 74346 W: 13946 L: 13918 D: 46482
LTC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 47265 W: 6531 L: 6447 D: 34287
bench: 8819179
Currently root moves are copied to all teh threads
but are DTZ filtered only in main thread at the
beginning of teh search.
This patch moves the TB filtering before the
copy of root moves fixing issue #679https://github.com/official-stockfish/Stockfish/issues/679
No bench change.
There are two concepts with this patch:
Limit check extensions by using move count.
The idea is to limit search explosion.
Always extend check if the first move gives check.
The idea is to save expensive SEE calls, since the vast
majority of first move will have SEE value >= 0, also
first move may still be strong even if the SEE is negative.
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 16503 W: 3068 L: 2873 D: 10562
LTC:
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 37202 W: 5261 L: 5014 D: 26927
bench: 8543366
Currently, helper threads will only search up to the
specified depth limit. Now let them search until the
main thread has finished the specified depth.
On the other hand, we don't want to pick a thread with
a higher search depth.
This may be considered cheating. ;-)
No functional change.
And passed in do_move(), this ensures maximum efficiency and
speed and at the same time unlimited move numbers.
The draw back is that to handle Position init we need to
reserve a StateInfo inside Position itself and use at
init time and when copying from another Position.
After lazy SMP we don't need anymore this gimmick and we can
get rid of this special case and always pass an external
StateInfo to Position object.
Also rewritten and simplified Position constructors.
Verified it does not regress with a 3 threads SMP test:
ELO: -0.00 +-12.7 (95%) LOS: 50.0%
Total: 1000 W: 173 L: 173 D: 654
No functional change.
When starting search in a mate or stalemate position, Stockfish does not
even care to reinitialize and start worker threads. However after search
all threads are checked for the best move.
This can lead to bestmove and info beeing carried over from the last
search.
Example session:
setoption name threads value 7
go movetime 4000
position startpos moves f2f3 e7e5 g2g4 d8h4
go movetime 4000
Actual output is like (almost always):
[...]
bestmove e2e4
info depth 0 score mate 0
info depth 20 seldepth 29 multipv 1 score cp 28 [...] pv e2e4
bestmove e2e4
Expected output / output after fix:
[...]
bestmove e2e4 ponder e7e6
info depth 0 score mate 0
bestmove (none)
Resolves#623
PredictedDepth can be negative, causing the futility_margin to be negative.
It will be very difficult to tweak moveCount pruning and reduction formula, as they are tuned to prevent this behavior.
No functional change
Resolves#587
There is no reason to compile 3 different copies of search(). PV nodes are on
the cold path, and PvNode is a template parameter, so there is no cost in
computing:
const bool RootNode = PvNode && (ss-1)->ply == 0;
And this simplifies code a tiny bit as well.
Speed impact is negligible on my machine (i7-3770k, linux 4.2, gcc 5.2):
nps +/-
test 2378605 3118
master 2383128 2793
diff -4523 2746
Bench: 7751425
No functional change.
Resolves#568