Reduce the variation in Root Depth between different threads. This
prevents threads from searching at a depth much higher than Main Thread.
Performed well at STC 24 Threads:
ELO: 3.44 +-3.8 (95%) LOS: 96.1%
Total: 10000 W: 1627 L: 1528 D: 6845
And LTC 24 Threads
LLR: 1.43 (-2.94,2.94) [0.00,4.00]
Total: 3804 W: 500 L: 420 D: 2884
ELO : +7.31
p-value: 73.16%
Passed no regression at STC 3 Threads:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40457 W: 7148 L: 7060 D: 26249
And LTC 3 Threads:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 17704 W: 2489 L: 2364 D: 12851
Raising a pull request early as 24 Thread tests are very expensive and
this is clearly a positive gain at high thread counts and high time
controls. The change is a small parameter tweak with no additional
logic.
No functional change for single thread mode.
Resolves#481
Rely on well defined behaviour for message passing, instead of volatile. Three
versions have been tested, to make sure this wouldn't cause a slowdown on any
platform.
v1: Sequentially consistent atomics
No mesurable regression, despite the extra memory barriers on x86. Even with 15
threads and extreme time pressure, both acting as a magnifying glass:
threads=15, tc=2+0.02
ELO: 2.59 +-3.4 (95%) LOS: 93.3%
Total: 18132 W: 4113 L: 3978 D: 10041
threads=7, tc=2+0.02
ELO: -1.64 +-3.6 (95%) LOS: 18.8%
Total: 16914 W: 4053 L: 4133 D: 8728
v2: Acquire/Release semantics
This version generates no extra barriers for x86 (on the hot path). As expected,
no regression either, under the same conditions:
threads=15, tc=2+0.02
ELO: 2.85 +-3.3 (95%) LOS: 95.4%
Total: 19661 W: 4640 L: 4479 D: 10542
threads=7, tc=2+0.02
ELO: 0.23 +-3.5 (95%) LOS: 55.1%
Total: 18108 W: 4326 L: 4314 D: 9468
As suggested by Joona, another test at LTC:
threads=15, tc=20+0.05
ELO: 0.64 +-2.6 (95%) LOS: 68.3%
Total: 20000 W: 3053 L: 3016 D: 13931
v3: Final version: SeqCst/Relaxed
threads=15, tc=10+0.1
ELO: 0.87 +-3.9 (95%) LOS: 67.1%
Total: 9541 W: 1478 L: 1454 D: 6609
Resolves#474
Using less parameters and code to compute Threats
Includes also a few spacing edits.
Run as a simplification.
Passed STC 10+0.1
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 18879 W: 3725 L: 3600 D: 11554
Passed LTC 60+0.4
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 74116 W: 11001 L: 10958 D: 52157
bench: 8004751
Fishtest is a key factor of SF success.
Thanks to Fishtest we have not only greately
improved ELO but, even more important, we
have enabled a kind of joint development and
testing that it is the herat of on open
source project like SF.
Open sourcing is not just about open code, it is
about commuity development. In case of a chess engine
this has never been possible before due to missing
a strong and strict testing environment that allows
many people to contribute in a safe and coordinate way.
Fishtest is a new way of developing chess engines,
something that has never exsisted before.
No functional change.
Collect and give a second try to some almost passed tuning attempts and
one-line tweaks from the last month.
Passed STC
LLR: 3.07 (-2.94,2.94) [0.00,4.00]
Total: 15124 W: 2974 L: 2756 D: 9394
And LTC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 21577 W: 3507 L: 3289 D: 14781
Bench: 8855226
Resolves#464
Start all threads searching on root position and
use only the shared TT table as synching scheme.
It seems this scheme scales better than YBWC for
high number of threads.
Verified for nor regression at STC 3 threads
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40232 W: 6908 L: 7130 D: 26194
Verified for nor regression at LTC 3 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 28186 W: 3908 L: 3798 D: 20480
Verified for nor regression at STC 7 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 3607 W: 674 L: 526 D: 2407
Verified for nor regression at LTC 7 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 4235 W: 671 L: 528 D: 3036
Tested with fixed games at LTC with 20 threads
ELO: 44.75 +-7.6 (95%) LOS: 100.0%
Total: 2069 W: 407 L: 142 D: 1520
Tested with fixed games at XLTC (120secs) with 20 threads
ELO: 28.01 +-6.7 (95%) LOS: 100.0%
Total: 2275 W: 349 L: 166 D: 1760
Original patch of mbootsector, with additional work
from Ivan Ivec (log formula), Joerg Oster (id loop
simplification) and Marco Costalba (assorted formatting
and rework).
Bench: 8116244
Apply bonus for the prior CMH that caused a fail low.
Balance Stats: CMH and History bonuses are updated differently.
This eliminates the "fudge" factor weight when scoring moves. Also
eliminated discontinuity in the gravity history stat formula. (i.e. stat
scores will no longer inverse when depth exceeds 22)
STC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 21802 W: 4107 L: 3887 D: 13808
LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 46036 W: 7046 L: 6756 D: 32234
Bench: 7677367
Add Travis CI support to GitHub repo.
After every push to master, Travis will build
the sources directly from GitHub repo according
to .travis.yml and verify everything is ok.
No functional change.
Fix issues after a run of PVS-STUDIO analyzer.
Mainly false positives but warnings are anyhow
useful to point out not very readable code.
Noteworthy is the memset() one, where PVS prefers ss-2
instead of stack. This is because memeset() could
be optimized away by the compiler when using 'stack',
due to stack being a local variable no more used after
memset. This should normally not happen, but when
it happens it leads to very sublte and difficult
to find bug, so better to be safe than sorry.
No functional change.
When changing 'search' and 'splitPointsSize' we have to
use thread locks, not split point ones, because can_join()
is called under the formers.
Verified succesfully with 24 hours toruture tests with 20
cores machine by Louis Zulli: it does not hangs.
Verifyed for no regressions with STC, 7 threads:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 52804 W: 8159 L: 8087 D: 36558
No functional change.
Only refresh TT entry when it's really necessary.
This should give a small speed boost for some machines.
And it's a risk-free change.
No functional change.
Resolves#429
Louis Zulli reported that Stockfish suffers from very occasional hangs with his 20 cores machine.
Careful SMP debugging revealed that this was caused by "a ghost split point slave", where thread
was marked as a split point slave, but wasn't actually working on it.
The only logical explanation for this was double booking, where due to SMP race, the same thread
is booked for two different split points simultaneously.
Due to very intermittent nature of the problem, we can't say exactly how this happens.
The current handling of Thread specific variables is risky though. Volatile variables are in some
cases changed without spinlock being hold. In this case standard doesn't give us any kind of
guarantees about how the updated values are propagated to other threads.
We resolve the situation by enforcing very strict locking rules:
- Values for key thread variables (splitPointsSize, activeSplitPoint, searching)
can only be changed when the thread specific spinlock is held.
- Structural changes for splitPoints[] are only allowed when the thread specific spinlock is held.
- Thread booking decisions (per split point) can only be done when the thread specific spinlock is held.
With these changes hangs didn't occur anymore during 2 days torture testing on Zulli's machine.
We probably have a slight performance penalty in SMP mode due to more locking.
STC (7 threads):
ELO: -1.00 +-2.2 (95%) LOS: 18.4%
Total: 30000 W: 4538 L: 4624 D: 20838
However stability is worth more than 1-2 ELO points in this case.
No functional change
Resolves#422
v = value without ep capture being considered
v1 = value of the ep capture
The correct logic is:
if without e.p. capture we are losing, and the value of e.p is either draw, or win or "loss, but 50 move rule saves us", then we should use the value of ep capture.
Credit and thanks to syzygy1 and lantonov !
No functional change (except with syzygy bases)
Resolves#415Resolves#394
Instead of using hard coded Min and Max values for history,
always adjust the old value slightly downwards before adding a new value.
The adjustment acts like gravity that prevents the value escaping too
far from zero.
Bench: 8020484
Resolves#407
Apart from usual renaiming, take advantage of
C++11 function template default parmeter to
get rid of Eval trampoline functions.
Some triviality fixes while there.
No functional change.
Align the behaviour with reductions. Initially castling moves had to be
treated differently, because the SEE did not handle them correctly. But now it
does.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 83750 W: 15722 L: 15711 D: 52317
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 97183 W: 15120 L: 15115 D: 66948
bench 7759837
Resolves#403
PawnSafePush, with the value S(5,5) proved not "necessary"
possibly due to recent changes to MobilityArea and other changes to Connected bonus.
STC:
LLR: 3.22 (-2.94,2.94) [-3.00,1.00]
Total: 98528 W: 18757 L: 18759 D: 61012
LTC:
LLR: 5.30 (-2.94,2.94) [-3.00,1.00]
Total: 204194 W: 31698 L: 31734 D: 140762
Bench: 7620871
Resolves#396
Use Position::square and Position::squares instead.
This allow us to remove king_square(), simplify
endgames and to have more naming uniformity.
Moreover, this is a prerequisite step in case
in the future we decide to retire piece lists
altoghter and use pop_lsb() to loop across
pieces and serialize the moves. In this way
we just need to change definition of Position::square
to something like:
template<PieceType Pt> inline
Square Position::square(Color c) const {
return lsb(byColorBB[c]);
}
No functional change.
Based off of Pull request #383:
Include squares occupied by some pawns in the MobilityArea
a) not blocked
b) on rank 4 and above
c) or captures
Passed STC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8157 W: 1644 L: 1516 D: 4997
And LTC
LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 26086 W: 4274 L: 4051 D: 17761
-----------
Then, a simplification test failed, trying to remove b and c)
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 6048 W: 1117 L: 1288 D: 3643
Another simplification test, was run to remove just (c)
Passed STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 28073 W: 5364 L: 5255 D: 17454
And LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 34652 W: 5448 L: 5348 D: 23856
A parameter tweak test showed that changing b) for "on rank 3 and above"
does not work
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 5233 W: 937 L: 1077 D: 3219
Finally, a small rewrite, and we have this version
Include squares occupied by some pawns in the MobilityArea which are
a) not blocked
b) on rank 4 and above
Bench: 8977899
Resolves#385