- Clean signature of functions in namespace NNUE
- Add comment for countermove based pruning
- Remove bestMoveCount variable
- Add const qualifier to kpp_board_index array
- Fix spaces in get_best_thread()
- Fix indention in capture LMR code in search.cpp
- Rename TtmemDeleter to LargePageDeleter
Closes https://github.com/official-stockfish/Stockfish/pull/3063
No functional change
covers the most important cases from the user perspective:
It embeds the default net in the binary, so a download of that binary will result
in a working engine with the default net. The engine will be functional in the default mode
without any additional user action.
It allows non-default nets to be used, which will be looked for in up to
three directories (working directory, location of the binary, and optionally a specific default directory).
This mechanism is also kept for those developers that use MSVC,
the one compiler that doesn't have an easy mechanism for embedding data.
It is possible to disable embedding, and instead specify a specific directory, e.g. linux distros might want to use
CXXFLAGS="-DNNUE_EMBEDDING_OFF -DDEFAULT_NNUE_DIRECTORY=/usr/share/games/stockfish/" make -j ARCH=x86-64 profile-build
passed STC non-regression:
https://tests.stockfishchess.org/tests/view/5f4a581c150f0aef5f8ae03a
LLR: 2.95 (-2.94,2.94) {-1.25,-0.25}
Total: 66928 W: 7202 L: 7147 D: 52579
Ptnml(0-2): 291, 5309, 22211, 5360, 293
closes https://github.com/official-stockfish/Stockfish/pull/3070
fixes https://github.com/official-stockfish/Stockfish/issues/3030
No functional change.
This patch ports the efficiently updatable neural network (NNUE) evaluation to Stockfish.
Both the NNUE and the classical evaluations are available, and can be used to
assign a value to a position that is later used in alpha-beta (PVS) search to find the
best move. The classical evaluation computes this value as a function of various chess
concepts, handcrafted by experts, tested and tuned using fishtest. The NNUE evaluation
computes this value with a neural network based on basic inputs. The network is optimized
and trained on the evalutions of millions of positions at moderate search depth.
The NNUE evaluation was first introduced in shogi, and ported to Stockfish afterward.
It can be evaluated efficiently on CPUs, and exploits the fact that only parts
of the neural network need to be updated after a typical chess move.
[The nodchip repository](https://github.com/nodchip/Stockfish) provides additional
tools to train and develop the NNUE networks.
This patch is the result of contributions of various authors, from various communities,
including: nodchip, ynasu87, yaneurao (initial port and NNUE authors), domschl, FireFather,
rqs, xXH4CKST3RXx, tttak, zz4032, joergoster, mstembera, nguyenpham, erbsenzaehler,
dorzechowski, and vondele.
This new evaluation needed various changes to fishtest and the corresponding infrastructure,
for which tomtor, ppigazzini, noobpwnftw, daylen, and vondele are gratefully acknowledged.
The first networks have been provided by gekkehenker and sergiovieri, with the latter
net (nn-97f742aaefcd.nnue) being the current default.
The evaluation function can be selected at run time with the `Use NNUE` (true/false) UCI option,
provided the `EvalFile` option points the the network file (depending on the GUI, with full path).
The performance of the NNUE evaluation relative to the classical evaluation depends somewhat on
the hardware, and is expected to improve quickly, but is currently on > 80 Elo on fishtest:
60000 @ 10+0.1 th 1
https://tests.stockfishchess.org/tests/view/5f28fe6ea5abc164f05e4c4c
ELO: 92.77 +-2.1 (95%) LOS: 100.0%
Total: 60000 W: 24193 L: 8543 D: 27264
Ptnml(0-2): 609, 3850, 9708, 10948, 4885
40000 @ 20+0.2 th 8
https://tests.stockfishchess.org/tests/view/5f290229a5abc164f05e4c58
ELO: 89.47 +-2.0 (95%) LOS: 100.0%
Total: 40000 W: 12756 L: 2677 D: 24567
Ptnml(0-2): 74, 1583, 8550, 7776, 2017
At the same time, the impact on the classical evaluation remains minimal, causing no significant
regression:
sprt @ 10+0.1 th 1
https://tests.stockfishchess.org/tests/view/5f2906a2a5abc164f05e4c5b
LLR: 2.94 (-2.94,2.94) {-6.00,-4.00}
Total: 34936 W: 6502 L: 6825 D: 21609
Ptnml(0-2): 571, 4082, 8434, 3861, 520
sprt @ 60+0.6 th 1
https://tests.stockfishchess.org/tests/view/5f2906cfa5abc164f05e4c5d
LLR: 2.93 (-2.94,2.94) {-6.00,-4.00}
Total: 10088 W: 1232 L: 1265 D: 7591
Ptnml(0-2): 49, 914, 3170, 843, 68
The needed networks can be found at https://tests.stockfishchess.org/nns
It is recommended to use the default one as indicated by the `EvalFile` UCI option.
Guidelines for testing new nets can be found at
https://github.com/glinscott/fishtest/wiki/Creating-my-first-test#nnue-net-tests
Integration has been discussed in various issues:
https://github.com/official-stockfish/Stockfish/issues/2823https://github.com/official-stockfish/Stockfish/issues/2728
The integration branch will be closed after the merge:
https://github.com/official-stockfish/Stockfish/pull/2825https://github.com/official-stockfish/Stockfish/tree/nnue-player-wip
closes https://github.com/official-stockfish/Stockfish/pull/2912
This will be an exciting time for computer chess, looking forward to seeing the evolution of
this approach.
Bench: 4746616
for users that set the needed privilige "Lock Pages in Memory"
large pages will be automatically enabled (see Readme.md).
This expert setting might improve speed, 5% - 30%, depending
on the hardware, the number of threads and hash size. More for
large hashes, large number of threads and NUMA. If the operating
system can not allocate large pages (easier after a reboot), default
allocation is used automatically. The engine log provides details.
closes https://github.com/official-stockfish/Stockfish/pull/2656
fixes https://github.com/official-stockfish/Stockfish/issues/2619
No functional change
The purpose of the code is to allow developers to easily and flexibly
setup SF for a tuning session. Mainly you have just to remove 'const'
qualifiers from the variables you want to tune and flag them for
tuning, so if you have:
int myKing = 10;
Score myBonus = S(5, 15);
Value myValue[][2] = { { V(100), V(20) }, { V(7), V(78) } };
and at the end of the update you may want to call
a post update function:
void my_post_update();
If instead of default Option's min-max values,
you prefer your custom ones, returned by:
std::pair<int, int> my_range(int value)
Or you jus want to set the range directly, you can
simply add below:
TUNE(SetRange(my_range), myKing, SetRange(-200, 200), myBonus, myValue, my_post_update);
And all the magic happens :-)
At startup all the parameters are printed in a
format suitable to be copy-pasted in fishtest.
In case the post update function is slow and you have many
parameters to tune, you can add:
UPDATE_ON_LAST();
And the values update, including post update function call, will
be done only once, after the engine receives the last UCI option.
The last option is the one defined and created as the last one, so
this assumes that the GUI sends the options in the same order in
which have been defined.
closes https://github.com/official-stockfish/Stockfish/pull/2654
No functional change.
In lazySMP it makes sense to prune a little more, as multiple threads
search wider. We thus increase the prefactor of the reductions slowly
as a function of the threads. The prefactor of the log(threads) term
is a parameter, this pull request uses 1/2 after testing.
passed STC @ 8threads:
LLR: 2.96 (-2.94,2.94) [0.50,4.50]
Total: 118125 W: 23151 L: 22462 D: 72512
http://tests.stockfishchess.org/tests/view/5d8bbf4d0ebc59509180f217
passed LTC @ 8threads:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 67546 W: 10630 L: 10279 D: 46637
http://tests.stockfishchess.org/tests/view/5d8c463b0ebc5950918167e8
passed ~LTC @ 14threads:
LLR: 2.95 (-2.94,2.94) [0.00,3.50]
Total: 74271 W: 12421 L: 12040 D: 49810
http://tests.stockfishchess.org/tests/view/5d8db1f50ebc590f3beb24ef
Note:
A larger prefactor (1) passed similar tests at STC and LTC (8 threads),
while a very large one (2) passed STC quickly but failed LTC (8 threads).
For the single-threaded case there is no functional change.
Closes https://github.com/official-stockfish/Stockfish/pull/2337
Bench: 4088701
Fixup: remove redundant code.
Similar to PSQT we only need one instance of the Endgames resource. The current per thread copies are identical and read only(after initialization) so from a design point of view it doesn't make sense to have them.
Tested for no slowdown.
http://tests.stockfishchess.org/tests/view/5c94377a0ebc5925cfff43ca
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17320 W: 3487 L: 3359 D: 10474
No functional change.
Simplification which removes the pawns connected array.
Instead of storing the values in an array, the values are
calculated real-time. This is about 1.6% faster on my machines.
Performance:
master ave nps: 159,248,672
patch ave nps: 161,905,592
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 20363 W: 4579 L: 4455 D: 11329
http://tests.stockfishchess.org/tests/view/5c9925ba0ebc5925cfff79a6
Non functional change.
Preparation commit for the upcoming Stockfish 10 version, giving a chance to catch last minute feature bugs and evaluation regression during the one-week code freeze period. Also changing the copyright dates to include 2019.
No functional change
This patch will make possible to free mapped TB files with "ucinewgame" command.
We wrote this patch specifically to address a problem that arose while
running Stockfish with 7-piece tablebases as a kibitzer at TCEC for
extended periods of time across multiple games. It was noted that after
some time, the NPS of the kibitzing Stockfish (which is usually 3x faster
than the Stockfish actually competing) would drop precipitously, eventually
falling to preposterously low numbers until restarted.
Their eval bot basically inputs FEN, go infinite, stop and loop, it probably
didn't do ucinewgame either. As time goes it gradually slowed down and OS
starts to use swap, this is not reasonable since the engine only uses 16GB
hash and the machine has 1TB physical RAM and does nothing else.
Author : noobpwnftw
Closes https://github.com/official-stockfish/Stockfish/pull/1790
No functional change.
Makes sure the potential benefit of first touch does not depend on
the order of the UCI commands Threads and Hash, by reallocating the
hash if a Threads is issued. The cost is zeroing the TT once more
than needed. In case the prefered order (first Threads than Hash)
is employed, this amounts to zeroing the default sized TT (16Mb),
which is essentially instantaneous.
Follow up for https://github.com/official-stockfish/Stockfish/pull/1601
where additional data and discussion is available.
Closes https://github.com/official-stockfish/Stockfish/pull/1620
No functional change.
This patch adds some documentation and code cleanup to tablebase code.
It took me some time to understand the relation among the differrent
structs, although I have rewrote them fully in the past. So I wrote
some detailed documentation to avoid the same efforts for future readers.
Also noteworthy is the use a standard hash table implementation with a
more efficient 1D array instead of a 2D array. This reduces the average
lookup steps of 90% (from 343 to 38 in a bench 128 1 16 run) and reduces
also the table from 5K to 4K
entries.
I have tested on 5-men and no functional and no slowdown reported. It
should be verified on 6-men that the new hash does not overflow. It is
enough to run ./stockfish with 6-men available: if it does not assert at
startup it means everything is ok with 6-men too.
EDIT: verified for 6-men tablebase by Jörg Oster. Thanks!
No functional change.
The heuristic to avoid thread binding if less than 8 threads are requested resulted in the first 7 threads not being bound.
The branch was verified to yield a roughly 13% speedup by @CoffeeOne on the appropriate hardware and OS, and an earlier version of this patch tested well on his machine:
http://tests.stockfishchess.org/tests/view/5a3693480ebc590ccbb8be5a
ELO: 9.24 +-4.6 (95%) LOS: 100.0%
Total: 5000 W: 634 L: 501 D: 3865
To make sure all threads (including mainThread) are bound as soon as the total number exceeds 7, recreate all threads on a change of thread number.
To do this, unify Threads::init, Threads::exit and Threads::set are unified in a single Threads::set function that goes through the needed steps.
The code includes several suggestions from @joergoster.
Fixes issue #1312
No functional change
Simplify out low level sync stuff (mutex
and friends) and avoid to use them directly
in many functions.
Also some renaming and better comment while
there.
No functional change.
Better split code that should be run at
startup from code run at ucinewgame. Also
fix several races when 'bench', 'perft' and
'ucinewgame' are sent just after 'bestomve'
from the engine threads are still running.
Also use a specific UI thread instead of
main thread when setting up the Position
object used by UI uci loop. This fixes a
race when sending 'eval' command while searching.
We accept a race on 'setoption' to allow the
GUI to change an option while engine is searching
withouth stalling the pipe. Note that changing an
option while searchingg is anyhow not mandated by
UCI protocol.
No functional change.
execute an implied ucinewgame upon entering the UCI::loop,
to make sure that searches starting with and without an (optional) ucinewgame
command yield the same search.
This is needed now that seach::clear() initializes tables to non-zero default values.
No functional change
Closes#1101Closes#1104
Rescales the king danger variables in evaluate_king() to
suppress the KingDanger[] array. This avoids the cost of
the memory accesses to the array and simplifies the non-linear
transformation used.
Full credits to "hxim" for the seminal idea and implementation,
see pull request #786.
https://github.com/official-stockfish/Stockfish/pull/786
Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 9649 W: 1829 L: 1689 D: 6131
Passed LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 53494 W: 7254 L: 7178 D: 39062
Bench: 6116200
Fix issues after a run of PVS-STUDIO analyzer.
Mainly false positives but warnings are anyhow
useful to point out not very readable code.
Noteworthy is the memset() one, where PVS prefers ss-2
instead of stack. This is because memeset() could
be optimized away by the compiler when using 'stack',
due to stack being a local variable no more used after
memset. This should normally not happen, but when
it happens it leads to very sublte and difficult
to find bug, so better to be safe than sorry.
No functional change.
Easier for tuning psq tables:
TUNE(myParameters, PSQT::init);
Also move PSQT code in a new *.cpp file, and retire the
old and hacky psqtab.h that required to be included only
once to work correctly, this is not idiomatic for a header
file.
Give wide visibility to psq tables (previously visible only
in position.cpp), this will easy the use of psq tables outside
Position, for instance in move ordering.
Finally trivial code style fixes of the latest patches.
Original patch of Lucas Braesch.
No functional change.
Import C++11 branch from:
https://github.com/mcostalba/Stockfish/tree/c++11
The version imported is teh last one as of today:
6670e93e50
Branch is fully equivalent with master but syzygy
tablebases that are missing (but will be added with
next commit).
bench: 8080602
Adds support for Syzygy tablebases to Stockfish. See
the Readme for information on using the tablebases.
Tablebase support can be enabled/disabled at the Makefile
level as well, by setting syzygy=yes or syzygy=no.
Big/little endian are both supported.
No functional change (if Tablebases are not used).
Resolves#6
Make even more clear what are the terms that
contribute to evaluate connected pawns, and
completely separate them from the weights
that are now fully looked up in a table.
For future tuning makes sense to init the table with
a formula instead of hard-code it. This allows to
reduce problem space cardinality and makes tuning
easier.
And fix a MSVC warning while there:
warning C4804: '>>' : unsafe use of type 'bool' in operation
No functional change.
These two notions are very correlated. Since connected has the most
generality, it makes sense to generalize it to encompass what is
covered by candidate.
STC:
LLR: 4.03 (-2.94,2.94) [-3.00,1.00]
Total: 11970 W: 2577 L: 2379 D: 7014
LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 13194 W: 2389 L: 2255 D: 8550
bench 7328585
Now that we use pre-increment on enums, it
make sense, for code style uniformity, to
swith to pre-increment also for native types,
although there is no speed difference.
No functional change.