Align the TT allocation by 2M to make it huge page friendly and advise the
kernel to use huge pages.
Benchmarks on my i7-8700K (6C/12T) box: (3 runs per bench per config)
vanilla (nps) hugepages (nps) avg
==================================================================================
bench | 3012490 3024364 3036331 3071052 3067544 3071052 +1.5%
bench 16 12 20 | 19237932 19050166 19085315 19266346 19207025 19548758 +1.1%
bench 16384 12 20 | 18182313 18371581 18336838 19381275 19738012 19620225 +7.0%
On my box, huge pages have a significant perf impact when using a big
hash size. They also speed up TT initialization big time:
vanilla (s) huge pages (s) speed-up
=======================================================================
time stockfish bench 16384 1 1 | 5.37 1.48 3.6x
In practice, huge pages with auto-defrag may always be enabled in the
system, in which case this patch has no effect. This
depends on the values in /sys/kernel/mm/transparent_hugepage/enabled
and /sys/kernel/mm/transparent_hugepage/defrag.
closes https://github.com/official-stockfish/Stockfish/pull/2463
No functional change
Official release version of Stockfish 11.
Bench: 5156767
-----------------------
It is our pleasure to release Stockfish 11 to our fans and supporters.
Downloads are freely available at http://stockfishchess.org/download/
This version 11 of Stockfish is 50 Elo stronger than the last version, and
150 Elo stronger than the version which famously lost a match to AlphaZero
two years ago. This makes Stockfish the strongest chess engine running on
your smartphone or normal desktop PC, and we estimate that on a modern four
cores CPU, Stockfish 11 could give 1:1000 time odds to the human chess champion
having classical time control, and be on par with him. More specific data,
including nice cumulative curves for the progression of Stockfish strength
over the last seven years, can be found on [our progression page][1], at
[Stefan Pohl site][2] or at [NextChessMove][3].
In October 2019 Stockfish has regained its crown in the TCEC competition,
beating in the superfinal of season 16 an evolution of the neural-network
engine Leela that had won the previous season. This clash of style between an
alpha-beta and an neural-network engine produced spectacular chess as always,
with Stockfish [emerging victorious this time][0].
Compared to Stockfish 10, we have made hundreds of improvements to the
[codebase][4], from the evaluation function (improvements in king attacks,
middlegame/endgame transitions, and many more) to the search algorithm (some
innovative coordination methods for the searching threads, better pruning of
unsound tactical lines, etc), and fixed a couple of bugs en passant.
Our testing framework [Fishtest][5] has also seen its share of improvements
to continue propelling Stockfish forward. Along with a lot of small enhancements,
Fishtest has switched to new SPRT bounds to increase the chance of catching Elo
gainers, along with a new testing book and the use of pentanomial statistics to
be more resource-efficient.
Overall the Stockfish project is an example of open-source at its best, as
its buzzing community of programmers sharing ideas and daily reviewing their
colleagues' patches proves to be an ideal form to develop innovative ideas for
chess programming, while the mathematical accuracy of the testing framework
allows us an unparalleled level of quality control for each patch we put in
the engine. If you wish, you too can help our ongoing efforts to keep improving
it, just [get involved][6] :-)
Stockfish is also special in that every chess fan, even if not a programmer,
[can easily help][7] the team to improve the engine by connecting their PC to
Fishtest and let it play some games in the background to test new patches.
Individual contributions vary from 1 to 32 cores, but this year Bojun Guo
made it a little bit special by plugging a whole data center during the whole
year: it was a vertiginous experience to see Fishtest spikes with 17466 cores
connected playing [25600 games/minute][8]. Thanks Guo!
The Stockfish team
[0]: <http://mytcecexperience.blogspot.com/2019/10/season-16-superfinal-games-91-100.html>
[1]: <https://github.com/glinscott/fishtest/wiki/Regression-Tests>
[2]: <https://www.sp-cc.de/index.htm>
[3]: <https://nextchessmove.com/dev-builds>
[4]: <https://github.com/official-stockfish/Stockfish>
[5]: <https://tests.stockfishchess.org/tests>
[6]: <https://stockfishchess.org/get-involved/>
[7]: <https://github.com/glinscott/fishtest/wiki>
[8]: <https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/lebEmG5vgng%5B1-25%5D>
This patch shows a description of the compiler used to compile Stockfish,
when starting from the console.
Usage:
```
./stockfish
compiler
```
Example of output:
```
Stockfish 120120 64 POPCNT by T. Romstad, M. Costalba, J. Kiiski, G. Linscott
Compiled by clang++ 9.0.0 on Apple
__VERSION__ macro expands to: 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.38)
```
No functional change
Stockfish crashes immediately if users enter a wrong file name (or even an existing
folder name) for debug log file. It may be hard for users to find out since it prints
nothing. If they enter the string via a chess GUI, the chess GUI may remember and
auto-send to Stockfish next time, makes Stockfish crashes all the time. Bug report by
Nguyen Hong Pham in this issue: https://github.com/official-stockfish/Stockfish/issues/2365
This patch avoids the crash and instead prefers to exit gracefully with a error
message on std:cerr, like we do with the fenFile for instance.
Closes https://github.com/official-stockfish/Stockfish/pull/2366
No functional change.
Revert the previous patch now that the binary for the super-final
of TCEC season 16 has been sent.
Maybe the feature of showing the name of compiler will be added to the
master branch in the future. But we may use a cleaner way to code it, see
some ideas using the Makefile approach at the end of pull request #2327 :
https://github.com/official-stockfish/Stockfish/pull/2327
Bench: 3618154
This patch shows a description of the compiler used to compile Stockfish,
when starting from the console.
Usage:
```
./stockfish
compiler
```
Example of output:
```
Stockfish 240919 64 POPCNT by T. Romstad, M. Costalba, J. Kiiski, G. Linscott
Compiled by clang++ 9.0.0 on Apple
__VERSION__ macro expands to: 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.38)
```
No functional change
This pull request fixes a rare crashing bug on Windows inside our NUMA code, first
reported by Dann Corbit in the following forum thread (thanks!):
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/gA6aoMEuOwg
The fix is to not use structure member beyond known size when iterating through
'SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX' structure. We note that the Microsoft
API is guaranteed to provide us at least one element upon successful, and no
element in the structure can have a zero size.
No functional change.
Official release version of Stockfish 10.
This is also the 10th anniversary version of the Stockfish project, which
started exactly ten years ago! I wish to extend a huge thank you to
all contributors and authors in our amazing community :-)
Bench: 3939338
Preparation commit for the upcoming Stockfish 10 version, giving a chance to catch last minute feature bugs and evaluation regression during the one-week code freeze period. Also changing the copyright dates to include 2019.
No functional change
This patch implements some idea by Alain Savard and Mike Whiteley taken from the perpertual renaming/reformatting thread.
This is a pure code cleaning patch (so no change in functionality), but I use it as a pretext to correct the bogus bench number that I introduced in the previous commit.
Bench: 4413383
Silences the following warnings when compiling with GCC 8.
The fix is to use an intermediate pointer to anonymous function:
```
misc.cpp: In function 'int WinProcGroup::get_group(size_t)':
misc.cpp:241:77: warning: cast between incompatible function types from 'FARPROC' {aka 'long long int (*)()'} to 'fun1_t' {aka 'bool (*)(_LOGICAL_PROCESSOR_RELATIONSHIP, _SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX*, long unsigned int*)'} [-Wcast-function-type]
auto fun1 = (fun1_t)GetProcAddress(k32, "GetLogicalProcessorInformationEx");
^
misc.cpp: In function 'void WinProcGroup::bindThisThread(size_t)':
misc.cpp:309:71: warning: cast between incompatible function types from 'FARPROC' {aka 'long long int (*)()'} to 'fun2_t' {aka 'bool (*)(short unsigned int, _GROUP_AFFINITY*)'} [-Wcast-function-type]
auto fun2 = (fun2_t)GetProcAddress(k32, "GetNumaNodeProcessorMaskEx");
^
misc.cpp:310:67: warning: cast between incompatible function types from 'FARPROC' {aka 'long long int (*)()'} to 'fun3_t' {aka 'bool (*)(void*, const _GROUP_AFFINITY*, _GROUP_AFFINITY*)'} [-Wcast-function-type]
auto fun3 = (fun3_t)GetProcAddress(k32, "SetThreadGroupAffinity");
^
```
No functional change.
The heuristic to avoid thread binding if less than 8 threads are requested resulted in the first 7 threads not being bound.
The branch was verified to yield a roughly 13% speedup by @CoffeeOne on the appropriate hardware and OS, and an earlier version of this patch tested well on his machine:
http://tests.stockfishchess.org/tests/view/5a3693480ebc590ccbb8be5a
ELO: 9.24 +-4.6 (95%) LOS: 100.0%
Total: 5000 W: 634 L: 501 D: 3865
To make sure all threads (including mainThread) are bound as soon as the total number exceeds 7, recreate all threads on a change of thread number.
To do this, unify Threads::init, Threads::exit and Threads::set are unified in a single Threads::set function that goes through the needed steps.
The code includes several suggestions from @joergoster.
Fixes issue #1312
No functional change
Some warnings after a run of:
$ clang-tidy-3.8 -checks='modernize-*' *.cpp syzygy/*.cpp -header-filter=.* -- -std=c++11
I have not fixed all suggestions, for instance I still prefer
to declare the type instead of a spread use of 'auto'. I also
perfer good old 'typedef' to the new 'using' form.
I have not fixed some warnings in the last functions of
syzygy code because those are still the original functions
and need to be completely rewritten anyhow.
Thanks to erbsenzaehler for the original idea.
No functional change.
The needed Windows API for processor groups could be missed from old Windows
versions, so instead of calling them directly (forcing the linker to resolve
the calls at compile time), try to load them at runtime. To do this we need
first to define the corresponding function pointers.
Also don't interfere with running fishtest on numa hardware with Windows.
Avoid all stockfish one-threaded processes will run on the same node
No functional change.
Under Windows it is not possible for a process to run on more than one
logical processor group. This usually means to be limited to use max 64
cores. To overcome this, some special platform specific API should be
called to set group affinity for each thread. Original code from Texel by
Peter sterlund.
Tested by Jean-Paul Vael on a Xeon E7-8890 v4 with 88 threads and confimed
speed up between 44 and 88 threads is about 30%, as expected.
No functional change.
Allow to specifiy the log file name, this comes
handy in case of self-matches so that each SF
instance writes into a different log file.
No functional change.
Fishtest is a key factor of SF success.
Thanks to Fishtest we have not only greately
improved ELO but, even more important, we
have enabled a kind of joint development and
testing that it is the herat of on open
source project like SF.
Open sourcing is not just about open code, it is
about commuity development. In case of a chess engine
this has never been possible before due to missing
a strong and strict testing environment that allows
many people to contribute in a safe and coordinate way.
Fishtest is a new way of developing chess engines,
something that has never exsisted before.
No functional change.
Funny enough, gcc __builtin_prefetch() expects
already a void*, instead Windows's _mm_prefetch()
requires a char*.
The patch allows to remove ugly casts from caller
sites.
No functional change.
Import C++11 branch from:
https://github.com/mcostalba/Stockfish/tree/c++11
The version imported is teh last one as of today:
6670e93e50
Branch is fully equivalent with master but syzygy
tablebases that are missing (but will be added with
next commit).
bench: 8080602
SSE4.2 has nothing to do with POPCNT. We must dispell this myth, because
Stockfish is a reference and many will copy this mistake if they see it in Stockfish:
* SSE is an SIMD instruction set, relative to vectorization (using special 128-bit registers).
* POPCNT/LZCNT work on normal registers (eg. AL, AX, EAX, RAX).
The confusion comes from the fact that, in the Intel product line, it just
so happens that SSE4.2 and POPCNT/LZCNT came out at the same time. But this
is not true for AMD. For example, all AMD Pheniom II have SSE3 but no
POPCNT/LZCNT, and that is why the modern compile uses -msse3 -popcnt and not -msse4.2.
No functional change.
Resolves#86
Objects that are only accessible at file-scope should be put in the anonymous namespace.
This is what the C++ standard recommends, rather than using static, which is really C-style and results in static linkage.
Stockfish already does this throughout the code. So let's weed out the few exceptions,
because... they have no reason to be exceptional.
No functional change.
Resolves#84