BadFish

mirror of https://github.com/sockspls/badfish synced 2025-05-02 09:39:36 +00:00

Author	SHA1	Message	Date
Joost VandeVondele	c4d67d77c9	Update copyright years No functional change	2021-01-08 17:04:23 +01:00
Tomasz Sobczyk	3f6451eff7	Manually align arrays on the stack as a workaround to issues with overaligned alignas() on stack variables in gcc < 9.3 on windows. closes https://github.com/official-stockfish/Stockfish/pull/3217 fixes #3216 No functional change	2020-11-04 19:52:42 +01:00
Sami Kiminki	485d517c68	Add large page support for NNUE weights and simplify TT mem management Use TT memory functions to allocate memory for the NNUE weights. This should provide a small speed-up on systems where large pages are not automatically used, including Windows and some Linux distributions. Further, since we now have a wrapper for std::aligned_alloc(), we can simplify the TT memory management a bit: - We no longer need to store separate pointers to the hash table and its underlying memory allocation. - We also get to merge the Linux-specific and default implementations of aligned_ttmem_alloc(). Finally, we'll enable the VirtualAlloc code path with large page support also for Win32. STC: https://tests.stockfishchess.org/tests/view/5f66595823a84a47b9036fba LLR: 2.94 (-2.94,2.94) {-0.25,1.25} Total: 14896 W: 1854 L: 1686 D: 11356 Ptnml(0-2): 65, 1224, 4742, 1312, 105 closes https://github.com/official-stockfish/Stockfish/pull/3081 No functional change.	2020-09-21 08:43:48 +02:00
Stéphane Nicolet	406979ea12	Embed default net, and simplify using non-default nets covers the most important cases from the user perspective: It embeds the default net in the binary, so a download of that binary will result in a working engine with the default net. The engine will be functional in the default mode without any additional user action. It allows non-default nets to be used, which will be looked for in up to three directories (working directory, location of the binary, and optionally a specific default directory). This mechanism is also kept for those developers that use MSVC, the one compiler that doesn't have an easy mechanism for embedding data. It is possible to disable embedding, and instead specify a specific directory, e.g. linux distros might want to use CXXFLAGS="-DNNUE_EMBEDDING_OFF -DDEFAULT_NNUE_DIRECTORY=/usr/share/games/stockfish/" make -j ARCH=x86-64 profile-build passed STC non-regression: https://tests.stockfishchess.org/tests/view/5f4a581c150f0aef5f8ae03a LLR: 2.95 (-2.94,2.94) {-1.25,-0.25} Total: 66928 W: 7202 L: 7147 D: 52579 Ptnml(0-2): 291, 5309, 22211, 5360, 293 closes https://github.com/official-stockfish/Stockfish/pull/3070 fixes https://github.com/official-stockfish/Stockfish/issues/3030 No functional change.	2020-08-29 21:56:00 +02:00
Joost VandeVondele	5f1843c9cb	Small trivial cleanups closes https://github.com/official-stockfish/Stockfish/pull/2801 No functional change	2020-08-23 01:53:41 +02:00
nodchip	84f3e86790	Add NNUE evaluation This patch ports the efficiently updatable neural network (NNUE) evaluation to Stockfish. Both the NNUE and the classical evaluations are available, and can be used to assign a value to a position that is later used in alpha-beta (PVS) search to find the best move. The classical evaluation computes this value as a function of various chess concepts, handcrafted by experts, tested and tuned using fishtest. The NNUE evaluation computes this value with a neural network based on basic inputs. The network is optimized and trained on the evalutions of millions of positions at moderate search depth. The NNUE evaluation was first introduced in shogi, and ported to Stockfish afterward. It can be evaluated efficiently on CPUs, and exploits the fact that only parts of the neural network need to be updated after a typical chess move. [The nodchip repository](https://github.com/nodchip/Stockfish) provides additional tools to train and develop the NNUE networks. This patch is the result of contributions of various authors, from various communities, including: nodchip, ynasu87, yaneurao (initial port and NNUE authors), domschl, FireFather, rqs, xXH4CKST3RXx, tttak, zz4032, joergoster, mstembera, nguyenpham, erbsenzaehler, dorzechowski, and vondele. This new evaluation needed various changes to fishtest and the corresponding infrastructure, for which tomtor, ppigazzini, noobpwnftw, daylen, and vondele are gratefully acknowledged. The first networks have been provided by gekkehenker and sergiovieri, with the latter net (nn-97f742aaefcd.nnue) being the current default. The evaluation function can be selected at run time with the `Use NNUE` (true/false) UCI option, provided the `EvalFile` option points the the network file (depending on the GUI, with full path). The performance of the NNUE evaluation relative to the classical evaluation depends somewhat on the hardware, and is expected to improve quickly, but is currently on > 80 Elo on fishtest: 60000 @ 10+0.1 th 1 https://tests.stockfishchess.org/tests/view/5f28fe6ea5abc164f05e4c4c ELO: 92.77 +-2.1 (95%) LOS: 100.0% Total: 60000 W: 24193 L: 8543 D: 27264 Ptnml(0-2): 609, 3850, 9708, 10948, 4885 40000 @ 20+0.2 th 8 https://tests.stockfishchess.org/tests/view/5f290229a5abc164f05e4c58 ELO: 89.47 +-2.0 (95%) LOS: 100.0% Total: 40000 W: 12756 L: 2677 D: 24567 Ptnml(0-2): 74, 1583, 8550, 7776, 2017 At the same time, the impact on the classical evaluation remains minimal, causing no significant regression: sprt @ 10+0.1 th 1 https://tests.stockfishchess.org/tests/view/5f2906a2a5abc164f05e4c5b LLR: 2.94 (-2.94,2.94) {-6.00,-4.00} Total: 34936 W: 6502 L: 6825 D: 21609 Ptnml(0-2): 571, 4082, 8434, 3861, 520 sprt @ 60+0.6 th 1 https://tests.stockfishchess.org/tests/view/5f2906cfa5abc164f05e4c5d LLR: 2.93 (-2.94,2.94) {-6.00,-4.00} Total: 10088 W: 1232 L: 1265 D: 7591 Ptnml(0-2): 49, 914, 3170, 843, 68 The needed networks can be found at https://tests.stockfishchess.org/nns It is recommended to use the default one as indicated by the `EvalFile` UCI option. Guidelines for testing new nets can be found at https://github.com/glinscott/fishtest/wiki/Creating-my-first-test#nnue-net-tests Integration has been discussed in various issues: https://github.com/official-stockfish/Stockfish/issues/2823 https://github.com/official-stockfish/Stockfish/issues/2728 The integration branch will be closed after the merge: https://github.com/official-stockfish/Stockfish/pull/2825 https://github.com/official-stockfish/Stockfish/tree/nnue-player-wip closes https://github.com/official-stockfish/Stockfish/pull/2912 This will be an exciting time for computer chess, looking forward to seeing the evolution of this approach. Bench: 4746616	2020-08-06 16:37:45 +02:00
mstembera	1ea488d34c	Use 128 bit multiply for TT index Remove super cluster stuff from TT and just use a 128 bit multiply. STC https://tests.stockfishchess.org/tests/view/5ee719b3aae8aec816ab7548 LLR: 2.94 (-2.94,2.94) {-1.50,0.50} Total: 12736 W: 2502 L: 2333 D: 7901 Ptnml(0-2): 191, 1452, 2944, 1559, 222 LTC https://tests.stockfishchess.org/tests/view/5ee732d1aae8aec816ab7556 LLR: 2.93 (-2.94,2.94) {-1.50,0.50} Total: 27584 W: 3431 L: 3350 D: 20803 Ptnml(0-2): 173, 2500, 8400, 2511, 208 Scheme back to being derived from https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/ Also the default optimized version of the index calculation now uses fewer instructions. https://godbolt.org/z/Tktxbv Might benefit from mulx (requires -mbmi2) closes https://github.com/official-stockfish/Stockfish/pull/2744 bench: 4320954	2020-06-17 07:32:16 +02:00
Sami Kiminki	beb327f910	Fix a Windows-only crash on exit without 'quit' There was a bug in commit `d4763424d2` (Add support for Windows large pages) that could result in trying to free memory allocated with VirtualAlloc incorrectly with free(). Fix this by reverting the TT.resize(0) logic in the previous commit, and instead, just call aligned_ttmem_free() in TranspositionTable::~TranspositionTable(). fixes https://github.com/official-stockfish/Stockfish/issues/2677 closes https://github.com/official-stockfish/Stockfish/pull/2679 No functional change	2020-05-14 20:35:40 +02:00
Sami Kiminki	d4763424d2	Add support for Windows large pages for users that set the needed privilige "Lock Pages in Memory" large pages will be automatically enabled (see Readme.md). This expert setting might improve speed, 5% - 30%, depending on the hardware, the number of threads and hash size. More for large hashes, large number of threads and NUMA. If the operating system can not allocate large pages (easier after a reboot), default allocation is used automatically. The engine log provides details. closes https://github.com/official-stockfish/Stockfish/pull/2656 fixes https://github.com/official-stockfish/Stockfish/issues/2619 No functional change	2020-05-13 20:57:47 +02:00
Gary Heckman	37e3863927	Fix ambiguity between clamp implementations There is an ambiguity between global and std clamp implementations when compiling in c++17, and on certain toolchains that are not strictly conforming to c++11. This is solved by putting our clamp implementation in a namespace. closes https://github.com/official-stockfish/Stockfish/pull/2572 No functional change.	2020-03-07 11:14:27 +01:00
Joost VandeVondele	0c878adb36	Small cleanups. closes https://github.com/official-stockfish/Stockfish/pull/2532 Bench: 4869669	2020-02-05 15:32:29 +01:00
Sami Kiminki	39437f4e55	Advise the kernel to use huge pages (Linux) Align the TT allocation by 2M to make it huge page friendly and advise the kernel to use huge pages. Benchmarks on my i7-8700K (6C/12T) box: (3 runs per bench per config) vanilla (nps) hugepages (nps) avg ================================================================================== bench \| 3012490 3024364 3036331 3071052 3067544 3071052 +1.5% bench 16 12 20 \| 19237932 19050166 19085315 19266346 19207025 19548758 +1.1% bench 16384 12 20 \| 18182313 18371581 18336838 19381275 19738012 19620225 +7.0% On my box, huge pages have a significant perf impact when using a big hash size. They also speed up TT initialization big time: vanilla (s) huge pages (s) speed-up ======================================================================= time stockfish bench 16384 1 1 \| 5.37 1.48 3.6x In practice, huge pages with auto-defrag may always be enabled in the system, in which case this patch has no effect. This depends on the values in /sys/kernel/mm/transparent_hugepage/enabled and /sys/kernel/mm/transparent_hugepage/defrag. closes https://github.com/official-stockfish/Stockfish/pull/2463 No functional change	2020-01-27 11:16:10 +01:00
Stéphane Nicolet	9f800a2577	Show compiler info at startup This patch shows a description of the compiler used to compile Stockfish, when starting from the console. Usage: ``` ./stockfish compiler ``` Example of output: ``` Stockfish 120120 64 POPCNT by T. Romstad, M. Costalba, J. Kiiski, G. Linscott Compiled by clang++ 9.0.0 on Apple __VERSION__ macro expands to: 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.38) ``` No functional change	2020-01-12 11:54:15 +01:00
Alain SAVARD	09bef14c76	Update lists of authors and contributors Preparing for version 11 of Stockfish: update lists of authors, contributors giving CPU time to the fishtest framework, etc. No functional change	2020-01-09 01:43:47 +01:00
Stéphane Nicolet	8726beba59	Restore development version (revert previous commit) Revert the previous patch now that the binary for the super-final of TCEC season 16 has been sent. Maybe the feature of showing the name of compiler will be added to the master branch in the future. But we may use a cleaner way to code it, see some ideas using the Makefile approach at the end of pull request #2327 : https://github.com/official-stockfish/Stockfish/pull/2327 Bench: 3618154	2019-09-26 23:27:48 +02:00
Stéphane Nicolet	0436f01d05	Temporary patch to show the compiler for TCEC submission This patch shows a description of the compiler used to compile Stockfish, when starting from the console. Usage: ``` ./stockfish compiler ``` Example of output: ``` Stockfish 240919 64 POPCNT by T. Romstad, M. Costalba, J. Kiiski, G. Linscott Compiled by clang++ 9.0.0 on Apple __VERSION__ macro expands to: 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.38) ``` No functional change	2019-09-25 22:28:51 +02:00
Marco Costalba	4e72e2a964	Assorted trivial cleanups 4/2019 No functional change.	2019-05-02 19:30:26 +02:00
Marco Costalba	82ad9ce9cf	Assorted trivial cleanups 3/2019 (#2030 ) No functional change.	2019-03-31 11:47:36 +02:00
Stéphane Nicolet	cf5d683408	Stockfish 10-beta Preparation commit for the upcoming Stockfish 10 version, giving a chance to catch last minute feature bugs and evaluation regression during the one-week code freeze period. Also changing the copyright dates to include 2019. No functional change	2018-11-19 11:18:21 +01:00
Stéphane Nicolet	a03e98dcd3	Switch time management to 64 bits This is a patch to fix issue #1498, switching the time management variables to 64 bits to avoid overflow of time variables after 25 days. There was a bug in Stockfish 9 causing the output to be wrong after 2^31 milliseconds search. Here is a long run from the starting position: info depth 64 seldepth 87 multipv 1 score cp 23 nodes 13928920239402 nps 0 tbhits 0 time -504995523 pv g1f3 d7d5 d2d4 g8f6 c2c4 d5c4 e2e3 e7e6 f1c4 c7c5 e1g1 b8c6 d4c5 d8d1 f1d1 f8c5 c4e2 e8g8 a2a3 c5e7 b2b4 f8d8 b1d2 b7b6 c1b2 c8b7 a1c1 a8c8 c1c2 c6e5 d1c1 c8c2 c1c2 e5f3 d2f3 a7a5 b4b5 e7c5 f3d4 d8c8 d4b3 c5d6 c2c8 b7c8 b3d2 c8b7 d2c4 d6c5 e2f3 b7d5 f3d5 e6d5 c4e5 a5a4 e5d3 f6e4 d3c5 e4c5 b2d4 c5e4 d4b6 e4d6 g2g4 d6b5 b6c5 b5c7 g1g2 c7e6 c5d6 g7g6 We check at compile time that the TimePoint type is exactly 64 bits long for the compiler (TimePoint is our alias in Stockfish for std::chrono::milliseconds -- it is a signed integer type of at least 45 bits according to the C++ standard, but will most probably be implemented as a 64 bits signed integer on modern compilers), and we use this TimePoint type consistently across the code. Bug report by user "fischerandom" on the TCEC chat (thanks), and the patch includes code and suggestions by user "WOnder93" and Ronald de Man. Fixes issue: https://github.com/official-stockfish/Stockfish/issues/1498 Closes pull request: https://github.com/official-stockfish/Stockfish/pull/1510 No functional change.	2018-03-27 16:25:41 +02:00
Joost VandeVondele	9afa1d7330	New Year 2018 Adjust copyright headers. No functional change.	2018-01-01 13:18:10 +01:00
mstembera	d01b66ae8f	Fix pawn entry prefetch No functional change Closes #1026	2017-03-14 20:56:26 -07:00
Joost VandeVondele	d8f683760c	Adjust copyright headers to 2017 (#965 ) No functional change.	2017-01-11 08:46:29 +01:00
Marco Costalba	0d9a9f5e98	Handle Windows Processors Groups Under Windows it is not possible for a process to run on more than one logical processor group. This usually means to be limited to use max 64 cores. To overcome this, some special platform specific API should be called to set group affinity for each thread. Original code from Texel by Peter �sterlund. Tested by Jean-Paul Vael on a Xeon E7-8890 v4 with 88 threads and confimed speed up between 44 and 88 threads is about 30%, as expected. No functional change.	2016-11-22 07:56:04 +01:00
lucasart	126036abb0	Do not hardcode Debug Log File Allow to specifiy the log file name, this comes handy in case of self-matches so that each SF instance writes into a different log file. No functional change.	2016-06-15 08:47:08 +02:00
ppigazzini	d4af15f682	Update AUTHORS and copyright notice No functional change Resolves #555	2016-01-02 09:43:51 +00:00
Marco Costalba	9742fb10fd	Update Copyright year No functional change. Resolves #554	2016-01-01 10:17:36 +00:00
Marco Costalba	0b36ba74fc	Don't assume the type of Time::point But instead use the proper definition. Also rewrite chrono functions while there. No functional change.	2015-02-24 14:08:14 +01:00
Marco Costalba	99c9cae586	Avoid casting to char* in prefetch() Funny enough, gcc __builtin_prefetch() expects already a void, instead Windows's _mm_prefetch() requires a char. The patch allows to remove ugly casts from caller sites. No functional change.	2015-02-07 19:13:41 +01:00
Marco Costalba	8b0fee9998	Rename dbg_hit_on_c() to dbg_hit_on() Use an overload instead of a new named function. I have found this handier and easier when adding some quick debug code. No functional change.	2015-02-07 11:15:38 +01:00
Marco Costalba	96e36a7897	Explicitly defaulted and deleted members Better than a bit obscure implicit ones. No functional change.	2015-01-21 13:18:19 +01:00
Marco Costalba	3c07603dac	Import C++11 branch Import C++11 branch from: https://github.com/mcostalba/Stockfish/tree/c++11 The version imported is teh last one as of today: `6670e93e50` Branch is fully equivalent with master but syzygy tablebases that are missing (but will be added with next commit). bench: 8080602	2015-01-18 08:00:50 +01:00
Marco Costalba	42b48b08e8	Update copyright year No functional change.	2015-01-10 11:46:28 +01:00
Marco Costalba	5943600a89	Assorted nitpicking code-style No functional change.	2014-12-10 12:38:13 +01:00
Ernesto Gatti	158864270a	Simpler PRNG and faster magics search This patch replaces RKISS by a simpler and faster PRNG, xorshift64* proposed by S. Vigna (2014). It is extremely simple, has a large enough period for Stockfish's needs (2^64), requires no warming-up (allowing such code to be removed), and offers slightly better randomness than MT19937. Paper: http://xorshift.di.unimi.it/ Reference source code (public domain): http://xorshift.di.unimi.it/xorshift64star.c The patch also simplifies how init_magics() searches for magics: - Old logic: seed the PRNG always with the same seed, then use optimized bit rotations to tailor the RNG sequence per rank. - New logic: seed the PRNG with an optimized seed per rank. This has two advantages: 1. Less code and less computation to perform during magics search (not ROTL). 2. More choices for random sequence tuning. The old logic only let us choose from 4096 bit rotation pairs. With the new one, we can look for the best seeds among 2^64 values. Indeed, the set of seeds[][] provided in the patch reduces the effort needed to find the magics: 64-bit SF: Old logic -> 5,783,789 rand64() calls needed to find the magics New logic -> 4,420,086 calls 32-bit SF: Old logic -> 2,175,518 calls New logic -> 1,895,955 calls In the 64-bit case, init_magics() take 25 ms less to complete (Intel Core i5). Finally, when playing with strength handicap, non-determinism is achieved by setting the seed of the static RNG only once. Afterwards, there is no need to skip output values. The bench only changes because the Zobrist keys are now different (since they are random numbers straight out of the PRNG). The RNG seed has been carefully chosen so that the resulting Zobrist keys are particularly well-behaved: 1. All triplets of XORed keys are unique, implying that it would take at least 7 keys to find a 64-bit collision (test suggested by ceebo) 2. All pairs of XORed keys are unique modulo 2^32 3. The cardinality of { (key1 ^ key2) >> 48 } is as close as possible to the maximum (65536) Point 2 aims at ensuring a good distribution among the bits that determine an TT entry's cluster, likewise point 3 among the bits that form the TT entry's key16 inside a cluster. Details: Bitset card(key1^key2) ------ --------------- RKISS key16 64894 = 99.020% of theoretical maximum low18 180117 = 99.293% low32 305362 = 99.997% Xorshift64, old seed key16 64918 = 99.057% low18 179994 = 99.225% low32 305350 = 99.993% Xorshift64, new seed key16 65027 = 99.223% low18 181118 = 99.845% low32 305371 = 100.000% Bench: 9324905 Resolves #148	2014-12-08 08:18:26 +08:00
hxim	fbb53524ef	Rename some variables for more clarity. No functional change. Resolves #131	2014-12-08 07:53:33 +08:00
mstembera	bc83515c9e	Removing some superfluous extern declarations No functional change. Resolves #93	2014-11-05 21:17:19 +00:00
Marco Costalba	ff480bdb83	Retire struct Log No more used now that we have removed "Write Search Log" UCI option. No functional change.	2014-09-16 21:13:50 +01:00
Marco Costalba	e4695f15bc	Additional renaming from DON Assorted renaming and triviality. No functional change.	2014-02-14 09:42:50 +01:00
Marco Costalba	41641e3b1e	Assorted tweaks from DON Mainly renames and some little code style improvment, inspired by looking at DON sources: https://github.com/erashid/DON No functional change.	2014-02-09 17:31:45 +01:00
Marco Costalba	c9dcda6ac4	Update copyright year No functional change.	2014-01-02 01:49:18 +01:00
Marco Costalba	f31847302d	Streamline time computation No functional change.	2013-08-03 18:30:43 +02:00
Joona Kiiski	a16ba5bbd1	Retire cpu_count() Set threads number always to 1 at startup and let the user explicitly to chose the number of threads. Also preserve the useful behavior of automatically set "Min Split Depth" according to the requested threads, indeed this parameter is too technical for a casual user, so, when left to zero, we set it on a sensible value. No functional change	2013-08-02 16:48:25 +02:00
homoSapiensSapiens	002062ae93	Use #ifndef instead of #if !defined And #ifdef instead of #if defined This is more standard form (see for example iostream file). No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2013-07-24 19:49:17 +02:00
Marco Costalba	c5ec94d0f1	Update copyright year No functional change.	2013-02-19 07:54:14 +01:00
Marco Costalba	b50ce5ebfb	Get rid of struct Time We just need the milliseconds of current system time for our needs. This allows to simplify the API. No functional change.	2012-09-04 09:38:51 +02:00
Marco Costalba	5900ab76a0	Rename current_time() to now() Follow C++11 naming conventions. No functional change.	2012-09-02 17:04:22 +02:00
Marco Costalba	831f91b859	Retire Time::restart() Simplify API. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2012-08-31 19:47:07 +02:00
Marco Costalba	1258c7aabe	Don't need to memset HashTable Default c'tor Entry() already initializes to zero all its POD members. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2012-08-31 19:47:00 +02:00
Marco Costalba	92e759a676	Introduce serialization of accesses to std::cout When many threds concurrently print you need to serialize the access to std::cout to avoid output lines are intermixed with the contents of each thread. This is not strictly needed at the moment because only main thread prints out, although some ad-hoc test could trigger UCI::loop() printing while searching. Anyhow we want to lift this pretty avoidable constrain also as a prerequisite for future work. This patch just introduces the support, next one will enable the serialization. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2012-08-29 19:11:31 +02:00

1 2 3

110 commits