1
0
Fork 0
mirror of https://github.com/sockspls/badfish synced 2025-05-01 17:19:36 +00:00
Commit graph

309 commits

Author SHA1 Message Date
Disservin
b187622233 use expanded variables for shell commands
Performance improvement for the shell commands in the Makefile.
By using expanded variables, the shell commands are only
evaluated once, instead of every time they are used.

closes https://github.com/official-stockfish/Stockfish/pull/4838

No functional change
2023-10-23 20:39:48 +02:00
Disservin
2d0237db3f add clang-format
This introduces clang-format to enforce a consistent code style for Stockfish.

Having a documented and consistent style across the code will make contributing easier
for new developers, and will make larger changes to the codebase easier to make.

To facilitate formatting, this PR includes a Makefile target (`make format`) to format the code,
this requires clang-format (version 17 currently) to be installed locally.

Installing clang-format is straightforward on most OS and distros
(e.g. with https://apt.llvm.org/, brew install clang-format, etc), as this is part of quite commonly
used suite of tools and compilers (llvm / clang).

Additionally, a CI action is present that will verify if the code requires formatting,
and comment on the PR as needed. Initially, correct formatting is not required, it will be
done by maintainers as part of the merge or in later commits, but obviously this is encouraged.

fixes https://github.com/official-stockfish/Stockfish/issues/3608
closes https://github.com/official-stockfish/Stockfish/pull/4790

Co-Authored-By: Joost VandeVondele <Joost.VandeVondele@gmail.com>
2023-10-22 16:06:27 +02:00
mstembera
8a912951de Remove handcrafted MMX code
too small a benefit to maintain this old target

closes https://github.com/official-stockfish/Stockfish/pull/4804

No functional change
2023-10-08 07:37:01 +02:00
Jasper Shovelton
ce99b4b2ef Increment minor section number from 3.7.1 to 3.8.1.
Pext has nothing to do with git commit sha/date, so separate the two
sub-sections.

closes https://github.com/official-stockfish/Stockfish/pull/4785

No functional change
2023-09-29 22:07:10 +02:00
Joost VandeVondele
22cdb6c1ea Explicitly invoke shell
in some cases the permission on the script might be incorrect (zip downloads?).
Explicitly invoke the shell

closes https://github.com/official-stockfish/Stockfish/pull/4803

No functional change
2023-09-24 20:04:42 +02:00
Sebastian Buchwald
952740b36c Let CI check C++ includes
The commit adds a CI workflow that uses the included-what-you-use (IWYU)
tool to check for missing or superfluous includes in .cpp files and
their corresponding .h files. This means that some .h files (especially
in the nnue folder) are not checked yet.

The CI setup looks like this:
- We build IWYU from source to include some yet unreleased fixes.
  This IWYU version targets LLVM 17. Thus, we get the latest release
  candidate of LLVM 17 from LLVM's nightly packages.
- The Makefile now has an analyze target that just build the object
  files (without linking)
- The CI uses the analyze target with the IWYU tool as compiler to
  analyze the compiled .cpp file and its corresponding .h file.
- If IWYU suggests a change the build fails (-Xiwyu --error).
- To avoid false positives we use LLVM's libc++ as standard library
- We have a custom mappings file that adds some mappings that are
  missing in IWYU's default mappings

We also had to add one IWYU pragma to prevent a false positive in
movegen.h.

https://github.com/official-stockfish/Stockfish/pull/4783

No functional change
2023-09-22 19:12:53 +02:00
Stéphane Nicolet
e594aa7429 Export makefile ARCH in binary
Example of the `./stockfish compiler` command:

Compiled by                : g++ (GNUC) 10.3.0 on Apple
Compilation architecture   : x86-64-bmi2
Compilation settings       : 64bit BMI2 AVX2 SSE41 SSSE3 SSE2 POPCNT
Compiler __VERSION__ macro : 10.3.0

closes https://github.com/official-stockfish/Stockfish/pull/4789

no functional change
2023-09-22 19:09:20 +02:00
Joost VandeVondele
708319a433 Enable a default native ARCH
uses a posix compatible script to find the native arch.
(based on ppigazzini's https://github.com/ppigazzini/stockfish-downloader )
use that native arch by default, no changes if ARCH is specified explicitly.

SF can now be compiled in an optimal way simply using

make -j profile-build

closes https://github.com/official-stockfish/Stockfish/pull/4777

No functional change
2023-09-22 19:06:37 +02:00
Sebastian Buchwald
b9319c4fa4 Cleanup code after dropping ICC support in favor of ICX
The commit removes all uses of ICC's __INTEL_COMPILER macro and other
references to ICC. It also adds ICX info to the compiler command and
fixes two typos in Makefile's help output.

closes https://github.com/official-stockfish/Stockfish/pull/4769

No functional change
2023-09-11 22:46:01 +02:00
Stéphane Nicolet
4c5919fa95 Fix some tabs in Makefile
Avoid mixing spaces and tabs for indentation in Makefile

closes https://github.com/official-stockfish/Stockfish/pull/4759

No functional change
2023-08-22 10:59:39 +02:00
Disservin
46756996e7 Add -funroll-loops to CXXFLAGS
Optimize profiling data accuracy by enabling -funroll-loops during the profile
generation phase, in addition to its default activation by -fprofile-use.
This seems to produce a slightly faster binary, for most compilers.

make -j profile-build ARCH=x86-64-avx2

sf_base =  1392875 +/-   5905 (95%)
sf_test =  1402332 +/-   7303 (95%)
diff    =     9457 +/-   4413 (95%)
speedup = 0.67896% +/- 0.317% (95%)

STC:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 34784 W: 8970 L: 8665 D: 17149
Ptnml(0-2): 115, 3730, 9405, 4019, 123
https://tests.stockfishchess.org/tests/view/64d944815b17f7c21c0e92e1

closes https://github.com/official-stockfish/Stockfish/pull/4750

No functional change
2023-08-16 21:25:42 +02:00
Tomasz Sobczyk
0d2ddb81ef Fix Makefile for incorrect nnue file
If an incorrect network file is present at the start of the compilation stage, the
Makefile script now correctly removes it before trying to download a clean version.

closes https://github.com/official-stockfish/Stockfish/pull/4726

No functional change
2023-08-11 19:20:29 +02:00
Linmiao Xu
0ad9b51dea Remove classical psqt
Based on vondele's deletepsqt branch:
https://github.com/vondele/Stockfish/commit/369f5b051

This huge simplification uses a weighted material differences instead of
the positional piece square tables (psqt) in the semi-classical complexity
calculation. Tuned weights using spsa at 45+0.45 with:

int pawnMult = 100;
int knightMult = 325;
int bishopMult = 350;
int rookMult = 500;
int queenMult = 900;
TUNE(SetRange(0, 200), pawnMult);
TUNE(SetRange(0, 650), knightMult);
TUNE(SetRange(0, 700), bishopMult);
TUNE(SetRange(200, 800), rookMult);
TUNE(SetRange(600, 1200), queenMult);

The values obtained via this tuning session were for a model where
the psqt replacement formula was always from the point of view of White,
even if the side to move was Black. We re-used the same values for an
implementation with a psqt replacement from the point of view of the side
to move, testing the result both on our standard book on positions with
a strong White bias, and an alternate book with positions with a strong
Black bias.

We note that with the patch the last use of the venerable "Score" type
disappears in Stockfish codebase (the Score type was used in classical
evaluation to get a tampered eval interpolating values smoothly from the
early midgame stage to the endgame stage). We leave it to another commit
to clean all occurrences of Score in the code and the comments.

-------

Passed non-regression LTC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 142542 W: 36264 L: 36168 D: 70110
Ptnml(0-2): 76, 15578, 39856, 15696, 65
https://tests.stockfishchess.org/tests/view/64c8cb495b17f7c21c0cf9f8

Passed non-regression LTC (with a book with Black bias):
https://tests.stockfishchess.org/tests/view/64c8f9295b17f7c21c0cfdaf
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 494814 W: 125565 L: 125827 D: 243422
Ptnml(0-2): 244, 53926, 139346, 53630, 261

------

closes https://github.com/official-stockfish/Stockfish/pull/4713

Bench: 1655985
2023-08-06 22:16:52 +02:00
Joost VandeVondele
d70a905ce3 Deprecate the x86-64-modern arch
Explicitly describe the architecture as deprecated,
it remains available as its current alias x86-64-sse41-popcnt

CPUs that support just this instruction set are now years old,
any few years old Intel or AMD CPU supports x86-64-avx2. However,
naming things 'modern' doesn't age well, so instead use explicit names.

Adjust CI accordingly. Wiki, fishtest, downloader done as well.

closes https://github.com/official-stockfish/Stockfish/pull/4691

No functional change.
2023-07-16 17:47:25 +02:00
Joost VandeVondele
a3a91f3f9f Build and test more binaries in CI
use a fixed compiler on Linux and Windows (right now gcc 11).
build avxvnni on Windows (Linux needs updated core utils)
build x86-32 on Linux (Windows needs other mingw)
fix a Makefile issue where a failed PGOBENCH would not stop the build
reuse the WINE_PATH for SDE as we do for QEMU
use WINE_PATH variable also for the signature
verify the bench for each of the binaries
do not build x86-64-avx2 on macos

closes https://github.com/official-stockfish/Stockfish/pull/4682

No functional change
2023-07-15 09:15:16 +02:00
Joost VandeVondele
af110e02ec Remove classical evaluation
since the introduction of NNUE (first released with Stockfish 12), we
have maintained the classical evaluation as part of SF in frozen form.
The idea that this code could lead to further inputs to the NN or
search did not materialize. Now, after five releases, this PR removes
the classical evaluation from SF. Even though this evaluation is
probably the best of its class, it has become unimportant for the
engine's strength, and there is little need to maintain this
code (roughly 25% of SF) going forward, or to expend resources on
trying to improve its integration in the NNUE eval.

Indeed, it had still a very limited use in the current SF, namely
for the evaluation of positions that are nearly decided based on
material difference, where the speed of the classical evaluation
outweights its inaccuracies. This impact on strength is small,
roughly 2Elo, and probably decreasing in importance as the TC grows.

Potentially, removal of this code could lead to the development of
techniques to have faster, but less accurate NN evaluation,
for certain positions.

STC
https://tests.stockfishchess.org/tests/view/64a320173ee09aa549c52157
Elo: -2.35 ± 1.1 (95%) LOS: 0.0%
Total: 100000 W: 24916 L: 25592 D: 49492
Ptnml(0-2): 287, 12123, 25841, 11477, 272
nElo: -4.62 ± 2.2 (95%) PairsRatio: 0.95

LTC
https://tests.stockfishchess.org/tests/view/64a320293ee09aa549c5215b
 Elo: -1.74 ± 1.0 (95%) LOS: 0.0%
Total: 100000 W: 25010 L: 25512 D: 49478
Ptnml(0-2): 44, 11069, 28270, 10579, 38
nElo: -3.72 ± 2.2 (95%) PairsRatio: 0.96

VLTC SMP
https://tests.stockfishchess.org/tests/view/64a3207c3ee09aa549c52168
 Elo: -1.70 ± 0.9 (95%) LOS: 0.0%
Total: 100000 W: 25673 L: 26162 D: 48165
Ptnml(0-2): 8, 9455, 31569, 8954, 14
nElo: -3.95 ± 2.2 (95%) PairsRatio: 0.95

closes https://github.com/official-stockfish/Stockfish/pull/4674

Bench: 1444646
2023-07-11 22:56:49 +02:00
Torom
f9d9c69bc3 Set the length of GIT_SHA to 8 characters
Previously, the length of git commit hashes could vary depending on the git environment.

closes https://github.com/official-stockfish/Stockfish/pull/4527

No functional change
2023-04-22 10:38:25 +02:00
Maxim Masiutin
bc50378ff1 Replace deprecated icc with icx
Replace the deprecated Intel compiler icc with its newer icx variant.
This newer compiler is based on clang, and yields good performance.
As before, currently only linux is supported.

closes https://github.com/official-stockfish/Stockfish/pull/4478

No functional change
2023-04-01 16:16:48 +02:00
Joost VandeVondele
66bf45b99e Stringify the git info passed
avoid escaping the string in the Makefile.

Alternative to https://github.com/official-stockfish/Stockfish/pull/4476

closes https://github.com/official-stockfish/Stockfish/pull/4481

No functional change.
2023-04-01 15:58:05 +02:00
Sebastian Buchwald
d1e17989b5 Fix Makefile for clang 16
The clang 16 release will remove the -fexperimental-new-pass-manager
flag (see 69b2b7282e).
Thus, the commit adapts the Makefile to use this flag only for older
clang versions.

closes https://github.com/official-stockfish/Stockfish/pull/4437

No functional change
2023-03-14 08:25:14 +01:00
Maxim Masiutin
70dfa141d5 Clarify the description of the x86-64-vnni256 and x86-64-avxvnni architectures
Now it is clearly explained that "x86-64-vnni256" requires full
support of AVX512-VNNI, but only 256-bit operands are used.

closes https://github.com/official-stockfish/Stockfish/pull/4427

No functional change
2023-03-08 07:14:07 +01:00
Sebastian Buchwald
b4ad3a3c4b Add support for ARM dot product instructions
The sdot instruction computes (and accumulates) a signed dot product,
which is quite handy for Stockfish's NNUE code. The instruction is
optional for Armv8.2 and Armv8.3, and mandatory for Armv8.4 and above.

The commit adds a new 'arm-dotprod' architecture with enabled dot
product support. It also enables dot product support for the existing
'apple-silicon' architecture, which is at least Armv8.5.

The following local speed test was performed on an Apple M1 with
ARCH=apple-silicon. I had to remove CPU pinning from the benchmark
script. However, the results were still consistent: Checking both
binaries against themselves reported a speedup of +0.0000 and +0.0005,
respectively.

```
Result of 100 runs
==================
base (...ish.037ef3e1) =    1917997  +/- 7152
test (...fish.dotprod) =    2159682  +/- 9066
diff                   =    +241684  +/- 2923

speedup        = +0.1260
P(speedup > 0) =  1.0000

CPU: 10 x arm
Hyperthreading: off
```

Fixes #4193

closes https://github.com/official-stockfish/Stockfish/pull/4400

No functional change
2023-02-23 13:22:03 +01:00
MinetaS
7fc0f589d6 Add -Wconditional-uninitialized when using Clang
Add -Wconditional-uninitialized as it is not controlled by -Wall.

closes https://github.com/official-stockfish/Stockfish/pull/4371

No functional change
2023-02-02 17:49:23 +01:00
Sebastian Buchwald
31acd6bab7 Warn if a global function has no previous declaration
If a global function has no previous declaration, either the declaration
is missing in the corresponding header file or the function should be
declared static. Static functions are local to the translation unit,
which allows the compiler to apply some optimizations earlier (when
compiling the translation unit rather than during link-time
optimization).

The commit enables the warning for gcc, clang, and mingw. It also fixes
the reported warnings by declaring the functions static or by adding a
header file (benchmark.h).

closes https://github.com/official-stockfish/Stockfish/pull/4325

No functional change
2023-01-09 20:18:39 +01:00
Sebastian Buchwald
b60f9cc451 Update copyright years
Happy New Year!

closes https://github.com/official-stockfish/Stockfish/pull/4315

No functional change
2023-01-02 19:07:38 +01:00
MinetaS
20b0226462 Fix a dependency bug
Instead of allowing .depend for specific build-related targets, filter
non-build-related targets (i.e. help, clean) so that other targets can
normally execute .depend target.

closes https://github.com/official-stockfish/Stockfish/pull/4293

No functional change
2022-12-20 08:14:19 +01:00
Joost VandeVondele
61ea1534ff No error if net available but wget/curl missing
do not error out on missing wget/curl if these tools are not needed later on,
i.e. if the net is available already.

closes https://github.com/official-stockfish/Stockfish/pull/4291
closes https://github.com/official-stockfish/Stockfish/pull/4253

No functional change
2022-12-19 18:17:50 +01:00
NguyenPham
3659a9fda0 Fixed the help of Makefile
make profile-build more prominent, adjust comments

closes https://github.com/official-stockfish/Stockfish/pull/4284

No functional change
2022-12-19 18:08:12 +01:00
MinetaS
74fb936dbd Invoke .depend only on build targets
Add a constraint so that the dependency build only occurs when users
actually run build tasks.

This fixes a bug on some systems where gcc/g++ is not available.

closes https://github.com/official-stockfish/Stockfish/pull/4255

No functional change
2022-12-08 20:48:20 +01:00
disservin
e048d11825 Change versioning and save binaries as CI artifacts
For development versions of Stockfish, the version will now look like
dev-20221107-dca9a0533
indicating a development version, the date of the last commit,
and the git SHA of that commit. If git is not available,
the fallback is the date of compilation. Releases will continue to be
versioned as before.

Additionally, this PR extends the CI to create binary artifacts,
i.e. pushes to master will automatically build Stockfish and upload
the binaries to github.

closes https://github.com/official-stockfish/Stockfish/pull/4220

No functional change
2022-11-07 07:56:58 +01:00
Clement
5604b255e6 Add RISC-V 64-bit support
adds a riscv64 target architecture to the Makefile to support RISC-V 64-bit.
Compiled and tested on VisionFive 2 board.

closes https://github.com/official-stockfish/Stockfish/pull/4205

No functional change.
2022-10-23 20:18:08 +02:00
disservin
804394b939 enable bit manipulation instruction set 1
bmi1 enables the use of _blsr_u64 for pop_lsb, and is availabe when avx2 is.

verified a small speedup (0.2 - 0.6%)

closes https://github.com/official-stockfish/Stockfish/pull/4202

No functional change
2022-10-23 20:08:18 +02:00
MinetaS
234d2156fd Apply -flto-partition=one / -flto=full
This patch fixes a potential bug derived from an incompatibility between LTO and top-level assembly code (INCBIN).

Passed non-regression STC (master e90341f):
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 119352 W: 31986 L: 31862 D: 55504
Ptnml(0-2): 439, 12624, 33400, 12800, 413
https://tests.stockfishchess.org/tests/view/634aacf84bc7650f0755188b

closes https://github.com/official-stockfish/Stockfish/pull/4201

No functional change
2022-10-23 19:58:47 +02:00
mstembera
93f71ecfe1 Optimize make_index() using templates and lookup tables.
https://tests.stockfishchess.org/tests/view/634517e54bc7650f07542f99
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 642672 W: 171819 L: 170658 D: 300195
Ptnml(0-2): 2278, 68077, 179416, 69336, 2229

this also introduces `-flto-partition=one` as suggested by MinetaS (Syine Mineta)
to avoid linking errors due to LTO on 32 bit mingw. This change was tested in isolation as well

https://tests.stockfishchess.org/tests/view/634aacf84bc7650f0755188b
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 119352 W: 31986 L: 31862 D: 55504
Ptnml(0-2): 439, 12624, 33400, 12800, 413

closes https://github.com/official-stockfish/Stockfish/pull/4199

No functional change
2022-10-16 11:42:19 +02:00
Joost VandeVondele
a4d18d23a9 Provide network download fallback
in case the base infrastructure for providing the networks

https://tests.stockfishchess.org/nns

is down, use an alternate github repo for downloading networks during the build.

fixes #4149
fixes #4140

closes https://github.com/official-stockfish/Stockfish/pull/4151

No functional change
2022-09-07 07:32:53 +02:00
proukornew
6ede1bed89 Improve handling of variables set in the make environment
removes duplication on the commandline for example in a profile-build

closes https://github.com/official-stockfish/Stockfish/pull/3859

No functional change
2022-05-29 19:04:25 +02:00
Giacomo Lorenzetti
f7d1491b3d Assorted small cleanups
closes https://github.com/official-stockfish/Stockfish/pull/3973

No functional change
2022-05-29 18:42:48 +02:00
ppigazzini
e178a09c47 Drop sse from target "x86-32"
have maximal compatibility on legacy target arch, now supporting AMD Athlon

The old behavior can anyway be selected by the user if needed, for example

make -j profile-build ARCH=x86-32 sse=yes

fixes #3904
closes https://github.com/official-stockfish/Stockfish/pull/3918

No functional change
2022-02-05 07:33:34 +01:00
pschneider1968
bddd38c45e Fix Makefile for Android NDK cross-compile
For cross-compiling to Android on windows, the Makefile needs some tweaks.

Tested with Android NDK 23.1.7779620 and 21.4.7075529, using
Windows 10 with clean MSYS2 environment (i.e. no MINGW/GCC/Clang
toolchain in PATH) and Fedora 35, with build target:
build ARCH=armv8 COMP=ndk

The resulting binary runs fine inside Droidfish on my Samsung
Galaxy Note20 Ultra and Samsung Galaxy Tab S7+

Other builds tested to exclude regressions: MINGW64/Clang64 build
on Windows; MINGW64 cross build, native Clang and GCC builds on Fedora.

wiki docs https://github.com/glinscott/fishtest/wiki/Cross-compiling-Stockfish-for-Android-on-Windows-and-Linux

closes https://github.com/official-stockfish/Stockfish/pull/3901

No functional change
2022-01-25 07:27:23 +01:00
Joost VandeVondele
77cf5704b6 Revert -flto=auto on mingw
causes issues on some installations (glinscott/fishtest#1255).

closes https://github.com/official-stockfish/Stockfish/pull/3898

No functional change
2022-01-20 18:34:16 +01:00
ppigazzini
67062637f4 Improve Makefile for Windows native builds
A Windows Native Build (WNB) can be done:
 - on Windows, using a recent mingw-w64 g++/clang compiler
   distributed by msys2, cygwin and others
 - on Linux, using mingw-w64 g++ to cross compile

Improvements:
 - check for a WNB in a proper way and set a variable to simplify the code
 - set the proper EXE for a WNB
 - use the proper name for the mingw-w64 clang compiler
 - use the static linking for a WNB
 - use wine to make a PGO cross compile on Linux (also with Intel SDE)
 - enable the LTO build for mingw-w64 g++ compiler
 - set `lto=auto` to use the make's job server, if available, or otherwise
   to fall back to autodetection of the number of CPU threads
 - clean up all the temporary LTO files saved in the local directory

Tested on:
 - msys2 MINGW64 (g++), UCRT64 (g++), MINGW32 (g++), CLANG64 (clang)
   environments
 - cygwin mingw-w64 g++
 - Ubuntu 18.04 & 21.10 mingw-w64 PGO cross compile (also with Intel SDE)

closes #3891

No functional change
2022-01-19 22:26:20 +01:00
proukornew
d11101e4c6 Improve logic on mingw
There is no need to point g++, if we explicitly choose mingw.

Now for cygwin:

make COMP=mingw ARCH=x86-64-modern build

closes https://github.com/official-stockfish/Stockfish/pull/3860

No functional change
2022-01-17 19:47:32 +01:00
pschneider1968
c5d45d3220 Fix Makefile for compilation with clang on Windows
use static compilation and
added exclusion of -latomic for Clang/MSYS2 as per ppigazzini's suggestion

fixes #3872

closes https://github.com/official-stockfish/Stockfish/pull/3873

No functional change
2022-01-13 22:17:27 +01:00
Brad Knox
ad926d34c0 Update copyright years
Happy New Year!

closes https://github.com/official-stockfish/Stockfish/pull/3881

No functional change
2022-01-06 15:45:45 +01:00
George Sobala
ca51b45649 Fixes build failure on Apple M1 Silicon
This pull request selectively avoids `-mdynamic-no-pic` for gcc on Apple Silicon
(there was no problem with the default clang compiler).

fixes https://github.com/official-stockfish/Stockfish/issues/3847
closes https://github.com/official-stockfish/Stockfish/pull/3850

No functional change
2021-12-19 11:43:18 +01:00
George Sobala
939b694bfd Fix for profile-build failure using gcc on MacOS
Fixes https://github.com/official-stockfish/Stockfish/issues/3846 ,
where the profiling SF binary generated by GCC on MacOS would launch
but failed to quit. Tested with gcc-8, gcc9, gcc10, gcc-11.

The problem can be fixed by adding -fvisibility=hidden to the compiler
flags, see for example the following piece of Apple documentation:
https://developer.apple.com/library/archive/documentation/DeveloperTools/Conceptual/CppRuntimeEnv/Articles/SymbolVisibility.html

For instance this now works:
   make -j8 profile-build ARCH=x86-64-avx2 COMP=gcc COMPCXX=g++-11

No functional change
2021-12-17 18:52:09 +01:00
Tomasz Sobczyk
4766dfc395 Optimize FT activation and affine transform for NEON.
This patch optimizes the NEON implementation in two ways.

    The activation layer after the feature transformer is rewritten to make it easier for the compiler to see through dependencies and unroll. This in itself is a minimal, but a positive improvement. Other architectures could benefit from this too in the future. This is not an algorithmic change.
    The affine transform for large matrices (first layer after FT) on NEON now utilizes the same optimized code path as >=SSSE3, which makes the memory accesses more sequential and makes better use of the available registers, which allows for code that has longer dependency chains.

Benchmarks from Redshift#161, profile-build with apple clang

george@Georges-MacBook-Air nets % ./stockfish-b82d93 bench 2>&1 | tail -4 (current master)
===========================
Total time (ms) : 2167
Nodes searched  : 4667742
Nodes/second    : 2154011
george@Georges-MacBook-Air nets % ./stockfish-7377b8 bench 2>&1 | tail -4 (this patch)
===========================
Total time (ms) : 1842
Nodes searched  : 4667742
Nodes/second    : 2534061

This is a solid 18% improvement overall, larger in a bench with NNUE-only, not mixed.

Improvement is also observed on armv7-neon (Raspberry Pi, and older phones), around 5% speedup.

No changes for architectures other than NEON.

closes https://github.com/official-stockfish/Stockfish/pull/3837

No functional changes.
2021-12-07 18:08:54 +01:00
Gian-Carlo Pascutto
c9977aa0a8 Add AVX-VNNI support for Alder Lake and later.
In their infinite wisdom, Intel axed AVX512 from Alder Lake
chips (well, not entirely, but we kind of want to use the Gracemont
cores for chess!) but still added VNNI support.
Confusingly enough, this is not the same as VNNI256 support.

This adds a specific AVX-VNNI target that will use this AVX-VNNI
mode, by prefixing the VNNI instructions with the appropriate VEX
prefix, and avoiding AVX512 usage.

This is about 1% faster on P cores:

Result of  20 runs
==================
base (./clang-bmi2   ) =    3306337  +/- 7519
test (./clang-vnni   ) =    3344226  +/- 7388
diff                   =     +37889  +/- 4153

speedup        = +0.0115
P(speedup > 0) =  1.0000

But a nice 3% faster on E cores:

Result of  20 runs
==================
base (./clang-bmi2   ) =    1938054  +/- 28257
test (./clang-vnni   ) =    1994606  +/- 31756
diff                   =     +56552  +/- 3735

speedup        = +0.0292
P(speedup > 0) =  1.0000

This was measured on Clang 13. GCC 11.2 appears to generate
worse code for Alder Lake, though the speedup on the E cores
is similar.

It is possible to run the engine specifically on the P or E using binding,
for example in linux it is possible to use (for an 8 P + 8 E setup like i9-12900K):
taskset -c 0-15 ./stockfish
taskset -c 16-23 ./stockfish
where the first call binds to the P-cores and the second to the E-cores.

closes https://github.com/official-stockfish/Stockfish/pull/3824

No functional change
2021-12-03 08:51:06 +01:00
Michel Van den Bergh
0e89d6e754 Do not output to stderr during the build.
To help with debugging, the worker sends the output of
stderr (suitable truncated) to the action log on the
server, in case a build fails. For this to work it is
important that there is no spurious output to stderr.

closes https://github.com/official-stockfish/Stockfish/pull/3773

No functional change
2021-10-31 22:40:41 +01:00
xoto10
f21a66f70d Small clean-up, Sept 2021
Closes https://github.com/official-stockfish/Stockfish/pull/3485

No functional change
2021-10-07 09:41:57 +02:00