1
0
Fork 0
mirror of https://github.com/sockspls/badfish synced 2025-05-02 17:49:35 +00:00
Commit graph

269 commits

Author SHA1 Message Date
Guy Vreuls
ea6220f381 This commit enables a mixed bench, to improve CI and allow for PGO (profile-build) of the NNUE part of the code.
Joint work gvreuls / vondele

* Download the default NNUE net in AppVeyor
* Download net in travis CI `make net`
* Adjust tests to cover more archs, speedup instrumented testing
* Introduce 'mixed' bench as default, with further options:

classical, NNUE, mixed.

mixed (default) and NNUE require the default net to be present,
which can be obtained with

```
make net
```

Further examples (first is equivalent to `./stockfish bench`):

```
./stockfish bench 16 1 13 default depth mixed
./stockfish bench 16 1 13 default depth classical
./stockfish bench 16 1 13 default depth NNUE
```

The net is now downloaded automatically if needed for `profile-build`
(usual `build` works fine without net present)

PGO gives a nice speedup on fishtest:

passed STC:
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 3360 W: 469 L: 343 D: 2548
Ptnml(0-2): 20, 246, 1030, 356, 28
https://tests.stockfishchess.org/tests/view/5f31b5499081672066537569

passed LTC:
LLR: 2.97 (-2.94,2.94) {0.25,1.75}
Total: 8824 W: 609 L: 502 D: 7713
Ptnml(0-2): 8, 430, 3438, 519, 17
https://tests.stockfishchess.org/tests/view/5f31c87b908167206653757c

closes https://github.com/official-stockfish/Stockfish/pull/2931

fixes https://github.com/official-stockfish/Stockfish/issues/2907

requires fishtest updates before commit

Bench: 4290577
2020-08-11 08:17:03 +02:00
mstembera
f46c73040c Fix AVX512 build with older compilers
avoids an intrinsic that is missing in gcc < 10.

For this target, might trigger another gcc bug on windows that
requires up-to-date gcc 8, 9, or 10, or usage of clang.

Fixes https://github.com/official-stockfish/Stockfish/issues/2975

closes https://github.com/official-stockfish/Stockfish/pull/2976

No functional change
2020-08-11 08:17:03 +02:00
Guy Vreuls
4ab8b0b738 Fix parallel LTO issues on Windows
This adds -save-temps to the linker flags when parallel LTO is used on
MinGW/MSYS.

fixes #2977

closes https://github.com/official-stockfish/Stockfish/pull/2978

No functional change.
2020-08-11 08:17:03 +02:00
Fanael Linithien
21df37d7fd Provide vectorized NNUE code for SSE2 and MMX targets
This patch allows old x86 CPUs, from AMD K8 (which the x86-64 baseline
targets) all the way down to the Pentium MMX, to benefit from NNUE with
comparable performance hit versus hand-written eval as on more modern
processors.

NPS of the bench with NNUE enabled on a Pentium III 1.13 GHz (using the
MMX code):
  master: 38951
  this patch: 80586

NPS of the bench with NNUE enabled using baseline x86-64 arch, which is
how linux distros are likely to package stockfish, on a modern CPU
(using the SSE2 code):
  master: 882584
  this patch: 1203945

closes https://github.com/official-stockfish/Stockfish/pull/2956

No functional change.
2020-08-10 19:17:57 +02:00
sf-x
cb0504028e Makefile rework/cleanup
Makefile targets x86-64-sse42, x86-sse3 are removed; x86-64-sse41
is renamed to x86-64-sse41-popcnt (it did enable popcnt).

Makefile variables sse3, sse42, their associated compilation flags
and code in misc.cpp are removed.

closes https://github.com/official-stockfish/Stockfish/pull/2922

No functional change
2020-08-10 14:32:11 +02:00
Joost VandeVondele
27b593a944 Fix a data race for NNUE
the stateInfo at the rootPos is no longer read-only, as the NNUE accumulator is part of it.
Threads can thus not share this object and need their own copy.

tested for no regression
https://tests.stockfishchess.org/tests/view/5f3022239081672066536bce
LLR: 2.96 (-2.94,2.94) {-1.50,0.50}
Total: 52800 W: 6843 L: 6802 D: 39155
Ptnml(0-2): 336, 4646, 16399, 4679, 340

closes https://github.com/official-stockfish/Stockfish/pull/2957

fixes https://github.com/official-stockfish/Stockfish/issues/2933

No functional change
2020-08-09 23:51:07 +02:00
Daniel Dugovic
d7a26899a9 Use fallback implementation for C++ aligned_alloc
fixes https://github.com/official-stockfish/Stockfish/issues/2921

closes https://github.com/official-stockfish/Stockfish/pull/2927

No functional change
2020-08-09 17:07:45 +02:00
Guy Vreuls
6d6267c378 Parallelize Link Time Optimization for GCC, CLANG and MINGW
This patch tries to run multiple LTO threads in parallel, speeding up
the build process of optimized builds if the -j make parameter is used.
This mitigates the longer linking times of optimized builds since the
integration of the NNUE code. Roughly 2x build speedup.

I've tried a similar patch some two years ago but it ran into trouble
with old compiler versions then. Since we're on the C++17 standard now
these old compilers should be obsolete.

closes https://github.com/official-stockfish/Stockfish/pull/2943

No functional change.
2020-08-08 22:35:18 +02:00
nodchip
84f3e86790 Add NNUE evaluation
This patch ports the efficiently updatable neural network (NNUE) evaluation to Stockfish.

Both the NNUE and the classical evaluations are available, and can be used to
assign a value to a position that is later used in alpha-beta (PVS) search to find the
best move. The classical evaluation computes this value as a function of various chess
concepts, handcrafted by experts, tested and tuned using fishtest. The NNUE evaluation
computes this value with a neural network based on basic inputs. The network is optimized
and trained on the evalutions of millions of positions at moderate search depth.

The NNUE evaluation was first introduced in shogi, and ported to Stockfish afterward.
It can be evaluated efficiently on CPUs, and exploits the fact that only parts
of the neural network need to be updated after a typical chess move.
[The nodchip repository](https://github.com/nodchip/Stockfish) provides additional
tools to train and develop the NNUE networks.

This patch is the result of contributions of various authors, from various communities,
including: nodchip, ynasu87, yaneurao (initial port and NNUE authors), domschl, FireFather,
rqs, xXH4CKST3RXx, tttak, zz4032, joergoster, mstembera, nguyenpham, erbsenzaehler,
dorzechowski, and vondele.

This new evaluation needed various changes to fishtest and the corresponding infrastructure,
for which tomtor, ppigazzini, noobpwnftw, daylen, and vondele are gratefully acknowledged.

The first networks have been provided by gekkehenker and sergiovieri, with the latter
net (nn-97f742aaefcd.nnue) being the current default.

The evaluation function can be selected at run time with the `Use NNUE` (true/false) UCI option,
provided the `EvalFile` option points the the network file (depending on the GUI, with full path).

The performance of the NNUE evaluation relative to the classical evaluation depends somewhat on
the hardware, and is expected to improve quickly, but is currently on > 80 Elo on fishtest:

60000 @ 10+0.1 th 1
https://tests.stockfishchess.org/tests/view/5f28fe6ea5abc164f05e4c4c
ELO: 92.77 +-2.1 (95%) LOS: 100.0%
Total: 60000 W: 24193 L: 8543 D: 27264
Ptnml(0-2): 609, 3850, 9708, 10948, 4885

40000 @ 20+0.2 th 8
https://tests.stockfishchess.org/tests/view/5f290229a5abc164f05e4c58
ELO: 89.47 +-2.0 (95%) LOS: 100.0%
Total: 40000 W: 12756 L: 2677 D: 24567
Ptnml(0-2): 74, 1583, 8550, 7776, 2017

At the same time, the impact on the classical evaluation remains minimal, causing no significant
regression:

sprt @ 10+0.1 th 1
https://tests.stockfishchess.org/tests/view/5f2906a2a5abc164f05e4c5b
LLR: 2.94 (-2.94,2.94) {-6.00,-4.00}
Total: 34936 W: 6502 L: 6825 D: 21609
Ptnml(0-2): 571, 4082, 8434, 3861, 520

sprt @ 60+0.6 th 1
https://tests.stockfishchess.org/tests/view/5f2906cfa5abc164f05e4c5d
LLR: 2.93 (-2.94,2.94) {-6.00,-4.00}
Total: 10088 W: 1232 L: 1265 D: 7591
Ptnml(0-2): 49, 914, 3170, 843, 68

The needed networks can be found at https://tests.stockfishchess.org/nns
It is recommended to use the default one as indicated by the `EvalFile` UCI option.

Guidelines for testing new nets can be found at
https://github.com/glinscott/fishtest/wiki/Creating-my-first-test#nnue-net-tests

Integration has been discussed in various issues:
https://github.com/official-stockfish/Stockfish/issues/2823
https://github.com/official-stockfish/Stockfish/issues/2728

The integration branch will be closed after the merge:
https://github.com/official-stockfish/Stockfish/pull/2825
https://github.com/official-stockfish/Stockfish/tree/nnue-player-wip

closes https://github.com/official-stockfish/Stockfish/pull/2912

This will be an exciting time for computer chess, looking forward to seeing the evolution of
this approach.

Bench: 4746616
2020-08-06 16:37:45 +02:00
Joost VandeVondele
ca41ee6632 Revert LTO for mingw on windows.
LTO with static linking is still only working with the latest versions of gcc,
causing problems for some devs.

on a modern mingw toolchain LTO optimizations can still be enabled as:

```
CXXFLAGS='-flto' make -j ARCH=x86-64-modern COMP=mingw profile-build
```

fixes https://github.com/official-stockfish/Stockfish/issues/2769

closes https://github.com/official-stockfish/Stockfish/pull/2774

No functional change.
2020-06-27 10:22:27 +02:00
Niklas Fiekas
aecfca2dc2 support popcnt on armv8
* Supports popcnt (thanks @daylen)
* bits = 64 is now the default

Tested with g++ (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0 on ThunderX CN8890,
yields about 9% speedup.

Also tested with clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final).

closes https://github.com/official-stockfish/Stockfish/pull/2770

No functional change.
2020-06-27 10:19:29 +02:00
UnaiCorzo
11483fe6d9 Makefile: support lto on mingw, default to 64bits
Clean and organize uppercase and spaces

fixes https://github.com/official-stockfish/Stockfish/issues/2731

closes  https://github.com/official-stockfish/Stockfish/pull/2763

No functional change
2020-06-24 22:14:25 +02:00
Niklas Fiekas
527d832a6d Support ARCH=armv8 in Makefile (#2355)
Tested with bench run after compiling with

- g++ (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
- clang version 3.8.1-24

on ThunderX CN8890.

closes https://github.com/official-stockfish/Stockfish/pull/2760

fixes https://github.com/official-stockfish/Stockfish/issues/2355

No functional change.
2020-06-24 21:59:57 +02:00
Joost VandeVondele
6f15e7fab2 small cleanups
closes https://github.com/official-stockfish/Stockfish/pull/2695

No functional change
2020-06-21 15:22:20 +02:00
xoto10
fcaf0736fe Fix syzygy dependencies issue
fixes https://github.com/official-stockfish/Stockfish/issues/2660

The problem was caused by .depend being created with a rule for tbprobe.o not for syzygy/tbprobe.o.
This patch keeps an explicit list of sources (SRCS), generates OBJS,
and compiles all object files to the src/ directory, consistent with .depend.
VPATH is used to search the syzygy directory as needed.

joint work with @gvreuls

closes https://github.com/official-stockfish/Stockfish/pull/2664

No functional change
2020-05-09 09:39:52 +02:00
Marco Costalba
c527c3ad44 Fishtest Tuning Framework
The purpose of the code is to allow developers to easily and flexibly
setup SF for a tuning session. Mainly you have just to remove 'const'
qualifiers from the variables you want to tune and flag them for
tuning, so if you have:

int myKing = 10;
Score myBonus = S(5, 15);
Value myValue[][2] = { { V(100), V(20) }, { V(7), V(78) } };

and at the end of the update you may want to call
a post update function:

void my_post_update();

If instead of default Option's min-max values,
you prefer your custom ones, returned by:

std::pair<int, int> my_range(int value)

Or you jus want to set the range directly, you can
simply add below:

TUNE(SetRange(my_range), myKing, SetRange(-200, 200), myBonus, myValue, my_post_update);

And all the magic happens :-)

At startup all the parameters are printed in a
format suitable to be copy-pasted in fishtest.

In case the post update function is slow and you have many
parameters to tune, you can add:

UPDATE_ON_LAST();

And the values update, including post update function call, will
be done only once, after the engine receives the last UCI option.
The last option is the one defined and created as the last one, so
this assumes that the GUI sends the options in the same order in
which have been defined.

closes https://github.com/official-stockfish/Stockfish/pull/2654

No functional change.
2020-05-02 17:32:11 +02:00
Joost VandeVondele
c6839a2615 Small cleanups
closes https://github.com/official-stockfish/Stockfish/pull/2546

No functional change.
2020-03-01 09:31:58 +01:00
Stéphane Nicolet
90c0385724 Assorted trivial cleanups
- Cleanups by Alain
- Group king attacks and king defenses
- Signature of futility_move_count()
- Use is_discovery_check_on_king()
- Simplify backward definition
- Use static asserts in move generator
- Factor a statement in move generator

No functional change
2019-10-26 00:29:12 +02:00
Stéphane Nicolet
8a04b3a13c Update Makefile documentation
Follow-up to previous commit. Update the documentation for the user when using `make`,
to show the preferred bmi2 compile in the advanced examples section.

Note: I made a mistake in the previous commit comment, the documentation is shown when
using `make` or `make help`, not `make --help`.

No functional change
2019-09-14 07:34:19 +02:00
Joost VandeVondele
db00e1625e Add sse4 if bmi2 is enabled
The only change done to the Makefile to get a somewhat faster binary as
discussed in #2291 is to add -msse4 to the compile options of the bmi2 build.
Since all processors supporting bmi2 also support sse4 this can be done easily.
It is a useful step to avoid sending around custom and poorly tested builds.

The speedup isn't enough to pass [0,4] but it is roughly 1.15Elo and a LOS of 90%:
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 93009 W: 20519 L: 20316 D: 52174

Also rewrite the documentation for the user when using `make --help`, so that
the order of architectures for x86-64 has the more performant build one on top.

Closes https://github.com/official-stockfish/Stockfish/pull/2300

No functional change
2019-09-14 07:11:23 +02:00
Daniel Axtens
fa1a2a0667 Enable popcount and prefetch for ppc-64
PowerPC has had popcount instructions for a long time, at least as far
back as POWER5 (released 2004). Enable them via a gcc builtin.

Using a gcc builtin has the added bonus that if compiled for a processor
that lacks a hardware instruction, gcc will include a software popcount
implementation that does not use the instruction. It might be slower
than the table lookups (or it might be faster) but it will certainly work.
So this isn't going to break anything.

On my POWER8 VM, this leads to a ~4.27% speedup.

Fir prefetch, the gcc builtin generates a 'dcbt' instruction, which is
supported at least as far back as the G5 (2002) and POWER4 (2001).

This leads to a ~5% speedup on my POWER8 VM.

No functional change
2019-07-11 11:30:09 +02:00
Joost VandeVondele
7ede1ed071 Allow for address sanitizer. (#2119)
Properly allow for sanitize=address (-fsanitize=address) as an argument to the Makefile.

No functional change
2019-04-27 20:47:06 +02:00
Stéphane Nicolet
cf5d683408 Stockfish 10-beta
Preparation commit for the upcoming Stockfish 10 version, giving a chance to catch last minute feature bugs and evaluation regression during the one-week code freeze period. Also changing the copyright dates to include 2019.

No functional change
2018-11-19 11:18:21 +01:00
Joost VandeVondele
3df8cabb84 Fix 'make strip' for mingw.
Currently the make strip target is broken on mingw as the exe name is wrong (stockfish instead of stockfish.exe).

Needs some testing by mingw users (both profile-build and strip, native and cross).

No functional change.
2018-04-29 06:53:51 +02:00
Stéphane Nicolet
94b3cdd908 Better indentation in Makefile
No functional change
2018-03-03 11:07:23 +01:00
erbsenzaehler
d438720a1c Unify use of -mdynamic-no-pic
Apply -mdynamic-no-pic in a single place in the Makefile instead of 5 places.

Verified on three different Macs:
- a MacBook from 2013
- a MacBook running MacOS 10.9.5
- an iMac running MacOS 10.13.3

No functional change.
2018-02-27 00:30:47 +01:00
syzygy1
ef61886332 Enable LTO for clang
Enable link-time optimization in the Makefile when compiling with clang.
Also update travis.yml to use clang++-5.0 and llvm-5.0-dev.

No functional change.
2018-02-06 00:46:50 +01:00
Joost VandeVondele
9afa1d7330 New Year 2018
Adjust copyright headers.

No functional change.
2018-01-01 13:18:10 +01:00
Stéphane Nicolet
ccd6bad512 Compile without exceptions
Add the -fno-exceptions flag to the Makefile to avoid the unecessary exceptions support in the executable (we do not use any exception in Stockfish at the moment).

This change gives a 9.2% reduction in size for the executable binary.

Before : executable size = 376956 bytes
After: executable size = 347652 bytes

No functional change.
2017-12-03 12:30:09 +01:00
basepr1me
7dd1f4a7c0 OpenBSD friendly start. 2017-11-18 16:45:33 +01:00
Cooffe
e0d2fdc843 Update Copyright year inMakefile
No functional change.
2017-10-28 12:35:44 +02:00
Joost VandeVondele
323925b91c Remove unneeded compile options.
In light of issue #1232, a test was performed about the value of '-fno-exceptions' and a second one of the combination '-fno-exceptions -fno-rtti'. It turns out these options are can be removed without introducing slowdown.

STC for removing '-fno-exceptions'
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 13678 W: 2572 L: 2439 D: 8667

STC for removing '-fno-exceptions -fno-rtti' (current patch)
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 32557 W: 6074 L: 5973 D: 20510

No functional change.
2017-09-02 16:58:23 +02:00
Joona Kiiski
d31f068312 Revert "Remove questionable gcc flags from profile-build"
This reverts commit 0371a8f8c4.
2017-07-13 16:36:27 -07:00
Joona Kiiski
0371a8f8c4 Remove questionable gcc flags from profile-build
Optimization options for official stockfish should be
consistent, easy, future proof and simple.

We don't want to optimize for any specific version of gcc

No functional change

Closes #1165
2017-07-08 14:20:46 -07:00
Joost VandeVondele
3cb0200459 Fix four data races.
the nodes, tbHits, rootDepth and lastInfoTime variables are read by multiple threads, but not declared atomic, leading to data races as found by -fsanitize=thread. This patch fixes this issue. It is based on top of the CI-threading branch (PR #1129), and should fix the corresponding CI error messages.

The patch passed an STC check for no regression:

http://tests.stockfishchess.org/tests/view/5925d5590ebc59035df34b9f
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 169597 W: 29938 L: 30066 D: 109593

Whereas rootDepth and lastInfoTime are not performance critical, nodes and tbHits are. Indeed, an earlier version using relaxed atomic updates on the latter two variables failed STC testing (http://tests.stockfishchess.org/tests/view/592001700ebc59035df34924), which can be shown to be due to x86-32 (http://tests.stockfishchess.org/tests/view/592330ac0ebc59035df34a89). Indeed, the latter have no instruction to atomically update a 64bit variable. The proposed solution thus uses a variable in Position that is accessed only by one thread, which is copied every few thousand nodes to the shared variable in Thread.

No functional change.

Closes #1130
Closes #1129
2017-06-21 13:37:58 -07:00
Marco Costalba
27ba611a3d Better naming in endgame code
And small clean-up of magic bitboards code.

No functional change.

Closes #1138
2017-06-16 19:33:44 -07:00
Marco Costalba
a6d6a2c2fa Assorted code style fixes
No functional change

Closes #1029
2017-03-14 21:02:21 -07:00
Joost VandeVondele
1e814e0ca0 Fix makefile: 32 bit builds without optimization.
Fixes failing build for

make ARCH=x86-32 clean && make ARCH=x86-32 optimize=no build

by passing -m32 also to the link step.

Extend travis testing accordingly.

No functional change.

Closes #999
2017-02-14 21:11:44 -08:00
theo77186
7a3844e6ef Fix PGO build with GCC (#904) 2016-11-27 14:43:52 +01:00
Michael Byrne
fbb2ffacfd Fix PGO Build for clang
This fixes https://github.com/official-stockfish/Stockfish/issues/167.

Additional improvments by Joost VandeVondele.
2016-11-27 10:03:52 +01:00
erbsenzaehler
ca464fc89e Cleanup Makfile for MacOs
1) Explicitly setting the default lib to the system-default is not
   needed on a Mac. See:
   http://libcxx.llvm.org/docs/UsingLibcxx.html

2) We do no longer need to exclude bmi2-builds from LTO. See:
   https://llvm.org/bugs/show_bug.cgi?id=19416

Changes tested and discussed on FishCooking:
   https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/acUQtKtEzMM

No functional change.
2016-11-25 08:46:20 +01:00
Joost VandeVondele
6036303bb6 Avoid touching source files in profile-build
This refines the profile-build target to avoid 'touch'ing the sources,
keeping meaningful modification dates and avoiding editor warnings like vi's:

WARNING: The file has been changed since reading it!!!
Do you really want to write to it (y/n)?

Instead of touching sources, the (instrumented) object files are removed,
which has the same effect of rebuilding them in the next step.

As a side effect, this simplifies the Makefile a bit.

No functional change.
2016-11-20 10:51:42 +01:00
Joost VandeVondele
67d19447f4 Makefile fix for sanitize
Small fixes for compilation with sanitize=yes optimize=no,
by always adding -fsanitize=undefined to the LDFLAGS as required.
Updates config-sanity to check&report the status of the flag.

No functional change.
2016-11-05 08:15:56 +01:00
Joost VandeVondele
bf51b4796a travis-ci: Enable undefined behavior checking 2016-10-27 06:29:24 +02:00
Jacques
16e1881126 Fixes for ARM compilation: take 2
The target:

Odroid U3 (http://www.hardkernel.com/main/products/prdt_info.php?g_code=g138745696275)
Debian Jessie
As listed in #550 and #638 three modifications are needed for compilation to work:

float-abi flag for GCC If an FPU is present and supported by the installed os then passed value need to be hard.
I didn't find any better solution than using readelf to check for the availibilty of Tag_ABI_VFP_args which sould indicate support for the FPU. The check is only done if the arch is arm and if readelf is not present
on the system, there will be an error (/bin/sh: 1: readelf: not found) but it will not break and will continue with the default softfp value. Outputing the error is not really acceptable but I wanted some feedback on the
check itself.

-lpthread is needed on armv7 outside of Android
I replaced UNAME with KERNEL and OS to allow to differentiate Android.

m32 flag
My understanding is that outside of Android the flag is generating errors on armv7.

These modifications should introduce change only for non Android armv7 build.

No functional change.
2016-10-14 08:58:07 +02:00
Marco Costalba
e1f600f186 Revert "Fixes for ARM compilation"
This reverts commit a3fe80c36a.

Break compilation on mingw for me.
2016-10-13 08:36:30 +02:00
Jacques
a3fe80c36a Fixes for ARM compilation
The target:

Odroid U3 (http://www.hardkernel.com/main/products/prdt_info.php?g_code=g138745696275)
Debian Jessie
As listed in #550 and #638 three modifications are needed for compilation to work:

float-abi flag for GCC If an FPU is present and supported by the installed os then passed value need to be hard.
I didn't find any better solution than using readelf to check for the availibilty of Tag_ABI_VFP_args which sould indicate support for the FPU. The check is only done if the arch is arm and if readelf is not present
on the system, there will be an error (/bin/sh: 1: readelf: not found) but it will not break and will continue with the default softfp value. Outputing the error is not really acceptable but I wanted some feedback on the
check itself.

-lpthread is needed on armv7 outside of Android
I replaced UNAME with KERNEL and OS to allow to differentiate Android.

m32 flag
My understanding is that outside of Android the flag is generating errors on armv7.

These modifications should introduce change only for non Android armv7 build.

No functional change.
2016-10-13 08:34:04 +02:00
Joost Vandevondele
4b0043ae7c Use fixed depth bench to make PGO builds more reproducible
Discussed on fishcooking

proposal and objdump verification:
https://groups.google.com/d/msg/fishcooking/4_ausUwMXP0/EGPsMYqOFAAJ

verified no significant speed difference between depth and time:
https://groups.google.com/d/msg/fishcooking/4_ausUwMXP0/KazW5QZmFgAJ

stockfish_time - stats:
mean = 2207232.56        std = 7079.51        std/mean = 0.003207

stockfish_depth - stats:
mean = 2201783.57        std = 6356.69        std/mean = 0.002887

No functional change
2016-09-18 08:13:34 +02:00
Krgp
8f934dff9a Remove useless -mbmi flag in Makefile
I could not find anything documented that is necessary that prepending -mbmi to -mbmi2 gives some benefit.
Instead at
https://gcc.gnu.org/onlinedocs/gcc/x86-Built-in-Functions.html#x86-Built-in-Functions

The following built-in functions are available when -mbmi is used. All of them generate the machine instruction that is part of the name.
unsigned int __builtin_ia32_bextr_u32(unsigned int, unsigned int);
unsigned long long __builtin_ia32_bextr_u64 (unsigned long long, unsigned long long);

The following built-in functions are available when -mbmi2 is used. All of them generate the machine instruction that is part of the name.
unsigned int _bzhi_u32 (unsigned int, unsigned int)
unsigned int _pdep_u32 (unsigned int, unsigned int)
unsigned int _pext_u32 (unsigned int, unsigned int)
unsigned long long _bzhi_u64 (unsigned long long, unsigned long long)
unsigned long long _pdep_u64 (unsigned long long, unsigned long long)
unsigned long long _pext_u64 (unsigned long long, unsigned long long)

and at
https://gcc.gnu.org/ml/gcc/2014-02/msg00204.html

( "... The real optimization comes from being able to use pext
(parallel bit extract), which can implement several bextr expressions in
parallel.")

Apart from that we don't use all -msse -msse2 -msse3 -msse4.2 etc. but just -msse3 (or -msse4.2) only.

As regards to the speedup within noise level - this pull request is actually reversal of mcostalba#198 wherein prepending -mbmi to -mbmi2 was claimed to be 0.3% faster and here (removing -mbmi) gives 0.4% speed gain.
2016-05-01 14:11:28 +02:00
erbsenzaehler
4048bae47b Use -O3 for all compilers (including ICC)
There seems to be no benefit from using -fast over -O3 with icc.
So use -O3 everywhere.

No functional change

Resolves #652
2016-04-24 00:55:56 +01:00