Add a Mac SSE4.2 target. Also change the Mac OS X minimum version to
10.6. Rationale: 97% of Macs run at least 10.6, version 10.9 is now
free, and using 10.6 as the minimum version gives a small 5% boost in
benchmark speed over versions using 10.0 as the minimum version.
Finally, enable Clang’s Link Time Optimization when compiling for the
Mac.
No functional change.
For some users -stack_size,0x4000 does not work,
so revert for now.
osX 10.6.8
gcc version 4.7.3 (MacPorts gcc47 4.7.3_2)
g++: error: unrecognized command line option '-stack_size,0x4000'
make[2]: *** [stockfish] Error 1
make[1]: *** [gcc-profile-make] Error 2
make: *** [profile-build] Error 2
No functional change.
This reverts commit 800410eef1 and instead increases
stack size.
I went through the old emails with Daylen that reported the
crash issue on Mac OS X and was fixed by 0049d3f337.
It was reported default stack size for a thread in Mac OS X is 8
megabytes while the patch that we are reverting allows to reduce
stack size at max of about 217KB, so the reason for the crash was
only marginal in MAX_MOVES value. On those emails Daylen also
hinted how to increase stack size for Mac OS X to 16MB.
So prefer to increase stack size to 16MB instad of re-inventing
the wheel and do our home grown stack as we did with the patch
that we are now reverting (it will remain anyhow in git history
for documentation purposes).
No functional change.
STANDALONE-TOOLCHAIN.html in Android NDK says:
It is recommended to use the -mthumb compiler flag to force the generation
of 16-bit Thumb-1 instructions (the default being 32-bit ARM ones).
If you want to target the 'armeabi-v7a' ABI, you will need ensure that the
following two flags are being used:
CFLAGS='-march=armv7-a -mfloat-abi=softfp'
Note: The first flag enables Thumb-2 instructions, and the second one
enables H/W FPU instructions while ensuring that floating-point
parameters are passed in core registers, which is critical for
ABI compatibility. Do *not* use these flags separately!
Thanks to Peter Osterlund for pointout this doc and for showing me
an example Makefile to follow.
No functional change.
Instead of classical flags, throw an
exception when we want to immediately halt
the search. Currently only one type
is used for both UCI stop and threads
cut off.
No functional change.
It is somewhat redundant and could make SF
name too long, so use just Version, in case
of a signature build Version will be set to
'sig-xxx' otherwise, if left empty, we fall
back on usual date stamp.
No functional change.
When compiling with:
make signature-build ARCH=xxx COMP=xxx
After binary has been roduced, it will be run to
get the signature 'stockfish bench' and this
number will be used as Version, so that it
will be easy to track the original sources
from a binary.
No functinal change.
Intel Compiler has 'invented' this pearl:
warning #1476: field uses tail padding of a base class
Just becuase we have subclassed MainThread and added
the field 'bool thinking'.
Pure nosense. Silence the warning.
No functional change.
Existing Makefile is buggy for PowerPC, it has no
SSE, yet it is given it if Prefetch is enabled,
because it isn't ARMv7.
Patch from Matthew Brades.
No functional change.
Implement lsb/msb using armv7 assembly instructions.
msb is the easiest one, using a gcc intrinsic that generates
code using the ARM's clz instruction. lsb is also using this
clz instruction, but with the help of ARM's 'rbit' (bit
reversing) instruction. This leads to a >2% speed gain.
I also renamed 'arm-32' to the more meaningfull 'armv7' in the Makefile
No functional change.
These kind of arch specific code is really nasty
to make it right becuase you need to verify on
all the platforms.
Now should compile properly also on ARM
Reported by Jean-Francois.
No functional change.
First change: If Haiku is host platform, change
installation prefix to /boot/common/bin
Second change: Only link in pthreads if Haiku isn't
host platform.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of -O4 option that does not work with both mingw and
Linux gcc (tested with Clang 3.1).
As reported by Reed Kotler:
Turns out that -O4 is not a valid option for clang unless you have
the proper gold linker and plugins built. That's because -O4 enables
LTO, which writes out bitcode files during the compile, and then loads
those and optimizes them during the link phase.
It requires a linker that supports LLVM's LTO. There is a plugin for
Gold available as part of LLVM.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Set optimization level to 4 and get a 2.564% faster binary:
Stockfish (Clang, Level 4) bench:
$ make build ARCH=osx-x86-64 COMP=clang
(Clang does not support PGO)
Average of 4 trials:
Total time (ms): 5137.5
Nodes searched: 5631135
Nodes/second: 1096084.5
Stockfish (Clang, Level 3) bench:
$ make build ARCH=osx-x86-64 COMP=clang
(Clang does not support PGO)
Average of 4 trials:
Total time (ms): 5269.25
Nodes searched: 5631135
Nodes/second: 1068679.25
Stockfish (GCC, PGO) bench:
$ make profile-build ARCH=osx-x86-64
Average of 4 trials:
Total time (ms): 5286
Nodes searched: 5631135
Nodes/second: 1065292.25
Suggestion and performance tests by Daylen Yang.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Makefile modified to support the clang compiler under Mac.
This was tested using clang 4 under Mountain Lion, but should
also work fine under Lion and possibly under Snow Leopard.
It requires the 'Xcode 4.x Command Line Tools' to be installed.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With this change sources are fully endianess
independent, so we can simplify the Makefile.
Somewhat surprisingly we don't have any speed
regression !
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With this we should compeltely remove the need
of installing third party POSIX threads library
when compiling with mingw-gcc under Windows.
Spotted by Trung Tu.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It seems we need to pass the full optimization
flags to the linker otherwise we end up in a
slow compile:
http://lists.debian.org/debian-devel/2011/06/msg00181.html
Regression reported by Benigno Hernandez.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The previous line, LDFLAGS += $(CXXFLAGS), does not make sense, and
breaks profile-build, thus changing it into: LDFLAGS += -flto.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Now that we don't support anymore popcount detection
at runtime this target is obsolete.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use $(CXX) instead of assuming compiler name is 'gcc'
Spotted by Louis Zulli.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use the starting thread to wait for GUI input and instead use
the other threads to search. The consequence is that now think()
is alwasy started on a differnt thread than the caller that
returns immediately waiting for input. This reformat greatly
simplifies the code and is more in line with the common way
to implement this feature.
As a side effect now we don't need anymore Makefile tricks
with sleep() to allow profile builds.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of binding link time optimization to the choice of
popcount support, do the right thing and add -flto option
when gcc 4.5 or later is detected.
Although it should be supported also under mingw, it happens
that it doesn't, at least on my 4.6.1 due to some known bugs.
Thanks to Mike for helping me with this patch.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Oliver reports profile builds error with new gcc 4.6, he says:
"We need to add -lgov with profile-generate AND profile-use.
So it has to be added to the second stage of building too.
The problem occurred first with the introduction of gcc4.6 and
I think this is because the previous version did find the gcov
library automatically. gcc4.6 needs more precise options and
does less guesses. I have seen it in debian, Ubuntu and also with
mingw on Windows. And all use gcc4.6."
This patch fixes the issue.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After async I/O patches 'bench' changed behaviour and now waits for
input at the end of the test run. This is due to listener thread stay
blocked on std::getline() even after test run is finished, as soon as
we feed something the thread unblocks and then quickly exits.
This is not a big problem, but has the bad side effect of breaking
profile builds that hang forever at the end of the test run.
The tricky workaround is to create a pipe that connects to stockfish
input and then, when test run is finished, breaking the pipe: this
makes std::getline() immediately return.
So this patch adds a 'sleep 10' piped into 'stockfish bench' test run
command. After 10 seconds sleep ends, the pipe breaks and 'bench'
finishes as usual.
Thanks to Oliver Korff for reporting the issue, and to Mike Whiteley
for having co-authored this solution.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Justin reports that it breaks the compilation on Fedore 15 and as Tom says:
-static is only needed to work around the gcc on ubuntu 11.10 beta bug.
If -static introduces issues on its own then it is better to remove it.
It will not be needed in most environments.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>