Get rid of macros and use templates instead,
this is safer and allows us fix the warning:
ISO C++ forbids braced-groups within expressions
That broke compilation with -pedantic flag under
gcc and POPCNT enabled.
No functional and no performance change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Also don't use __cpuid() intrinsic for Intel under
Linux because gives wrong results when detecting HT,
use the gcc version instead. Finally clean up the code.
Error was due to changed __cpuid() signature for
gcc compiler.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This will allow to use the function also for other
purposes then detecting POPCNT.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Intrinsic __popcnt64() returns an unsigned __int64, cast
to an integer and silence the warning.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With new target 'make gcc-popcnt' it is now
possible to compile with enabled hardware POPCNT
support also with gcc. Until now was possible only
for Intel and MSVC compilers.
When this instruction is supported by CPU, for instance
on Intel i7 or i5 family, produced binary is a bit faster.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Also remove some fallback templates that prevent a
compile error in case the user runs 'make icc-profile-popcnt'
from a non supported machine.
We want to loudly fail in that case instead of silently
fallback in a non-popcount compilation.
Updated documentation too.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Predefined macro __INTEL_COMPILER is defined only for Intel,
while _MSC_VER is defined for both Intel C++ and MSVC.
So rearrange ifdefs to take in account this and test __INTEL_COMPILER
first and only if not defined check _MSC_VER for MSVC.
Patch suggested by Joona.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
This is mainly intended to allow 64 bit compiles on any
system and avoid to crash when the binary, compiled on a
box where POPCNT is not supported, is run on a Core i7
system or similar CPU.
What could happen is that when compiled in a standard 64 bit
system, because the correct headers for the POPCNT intrinsic
are not found, the compiler creates dummy bit count functions
instead, these are never called at runtime on the machine where
Stockfish has been compiled. But if we run the same binary on a
Core i7 system, because POPCNT is detected at run time, the dummy
bitcount functions will be called giving false results that will
crash the application.
Note that would be possible to fallback on software bit count in
these cases, but this is even more subtle because POPCNT path is not
optimized so that we have an application working but at sub-optimal
speed, so better to crash, at least user is loudly warned that there
is something wrong.
If, instead, Stockfish is compiled on a Core i7 system with POPCNT
enabled, then if the PGO compile has been done properly, the same binary
will run at optimal speed _both_ on the Core i7 machine and on any other
64 bit standard machine. This is the ideal mode for binary distribution.
Finally this patch disables bsfq support under Windows, because it seems
inline assembly is not supported both by MSVC and by Intel Windows version.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Also rename DISABLE_POPCNT_SUPPORT in NO_POPCNT and simplify a bit
the macro logic.
Always define a __popcnt64()or _mm_popcnt_u64() template, if the proper
function with the same name is defined in the intrinsics header, then it
will be choosen as first otherwise we fall back on the dummy template
that is never called at runtime anyway because cpu_has_popcnt() returns
false.
This fixes the compile error reported by Jim.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Was erroneusly changed with the 32bit in recent
patch "Retire USE_COMPACT_ROOK_ATTACKS...".
Also another clean up of define magics. Move compiler
specific definitions in types.h and remove redundant cruft.
Now this macro ugly mess seems more reasonable.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
It shows almost no improvment and adds a good
bunch of complexity.
So remove for now. No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We don't need different names between a function and a
template. Compiler will know when use one or the other.
This let use restore original count_1s_xx() names instead of
sw_count_1s_xxx so to simplify a bit the code.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
With this patch at the applications startup a line is printed
with info about use of optimized 64 bit routines and hardware
POPCNT.
Also allow the possibility to disable POPCNT support during
PGO compiles to exercise the fallback software only path.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Runtime detect POPCNT instruction support and
use it.
Also if POPCNT is not supported we don't add _any_ overhead so
that we don't lose any speed in standard case.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>