This is a non-functional code style change.
If we add an accessor function for SquareBB we can consolidate all of the asserts. This is also a bit cleaner because all SquareBB accesses go through this method making future changes easier to manage.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 63406 W: 14084 L: 14045 D: 35277
http://tests.stockfishchess.org/tests/view/5c9ea6100ebc5925cfffc9af
No functional change.
This is a non-functional simplification / code-style change.
This popcount16 method does nothing but initialize the PopCnt16 arrays.
This can be done in a single bitset line, which is less lines and more clear. Performance for this code is moot.
No functional change.
This is non-functional. These 5 arrays are simple enough to calculate real-time and maintaining an array for them does not help. Decreases the memory footprint.
This seems a tiny bit slower on my machine, but passed STC well enough. Could someone verify speed?
STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 44745 W: 9780 L: 9704 D: 25261
http://tests.stockfishchess.org/tests/view/5c47aa2d0ebc5902bca13fc4
The slowdown is minimal even in 32 bit case (thanks to @mstembera for testing):
Compiled using make build ARCH=x86-32 CXX=i686-w64-mingw32-c++ and benched
This patch only:
```
Results for 40 tests for each version:
Base Test Diff
Mean 1455204 1450033 5171
StDev 49452 34533 59621
p-value: 0.465
speedup: -0.004
```
No functional change.
This patch saves (4-1) * 64 * 64 = 12KiB of cache.
STC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 176120 W: 38944 L: 38087 D: 99089
http://tests.stockfishchess.org/tests/view/5c4c9f840ebc593af5d4a7ce
LTC
As a pure speed up, I've been informed it should not require LTC.
No functional change
This is a non-functional simplification that removes the AdjacentFiles array.
This array is simple enough to calculate that the pre-calculated array provides
no benefit. Reduces the memory footprint.
STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 74839 W: 16390 L: 16373 D: 42076
http://tests.stockfishchess.org/tests/view/5c3d75920ebc596a450cfb67
No functionnal change
Preparation commit for the upcoming Stockfish 10 version, giving a chance to catch last minute feature bugs and evaluation regression during the one-week code freeze period. Also changing the copyright dates to include 2019.
No functional change
This is a non-functional change. By pre-incrementing minKingPawnDistance
instead of post-incrementing, we can remove this -1.
This also makes DistanceRing more consistent with the rest of stockfish
since it now holds an actual "distance" instead of a less natural distance-1.
In current master, PseudoAttacks[KING][ksq] == DistanceRingBB[ksq][0]
With this patch, it will be PseudoAttacks[KING][ksq] == DistanceRingBB[ksq][1]
ie squares at distance 1 from the king. This is more natural use of distance.
The current array size DistanceRingBB[SQUARE_NB][8] is still OK with the new
definition, because maximum distance between two squares on a chess board is
seven (for example Kh1 and a8).
No functional change.
In current master, the function make_bitboard() does nothing apart from
helping initialize the SquareBB[] array. This seems like an unnecessary
abstraction layer.
The advantage of make_bitboard() is we can define a bitboard, in a simple
and general way, not only from a single square but also from a list of
squares. It is more elegant, faster and readable than combining multiple
SquareBB explicitly, but the last complex use case in evaluation was
simplified away a few months ago.
If make_bitboard() becomes useful again to define complicated bitboards,
it will be easy enough to reintroduce it using this pull request as
an implementation reference.
No functional change.
1) Use make_bitboard() in Bitboards::init()
2) Fix MSVC warning: search.h(85): warning C4244: '=': conversion from
'TimePoint' to 'int', possible loss of data.
Closes https://github.com/official-stockfish/Stockfish/pull/1524
No functional change.
The NO_BSF does not cover any real life use-case today. The only compilers that
can compile SF today, with the current Makefile and no source code changes, are
either GCC compatible (define __GNUC__) or MSVC compatible (define _MSC_VER). So
they all support LSB/MSB intrinsics.
This patch simplifies away the software fall-backs of LSB/MSB that were still
in Stockfish code, but unused in any of the officially supported compilers.
Note the (legacy) MSVC/WIN32 case, where we use a 32-bit BSF/BSR solution, as
64-bit intrinsics aren't available there.
Discussed in: https://github.com/official-stockfish/Stockfish/pull/1447
and: https://github.com/official-stockfish/Stockfish/pull/1479
No functional change.
Currently the NORTH/WEST/SOUTH/EAST values are of type Square, but conceptually they are not squares but directions. This patch separates these values into a Direction enum and overloads addition and subtraction to allow adding a Square to a Direction (to get a new Square).
I have also slightly trimmed the possible overloadings to improve type safety. For example, it would normally not make sense to add a Color to a Color or a Piece to a Piece, or to multiply or divide them by an integer. It would also normally not make sense to add a Square to a Square.
This is a non-functional change.
Make magic_index() a member of Magic since it uses all it's members
and keep us from having to pass the function pointer around to
init_magics().
No functional change
Closes#1146
Gather all magic relevant data into a struct.
This changes memory layout putting everything necessary for processing a single square
in the same memory location thus speeding up access.
Original patch by @snicolet
No functional change.
Closes#1127Closes#1128
StepAttacks[] is misdesigned, the color dependance is specific
to pawns, and trying to generalise to king and knights, proves
neither useful nor convinient in practice.
So this patch reformats the code with the following changes:
- Use PieceType instead of Piece in attacks_() functions
- Use PseudoAttacks for KING and KNIGHT
- Rename StepAttacks[] into PawnAttacks[]
Original patch and idea from Alain Savard.
No functional change.
Closes#1086
Rename shift_bb() to shift(), and DELTA_S to SOUTH, etc.
to improve code readability, especially in evaluate.cpp
when they are used together:
old b = shift_bb<DELTA_S>(pos.pieces(PAWN))
new b = shift<SOUTH>(pos.pieces(PAWN))
While there fix some small code style issues.
No functional change.
Also a speedup(about 1%) on 64-bit w/o hardware popcnt
Retire Max15 and Full template parameters
(Contributed by Marco Costalba)
Now that we have just SW and HW versions, use
template default parameter to get rid of explicit
template parameters.
Retire bitcount.h and move the only defined
function to bitboard.h
No functional change
Resolves#620
lsb(b) and msb(b) are undefined when b == 0. This can lead to subtle bugs, where
the resulting code behaves differently on different configurations:
- It can be the home grown software LSB/MSB
- It can be the compiler generated software LSB/MSB (when using compiler
intrinsics without the right compiler flags to allow compiler to use hardware
LSB/MSB). Which of course depends on the compiler.
- It can be hardware LSB/MSB generated by the compiler.
- Not to mention that hardware LSB/MSB can return different value on different
hardware when b == 0.
No functional change
Resolves#610
Use compiler intrinsics when possible to
avoid writing platform specific asm code.
Tested on Windows 7 with MSVC 2013 and mingw 4.8.3 (32 and 64 bit)
and on Linux Mint with g++ 4.8.4 and clang 3.4 (32 and 64 bit).
No functional change
Resolves#609
Under Windows with MSVC we use the IDE to compile,
in this case we infer some compiler flags usually
set by Makefile.
The condition to check this was wrong, namely when compiling
with mingw under Windows 64 bit we always set IS_64BIT and
USE_BSFQ even if compiled with ARCH=x86-32 (this is how I
found it).
Small code style touches while there.
No functional change.
Profiling shows that resetting attacks table after
a failed candidate magic attempt is the biggest
time consumer, so rewrite the logic avoiding the
memset()
Magics init for rook+bishop goes from 200msecs to
under 100msec.
No functional change.
This intrinsic to call BMI2 PEXT instruction is
defined in immintrin.h. This header should be
included only when USE_PEXT is defined, otherwise
we define _pext_u64 as 0 forcing a nop.
But under some mingw platforms, even if we don't
include the header, immintrin.h gets included
anyhow through an include chain that starts with
STL <algorithm> header. So we end up both defining
_pext_u64 function and at the same time defining
_pext_u64 as 0 leading to a compile error.
The correct solution is of not using _pext_u64 directly.
This patch fixes a compile error with some mingw64
package when compiling with x86-64.
No functional change.
It is up to material (and pawn) table look up
code to know where the per-thread tables are,
so change API to reflect this.
Also some comment fixing while there
No functional change.
This patch replaces RKISS by a simpler and faster PRNG, xorshift64* proposed
by S. Vigna (2014). It is extremely simple, has a large enough period for
Stockfish's needs (2^64), requires no warming-up (allowing such code to be
removed), and offers slightly better randomness than MT19937.
Paper: http://xorshift.di.unimi.it/
Reference source code (public domain):
http://xorshift.di.unimi.it/xorshift64star.c
The patch also simplifies how init_magics() searches for magics:
- Old logic: seed the PRNG always with the same seed,
then use optimized bit rotations to tailor the RNG sequence per rank.
- New logic: seed the PRNG with an optimized seed per rank.
This has two advantages:
1. Less code and less computation to perform during magics search (not ROTL).
2. More choices for random sequence tuning. The old logic only let us choose
from 4096 bit rotation pairs. With the new one, we can look for the best seeds
among 2^64 values. Indeed, the set of seeds[][] provided in the patch reduces
the effort needed to find the magics:
64-bit SF:
Old logic -> 5,783,789 rand64() calls needed to find the magics
New logic -> 4,420,086 calls
32-bit SF:
Old logic -> 2,175,518 calls
New logic -> 1,895,955 calls
In the 64-bit case, init_magics() take 25 ms less to complete (Intel Core i5).
Finally, when playing with strength handicap, non-determinism is achieved
by setting the seed of the static RNG only once. Afterwards, there is no need
to skip output values.
The bench only changes because the Zobrist keys are now different (since they
are random numbers straight out of the PRNG).
The RNG seed has been carefully chosen so that the
resulting Zobrist keys are particularly well-behaved:
1. All triplets of XORed keys are unique, implying that it
would take at least 7 keys to find a 64-bit collision
(test suggested by ceebo)
2. All pairs of XORed keys are unique modulo 2^32
3. The cardinality of { (key1 ^ key2) >> 48 } is as close
as possible to the maximum (65536)
Point 2 aims at ensuring a good distribution among the bits
that determine an TT entry's cluster, likewise point 3
among the bits that form the TT entry's key16 inside a
cluster.
Details:
Bitset card(key1^key2)
------ ---------------
RKISS
key16 64894 = 99.020% of theoretical maximum
low18 180117 = 99.293%
low32 305362 = 99.997%
Xorshift64*, old seed
key16 64918 = 99.057%
low18 179994 = 99.225%
low32 305350 = 99.993%
Xorshift64*, new seed
key16 65027 = 99.223%
low18 181118 = 99.845%
low32 305371 = 100.000%
Bench: 9324905
Resolves#148
Bitboard init code is already noteasy to follow,
so don't make it even harder using 'smart' code.
Also reindent a while loop in standard way.
No functional change.
Retire software pext and introduce hardware
call when USE_PEXT is defined during compilation.
This is a full complete implementation of sliding
attacks using PEXT.
No functional change.