1
0
Fork 0
mirror of https://github.com/sockspls/badfish synced 2025-04-30 16:53:09 +00:00
Commit graph

175 commits

Author SHA1 Message Date
Tomasz Sobczyk
a169c78b6d Improve performance on NUMA systems
Allow for NUMA memory replication for NNUE weights.  Bind threads to ensure execution on a specific NUMA node.

This patch introduces NUMA memory replication, currently only utilized for the NNUE weights. Along with it comes all machinery required to identify NUMA nodes and bind threads to specific processors/nodes. It also comes with small changes to Thread and ThreadPool to allow easier execution of custom functions on the designated thread. Old thread binding (WinProcGroup) machinery is removed because it's incompatible with this patch. Small changes to unrelated parts of the code were made to ensure correctness, like some classes being made unmovable, raw pointers replaced with unique_ptr. etc.

Windows 7 and Windows 10 is partially supported. Windows 11 is fully supported. Linux is fully supported, with explicit exclusion of Android. No additional dependencies.

-----------------

A new UCI option `NumaPolicy` is introduced. It can take the following values:
```
system - gathers NUMA node information from the system (lscpu or windows api), for each threads binds it to a single NUMA node
none - assumes there is 1 NUMA node, never binds threads
auto - this is the default value, depends on the number of set threads and NUMA nodes, will only enable binding on multinode systems and when the number of threads reaches a threshold (dependent on node size and count)
[[custom]] -
  // ':'-separated numa nodes
  // ','-separated cpu indices
  // supports "first-last" range syntax for cpu indices,
  for example '0-15,32-47:16-31,48-63'
```

Setting `NumaPolicy` forces recreation of the threads in the ThreadPool, which in turn forces the recreation of the TT.

The threads are distributed among NUMA nodes in a round-robin fashion based on fill percentage (i.e. it will strive to fill all NUMA nodes evenly). Threads are bound to NUMA nodes, not specific processors, because that's our only requirement and the OS can schedule them better.

Special care is made that maximum memory usage on systems that do not require memory replication stays as previously, that is, unnecessary copies are avoided.

On linux the process' processor affinity is respected. This means that if you for example use taskset to restrict Stockfish to a single NUMA node then the `system` and `auto` settings will only see a single NUMA node (more precisely, the processors included in the current affinity mask) and act accordingly.

-----------------

We can't ensure that a memory allocation takes place on a given NUMA node without using libnuma on linux, or using appropriate custom allocators on windows (https://learn.microsoft.com/en-us/windows/win32/memory/allocating-memory-from-a-numa-node), so to avoid complications the current implementation relies on first-touch policy. Due to this we also rely on the memory allocator to give us a new chunk of untouched memory from the system. This appears to work reliably on linux, but results may vary.

MacOS is not supported, because AFAIK it's not affected, and implementation would be problematic anyway.

Windows is supported since Windows 7 (https://learn.microsoft.com/en-us/windows/win32/api/processtopologyapi/nf-processtopologyapi-setthreadgroupaffinity). Until Windows 11/Server 2022 NUMA nodes are split such that they cannot span processor groups. This is because before Windows 11/Server 2022 it's not possible to set thread affinity spanning processor groups. The splitting is done manually in some cases (required after Windows 10 Build 20348). Since Windows 11/Server 2022 we can set affinites spanning processor group so this splitting is not done, so the behaviour is pretty much like on linux.

Linux is supported, **without** libnuma requirement. `lscpu` is expected.

-----------------

Passed 60+1 @ 256t 16000MB hash: https://tests.stockfishchess.org/tests/view/6654e443a86388d5e27db0d8
```
LLR: 2.95 (-2.94,2.94) <0.00,10.00>
Total: 278 W: 110 L: 29 D: 139
Ptnml(0-2): 0, 1, 56, 82, 0
```

Passed SMP STC: https://tests.stockfishchess.org/tests/view/6654fc74a86388d5e27db1cd
```
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 67152 W: 17354 L: 17177 D: 32621
Ptnml(0-2): 64, 7428, 18408, 7619, 57
```

Passed STC: https://tests.stockfishchess.org/tests/view/6654fb27a86388d5e27db15c
```
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 131648 W: 34155 L: 34045 D: 63448
Ptnml(0-2): 426, 13878, 37096, 14008, 416
```

fixes #5253
closes https://github.com/official-stockfish/Stockfish/pull/5285

No functional change
2024-05-28 18:34:15 +02:00
cj5716
c6a1e7fd42 Optimise pairwise multiplication
This speedup was first inspired by a comment by @AndyGrant on my recent
PR "If mullo_epi16 would preserve the signedness, then this could be
used to remove 50% of the max operations during the halfkp-pairwise
mat-mul relu deal."

That got me thinking, because although mullo_epi16 did not preserve the
signedness, mulhi_epi16 did, and so we could shift left and then use
mulhi_epi16, instead of shifting right after the mullo.

However, due to some issues with shifting into the sign bit, the FT
weights and biases had to be multiplied by 2 for the optimisation to
work.

Speedup on "Arch=x86-64-bmi2 COMP=clang", courtesy of @Torom
Result of 50 runs
base (...es/stockfish) =     962946  +/- 1202
test (...ise-max-less) =     979696  +/- 1084
diff                   =     +16750  +/- 1794

speedup        = +0.0174
P(speedup > 0) =  1.0000

CPU: 4 x Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Hyperthreading: on

Also a speedup on "COMP=gcc", courtesy of Torom once again
Result of 50 runs
base (...tockfish_gcc) =     966033  +/- 1574
test (...max-less_gcc) =     983319  +/- 1513
diff                   =     +17286  +/- 2515

speedup        = +0.0179
P(speedup > 0) =  1.0000

CPU: 4 x Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Hyperthreading: on

Passed STC:
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 67712 W: 17715 L: 17358 D: 32639
Ptnml(0-2): 225, 7472, 18140, 7759, 260
https://tests.stockfishchess.org/tests/view/664c1d75830eb9f886616906

closes https://github.com/official-stockfish/Stockfish/pull/5282

No functional change
2024-05-23 21:37:46 +02:00
Michael Chaly
0c797367a3 Update correction history in case of successful null move pruning
Since null move pruning uses the same position it makes some sense to try to update correction history there in case of fail high.
Update value is 4 times less than normal update.

Passed STC:
https://tests.stockfishchess.org/tests/view/664a011cae57c1758ac5b4dd
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 419360 W: 108390 L: 107505 D: 203465
Ptnml(0-2): 1416, 49603, 106724, 50554, 1383

Passed LTC:
https://tests.stockfishchess.org/tests/view/664a53d95fc7b70b8817c65b
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 193518 W: 49076 L: 48434 D: 96008
Ptnml(0-2): 89, 21335, 53263, 21989, 83

closes https://github.com/official-stockfish/Stockfish/pull/5272

bench 1301487
2024-05-21 08:17:20 +02:00
cj5716
27eb49a221 Simplify ClippedReLU
Removes some max calls

Some speedup stats, courtesy of @AndyGrant (albeit measured in an alternate implementation)
Dev  749240 nps
Base 748495 nps
Gain 0.100%
289936 games

STC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 203040 W: 52213 L: 52179 D: 98648
Ptnml(0-2): 480, 20722, 59139, 20642, 537
https://tests.stockfishchess.org/tests/view/664805fe6dcff0d1d6b05f2c

closes #5261

No functional change
2024-05-21 07:58:16 +02:00
Linmiao Xu
d92d1f3180 Move smallnet threshold logic into a function
Now that the smallnet threshold is no longer a constant,
use a function to organize it with other eval code.

Passed non-regression STC:
https://tests.stockfishchess.org/tests/view/66459fa093ce6da3e93b5ba2
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 217600 W: 56281 L: 56260 D: 105059
Ptnml(0-2): 756, 23787, 59729, 23736, 792

closes https://github.com/official-stockfish/Stockfish/pull/5255

No functional change
2024-05-18 09:21:00 +02:00
Linmiao Xu
47597641dc Lower smallnet threshold linearly as pawn count decreases
Passed STC:
https://tests.stockfishchess.org/tests/view/6644f677324e96f42f89d894
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 377920 W: 97135 L: 96322 D: 184463
Ptnml(0-2): 1044, 44259, 97588, 44978, 1091

Passed LTC:
https://tests.stockfishchess.org/tests/view/664548af93ce6da3e93b31b3
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 169056 W: 42901 L: 42312 D: 83843
Ptnml(0-2): 58, 18538, 46753, 19115, 64

closes https://github.com/official-stockfish/Stockfish/pull/5252

Bench: 1991750
2024-05-16 14:19:28 +02:00
xoto10
2682c2127d Use 5% less time on first move
Stockfish appears to take too much time on the first move of a game and
then not enough on moves 2,3,4... Probably caused by most of the factors
that increase time usually applying on the first move.

Attempts to give more time to the subsequent moves have not worked so
far, but this change to simply reduce first move time by 5% worked.

STC 10+0.1 :
LLR: 2.96 (-2.94,2.94) <0.00,2.00>
Total: 78496 W: 20516 L: 20135 D: 37845
Ptnml(0-2): 340, 8859, 20456, 9266, 327
https://tests.stockfishchess.org/tests/view/663d47bf507ebe1c0e9200ba

LTC 60+0.6 :
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 94872 W: 24179 L: 23751 D: 46942
Ptnml(0-2): 61, 9743, 27405, 10161, 66
https://tests.stockfishchess.org/tests/view/663e779cbb28828150dd9089

closes https://github.com/official-stockfish/Stockfish/pull/5235

Bench: 1876282
2024-05-15 16:09:30 +02:00
mstembera
e608eab8dd Optimize update_accumulator_refresh_cache()
STC https://tests.stockfishchess.org/tests/view/664105df26ac5f9b286d30e6
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 178528 W: 46235 L: 45750 D: 86543
Ptnml(0-2): 505, 17792, 52142, 18363, 462

Combo of two yellow speedups
https://tests.stockfishchess.org/tests/view/6640abf9d163897c63214f5c
LLR: -2.93 (-2.94,2.94) <0.00,2.00>
Total: 355744 W: 91714 L: 91470 D: 172560
Ptnml(0-2): 913, 36233, 103384, 36381, 961

https://tests.stockfishchess.org/tests/view/6628ce073fe04ce4cefc739c
LLR: -2.93 (-2.94,2.94) <0.00,2.00>
Total: 627040 W: 162001 L: 161339 D: 303700
Ptnml(0-2): 2268, 72379, 163532, 73105, 2236

closes https://github.com/official-stockfish/Stockfish/pull/5239

No functional change
2024-05-13 07:32:32 +02:00
cj5716
61f12a4c38 Simplify accumulator refreshes
Passed Non-Regression STC:
https://tests.stockfishchess.org/tests/view/6631f5d5d01fb9ac9bcdc7d0
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 57472 W: 14979 L: 14784 D: 27709
Ptnml(0-2): 185, 6486, 15192, 6695, 178

closes https://github.com/official-stockfish/Stockfish/pull/5207

No functional change
2024-05-05 15:11:37 +02:00
cj5716
8ee9905d8b Remove PSQT-only mode
Passed STC:
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 94208 W: 24270 L: 24112 D: 45826
Ptnml(0-2): 286, 11186, 24009, 11330, 293
https://tests.stockfishchess.org/tests/view/6635ddd773559a8aa8582826

Passed LTC:
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 114960 W: 29107 L: 28982 D: 56871
Ptnml(0-2): 37, 12683, 31924, 12790, 46
https://tests.stockfishchess.org/tests/view/663604a973559a8aa85881ed

closes #5214

Bench 1653939
2024-05-05 12:36:20 +02:00
mstembera
be142337d8 Accumulator cache bugfix and cleanup
STC:
https://tests.stockfishchess.org/tests/view/663068913a05f1bf7a511dc2
LLR: 2.98 (-2.94,2.94) <-1.75,0.25>
Total: 70304 W: 18211 L: 18026 D: 34067
Ptnml(0-2): 232, 7966, 18582, 8129, 243

1) Fixes a bug introduced in
   https://github.com/official-stockfish/Stockfish/pull/5194. Only one
   psqtOnly flag was used for two perspectives which was causing
   wrong entries to be cleared and marked.
2) The finny caches should be cleared like histories and not at the
   start of every search.

closes https://github.com/official-stockfish/Stockfish/pull/5203

No functional change
2024-05-01 14:17:32 +02:00
cj5716
6a9b8a0c7b Optimise NNUE Accumulator updates
Passed STC:
https://tests.stockfishchess.org/tests/view/662e3c6a5e9274400985a741
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 86176 W: 22284 L: 21905 D: 41987
Ptnml(0-2): 254, 9572, 23051, 9963, 248

closes https://github.com/official-stockfish/Stockfish/pull/5202

No functional change
2024-05-01 14:10:57 +02:00
mstembera
a129c0695b Combine remove and add in update_accumulator_refresh_cache()
Combine remove and add in update_accumulator_refresh_cache().
Move remove before add to match other parts of the code.

STC:
https://tests.stockfishchess.org/tests/view/662d96dc6115ff6764c7f4ca
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 364032 W: 94421 L: 93624 D: 175987
Ptnml(0-2): 1261, 41983, 94811, 42620, 1341

closes https://github.com/official-stockfish/Stockfish/pull/5194

Bench: 1836777
2024-04-28 21:35:48 +02:00
mstembera
940a3a7383 Cache small net w/ psqtOnly support
Caching the small net in the same way as the big net allows them to
share the same code path and completely removes
update_accumulator_refresh().

STC:
https://tests.stockfishchess.org/tests/view/662bfb5ed46f72253dcfed85
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 151712 W: 39252 L: 39158 D: 73302
Ptnml(0-2): 565, 17474, 39683, 17570, 564

closes https://github.com/official-stockfish/Stockfish/pull/5194

Bench: 1836777
2024-04-28 21:30:19 +02:00
Joost VandeVondele
bc45cbc820 Output some basic info about the used networks
Adds size in memory as well as layer sizes as in

info string NNUE evaluation using nn-ae6a388e4a1a.nnue (132MiB, (22528, 3072, 15, 32, 1))
info string NNUE evaluation using nn-baff1ede1f90.nnue (6MiB, (22528, 128, 15, 32, 1))

For example, the size in MiB is useful to keep the fishtest memory sizes up-to-date,
the L1-L3 sizes give a useful hint about the architecture used.

closes https://github.com/official-stockfish/Stockfish/pull/5193

No functional change
2024-04-28 21:27:28 +02:00
Disservin
3502c8ae42 Fix missing initialization of AccumulatorCaches in Eval::trace
Add a constructor to `AccumulatorCaches` instead of just calling
`clear(networks)` to prevent similar issues from appearing in the
future.

fixes https://github.com/official-stockfish/Stockfish/issues/5190

closes https://github.com/official-stockfish/Stockfish/pull/5191

No functional change
2024-04-28 21:26:36 +02:00
gab8192
49ef4c935a Implement accumulator refresh table
For each thread persist an accumulator cache for the network, where each
cache contains multiple entries for each of the possible king squares.
When the accumulator needs to be refreshed, the cached entry is used to more
efficiently update the accumulator, instead of rebuilding it from scratch.
This idea, was first described by Luecx (author of Koivisto) and
is commonly referred to as "Finny Tables".

When the accumulator needs to be refreshed, instead of filling it with
biases and adding every piece from scratch, we...

1. Take the `AccumulatorRefreshEntry` associated with the new king bucket
2. Calculate the features to activate and deactivate (from differences
   between bitboards in the entry and bitboards of the actual position)
3. Apply the updates on the refresh entry
4. Copy the content of the refresh entry accumulator to the accumulator
   we were refreshing
5. Copy the bitboards from the position to the refresh entry, to match
   the newly updated accumulator

Results at STC:
https://tests.stockfishchess.org/tests/view/662301573fe04ce4cefc1386
(first version)
https://tests.stockfishchess.org/tests/view/6627fa063fe04ce4cefc6560
(final)

Non-Regression between first and final:
https://tests.stockfishchess.org/tests/view/662801e33fe04ce4cefc660a

STC SMP:
https://tests.stockfishchess.org/tests/view/662808133fe04ce4cefc667c

closes https://github.com/official-stockfish/Stockfish/pull/5183

No functional change
2024-04-24 18:38:20 +02:00
Gahtan Nahdi
d0e72c19fa fix clang compiler warning for avx512 build
Initialize variable in constexpr function to get rid of clang compiler warning for avx512 build.

closes https://github.com/official-stockfish/Stockfish/pull/5176

Non-functional change
2024-04-21 14:38:16 +02:00
mstembera
94484db6e8 Avoid permuting inputs during transform()
Avoid permuting inputs during transform() and instead do it once at load time.
Affects AVX2 and newer Intel architectures only.

https://tests.stockfishchess.org/tests/view/661306613eb00c8ccc0033c7
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 108480 W: 28319 L: 27898 D: 52263
Ptnml(0-2): 436, 12259, 28438, 12662, 445

speedups measured such as e.g.

```
Result of 100 runs
==================
base (./stockfish.master       ) =    1241128  +/- 3757
test (./stockfish.patch        ) =    1247713  +/- 3689
diff                             =      +6585  +/- 2583

speedup        = +0.0053
P(speedup > 0) =  1.0000
```

closes https://github.com/official-stockfish/Stockfish/pull/5160

No functional change
2024-04-11 22:38:38 +02:00
Disservin
299707d2c2 Split UCI into UCIEngine and Engine
This is another refactor which aims to decouple uci from stockfish. A new engine
class manages all engine related logic and uci is a "small" wrapper around it.

In the future we should also try to remove the need for the Position object in
the uci and replace the options with an actual options struct instead of using a
map. Also convert the std::string's in the Info structs a string_view.

closes #5147

No functional change
2024-04-04 00:15:17 +02:00
Viren6
0716b845fd Update NNUE architecture to SFNNv9 and net nn-ae6a388e4a1a.nnue
Part 1: PyTorch Training, linrock

Trained with a 10-stage sequence from scratch, starting in May 2023:
https://github.com/linrock/nnue-tools/blob/master/exp-sequences/3072-10stage-SFNNv9.yml

While the training methods were similar to the L1-2560 training sequence,
the last two stages introduced min-v2 binpacks,
where bestmove capture and in-check position scores were not zeroed during minimization,
for compatibility with skipping SEE >= 0 positions and future research.

Training data can be found at:
https://robotmoon.com/nnue-training-data

This net was tested at epoch 679 of the 10th training stage:
https://tests.stockfishchess.org/tests/view/65f32e460ec64f0526c48dbc

Part 2: SPSA Training, Viren6

The net was then SPSA tuned.
This consisted of the output weights (32 * 8) and biases (8)
as well as the L3 biases (32 * 8) and L2 biases (16 * 8), totalling 648 params in total.

The SPSA tune can be found here:
https://tests.stockfishchess.org/tests/view/65fc33ba0ec64f0526c512e3

With the help of Disservin , the initial weights were extracted with:
https://github.com/Viren6/Stockfish/tree/new228

The net was saved with the tuned weights using:
https://github.com/Viren6/Stockfish/tree/new241

Earlier nets of the SPSA failed STC compared to the base 3072 net of part 1:
https://tests.stockfishchess.org/tests/view/65ff356e0ec64f0526c53c98
Therefore it is suspected that the SPSA at VVLTC has
added extra scaling on top of the scaling of increasing the L1 size.

Passed VVLTC 1:
https://tests.stockfishchess.org/tests/view/6604a9020ec64f0526c583da
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 53042 W: 13554 L: 13256 D: 26232
Ptnml(0-2): 12, 5147, 15903, 5449, 10

Passed VVLTC 2:
https://tests.stockfishchess.org/tests/view/660ad1b60ec64f0526c5dd23
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 17506 W: 4574 L: 4315 D: 8617
Ptnml(0-2): 1, 1567, 5362, 1818, 5

STC Elo estimate:
https://tests.stockfishchess.org/tests/view/660b834d01aaec5069f87cb0
Elo: -7.66 ± 3.8 (95%) LOS: 0.0%
Total: 9618 W: 2440 L: 2652 D: 4526
Ptnml(0-2): 80, 1281, 2261, 1145, 42
nElo: -13.94 ± 6.9 (95%) PairsRatio: 0.87

closes https://tests.stockfishchess.org/tests/view/660b834d01aaec5069f87cb0

bench 1823302

Co-Authored-By: Linmiao Xu <lin@robotmoon.com>
2024-04-02 08:49:48 +02:00
mstembera
5001d49f42 Update nnue_feature_transformer.h
Unroll update_accumulator_refresh to process two
active indices simultaneously.

The compiler might not unroll effectively because
the number of active indices isn't known at
compile time.

STC https://tests.stockfishchess.org/tests/view/65faa8850ec64f0526c4fca9
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 130464 W: 33882 L: 33431 D: 63151
Ptnml(0-2): 539, 14591, 34501, 15082, 519

closes https://github.com/official-stockfish/Stockfish/pull/5125

No functional change
2024-03-26 18:06:49 +01:00
Robert Nurnberg
9b92ada935 Base WDL model on material count and normalize evals dynamically
This PR proposes to change the parameter dependence of Stockfish's
internal WDL model from full move counter to material count. In addition
it ensures that an evaluation of 100 centipawns always corresponds to a
50% win probability at fishtest LTC, whereas for master this holds only
at move number 32. See also
https://github.com/official-stockfish/Stockfish/pull/4920 and the
discussion therein.

The new model was fitted based on about 340M positions extracted from
5.6M fishtest LTC games from the last three weeks, involving SF versions
from e67cc979fd (SF 16.1) to current
master.

The involved commands are for
[WDL_model](https://github.com/official-stockfish/WDL_model) are:
```
./updateWDL.sh --firstrev e67cc979fd
python scoreWDL.py updateWDL.json --plot save --pgnName update_material.png --momType "material" --momTarget 58 --materialMin 10 --modelFitting optimizeProbability
```

The anchor `58` for the material count value was chosen to be as close
as possible to the observed average material count of fishtest LTC games
at move 32 (`43`), while not changing the value of
`NormalizeToPawnValue` compared to the move-based WDL model by more than
1.

The patch only affects the displayed cp and wdl values.

closes https://github.com/official-stockfish/Stockfish/pull/5121

No functional change
2024-03-20 16:29:35 +01:00
Disservin
134e6d7bb4 Consistent use of anonymous namespace
Also change `bindThisThread` to match the current code style for function naming.

closes https://github.com/official-stockfish/Stockfish/pull/5118

No functional change
2024-03-20 16:15:37 +01:00
Disservin
55df0ee009 Fix Raspberry Pi Compilation
Reported by @Torom over discord.

> dev build fails on Raspberry Pi 5 with clang

```
clang++ -o stockfish benchmark.o bitboard.o evaluate.o main.o misc.o movegen.o movepick.o position.o search.o thread.o timeman.o tt.o uci.o ucioption.o tune.o tbprobe.o nnue_misc.o half_ka_v2_hm.o network.o  -fprofile-instr-generate -latomic -lpthread  -Wall -Wcast-qual -fno-exceptions -std=c++17 -fprofile-instr-generate  -pedantic -Wextra -Wshadow -Wmissing-prototypes -Wconditional-uninitialized -DUSE_PTHREADS -DNDEBUG -O3 -funroll-loops -DIS_64BIT -DUSE_POPCNT -DUSE_NEON=8 -march=armv8.2-a+dotprod -DUSE_NEON_DOTPROD -DGIT_SHA=627974c9 -DGIT_DATE=20240312 -DARCH=armv8-dotprod -flto=full
/tmp/lto-llvm-e9300e.o: in function `_GLOBAL__sub_I_network.cpp':
ld-temp.o:(.text.startup+0x704c): relocation truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol `gEmbeddedNNUEBigEnd' defined in .rodata section in /tmp/lto-llvm-e9300e.o
/usr/bin/ld: ld-temp.o:(.text.startup+0x704c): warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined
ld-temp.o:(.text.startup+0x7068): relocation truncated to fit: R_AARCH64_LDST64_ABS_LO12_NC against symbol `gEmbeddedNNUESmallEnd' defined in .rodata section in /tmp/lto-llvm-e9300e.o
/usr/bin/ld: ld-temp.o:(.text.startup+0x7068): warning: one possible cause of this error is that the symbol is being referenced in the indicated code as if it had a larger alignment than was declared where it was defined
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [Makefile:1051: stockfish] Error 1
make[2]: Leaving directory '/home/torsten/chess/Stockfish_master/src'
make[1]: *** [Makefile:1058: clang-profile-make] Error 2
make[1]: Leaving directory '/home/torsten/chess/Stockfish_master/src'
make: *** [Makefile:886: profile-build] Error 2
```

closes https://github.com/official-stockfish/Stockfish/pull/5106

No functional change
2024-03-12 19:09:50 +01:00
Disservin
1a26d698de Refactor Network Usage
Continuing from PR #4968, this update improves how Stockfish handles network
usage, making it easier to manage and modify networks in the future.

With the introduction of a dedicated Network class, creating networks has become
straightforward. See uci.cpp:
```cpp
NN::NetworkBig({EvalFileDefaultNameBig, "None", ""}, NN::embeddedNNUEBig)
```

The new `Network` encapsulates all network-related logic, significantly reducing
the complexity previously required to support multiple network types, such as
the distinction between small and big networks #4915.

Non-Regression STC:
https://tests.stockfishchess.org/tests/view/65edd26c0ec64f0526c43584
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 33760 W: 8887 L: 8661 D: 16212
Ptnml(0-2): 143, 3795, 8808, 3961, 173

Non-Regression SMP STC:
https://tests.stockfishchess.org/tests/view/65ed71970ec64f0526c42fdd
LLR: 2.96 (-2.94,2.94) <-1.75,0.25>
Total: 59088 W: 15121 L: 14931 D: 29036
Ptnml(0-2): 110, 6640, 15829, 6880, 85

Compiled with `make -j profile-build`
```
bash ./bench_parallel.sh ./stockfish ./stockfish-nnue 13 50

sf_base =  1568540 +/-   7637 (95%)
sf_test =  1573129 +/-   7301 (95%)
diff    =     4589 +/-   8720 (95%)
speedup = 0.29260% +/- 0.556% (95%)
```

Compiled with `make -j build`
```
bash ./bench_parallel.sh ./stockfish ./stockfish-nnue 13 50

sf_base =  1472653 +/-   7293 (95%)
sf_test =  1491928 +/-   7661 (95%)
diff    =    19275 +/-   7154 (95%)
speedup = 1.30886% +/- 0.486% (95%)
```

closes https://github.com/official-stockfish/Stockfish/pull/5100

No functional change
2024-03-12 16:41:08 +01:00
Muzhen Gaming
10e2732978 VVLTC search tune
Result of 32k games of tuning at 60+0.6 8-thread. Link to the tuning
attempt:
https://tests.stockfishchess.org/tests/view/65def7b04b19edc854ebdec8

Passed VVLTC first SPRT:
https://tests.stockfishchess.org/tests/view/65e51b53416ecd92c162ab7f
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 37570 W: 9613 L: 9342 D: 18615
Ptnml(0-2): 2, 3454, 11601, 3727, 1

Passed VVLTC second SPRT:
https://tests.stockfishchess.org/tests/view/65e87d1c0ec64f0526c3eb39
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 123158 W: 31463 L: 31006 D: 60689
Ptnml(0-2): 5, 11589, 37935, 12044, 6

Note: The small net and psqt-only thresholds have been moved to
evaluate.h. The reasoning is that these values are used in both
`evaluate.cpp` and `evaluate_nnue.cpp`, and thus unifying their usage
avoids inconsistencies during testing, where one occurrence is changed
without the other (this happened during the search tune SPRT).

closes https://github.com/official-stockfish/Stockfish/pull/5101

Bench: 1741218
2024-03-11 10:04:37 +01:00
Disservin
b6dfd6bd54 Assorted cleanups
- fix naming convention for `workingDirectory`
- use type alias for `EvalFiles` everywhere
- move `ponderMode` into `LimitsType`
- move limits parsing into standalone static function

closes https://github.com/official-stockfish/Stockfish/pull/5098

No functional change
2024-03-11 09:02:13 +01:00
mstembera
7831131591 Only evaluate the PSQT part of the small net for large evals.
Thanks to Viren6 for suggesting to set complexity to 0.

STC https://tests.stockfishchess.org/tests/view/65d7d6709b2da0226a5a203f
LLR: 2.92 (-2.94,2.94) <0.00,2.00>
Total: 328384 W: 85316 L: 84554 D: 158514
Ptnml(0-2): 1414, 39076, 82486, 39766, 1450

LTC https://tests.stockfishchess.org/tests/view/65dce6d290f639b028a54d2e
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 165162 W: 41918 L: 41330 D: 81914
Ptnml(0-2): 102, 18332, 45124, 18922, 101

closes https://github.com/official-stockfish/Stockfish/pull/5083

bench: 1504003
2024-03-03 15:29:58 +01:00
mstembera
9699f4f79a Fix the alignment of the transformer buffer
Fixes the issue mentioned in
584d9efedc (r138417600).
Thanks to @cj5716 and @peregrineshahin for
spotting this!

closes https://github.com/official-stockfish/Stockfish/pull/5042

No functional change
2024-02-09 19:06:25 +01:00
FauziAkram
59691d46a1 Assorted trivial cleanups
Renaming doubleExtensions variable to multiExtensions, since now we have also triple extensions.

Some extra cleanups.

Recent tests used to measure the elo worth:
https://tests.stockfishchess.org/tests/view/659fd0c379aa8af82b96abc3
https://tests.stockfishchess.org/tests/view/65a8f3da79aa8af82b9751e3
https://tests.stockfishchess.org/tests/view/65b51824c865510db0272740
https://tests.stockfishchess.org/tests/view/65b58fbfc865510db0272f5b

closes https://github.com/official-stockfish/Stockfish/pull/5032

No functional change
2024-02-09 19:06:24 +01:00
mstembera
32e46fc47f Remove some outdated SIMD functions
Since https://github.com/official-stockfish/Stockfish/pull/4391 the x2
SIMD functions no longer serve any useful purpose.

Passed non-regression STC:
https://tests.stockfishchess.org/tests/view/659cf42579aa8af82b966d55
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 67392 W: 17222 L: 17037 D: 33133
Ptnml(0-2): 207, 7668, 17762, 7851, 208

closes https://github.com/official-stockfish/Stockfish/pull/4974

No functional change
2024-01-17 18:04:29 +01:00
Disservin
a107910951 Refactor global variables
This aims to remove some of the annoying global structure which Stockfish has.

Overall there is no major elo regression to be expected.

Non regression SMP STC (paused, early version):
https://tests.stockfishchess.org/tests/view/65983d7979aa8af82b9608f1
LLR: 0.23 (-2.94,2.94) <-1.75,0.25>
Total: 76232 W: 19035 L: 19096 D: 38101
Ptnml(0-2): 92, 8735, 20515, 8690, 84

Non regression STC (early version):
https://tests.stockfishchess.org/tests/view/6595b3a479aa8af82b95da7f
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 185344 W: 47027 L: 46972 D: 91345
Ptnml(0-2): 571, 21285, 48943, 21264, 609

Non regression SMP STC:
https://tests.stockfishchess.org/tests/view/65a0715c79aa8af82b96b7e4
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 142936 W: 35761 L: 35662 D: 71513
Ptnml(0-2): 209, 16400, 38135, 16531, 193

These global structures/variables add hidden dependencies and allow data
to be mutable from where it shouldn't it be (i.e. options). They also
prevent Stockfish from internal selfplay, which would be a nice thing to
be able to do, i.e. instantiate two Stockfish instances and let them
play against each other. It will also allow us to make Stockfish a
library, which can be easier used on other platforms.

For consistency with the old search code, `thisThread` has been kept,
even though it is not strictly necessary anymore. This the first major
refactor of this kind (in recent time), and future changes are required,
to achieve the previously described goals. This includes cleaning up the
dependencies, transforming the network to be self contained and coming
up with a plan to deal with proper tablebase memory management (see
comments for more information on this).

The removal of these global structures has been discussed in parts with
Vondele and Sopel.

closes https://github.com/official-stockfish/Stockfish/pull/4968

No functional change
2024-01-13 19:40:53 +01:00
Disservin
99cdb920fc Cleanup Evalfile handling
This cleans up the EvalFile handling after the merge of #4915,
which has become a bit confusing on what it is actually doing.

closes https://github.com/official-stockfish/Stockfish/pull/4971

No functional change
2024-01-08 18:33:38 +01:00
Disservin
7c5e3f2865 Prefix abs with std:: 2024-01-07 21:41:52 +01:00
Linmiao Xu
f09adaa4a4 Update smallnet to nn-baff1ede1f90.nnue with wider eval range
Created by training an L1-128 net from scratch with a wider range of
evals in the training data and wld-fen-skipping disabled during
training. The differences in this training data compared to the first
dual nnue PR are:

- removal of all positions with 3 pieces
- when piece count >= 16, keep positions with simple eval above 750
- when piece count < 16, remove positions with simple eval above 3000

The asymmetric data filtering was meant to flatten the training data
piece count distribution, which was previously heavily skewed towards
positions with low piece counts.

Additionally, the simple eval range where the smallnet is used was
widened to cover more positions previously evaluated by the big net and
simple eval.

```yaml
experiment-name: 128--S1-hse-S7-v4-S3-v1-no-wld-skip

training-dataset:
  - /data/hse/S3/leela96-filt-v2.min.high-simple-eval-1k.binpack
  - /data/hse/S3/dfrc99-16tb7p-eval-filt-v2.min.high-simple-eval-1k.binpack
  - /data/hse/S3/test80-apr2022-16tb7p.min.high-simple-eval-1k.binpack

  - /data/hse/S7/test60-2020-2tb7p.v6-3072.high-simple-eval-v4.binpack
  - /data/hse/S7/test60-novdec2021-12tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack

  - /data/hse/S7/test77-nov2021-2tb7p.v6-3072.min.high-simple-eval-v4.binpack
  - /data/hse/S7/test77-dec2021-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
  - /data/hse/S7/test77-jan2022-2tb7p.high-simple-eval-v4.binpack

  - /data/hse/S7/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
  - /data/hse/S7/test78-juntosep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack

  - /data/hse/S7/test79-apr2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
  - /data/hse/S7/test79-may2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack

  - /data/hse/S7/test80-may2022-16tb7p.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-jul2022-16tb7p.v6-dd.min.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-aug2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-sep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-oct2022-16tb7p.v6-dd.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-nov2022-16tb7p-v6-dd.min.high-simple-eval-v4.binpack

  - /data/hse/S7/test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-mar2023-2tb7p.v6-sk16.min.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-apr2023-2tb7p-filter-v6-sk16.min.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-may2023-2tb7p.v6.min.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-jun2023-2tb7p.v6-3072.min.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-jul2023-2tb7p.v6-3072.min.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-aug2023-2tb7p.v6.min.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-sep2023-2tb7p.high-simple-eval-v4.binpack
  - /data/hse/S7/test80-oct2023-2tb7p.high-simple-eval-v4.binpack

wld-fen-skipping: False
start-from-engine-test-net: False

nnue-pytorch-branch: linrock/nnue-pytorch/L1-128
engine-test-branch: linrock/Stockfish/L1-128-nolazy
engine-base-branch: linrock/Stockfish/L1-128

num-epochs: 500
start-lambda: 1.0
end-lambda: 1.0
```

Experiment yaml configs converted to easy_train.sh commands with:
https://github.com/linrock/nnue-tools/blob/4339954/yaml_easy_train.py

Binpacks interleaved at training time with:
https://github.com/official-stockfish/nnue-pytorch/pull/259

FT weights permuted with 10k positions from fishpack32.binpack with:
https://github.com/official-stockfish/nnue-pytorch/pull/254

Data filtered for high simple eval positions (v4) with:
https://github.com/linrock/Stockfish/blob/b9c8440/src/tools/transform.cpp#L640-L675

Training data can be found at:
https://robotmoon.com/nnue-training-data/

Local elo at 25k nodes per move of
L1-128 smallnet (nnue-only eval) vs. L1-128 trained on standard S1 data:
nn-epoch319.nnue : -241.7 +/- 3.2

Passed STC vs. 36db936:
https://tests.stockfishchess.org/tests/view/6576b3484d789acf40aabbfe
LLR: 2.94 (-2.94,2.94) <0.00,2.00>
Total: 21920 W: 5680 L: 5381 D: 10859
Ptnml(0-2): 82, 2488, 5520, 2789, 81

Passed LTC vs. DualNNUE #4915:
https://tests.stockfishchess.org/tests/view/65775c034d789acf40aac7e3
LLR: 2.95 (-2.94,2.94) <0.50,2.50>
Total: 147606 W: 36619 L: 36063 D: 74924
Ptnml(0-2): 98, 16591, 39891, 17103, 120

closes https://github.com/official-stockfish/Stockfish/pull/4919

Bench: 1438336
2024-01-07 21:20:15 +01:00
Linmiao Xu
584d9efedc Dual NNUE with L1-128 smallnet
Credit goes to @mstembera for:
- writing the code enabling dual NNUE:
  https://github.com/official-stockfish/Stockfish/pull/4898
- the idea of trying L1-128 trained exclusively on high simple eval
  positions

The L1-128 smallnet is:
- epoch 399 of a single-stage training from scratch
- trained only on positions from filtered data with high material
  difference
  - defined by abs(simple_eval) > 1000

```yaml
experiment-name: 128--S1-only-hse-v2

training-dataset:
  - /data/hse/S3/dfrc99-16tb7p-eval-filt-v2.min.high-simple-eval-1k.binpack
  - /data/hse/S3/leela96-filt-v2.min.high-simple-eval-1k.binpack
  - /data/hse/S3/test80-apr2022-16tb7p.min.high-simple-eval-1k.binpack

  - /data/hse/S7/test60-2020-2tb7p.v6-3072.high-simple-eval-1k.binpack
  - /data/hse/S7/test60-novdec2021-12tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-1k.binpack

  - /data/hse/S7/test77-nov2021-2tb7p.v6-3072.min.high-simple-eval-1k.binpack
  - /data/hse/S7/test77-dec2021-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-1k.binpack
  - /data/hse/S7/test77-jan2022-2tb7p.high-simple-eval-1k.binpack

  - /data/hse/S7/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-1k.binpack
  - /data/hse/S7/test78-juntosep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-1k.binpack

  - /data/hse/S7/test79-apr2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-1k.binpack
  - /data/hse/S7/test79-may2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-1k.binpack

  # T80 2022
  - /data/hse/S7/test80-may2022-16tb7p.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-jul2022-16tb7p.v6-dd.min.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-aug2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-sep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-oct2022-16tb7p.v6-dd.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-nov2022-16tb7p-v6-dd.min.high-simple-eval-1k.binpack

  # T80 2023
  - /data/hse/S7/test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-mar2023-2tb7p.v6-sk16.min.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-apr2023-2tb7p-filter-v6-sk16.min.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-may2023-2tb7p.v6.min.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-jun2023-2tb7p.v6-3072.min.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-jul2023-2tb7p.v6-3072.min.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-aug2023-2tb7p.v6.min.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-sep2023-2tb7p.high-simple-eval-1k.binpack
  - /data/hse/S7/test80-oct2023-2tb7p.high-simple-eval-1k.binpack

start-from-engine-test-net: False

nnue-pytorch-branch: linrock/nnue-pytorch/L1-128
engine-test-branch: linrock/Stockfish/L1-128-nolazy
engine-base-branch: linrock/Stockfish/L1-128

num-epochs: 500
lambda: 1.0
```

Experiment yaml configs converted to easy_train.sh commands with:
https://github.com/linrock/nnue-tools/blob/4339954/yaml_easy_train.py

Binpacks interleaved at training time with:
https://github.com/official-stockfish/nnue-pytorch/pull/259

Data filtered for high simple eval positions with:
https://github.com/linrock/nnue-data/blob/32d6a68/filter_high_simple_eval_plain.py
https://github.com/linrock/Stockfish/blob/61dbfe/src/tools/transform.cpp#L626-L655

Training data can be found at:
https://robotmoon.com/nnue-training-data/

Local elo at 25k nodes per move of
L1-128 smallnet (nnue-only eval) vs. L1-128 trained on standard S1 data:
nn-epoch399.nnue : -318.1 +/- 2.1

Passed STC:
https://tests.stockfishchess.org/tests/view/6574cb9d95ea6ba1fcd49e3b
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 62432 W: 15875 L: 15521 D: 31036
Ptnml(0-2): 177, 7331, 15872, 7633, 203

Passed LTC:
https://tests.stockfishchess.org/tests/view/6575da2d4d789acf40aaac6e
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 64830 W: 16118 L: 15738 D: 32974
Ptnml(0-2): 43, 7129, 17697, 7497, 49

closes https://github.com/official-stockfish/Stockfish/pulls

Bench: 1330050

Co-Authored-By: mstembera <5421953+mstembera@users.noreply.github.com>
2024-01-07 21:15:52 +01:00
FauziAkram
8b4583bce7 Remove redundant int cast
Remove a redundant int cast in the calculation of fwdOut. The variable
OutputType is already defined as std::int32_t, which is an integer type, making
the cast unnecessary.

closes https://github.com/official-stockfish/Stockfish/pull/4961

No functional change
2024-01-04 15:56:53 +01:00
Disservin
b987d4f033 Use type aliases instead of enums for Value types
The primary rationale behind this lies in the fact that enums were not
originally designed to be employed in the manner we currently utilize them.

The Value enum was used like a type alias throughout the code and was often
misused. Furthermore, changing the underlying size of the enum to int16_t broke
everything, mostly because of the operator overloads for the Value enum, were
causing data to be truncated. Since Value is now a type alias, the operator
overloads are no longer required.

Passed Non-Regression STC:
https://tests.stockfishchess.org/tests/view/6593b8bb79aa8af82b95b401
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 235296 W: 59919 L: 59917 D: 115460
Ptnml(0-2): 743, 27085, 62054, 26959, 807

closes https://github.com/official-stockfish/Stockfish/pull/4960

No functional change
2024-01-04 15:54:23 +01:00
Disservin
444f03ee95 Update copyright year
closes https://github.com/official-stockfish/Stockfish/pull/4954

No functional change
2024-01-04 15:47:10 +01:00
FauziAkram
833a2e2bc0 Cleanup comments
Tests used to derive some Elo worth comments:
https://tests.stockfishchess.org/tests/view/656a7f4e136acbc573555a31
https://tests.stockfishchess.org/tests/view/6585fb455457644dc984620f

closes https://github.com/official-stockfish/Stockfish/pull/4945

No functional change
2023-12-31 19:54:27 +01:00
FauziAkram
a069a1bbbf Use std::abs over abs
closes https://github.com/official-stockfish/Stockfish/pull/4926
closes https://github.com/official-stockfish/Stockfish/pull/4909

No functional change

Co-Authored-By: fffelix-huang <72808219+fffelix-huang@users.noreply.github.com>
2023-12-19 18:22:10 +01:00
Joost VandeVondele
ec02714b62 Cleanup comments and some code reorg.
passed STC:
https://tests.stockfishchess.org/tests/view/6536dc7dcc309ae83955b04d
LLR: 2.93 (-2.94,2.94) <-1.75,0.25>
Total: 58048 W: 14693 L: 14501 D: 28854
Ptnml(0-2): 200, 6399, 15595, 6669, 161

closes https://github.com/official-stockfish/Stockfish/pull/4846

No functional change
2023-10-24 17:43:05 +02:00
cj5716
d6a5c2b085 Small formatting improvements
Changes some C style casts to C++ style, and fixes some incorrect comments and variable names.

closes #4845

No functional change
2023-10-24 17:42:13 +02:00
Disservin
a105978bbd remove blank line between function and it's description
- remove the blank line between the declaration of the function and it's
  comment, leads to better IDE support when hovering over a function to see it's
  description
- remove the unnecessary duplication of the function name in the functions
  description
- slightly refactored code for lsb, msb in bitboard.h There are still a few
  things we can be improved later on, move the description of a function where
  it was declared (instead of implemented) and add descriptions to functions
  which are behind macros ifdefs

closes https://github.com/official-stockfish/Stockfish/pull/4840

No functional change
2023-10-23 20:39:48 +02:00
Disservin
2d0237db3f add clang-format
This introduces clang-format to enforce a consistent code style for Stockfish.

Having a documented and consistent style across the code will make contributing easier
for new developers, and will make larger changes to the codebase easier to make.

To facilitate formatting, this PR includes a Makefile target (`make format`) to format the code,
this requires clang-format (version 17 currently) to be installed locally.

Installing clang-format is straightforward on most OS and distros
(e.g. with https://apt.llvm.org/, brew install clang-format, etc), as this is part of quite commonly
used suite of tools and compilers (llvm / clang).

Additionally, a CI action is present that will verify if the code requires formatting,
and comment on the PR as needed. Initially, correct formatting is not required, it will be
done by maintainers as part of the merge or in later commits, but obviously this is encouraged.

fixes https://github.com/official-stockfish/Stockfish/issues/3608
closes https://github.com/official-stockfish/Stockfish/pull/4790

Co-Authored-By: Joost VandeVondele <Joost.VandeVondele@gmail.com>
2023-10-22 16:06:27 +02:00
mstembera
d3d0c69dc1 Remove outdated Tile naming.
cleanup variable naming after  #4816

closes #4833

No functional change
2023-10-21 10:28:55 +02:00
FauziAkram
edb4ab924f Standardize Comments
use double slashes (//) only for comments.

closes #4820

No functional change.
2023-10-21 10:25:03 +02:00
mstembera
c17a657b04 Optimize the most common update accumalator cases w/o tiling
In the most common case where we only update a single state
it's faster to not use temporary accumulation registers and tiling.
(Also includes a couple of small cleanups.)

passed STC
https://tests.stockfishchess.org/tests/view/651918e3cff46e538ee0023b
LLR: 2.95 (-2.94,2.94) <0.00,2.00>
Total: 34944 W: 8989 L: 8687 D: 17268
Ptnml(0-2): 88, 3743, 9512, 4037, 92

A simpler version
https://tests.stockfishchess.org/tests/view/65190dfacff46e538ee00155
also passed but this version is stronger still
https://tests.stockfishchess.org/tests/view/6519b95fcff46e538ee00fa2

closes https://github.com/official-stockfish/Stockfish/pull/4816

No functional change
2023-10-08 07:42:39 +02:00
mstembera
8a912951de Remove handcrafted MMX code
too small a benefit to maintain this old target

closes https://github.com/official-stockfish/Stockfish/pull/4804

No functional change
2023-10-08 07:37:01 +02:00