BadFish

mirror of https://github.com/sockspls/badfish synced 2025-04-30 16:53:09 +00:00

Author	SHA1	Message	Date
Marco Costalba	17c5119222	Retire pieces_of_color_and_type() It is used mainly in a bunch of inline oneliners just below its definition. So substitute it with the explicit definition and avoid information hiding. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-31 16:23:04 +02:00
Marco Costalba	cf71efc34b	MovePicker: rename number_of_moves() in number_of_evasions() It is more clear that only in that case the move number is correct, otherwise is only a partial quantity: the number of moves of that phase. In case of PH_EVASIONS instead we have only one phase. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-31 15:32:31 +02:00
Marco Costalba	c9d364f9ca	Use pointers instead of array indices also for badCaptures To have uniformity with moves array handling. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-31 12:33:44 +02:00
Marco Costalba	97dd7568ed	Document index[] and pieceList[] are not invariants Array index[] and pieceList[] are not guaranteed to be invariant to a do_move() + undo_move() sequence when a capture move is involved. The reason is that the captured piece is removed form the list and substituted with the last one in do_move() while in undo_move() is added again but at the end of the list. Because index[] and pieceList[] are used in move generation to scan the pieces it means that moves will be generated in a different order before and after a do_move() + undo_move() sequence as, for instance, the one in Position::has_mate_threat() After latest patches, move generation could now be invoked also by MovePicker c'tor and this explains why order of picked moves is different if MovePicker object is istantiated before or after a Position::has_mate_threat() call. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-31 11:02:28 +02:00
Marco Costalba	af220cfd52	Workaround a bug in Position::has_mate_threat() It seems that pos.has_mate_threat() changes the position ! So that calling MovePicker c'tor before or after the has_mate_threat() call changes the things ! Bug was unhidden by previous patch that makes MovePicker c'tor to generate, score and sort good captures under some circumstances. Because scoring the captures is position dependent it seems that the moves returned by MovePicker are different when c'tor is called before has_mate_threat() Of course this is only a workaround because the real bug is still hidden :-( Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-30 20:10:09 +01:00
Marco Costalba	1130c8d815	Skip TT_MOVES phase when possible If we don't have tt moves to search skip the useless loop associated with TT_MOVES phase. Another 1% speed boost that brings this series to a +6.2% against original revision `595a90df` No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-30 20:10:09 +01:00
Marco Costalba	607ac0687a	Movepicker: take move's loop out of switch statement This not only cleans up the code but gives another speed boost of 1.8% From revision `595a90dfd0` we have increased pgo compiled binary speed of a whopping +5.2% without any functional change !! This is really awsome considering that we have also cut line count by 25 lines. Sometime we spend days for getting an extra 1% from move generation while instead the biggest optimizations come from anonymous and apparently dull parts of the code. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-30 20:10:08 +01:00
Marco Costalba	e9de96f0e4	Revert "null move reorder" series Does not seem to improve on the standard, latest results from Joona after 2040 games are negative: Orig - Mod: 454 - 424 - 1162 And is more or less the same I got few days ago. So revert for now. Verified same functionality of `595a90dfd` Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-30 20:09:58 +01:00
Marco Costalba	9ab84a8165	Convert handling of tt moves and killers to standard form Use the same way of loop along the move list used for the others move kinds so to be consistent in get_next_move() And a bit of the usual clean up too, but just a bit. It is even a bit (+0.3%) faster now. ;-) No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-29 19:51:00 +01:00
Marco Costalba	ac65b14d30	Try null move before captures Always after TT move but before captures. This seems a better setup against version before this patch. After 999 games at 1+0 Mod - Orig +252 =527 -220 +11 ELO Unfortunatly it does not seems to improve on the standard version, with null move outside of movepicker (`595a90df`) with the latest speed-up patches added in. After 999 games at 1+0 Mod - Standard +244 =506 -249 -2 ELO Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-29 07:17:09 +01:00
Marco Costalba	9e4befe3f1	Use pointers instead of array indices in MovePicker This avoids calculating the array entry position at each access and gives another boost of almost 1%. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-29 06:48:31 +01:00
Marco Costalba	6cf28d4aa7	Change the flow in wich moves are generated and picked In MovePicker we get the next move with pick_move_from_list(), then check if the return value is equal to MOVE_NONE and in this case we update the state to the new phase. This patch reorders the flow so that now from pick_move_from_list() renamed get_next_move() we directly call go_next_phase() to generate and sort the next bunch of moves when there are no more move to try. This avoids to always check for pick_move_from_list() returned value and the flow is more linear and natural. Also use a local variable instead of a pointer dereferencing in a time critical switch statement in get_next_move() With this patch alone we have an incredible speed up of 3.2% !!! No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-27 19:56:26 +01:00
Marco Costalba	129cde008c	Disable again null move at depth == OnePly Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-26 16:59:58 +01:00
Joona Kiiski	b088f0aefd	Use special null move technique in low depth. Try good captures before null move when depth < 3 * OnePly. Use this kind of null move also in Depth == OnePly. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-26 16:30:39 +01:00
Joona Kiiski	a5d699d62f	Use nullMove only through MovePicker. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-26 16:30:35 +01:00
Joona Kiiski	f6d2452916	Add Null move support to MovePicker. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-26 16:29:18 +01:00
Joona Kiiski	268c53ac51	Create useNullMove local variable No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-26 15:42:58 +01:00
Marco Costalba	595a90dfd0	Clean killers handling in movepicker Original patch from Joona with added optimizations by me. Great cleanup of MovePicker with speed improvment of 1% No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-26 15:38:47 +01:00
Marco Costalba	e217407450	Micro-optimze extension() Explicitly write the conditions for pawn to 7th and passed pawn instead of wrapping in redundant helpers. Also retire the now unused move_is_pawn_push_to_7th() and the never used move_was_passed_pawn_push() and move_is_deep_pawn_push() Function extension() is so time critical that this simple patch speeds up the pgo compile of 0.5% and it is also more clear what actually happens there. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-25 15:11:05 +01:00
Marco Costalba	2078878376	Merge branch 'master' of git-Stockfish@free2.projectlocker.com:sf	2009-08-23 18:57:11 +01:00
Marco Costalba	d1d4437699	Remove a local variable from pop_1st_bit() Remove the 'b' uint32_t local variable. Optimized assembly is more or less the same (one 'mov' instruction less), but now it is written in a way more similar to the final assembly flow so it should be easier for compiler to optimize. Also guarantee that BitTable[] is always aligned. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-23 18:55:07 +01:00
Marco Costalba	ba04eb0446	Poly ampli+bias values after 73831 games Verified correct against tuning branch. After 999 games at 1+0 Mod vs Orig +257 =510 -232 51.20% +9 ELO Very small increase but an increase anyway ! Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-23 18:51:01 +01:00
Tord Romstad	ed347e7cbd	Added a few new targets to the Makefile for OS X with icpc. The following new targets were added: * osx-icc32: 32-bit x86 compiled with icpc. * osx-icc64: 64-bit x86 compiled with icpc. * osx-icc32-profile: 32-bit x86 compiled with icpc and pgo. * osx-icc64-profile: 64-bit x86 compiled with icpc and pgo.	2009-08-21 10:50:34 +02:00
Marco Costalba	95af1e28be	Fix some asserts raised by is_ok() There were two asserts. The first was raised because is_ok() was called at the beginning of do_castle_move() and this is wrong after the last code reformatting because at that point the state is already modified by the caller do_move(). The second, raised by debugIncrementalEval, was due to a rounding error in compute_value() that occurs because TempoValueEndgame was updated in an odd number by patch "Merge Joona Kiiski evaluation tweaks" (`3ed603cd`) of 13/3/2009 This line in compute_value() is the guilty one: result += (side_to_move() == WHITE)? TempoValue / 2 : -TempoValue / 2; The fix is to increment TempoValueEndgame so to be even. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-20 17:48:52 +01:00
Tord Romstad	e9aa20ad13	Fixed incorrect material key update when making promotion moves.	2009-08-20 16:54:20 +02:00
Marco Costalba	e01fefbbaf	More use of memset() in Position::clear() No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-18 21:21:28 +01:00
Marco Costalba	e4fc957898	Little do_move() micro optimizations Also a few remaining style touches. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-18 08:58:19 +01:00
Marco Costalba	693b38a5e7	Better clarify how pieceList[] and index[] work Rearrange the code a bit to be more self-documenting. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-17 23:15:35 +01:00
Marco Costalba	fbec55e52e	Unify patch series summary This patch seems bigger then what actually is. It just moves some code around and adds a bit of coding style fixes to do_move() and undo_move() so to have uniformity of naming in both functions. The diffstat for the whole patch series is 239 insertions(+), 426 deletions(-) And final MSVC pgo build is even a bit faster: Before 448.051 nodes/sec After 453.810 nodes/sec (+1.3%) No functional change (tested on more then 100M of nodes) Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-17 15:09:20 +01:00
Marco Costalba	05e70d6740	Unify undo_ep_move(m) Integrate undo_ep_move in undo_move() this reduces line count and code readibility. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-17 14:48:45 +01:00
Marco Costalba	b4cb1a3a9e	Unify undo_promotion_move() Integrate do_ep_move in undo_move() this reduces line count and code readibility. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-17 14:48:33 +01:00
Marco Costalba	ec14fb1b33	Unify do_promotion_move() Integrate do_promotion_move() in do_move() this reduces line count and code readibility. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-17 14:48:20 +01:00
Marco Costalba	cb506d3b16	Unify do_ep_move() Integrate do_ep_move in do_move() this reduces line count and code readibility. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-17 14:47:12 +01:00
Marco Costalba	e0c47a6ceb	L1/L2 friendly PhaseTable[] In Movepicker c'tor we access during initialization one of MainSearchPhaseIndex..QsearchWithoutChecksPhaseIndex globals. Postpone definition of PhaseTable[] just after them so that when PhaseTable[] will be accessed later in get_next_move() it will be already present in L1/L2. It works like an implicit prefetching of PhaseTable[]. Also shrink PhaseTable[] to fit an L1 cache line of 16 bytes using uint8_t instead of int. This apparentely innocuous patch gives an astonish speed up of 1.6% under MSVC 2010 beta, pgo optimized ! No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-15 16:09:10 +01:00
Marco Costalba	f3d0b76feb	Use optimized pop_1st_bit() under Windows 64 with icc Intel compiler can handle this code even under Windows. So lift the costrain. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-14 12:47:49 +01:00
Marco Costalba	bfd4421f49	Better naming and document some endgame functions In particular the generic scaling functions. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-14 08:19:55 +01:00
Marco Costalba	fd12e8cb23	Finally fix prefetch on Linux It was due to a missing -msse compiler option ! Without this option the CPU silently discards prefetcht2 instructions during execution. Also added a (gcc documented) hack to prevent Intel compiler to optimize away the prefetches. Special thanks to Heinz for testing and suggesting improvments. And for Jim for testing icc on Windows. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-14 08:13:42 +01:00
Marco Costalba	166c09a7a0	Reuse 5 slots instead of 4 But this time with the guarantee of an always aligned access so that prefetching is not adversely impacted. On Joona PC 1+0, 64Mb hash: Orig - Mod: 174 - 237 - 359 Instead after 1000 games at 1+0 with 128MB hash size we are at + 1 ELO (just 4 games of difference). Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-14 08:13:13 +01:00
Marco Costalba	8d369600ec	Double prefetch on Windows After fixing the cpu frequency with RightMark tool I was able to test speed all the different prefetch combinations. Here the results: OS Windows Vista 32bit, MSVC compile CPU Intecl Core 2 Duo T5220 1.55 GHz bench on depth 12, 1 thread, 26552844 nodes searched results in nodes/sec no-prefetch 402486, 402005, 402767, 401439, 403060 single prefetch (aligned 64) 410145, 409159, 408078, 410443, 409652 double prefetch (aligned 64) 0+32 414739, 411238, 413937, 414641, 413834 double prefetch (aligned 64) 0+64 413537, 414337, 413537, 414842, 414240 And now also some crazy stuff: single prefetch (aligned 128) 410145, 407395, 406230, 410050, 409949 double prefetch (aligned 64) 0+0 409753, 410044, 409456 single prefetch (aligned 64) +32 408379, 408272, 406809 single prefetch (aligned 64) +64 408279, 409059, 407395 So it seems the best is a double prefetch at the addres + 32 or +64, I will choose the second one because it seems more natural to me. It is still a mystery why it doesn't work under Linux :-( Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-10 22:35:08 +01:00
Marco Costalba	f4140ecc0c	Avoid Intel compiler optimizes away prefetching Without this hack Intel compiler happily optimizes away the gcc builtin call. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-10 13:49:12 +01:00
Marco Costalba	60b5da4cc8	Use aligned prefetch address Prefetch always form a chache line boundary. It seems that if prefetch address is not cache line aligned then performance is adversely impacted. Hopefully we will resuse that 32 bits of padding for something useful in the future. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-10 13:49:00 +01:00
Marco Costalba	55c46b2399	Remove old BishopPairBonus constants Now that we have poly imbalance these ones are no more used. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-10 13:47:39 +01:00
Marco Costalba	76ae0e36be	Enable prefetch also for gcc This fix a compile error under Linux with gcc when there aren't the intel dev libraries. Also simplify the previous patch moving TT definition from search.cpp to tt.cpp so to avoid using passing a pointer to TT to the current position. Finally simplify do_move(), now we miss a prefetch in the rare case of setting an en-passant square but code is much cleaner and performance penalty is almost zero. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-10 01:42:35 +01:00
Marco Costalba	4251eac860	Try to prefetch as soon as position key is ready Move prefetching code inside do_move() so to allow a very early prefetching and to put as many instructions as possible between prefetching and following retrieve(). With this patch retrieve() times are cutted of another 25% No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-09 16:45:37 +01:00
Marco Costalba	cd4604b05c	Add TT prefetching support TT.retrieve() is the most time consuming function because almost always involves a very slow RAM access. TT table is so big that is never cached. This patch prefetches TT data just after a move is done, so that subsequent TT.retrieve will be very fast. Profiling with VTune shows that TT:retrieve() times are almost cutted in half ! No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-09 14:18:15 +01:00
Marco Costalba	e6863f46de	Use 5 TTEntry slots instead of 4 Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-09 04:42:26 +01:00
Marco Costalba	6f1475b6fc	Use 32 bit key in TT Shrink key to 32 bits instead of 64. To still avoid collisions use the high 32 bits of position key as the TT key and the low 32 bits to retrieve the correct cluster index in the table. With this patch size og TTentry shrinks to 96 bits instead of 128 and the cluster of 4 TTEntry sums to 48 bytes instead of 64. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-09 04:42:07 +01:00
Marco Costalba	4a777954e1	Makefile: added 'make strip' target Binaries are always built with symbol table in to easy debugging and profiling. It is now possible to run: make strip To remove symbol table from the compiled binary. This could be useful to prepare the release version. Patch by Heinz van Saanen. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-08 17:37:13 +01:00
Marco Costalba	54382f8b07	Let LMR at root be independent of MultiPV value Current formula enable LMR when i + MultiPV >= LMRPVMoves It means that, for instance, if MultiPV == 1 then LMR will be started to be considered at move i = LMRPVMoves - 1, while if MultiPV == 3 then it will start before, at move i = LMRPVMoves - 3. With this patch the formula becomes i >= MultiPV + LMRPVMoves - 2 So that LMR will always start after LMRPVMoves - 1 moves from the last PV move. No functional change when MultiPV == 1 Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-08 17:30:46 +01:00
Marco Costalba	339bb8a524	Speed up polynomial material imbalance loop Access pos.piece_count() only once and avoid some branches in the inner loop. Profiling with VTune shows a 20% speed improvement in get_material_info(), and it is also a bit more cleaned up this way ;-) No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-08 14:12:04 +01:00

... 2 3 4 5 6 ...

838 commits