BadFish

mirror of https://github.com/sockspls/badfish synced 2025-04-30 08:43:09 +00:00

Author	SHA1	Message	Date
Marco Costalba	f3d0b76feb	Use optimized pop_1st_bit() under Windows 64 with icc Intel compiler can handle this code even under Windows. So lift the costrain. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-14 12:47:49 +01:00
Marco Costalba	bfd4421f49	Better naming and document some endgame functions In particular the generic scaling functions. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-14 08:19:55 +01:00
Marco Costalba	fd12e8cb23	Finally fix prefetch on Linux It was due to a missing -msse compiler option ! Without this option the CPU silently discards prefetcht2 instructions during execution. Also added a (gcc documented) hack to prevent Intel compiler to optimize away the prefetches. Special thanks to Heinz for testing and suggesting improvments. And for Jim for testing icc on Windows. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-14 08:13:42 +01:00
Marco Costalba	166c09a7a0	Reuse 5 slots instead of 4 But this time with the guarantee of an always aligned access so that prefetching is not adversely impacted. On Joona PC 1+0, 64Mb hash: Orig - Mod: 174 - 237 - 359 Instead after 1000 games at 1+0 with 128MB hash size we are at + 1 ELO (just 4 games of difference). Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-14 08:13:13 +01:00
Marco Costalba	8d369600ec	Double prefetch on Windows After fixing the cpu frequency with RightMark tool I was able to test speed all the different prefetch combinations. Here the results: OS Windows Vista 32bit, MSVC compile CPU Intecl Core 2 Duo T5220 1.55 GHz bench on depth 12, 1 thread, 26552844 nodes searched results in nodes/sec no-prefetch 402486, 402005, 402767, 401439, 403060 single prefetch (aligned 64) 410145, 409159, 408078, 410443, 409652 double prefetch (aligned 64) 0+32 414739, 411238, 413937, 414641, 413834 double prefetch (aligned 64) 0+64 413537, 414337, 413537, 414842, 414240 And now also some crazy stuff: single prefetch (aligned 128) 410145, 407395, 406230, 410050, 409949 double prefetch (aligned 64) 0+0 409753, 410044, 409456 single prefetch (aligned 64) +32 408379, 408272, 406809 single prefetch (aligned 64) +64 408279, 409059, 407395 So it seems the best is a double prefetch at the addres + 32 or +64, I will choose the second one because it seems more natural to me. It is still a mystery why it doesn't work under Linux :-( Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-10 22:35:08 +01:00
Marco Costalba	f4140ecc0c	Avoid Intel compiler optimizes away prefetching Without this hack Intel compiler happily optimizes away the gcc builtin call. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-10 13:49:12 +01:00
Marco Costalba	60b5da4cc8	Use aligned prefetch address Prefetch always form a chache line boundary. It seems that if prefetch address is not cache line aligned then performance is adversely impacted. Hopefully we will resuse that 32 bits of padding for something useful in the future. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-10 13:49:00 +01:00
Marco Costalba	55c46b2399	Remove old BishopPairBonus constants Now that we have poly imbalance these ones are no more used. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-10 13:47:39 +01:00
Marco Costalba	76ae0e36be	Enable prefetch also for gcc This fix a compile error under Linux with gcc when there aren't the intel dev libraries. Also simplify the previous patch moving TT definition from search.cpp to tt.cpp so to avoid using passing a pointer to TT to the current position. Finally simplify do_move(), now we miss a prefetch in the rare case of setting an en-passant square but code is much cleaner and performance penalty is almost zero. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-10 01:42:35 +01:00
Marco Costalba	4251eac860	Try to prefetch as soon as position key is ready Move prefetching code inside do_move() so to allow a very early prefetching and to put as many instructions as possible between prefetching and following retrieve(). With this patch retrieve() times are cutted of another 25% No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-09 16:45:37 +01:00
Marco Costalba	cd4604b05c	Add TT prefetching support TT.retrieve() is the most time consuming function because almost always involves a very slow RAM access. TT table is so big that is never cached. This patch prefetches TT data just after a move is done, so that subsequent TT.retrieve will be very fast. Profiling with VTune shows that TT:retrieve() times are almost cutted in half ! No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-09 14:18:15 +01:00
Marco Costalba	e6863f46de	Use 5 TTEntry slots instead of 4 Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-09 04:42:26 +01:00
Marco Costalba	6f1475b6fc	Use 32 bit key in TT Shrink key to 32 bits instead of 64. To still avoid collisions use the high 32 bits of position key as the TT key and the low 32 bits to retrieve the correct cluster index in the table. With this patch size og TTentry shrinks to 96 bits instead of 128 and the cluster of 4 TTEntry sums to 48 bytes instead of 64. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-09 04:42:07 +01:00
Marco Costalba	4a777954e1	Makefile: added 'make strip' target Binaries are always built with symbol table in to easy debugging and profiling. It is now possible to run: make strip To remove symbol table from the compiled binary. This could be useful to prepare the release version. Patch by Heinz van Saanen. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-08 17:37:13 +01:00
Marco Costalba	54382f8b07	Let LMR at root be independent of MultiPV value Current formula enable LMR when i + MultiPV >= LMRPVMoves It means that, for instance, if MultiPV == 1 then LMR will be started to be considered at move i = LMRPVMoves - 1, while if MultiPV == 3 then it will start before, at move i = LMRPVMoves - 3. With this patch the formula becomes i >= MultiPV + LMRPVMoves - 2 So that LMR will always start after LMRPVMoves - 1 moves from the last PV move. No functional change when MultiPV == 1 Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-08 17:30:46 +01:00
Marco Costalba	339bb8a524	Speed up polynomial material imbalance loop Access pos.piece_count() only once and avoid some branches in the inner loop. Profiling with VTune shows a 20% speed improvement in get_material_info(), and it is also a bit more cleaned up this way ;-) No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-08 14:12:04 +01:00
Marco Costalba	aa925a0e29	There is no need to special case KNNK ending It is always draw, so use the corresponding proper evaluation function. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-08 13:10:10 +01:00
Marco Costalba	23ceb66950	Move halfOpenFiles[] calculation out of a loop And put it in an already existing one so to optimze a bit. Also additional cleanups and code shuffles all around the place. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-08 09:21:42 +01:00
Marco Costalba	565d12bf42	Compile without DEBUG flag by default And build also symbol table. It can easily stripped after .exe is done and it is necessary for profiling. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-08 09:21:29 +01:00
Marco Costalba	00eab73399	Revert material balance values after 100000 games After Joona's direct testing with ~2000 games it seems values after 100.000 games does not give any advantage, so revert for now. Score of Stockfish_0 vs Stockfish_15: 491 - 392 - 1102 Score of Stockfish_0 vs Stockfish_40: 461 - 439 - 1076 Score of Stockfish_0 vs Stockfish_65: 442 - 518 - 1018 (13 elo) Score of Stockfish_0 vs Stockfish_100: 504 - 502 - 984 Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-08 03:49:49 +01:00
Joona Kiiski	5be3d98d17	Do not adjust Minimum Split Depth automatically Currently minimum split depth is set automatically to 6 when number of CPUs is more than 4. I believe this is a bad idea since for example my quad (4CPU with hyperthreading) is detected as 8CPU computer. I've manually lowered down the number of Threads, but so far I have played all games with Minimum Split Depth set to 6! Since 4CPU computers with hyperthreading are quite common and 8 CPU computers extremely rear (I expect we can get a direct jump to 16 or 32 cores), this automatic adjusting is likely to do more harm than good. Add a note in Readme.txt, so that those rear 8CPU owners can manually tweak the "Minimum Split Depth" parameter Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-08 03:36:20 +01:00
Marco Costalba	5b3fcab1ad	Polished Makefile for *nix Greately improved Makefile from Heinz van Saanen Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-08-08 03:30:27 +01:00
Tord Romstad	977ca40d6d	Supply the "upperbound" and "lowerbound" parameters in UCI search output when the score is outside the root window.	2009-08-07 16:26:24 +02:00
Tord Romstad	ae49677446	Fixed a bug in PV extraction from the transposition table: The previous used move_is_legal to verify that the move from the TT was legal, and the old version of move_is_legal only works when the side to move is not in check. Fixed this by adding a separate, slower version of move_is_legal which works even when the side to move is in check.	2009-08-06 18:07:32 +02:00
Tord Romstad	2fff532f4e	Moved the code for extracting the PV from the TT to tt.cpp, where it belongs.	2009-08-06 14:02:53 +02:00
Tord Romstad	da854fe83a	Added a new function build_pv(), which extends a PV by walking down the transposition table. When the search was stopped before a fail high at the root was resolved, Stockfish would often print a very short PV, sometimes consisting of just a single move. This was not only a little user-unfriendly, but also harmed the strength a little in ponder-on games: Single-move PVs mean that there is no ponder move to search. It is perhaps worth considering to remove the pv[][] array entirely, and always build the entire PV from the transposition table. This would simplify the source code somewhat and probably make the program infinitesimally faster, at the expense of sometimes getting shorter PVs or PVs with rubbish moves near the end.	2009-08-06 13:27:49 +02:00
Tord Romstad	a1096e55cf	Initial work towards adjustable playing strength. Added the UCI_LimitStrength and the UCI_Elo options, with an Elo range of 2100-2900. When UCI_LimitStrength is enabled, the number of threads is set to 1, and the search speed is slowed down according to the chosen Elo level. Todo: 1. Implement Elo levels below 2100 by blundering on purpose and/or crippling the evaluation. 2. Automatically calibrate the maximum Elo by measuring the CPU speed during program initialization, perhaps by doing some bitboard computations and measuring the time taken. No functional change when UCI_LimitStrength is false (the default).	2009-08-04 11:31:25 +02:00
Tord Romstad	dad632ce5b	Added LMR at the root. After 2000 games at 1+0 Mod vs Orig +534 =1033 -433 52.525% 1050.5/2000 +18 ELO	2009-08-03 09:08:59 +02:00
Joona Kiiski	2f7723fd44	Remove useless mate value special handling in null search After 1200 games (1CPU), time control 1+0: Mod vs Orig: +331 =564 -277 +16 ELO Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-26 18:55:17 +01:00
Marco Costalba	152f3b13b7	Yet another small touch to endgame functions handling It is like a never finished painting. Everyday a little touch more. But this time it is very little ;-) No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-26 17:42:48 +01:00
Marco Costalba	bb1b049b83	Remove unused members in Application class Also rearrange a bit the remining methods. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-26 16:11:20 +01:00
Marco Costalba	50f92bed06	Fix a spurious extra space This morning it seems there is nothing better to do... Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-26 09:07:42 +01:00
Marco Costalba	bdb586ac2b	Micro optimize extension() in search.cpp Small micro-optimization in this very time critical function. Use bitwise 'or' instead of logic 'or' to avoid branches in the assembly and use the result to skip an handful of checks. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-25 16:48:28 +01:00
Marco Costalba	1b0303b6e9	Polynomial material balance after 100.000 games Verified it is equivalent to the tuning branch results with parameter values sampled after 100.000 games. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-24 14:26:49 +01:00
Marco Costalba	5f232e0667	Revert Makefile changes Some unwanted changes to Makefile slept in in patch "Introduced the UCI_AnalyseMode option". Revert them. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-24 14:18:03 +01:00
Marco Costalba	080a4995a3	Simplify king shelter cache handling This is more similar to how get_material_info() and get_pawn_info() work and also removes some clutter from evaluate_king(). No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-24 14:13:13 +01:00
Marco Costalba	20224a5bbf	Delay costly SEE call during captures ordering in MovePicker When ordering moves we push all captures with negative SEE values to badCaptures[] array during the scoring phase. This patch delays the costly SEE call up to when the move has been picked up in pick_move_from_list(), this way we save some SEE calls in case we get a cutoff. It seems we have a speed gain of about 1-1.5 % in terms of nodes/sec and profiling seems to confirm the small but real speed increase. Idea from Pablo Vazquez on talkchess.com http://www.talkchess.com/forum/viewtopic.php?t=29018&start=20 It would be a no functional change but actually it is not because now sorting set is different and so std::sort(), that is not a stable sort, does not guarantees the order of same scored moves to remain the same as before. After 952 games at 1+0 we are below error bar, almost equal just 6 games of difference (+2 ELO) Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-24 14:12:33 +01:00
Marco Costalba	8654fee18c	Microptimization in do_evaluate() Do not call count_1s_max_15() if not necessary, as is not in the common case (>95%). No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-23 22:01:42 +01:00
Marco Costalba	8b45b60327	Use do_move_bb() helpers when doing a castle Small cleanup. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-23 10:43:58 +01:00
Marco Costalba	044ad593b3	Add Tord's polynomial material balance Use a polynomial weighted evaluation to calculate material value. This is far more flexible and elegant then applying a series of single euristic rules as before. Also correct a design issue in which we returned two values, one for middle game and one for endgame, while instead, because game phase is a function of board material itself, only one value should be calculated and used both for mid and end game. Verified it is equivalent to the tuning branch results with parameter values sampled after 40.000 games. After 999 games at 1+0 Mod vs Orig +277 =482 -240 51.85% 518.0/999 +13 ELO Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-23 00:03:30 +01:00
Marco Costalba	5600d91cff	Rename int32 in int32_t To use the same naming rule of the other types and to be compatible with inttypes.h, used under Linux. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-20 10:53:41 +01:00
Marco Costalba	1cc44bcaae	Correctly set mateThreat in search() We do not accept null search returned mate values, but we always do a full search in those cases. So the variable mateThreat that is set only if null move search returns a mate value is always false. Restore the functionality of mateThreat moving the assignement where it can be triggered. After 999 games at 1+0 Mod vs Orig +253 =517 -229 51.20% +8 ELO Bug reported by xiaozhi Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-20 08:05:48 +01:00
Marco Costalba	15eb59683e	Use increased LMR horizont also in PV search Tord says that using a lower horizon at PV nodes looks strange and inconsistent with the general philosophy of our search (i.e. always being more conservative at PV nodes). So set LMR at 3 also on search_pv(). Test result after 601 games seems to confirm this. Mod vs Orig +156 =318 -127 52.41% 315.0/601 +17 ELO Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-18 12:47:37 +02:00
Marco Costalba	620cfbb676	Reintroduce null move dynamic reduction Test extension of LMR horizon to 3 plies alone, without touching null move search. To keep the patch minimal we still don't change LMR horizon in PV search. This will be the object of the next patch. Result seems good after 998 games: Mod vs Orig +252/=518/-228 51.20% 511.0/998 +8 ELO So dynamic null move reduction seems a bit stronger then fixed reduction even with LMR horizon set to 3. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-18 06:08:06 +01:00
Marco Costalba	fe523b2d18	Use increased LMR horizont only after a null move Revert to LMR horizont of 2 plies. Only if parent move is a null move increase to 3 so to avoid the bad combination of null move reduction + LMR reduction. This is a more aggressive patch then previous one, but it seems we are going in the wromg direction. After 531 games result is not good: Mod vs Orig +123/=265/-143 48.12% 255.5/531 -13 ELO Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-18 06:08:02 +01:00
Marco Costalba	2a203d8d6f	Combine increased LMR horizont and fixed null move reduction Set null move reduction to R=4, but increase the LMR horizon to 3 plies. The two tweaks are related and should compensate the combined effect of null move + LMR reduction at shallow depths. Idea from Tord. After 999 games at 1+0 Mod vs Orig +251 =522 -225 51.30% + 9 ELO On Tord iMac Core 2 Duo 2.8 GHz, one thread, Mac OS X 10.6, at 1+0 time control we have: Mod vs Orig 994-1006 -1.4 ELO But Orig version is pgo compiled and Mod is not. The PGO compiled version is about 8% faster, which corresponds to about 7 Elo points. This means that results are reasonably consistent. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-18 06:07:58 +01:00
Tord Romstad	b8326edea3	Introduced the UCI_AnalyseMode option, and made the evaluation function symmetrical in analyse mode. No functional change when playing games.	2009-07-17 22:26:01 +02:00
Marco Costalba	20e8738901	Fix two compile errors in new endgame code Code that compiles cleanly under MSVC triggers one compile error (correct) under Intel C++ and two(!) under gcc. The first is the same complained by Intel, but the second is an interesting corner case of C++ standard (there are many) that is correctly spotted only by gcc. Both MSVC and Intel pass this silently, probably to avoid breaking people code. Now we are fully C++ compliant ;-) Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-17 19:29:25 +01:00
Marco Costalba	b3b1d3aaa7	Move constant bitboard arrays from header to cpp file This avoid to duplicate storage allocation for every file where they are used. Note that simple numeric constant can remain in header because are automatically folded by the compiler. Patch suggested by Tord. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-17 16:25:53 +01:00
Marco Costalba	0d69ac33ff	Remove even more redundancy in endgame functions handling Push on the templatization even more to chip out some code and take the opportunity to show some neat template trick ;-) Ok. I would say we can stop here now....it is quickly becoming a style exercise but we are not boost developers so give it a stop. No functional change. Signed-off-by: Marco Costalba <mcostalba@gmail.com>	2009-07-17 16:05:19 +01:00

... 2 3 4 5 6 ...

804 commits