Don't wait for the search to finish after a 'stop'
command, but keep processing the GUI input if any.
Also explicitly wake up the main thread (that could be
sleeping) after a 'stop' or 'quit' command and do not
rely on wait_for_search_finished() doing it for us.
This patch cleans up the code and functions's definitions,
but it is risky and needs a good test under different
conditions to be sure it does not introduces hungs up.
No functional change.
Handle also the SMP case. This has been quite tricky, not
trivial to enforce the node limit in SMP case becuase
with "helpful master" concept we can have recursive split
points and we cannot lock them all at once so there is the
risk of counting the same nodes more than once.
Anyhow this patch should be race free and counted nodes are
correct.
No functional change.
It is very difficult and risky to assure
that a running thread doesn't access a global
variable. This is currently true, but could
change in the future and we don't want to rely
on code that works 'by accident'. The threads
are still running when ThreadPool destructor is
called (after main() returns) and this could
lead to crashes if a thread accesses a global
that has been already freed. The solution is to
use an exit() function and call it while we are
still in main(), ensuring global variables are
still alive at threads termination time.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Before the search we setup the starting position doing all the
moves (sent by GUI) from start position to the position just
before to start searching.
To do this we use a set of StateInfo records used by each
do_move() call. These records shall be kept valid during all
the search because repetition draw detection uses them to back
track all the earlier positions keys. The problem is that, while
searching, the GUI could send another 'position' command, this
calls set_position() that clears the states! Of course a crash
follows shortly.
Before searching all the relevant parameters are copied in
start_searching() just for this reason: to fully detach data
accessed during the search from the UCI protocol handling.
So the natural solution would be to copy also the setup states.
Unfortunatly this approach does not work because StateInfo
contains a pointer to the previous record, so naively copying and
then freeing the original memory leads to a crash.
That's why we use two std::auto_ptr (one belonging to UCI and another
to Search) to safely transfer ownership of the StateInfo records to
the search, after we have setup the root position.
As a nice side-effect all the possible memory leaks are magically
sorted out for us by std::auto_ptr semantic.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Instead of just size(). Although code is longer,
should be more immediate to understand when reading.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
To mimics C++11 std::mutex and std::condition_variable,
also rename locks and condition variables to be more
uniform across the classes.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We can detect the split point master also from within idle_loop,
so we can call the function without parameters and remove an
overloaded member hack in Thread class.
Note that we don't need to take a lock around curSplitPoint
when entering idle_loop() because if we are the master then
curSplitPoint cannot change under our feet (because is_searching
is set and so we cannot be reallocated), if we are a slave
we enter idle_loop() only upon Thread creation and in that case
is always splitPointsCnt == 0. This is true even in the very rare
case that curSplitPoint != NULL, if we have been already allocated
even before entering idle_loop().
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
After 6K games at 60" + 0.1 on QUAD with 4 threads
this implementation fails to show a measurable increase,
result is well within error bar.
Perhaps with 8 or more threads resut is better but we
don't have the hardware to test. So retire for now and
in case re-add in the future if it proves good on big
machines.
The only good news is that we don't have a regression and
implementation is stable and bug-free, so could be reused
somewhere in the future.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
The check for detecting when a split point has all the
slaves still running is done with:
slavesMask == allSlavesMask
When a thread reparents, slavesMask is increased, then, if
the same thread finishes, because there are no more moves,
slavesMask returns to original value and the above condition
returns to be true. So that the just finished thread immediately
reparents again with the same split point, then starts and
then immediately exits in a tight loop that ends only when a
second slave finishes, so that slavesMask decrements and the
condition becomes false. This gives a spurious and anomaly
high number of faked reparents.
With this patch, that rewrites the logic to avoid this pitfall,
the reparenting success rate drops to a more realistical 5-10%
for 4 threads case.
As a side effect note that now there is no more the limit of
maxThreadsPerSplitPoint when reparenting.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And update master->splitPointsCnt under lock
protection. Not stricly necessary because
single_bit() condition takes care of false
positives anyhow, but it is a bit tricky and
moving under lock is the most natural thing
to do to avoid races with "reparenting".
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In Young Brothers Wait Concept (YBWC) available slaves are
booked by the split point master, then start to search below
the assigned split point and, once finished, return in idle
state waiting to be booked by another master.
This patch introduces "Active Reparenting" so that when a
slave finishes its job on the assigned split point, instead
of passively waiting to be booked, searches a suitable active
split point and reprents itselfs to that split point. Then
immediately starts to search below the split point in exactly
the same way of the others split point's slaves. This reduces
to zero the time waiting in idle loop and should increase
scalability especially whit many (8 or more) cores.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Unfortunatly accessing thread local variable
is much slower than object data (see previous
patch log msg), so we have to revert to old code
to avoid speed regression.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Much faster then pthread_getspecific() but still a
speed regression against the original code.
Following are the nps on a bench:
Position
454165
454838
455433
tls
441046
442767
442767
ms (Win)
450521
447510
451105
ms (pthread)
422115
422115
424276
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
So to avoid a crash when setting the moves in
UCI "position startpos moves ...." command.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
But use the newly introduced local storage
for this. A good code semplification and also
the correct way to go.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Use thread local storage to store a pointer to the thread we
are running on. This will allow to remove thread info from
Position class.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A std::set (that is a rb_tree) seems really
overkill to store at most a handful of moves
and nothing in the common case.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In particualr before to wake up main thread that
could take some random time. Until we don't reset
search time we are not able to correctly track
the elapsed search time and this can be dangerous
under extreme time pressure.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
In this case SF stop searching and goes sleeping
waiting for a stop / ponderhit before to return
best move. So when a "stop" arrives we need to wake
up the main thread again.
Another regression introduced by 3aa471f2a9,
hopefully the last one.
Thanks to Otello1984 to reporting this.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Fixes a not so rare crash (once every 100 games)
newly introduced. Unfortunatly I am still not
able to figure out why :-(
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And add final touches to this long patch series.
All the series has been verified against regression with
20K games at fast TC.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We cannot set do_sleep flag of main thread before
"bestmove" is sent to GUI, otherwise GUI could send
immediately the next "go" command that triggers
start_thinking() and because do_sleep is set UI
thread resets the flag to launch a new search. But
when shortly after main thread returns to main_loop()
flag is incorrectly reset and main thread goes to sleep
hanging the engine.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We store pointers instead of Thread objects because
Thread is not copy-constructible nor copy-assignable
and default ones are not suitable. So we cannot store
directly in a std::vector.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Associate platform OS thread to the Thread class instead of
creating it from ThreadsManager.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Split the data allocation, now done (mostly once)
in read_uci_options(), from the wake up and sleeping
of the slave threads upon entering/exiting the search.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
A somewhat tricky function pointer cast allows us
to move the platform specifics to lock.h, the cast
is tricky because return type is not the same of the
casted function in Linux (for Windows return type is
a DWORD that is a long) but this should not be a
problem as long as the size is the same;
From: http://stackoverflow.com/questions/188839/function-pointer-cast-to-different-signature
"OpenSSL was only casting functions pointers to
other function types taking and returning the same
number of values of the same exact sizes, and this
(assuming you're not dealing with floating-point)
happens to be safe across all the platforms and
calling conventions I know of. However, anything
else is potentially unsafe."
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
Introduce 'on change' actions that are triggered as soon as
an UCI option is changed by the GUI. This allows to set hash
size before to start the game, helpful especially on very fast
TC and big TT size.
As a side effect remove the 'button' type option, that now
is managed as a 'check' type.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
And introduce SPlitPoint bestMove to pass back the
best move after a split point.
This allow to define as const the search stack passed
to split.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>
We shouldn't need lock protection to increment
splitPointsCnt and set curSplitPoint of masterThread.
Anyhow because this code is very tricky and prone to
races bound the change in a single patch.
No functional change.
Signed-off-by: Marco Costalba <mcostalba@gmail.com>