提交 dfb06020 编写于 作者: L lateralusX 提交者: Alex Thibodeau

Wait on runtime threads to park on joinable thread list during shutdown.

https://github.com/mono/mono/pull/5599 fixed a race condition during shutdown
when runtime threads have come parts of their way through detach, but still
depend on runtime resources, like GC memory. The fix added runtime threads
to the joinable thread list just before they vanished from mono_thread_manage
radar making sure shutdown waited upon the thread before cleaning up.

The above fix slightly changed the behavior of the finalizer thread since it
waits on joinable threads and will now potential block on threads still
executing code (that involves runtime resources). There’s was an assumption
around the threads on the joinable thread list that they should be very close
to complete when added, so join calls coming from the finalizer thread should
almost never block and if it does, the code that remains to execute should not
involve runtime operations risking deadlock situations. Adding the thread to
the list earlier than previously done expose the shutdown to some potential
theoretical problems.

To mitigate the risk and still solve the race condition this commit adds a
mechanism to keep track of active runtime threads until they park on joinable
thread list. The pending counter will be waited upon by the shutdown
thread, just before it does its regular wait on all joinable threads
(after finalizer thread has stopped) to make sure all runtime threads have
been added to the joinable thread list before waiting upon them. Threads are
added to the joinable thread as late as possible, exactly how it’s been done
in the past by sgen_client_thread_detach_with_lock. Shutdown thread will wait
on runtime threads to appear on the list for a short time and if timeout (pending
runtime thread count not reaching 0 before timeout), it will just print a warning
and continue shutdown.

Getting into a wait state during shutdown due to runtime threads not yet added to
joinable threads list should be very rare (hitting previous race condition that was rare),
triggering the timeout should be even more rare, and if that ever happens, we are exposed
to shutdown race condition as we have had in the past, but now we at least get a warning in
the log making it simpler to analyze further.

This commit also fixes a problem with the debugger thread hitting the same race condition as above.
The shutdown thread stopping the debugger thread didn't completely wait for it to stop using runtime
resources before continue shutdown sequence. This triggers the same race condition as when shutting
down regular runtime threads. This commit makes sure stop_debugger_thread waits on the debugger thread
handle to become signaled (happens at the very end of thread lifetime) before continuing the shutdown
logic.
上级 940e4027
......@@ -160,6 +160,14 @@ static MonoGHashTable *threads_starting_up = NULL;
static GHashTable *joinable_threads;
static gint32 joinable_thread_count;
static GHashTable *pending_joinable_threads;
static gint32 pending_joinable_thread_count;
static mono_cond_t zero_pending_joinable_thread_event;
static void threads_add_pending_joinable_runtime_thread (MonoThreadInfo *mono_thread_info);
static gboolean threads_wait_pending_joinable_threads (uint32_t timeout);
#define SET_CURRENT_OBJECT(x) mono_tls_set_thread (x)
#define GET_CURRENT_OBJECT() (MonoInternalThread*) mono_tls_get_thread ()
......@@ -763,6 +771,17 @@ mono_thread_detach_internal (MonoInternalThread *thread)
MONO_PROFILER_RAISE (thread_stopping, (thread->tid));
/*
* Prevent race condition between thread shutdown and runtime shutdown.
* Including all runtime threads in the pending joinable count will make
* sure shutdown will wait for it to get onto the joinable thread list before
* critical resources have been cleanup (like GC memory). Threads getting onto
* the joinable thread list should just about to exit and not blocking a potential
* join call. Owner of threads attached to the runtime but not identified as runtime
* threads needs to make sure thread detach calls won't race with runtime shutdown.
*/
threads_add_pending_joinable_runtime_thread (info);
#ifndef HOST_WIN32
mono_w32mutex_abandon ();
#endif
......@@ -775,17 +794,6 @@ mono_thread_detach_internal (MonoInternalThread *thread)
thread->abort_exc = NULL;
thread->current_appcontext = NULL;
/*
* Prevent race condition between execution of this method and runtime shutdown.
* Adding runtime thread to the joinable threads list will make sure runtime shutdown
* won't complete until added runtime thread have exited. Owner of threads attached to the
* runtime but not identified as runtime threads needs to make sure thread detach calls won't
* race with runtime shutdown.
*/
#ifdef HOST_WIN32
mono_threads_add_joinable_runtime_thread (info);
#endif
/*
* thread->synch_cs can be NULL if this was called after
* ves_icall_System_Threading_InternalThread_Thread_free_internal.
......@@ -3031,6 +3039,8 @@ void mono_thread_init (MonoThreadStartCB start_cb,
mono_os_event_init (&background_change_event, FALSE);
mono_os_cond_init (&zero_pending_joinable_thread_event);
mono_init_static_data_info (&thread_static_info);
mono_init_static_data_info (&context_static_info);
......@@ -3122,6 +3132,13 @@ mono_thread_callbacks_init (void)
void
mono_thread_cleanup (void)
{
/* Wait for pending threads to park on joinable threads list */
/* NOTE, waiting on this should be extremely rare and will only happen */
/* under certain specific conditions. */
gboolean wait_result = threads_wait_pending_joinable_threads (2000);
if (!wait_result)
g_warning ("Waiting on threads to park on joinable thread list timed out.");
mono_threads_join_threads ();
#if !defined(RUN_IN_SUBTHREAD) && !defined(HOST_WIN32)
......@@ -3143,6 +3160,7 @@ mono_thread_cleanup (void)
mono_os_mutex_destroy (&interlocked_mutex);
mono_os_mutex_destroy (&delayed_free_table_mutex);
mono_os_mutex_destroy (&small_id_mutex);
mono_os_cond_destroy (&zero_pending_joinable_runtime_thread_event);
mono_os_event_destroy (&background_change_event);
#endif
}
......@@ -3367,6 +3385,7 @@ mono_thread_manage (void)
mono_threads_unlock ();
return;
}
mono_threads_unlock ();
do {
......@@ -5221,6 +5240,85 @@ threads_add_joinable_thread_nolock (gpointer tid)
}
#endif
static void
threads_add_pending_joinable_thread (gpointer tid)
{
joinable_threads_lock ();
if (!pending_joinable_threads)
pending_joinable_threads = g_hash_table_new (NULL, NULL);
gpointer orig_key;
gpointer value;
if (!g_hash_table_lookup_extended (pending_joinable_threads, tid, &orig_key, &value)) {
g_hash_table_insert (pending_joinable_threads, tid, tid);
UnlockedIncrement (&pending_joinable_thread_count);
}
joinable_threads_unlock ();
}
static void
threads_add_pending_joinable_runtime_thread (MonoThreadInfo *mono_thread_info)
{
g_assert (mono_thread_info);
if (mono_thread_info->runtime_thread) {
threads_add_pending_joinable_thread ((gpointer)(MONO_UINT_TO_NATIVE_THREAD_ID (mono_thread_info_get_tid (mono_thread_info))));
}
}
static void
threads_remove_pending_joinable_thread_nolock (gpointer tid)
{
gpointer orig_key;
gpointer value;
if (pending_joinable_threads && g_hash_table_lookup_extended (pending_joinable_threads, tid, &orig_key, &value)) {
g_hash_table_remove (pending_joinable_threads, tid);
if (UnlockedDecrement (&pending_joinable_thread_count) == 0)
mono_os_cond_broadcast (&zero_pending_joinable_thread_event);
}
}
static gboolean
threads_wait_pending_joinable_threads (uint32_t timeout)
{
if (UnlockedRead (&pending_joinable_thread_count) > 0) {
joinable_threads_lock ();
if (timeout == MONO_INFINITE_WAIT) {
while (UnlockedRead (&pending_joinable_thread_count) > 0)
mono_os_cond_wait (&zero_pending_joinable_thread_event, &joinable_threads_mutex);
} else {
gint64 start = mono_msec_ticks ();
gint64 elapsed = 0;
while (UnlockedRead (&pending_joinable_thread_count) > 0 && elapsed < timeout) {
mono_os_cond_timedwait (&zero_pending_joinable_thread_event, &joinable_threads_mutex, timeout - (uint32_t)elapsed);
elapsed = mono_msec_ticks () - start;
}
}
joinable_threads_unlock ();
}
return UnlockedRead (&pending_joinable_thread_count) == 0;
}
static void
threads_add_unique_joinable_thread_nolock (gpointer tid)
{
if (!joinable_threads)
joinable_threads = g_hash_table_new (NULL, NULL);
gpointer orig_key;
gpointer value;
if (!g_hash_table_lookup_extended (joinable_threads, tid, &orig_key, &value)) {
threads_add_joinable_thread_nolock (tid);
UnlockedIncrement (&joinable_thread_count);
}
}
void
mono_threads_add_joinable_runtime_thread (gpointer thread_info)
{
......@@ -5228,8 +5326,19 @@ mono_threads_add_joinable_runtime_thread (gpointer thread_info)
MonoThreadInfo *mono_thread_info = (MonoThreadInfo*)thread_info;
if (mono_thread_info->runtime_thread) {
if (mono_atomic_cas_i32 (&mono_thread_info->thread_pending_native_join, TRUE, FALSE) == FALSE)
mono_threads_add_joinable_thread ((gpointer)(MONO_UINT_TO_NATIVE_THREAD_ID (mono_thread_info_get_tid (mono_thread_info))));
gpointer tid = (gpointer)(MONO_UINT_TO_NATIVE_THREAD_ID (mono_thread_info_get_tid (mono_thread_info)));
joinable_threads_lock ();
// Add to joinable thread list, if not already included.
threads_add_unique_joinable_thread_nolock (tid);
// Remove thread from pending joinable list, if present.
threads_remove_pending_joinable_thread_nolock (tid);
joinable_threads_unlock ();
mono_gc_finalize_notify ();
}
}
......@@ -5248,15 +5357,7 @@ mono_threads_add_joinable_thread (gpointer tid)
* we have time (in the finalizer thread).
*/
joinable_threads_lock ();
if (!joinable_threads)
joinable_threads = g_hash_table_new (NULL, NULL);
gpointer orig_key;
gpointer value;
if (!g_hash_table_lookup_extended (joinable_threads, tid, &orig_key, &value)) {
threads_add_joinable_thread_nolock (tid);
UnlockedIncrement (&joinable_thread_count);
}
threads_add_unique_joinable_thread_nolock (tid);
joinable_threads_unlock ();
mono_gc_finalize_notify ();
......
......@@ -1861,7 +1861,8 @@ stop_debugger_thread (void)
mono_coop_mutex_unlock (&debugger_thread_exited_mutex);
} while (!debugger_thread_exited);
mono_native_thread_join (debugger_thread_id);
if (debugger_thread_handle)
mono_thread_info_wait_one_handle (debugger_thread_handle, MONO_INFINITE_WAIT, TRUE);
}
transport_close2 ();
......
......@@ -231,8 +231,6 @@ typedef struct {
*/
gint32 profiler_signal_ack;
gint32 thread_pending_native_join;
#ifdef USE_WINDOWS_BACKEND
gint32 win32_apc_info;
gpointer win32_apc_info_io_handle;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册