• R
    pthread stack treatment overhaul for application-provided stacks, etc. · d5142642
    Rich Felker 提交于
    the main goal of these changes is to address the case where an
    application provides a stack of size N, but TLS has size M that's a
    significant portion of the size N (or even larger than N), thus giving
    the application less stack space than it expected or no stack at all!
    
    the new strategy pthread_create now uses is to only put TLS on the
    application-provided stack if TLS is smaller than 1/8 of the stack
    size or 2k, whichever is smaller. this ensures that the application
    always has "close enough" to what it requested, and the threshold is
    chosen heuristically to make sure "sane" amounts of TLS still end up
    in the application-provided stack.
    
    if TLS does not fit the above criteria, pthread_create uses mmap to
    obtain space for TLS, but still uses the application-provided stack
    for actual call frame stack. this is to avoid wasting memory, and for
    the sake of supporting ugly hacks like garbage collection based on
    assumptions that the implementation will use the provided stack range.
    
    in order for the above heuristics to ever succeed, the amount of TLS
    space wasted on POSIX TSD (pthread_key_create based) needed to be
    reduced. otherwise, these changes would preclude any use of
    pthread_create without mmap, which would have serious memory usage and
    performance costs for applications trying to create huge numbers of
    threads using pre-allocated stack space. the new value of
    PTHREAD_KEYS_MAX is the minimum allowed by POSIX, 128. this should
    still be plenty more than real-world applications need, especially now
    that C11/gcc-style TLS is now supported in musl, and most apps and
    libraries choose to use that instead of POSIX TSD when available.
    
    at the same time, PTHREAD_STACK_MIN has been decreased. it was
    originally set to PAGE_SIZE back when there was no support for TLS or
    application-provided stacks, and requests smaller than a whole page
    did not make sense. now, there are two good reasons to support
    requests smaller than a page: (1) applications could provide
    pre-allocated stacks smaller than a page, and (2) with smaller stack
    sizes, stack+TLS+TSD can all fit in one page, making it possible for
    applications which need huge numbers of threads with minimal stack
    needs to allocate exactly one page per thread. the new value of
    PTHREAD_STACK_MIN, 2k, is aligned with the minimum size for
    sigaltstack.
    d5142642
pthread_create.c 4.8 KB