• R
    optimized C memset · a543369e
    Rich Felker 提交于
    this version of memset is optimized both for small and large values of
    n, and makes no misaligned writes, so it is usable (and near-optimal)
    on all archs. it is capable of filling up to 52 or 56 bytes without
    entering a loop and with at most 7 branches, all of which can be fully
    predicted if memset is called multiple times with the same size.
    
    it also uses the attribute extension to inform the compiler that it is
    violating the aliasing rules, unlike the previous code which simply
    assumed it was safe to violate the aliasing rules since translation
    unit boundaries hide the violations from the compiler. for non-GNUC
    compilers, 100% portable fallback code in the form of a naive loop is
    provided. I intend to eventually apply this approach to all of the
    string/memory functions which are doing word-at-a-time accesses.
    a543369e
memset.c 2.1 KB