• R
    lib/vsprintf.c: even faster binary to decimal conversion · 7c43d9a3
    Rasmus Villemoes 提交于
    The most expensive part of decimal conversion is the divisions by 10
    (albeit done using reciprocal multiplication with appropriately chosen
    constants).  I decided to see if one could eliminate around half of
    these multiplications by emitting two digits at a time, at the cost of a
    200 byte lookup table, and it does indeed seem like there is something
    to be gained, especially on 64 bits.  Microbenchmarking shows
    improvements ranging from -50% (for numbers uniformly distributed in [0,
    2^64-1]) to -25% (for numbers heavily biased toward the smaller end, a
    more realistic distribution).
    
    On a larger scale, perf shows that top, one of the big consumers of /proc
    data, uses 0.5-1.0% fewer cpu cycles.
    
    I had to jump through some hoops to get the 32 bit code to compile and run
    on my 64 bit machine, so I'm not sure how relevant these numbers are, but
    just for comparison the microbenchmark showed improvements between -30%
    and -10%.
    
    The bloat-o-meter costs are around 150 bytes (the generated code is a
    little smaller, so it's not the full 200 bytes) on both 32 and 64 bit.
    I'm aware that extra cache misses won't show up in a microbenchmark as
    used above, but on the other hand decimal conversions often happen in bulk
    (for example in the case of top).
    
    I have of course tested that the new code generates the same output as the
    old, for both the first and last 1e10 numbers in [0,2^64-1] and 4e9
    'random' numbers in-between.
    
    Test and verification code on github: https://github.com/Villemoes/dec.
    Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
    Tested-by: NJeff Epler <jepler@unpythonic.net>
    Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Joe Perches <joe@perches.com>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    7c43d9a3
vsprintf.c 63.4 KB