• G
    bpf, x64: fix JIT emission for dead code · 1612a981
    Gianluca Borello 提交于
    Commit 2a5418a1 ("bpf: improve dead code sanitizing") replaced dead
    code with a series of ja-1 instructions, for safety. That made JIT
    compilation much more complex for some BPF programs. One instance of such
    programs is, for example:
    
    bool flag = false
    ...
    /* A bunch of other code */
    ...
    if (flag)
            do_something()
    
    In some cases llvm is not able to remove at compile time the code for
    do_something(), so the generated BPF program ends up with a large amount
    of dead instructions. In one specific real life example, there are two
    series of ~500 and ~1000 dead instructions in the program. When the
    verifier replaces them with a series of ja-1 instructions, it causes an
    interesting behavior at JIT time.
    
    During the first pass, since all the instructions are estimated at 64
    bytes, the ja-1 instructions end up being translated as 5 bytes JMP
    instructions (0xE9), since the jump offsets become increasingly large (>
    127) as each instruction gets discovered to be 5 bytes instead of the
    estimated 64.
    
    Starting from the second pass, the first N instructions of the ja-1
    sequence get translated into 2 bytes JMPs (0xEB) because the jump offsets
    become <= 127 this time. In particular, N is defined as roughly 127 / (5
    - 2) ~= 42. So, each further pass will make the subsequent N JMP
    instructions shrink from 5 to 2 bytes, making the image shrink every time.
    This means that in order to have the entire program converge, there need
    to be, in the real example above, at least ~1000 / 42 ~= 24 passes just
    for translating the dead code. If we add this number to the passes needed
    to translate the other non dead code, it brings such program to 40+
    passes, and JIT doesn't complete. Ultimately the userspace loader fails
    because such BPF program was supposed to be part of a prog array owner
    being JITed.
    
    While it is certainly possible to try to refactor such programs to help
    the compiler remove dead code, the behavior is not really intuitive and it
    puts further burden on the BPF developer who is not expecting such
    behavior. To make things worse, such programs are working just fine in all
    the kernel releases prior to the ja-1 fix.
    
    A possible approach to mitigate this behavior consists into noticing that
    for ja-1 instructions we don't really need to rely on the estimated size
    of the previous and current instructions, we know that a -1 BPF jump
    offset can be safely translated into a 0xEB instruction with a jump offset
    of -2.
    
    Such fix brings the BPF program in the previous example to complete again
    in ~9 passes.
    
    Fixes: 2a5418a1 ("bpf: improve dead code sanitizing")
    Signed-off-by: NGianluca Borello <g.borello@gmail.com>
    Acked-by: NAlexei Starovoitov <ast@kernel.org>
    Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
    1612a981
bpf_jit_comp.c 33.1 KB