• J
    bpf: sockmap with sk redirect support · 174a79ff
    John Fastabend 提交于
    Recently we added a new map type called dev map used to forward XDP
    packets between ports (6093ec2d). This patches introduces a
    similar notion for sockets.
    
    A sockmap allows users to add participating sockets to a map. When
    sockets are added to the map enough context is stored with the
    map entry to use the entry with a new helper
    
      bpf_sk_redirect_map(map, key, flags)
    
    This helper (analogous to bpf_redirect_map in XDP) is given the map
    and an entry in the map. When called from a sockmap program, discussed
    below, the skb will be sent on the socket using skb_send_sock().
    
    With the above we need a bpf program to call the helper from that will
    then implement the send logic. The initial site implemented in this
    series is the recv_sock hook. For this to work we implemented a map
    attach command to add attributes to a map. In sockmap we add two
    programs a parse program and a verdict program. The parse program
    uses strparser to build messages and pass them to the verdict program.
    The parse programs use the normal strparser semantics. The verdict
    program is of type SK_SKB.
    
    The verdict program returns a verdict SK_DROP, or  SK_REDIRECT for
    now. Additional actions may be added later. When SK_REDIRECT is
    returned, expected when bpf program uses bpf_sk_redirect_map(), the
    sockmap logic will consult per cpu variables set by the helper routine
    and pull the sock entry out of the sock map. This pattern follows the
    existing redirect logic in cls and xdp programs.
    
    This gives the flow,
    
     recv_sock -> str_parser (parse_prog) -> verdict_prog -> skb_send_sock
                                                         \
                                                          -> kfree_skb
    
    As an example use case a message based load balancer may use specific
    logic in the verdict program to select the sock to send on.
    
    Sample programs are provided in future patches that hopefully illustrate
    the user interfaces. Also selftests are in follow-on patches.
    Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
    Signed-off-by: NDavid S. Miller <davem@davemloft.net>
    174a79ff
syscall.c 33.5 KB