1. 22 3月, 2018 19 次提交
  2. 21 3月, 2018 10 次提交
  3. 20 3月, 2018 11 次提交
    • M
      mlx5: Remove call to ida_pre_get · c846d8da
      Matthew Wilcox 提交于
      The mlx5 driver calls ida_pre_get() in a loop for no readily apparent
      reason.  The driver uses ida_simple_get() which will call ida_pre_get()
      by itself and there's no need to use ida_pre_get() unless using
      ida_get_new().
      Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c846d8da
    • D
      Merge branch 'bpf-sockmap-ulp' · d48ce3e5
      Daniel Borkmann 提交于
      John Fastabend says:
      
      ====================
      This series adds a BPF hook for sendmsg and senfile by using
      the ULP infrastructure and sockmap. A simple pseudocode example
      would be,
      
        // load the programs
        bpf_prog_load(SOCKMAP_TCP_MSG_PROG, BPF_PROG_TYPE_SK_MSG,
                      &obj, &msg_prog);
      
        // lookup the sockmap
        bpf_map_msg = bpf_object__find_map_by_name(obj, "my_sock_map");
      
        // get fd for sockmap
        map_fd_msg = bpf_map__fd(bpf_map_msg);
      
        // attach program to sockmap
        bpf_prog_attach(msg_prog, map_fd_msg, BPF_SK_MSG_VERDICT, 0);
      
        // Add a socket 'fd' to sockmap at location 'i'
        bpf_map_update_elem(map_fd_msg, &i, fd, BPF_ANY);
      
      After the above snippet any socket attached to the map would run
      msg_prog on sendmsg and sendfile system calls.
      
      Three additional helpers are added bpf_msg_apply_bytes(),
      bpf_msg_cork_bytes(), and bpf_msg_pull_data(). With
      bpf_msg_apply_bytes BPF programs can tell the infrastructure how
      many bytes the given verdict should apply to. This has two cases.
      First, a BPF program applies verdict to fewer bytes than in the
      current sendmsg/sendfile msg this will apply the verdict to the
      first N bytes of the message then run the BPF program again with
      data pointers recalculated to the N+1 byte. The second case is the
      BPF program applies a verdict to more bytes than the current sendmsg
      or sendfile system call. In this case the infrastructure will cache
      the verdict and apply it to future sendmsg/sendfile calls until the
      byte limit is reached. This avoids the overhead of running BPF
      programs on large payloads.
      
      The helper bpf_msg_cork_bytes() handles a different case where
      a BPF program can not reach a verdict on a msg until it receives
      more bytes AND the program doesn't want to forward the packet
      until it is known to be "good". The example case being a user
      (albeit a dumb one probably) sends a N byte header in 1B system
      calls. The BPF program can call bpf_msg_cork_bytes with the
      required byte limit to reach a verdict and then the program will
      only be called again once N bytes are received.
      
      The last helper added in this series is bpf_msg_pull_data(). It
      is used to pull data in for modification or reading. Similar to
      how sk_pull_data() works msg_pull_data can be used to access data
      not in the initial (data_start, data_end) range. For sendpage()
      calls this is needed if any data is accessed because the BPF
      sendpage hook initializes the data_start and data_end pointers to
      zero. We do this because sendpage data is shared with the user
      and can be modified during or after the BPF verdict possibly
      invalidating any verdict the BPF program decides. For sendmsg
      the data is already copied by the sendmsg bpf infrastructure so
      we only copy the data if the user request a data range that is
      not already linearized. This happens if the user requests larger
      blocks of data that are not in a single scatterlist element. The
      common case seems to be accessing headers which normally are
      in the first scatterlist element and already linearized.
      
      For more examples please review the sample program. There are
      examples for all the actions and helpers there.
      
      Patches 1-8 implement the above sockmap/BPF infrastructure. The
      remaining patches flush out some minimal selftests and the sample
      sockmap program. The sockmap sample program is the main vehicle
      for testing this infrastructure and will be moved into selftests
      shortly. The final patch in this series is a simple shell script
      to run a set of tests. These are the tests I run after any changes
      to sockmap. The next task on the list after this series is to
      push those into selftests so we can avoid manually testing.
      
      Couple notes on future items in the pipeline,
      
        0. move sample sockmap programs into selftests (noted above)
        1. add additional support for tcp flags, most are ignored now.
        2. add a Documentation/bpf/sockmap file with these details
        3. support stacked ULP types to allow this and ktls to cooperate
        4. Ingress flag support, redirect only supports egress here. The
           other redirect helpers support ingress and egress flags.
        5. add optimizations, I cut a few optimizations here in the
           first iteration of the code for later study/implementation
      
      -v3 updates
        : u32 data pointers in msg_md changed to void *
        : page_address NULL check and flag verification in msg_pull_data
        : remove old note in commit msg that is no longer relevant
        : remove enum sk_msg_action its not used anywhere
        : fixup test_verifier W -> DW insn to account for data pointers
        : unintentionally dropped a smap_stop_tx() call in sockmap.c
      
      I propagated the ACKs forward because above changes were small
      one/two line changes.
      
      -v2 updates (discussion):
      
      Dave noticed that sendpage call was previously (in v1) running
      on the data directly. This allowed users to potentially modify
      the data after or during the BPF program. However doing a copy
      automatically even if the data is not accessed has measurable
      performance impact. So we added another helper modeled after
      the existing skb_pull_data() helper to allow users to selectively
      pull data from the msg. This is also useful in the sendmsg case
      when users need to access data outside the first scatterlist
      element or across scatterlist boundaries.
      
      While doing this I also unified the sendmsg and sendfile handlers
      a bit. Originally the sendfile call was optimized for never
      touching the data. I've decided for a first submission to drop
      this optimization and we can add it back later. It introduced
      unnecessary complexity, at least for a first posting, for a
      use case I have not entirely flushed out yet. When the use
      case is deployed we can add it back if needed. Then we can
      review concrete performance deltas as well on real-world
      use-cases/applications.
      
      Lastly, I reorganized the patches a bit. Now all sockmap
      changes are in a single patch and each helper gets its own
      patch. This, at least IMO, makes it easier to review because
      sockmap changes are not spread across the patch series. On
      the other hand now apply_bytes, cork_bytes logic is only
      activated later in the series. But that should be OK.
      ====================
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      d48ce3e5
    • J
      bpf: sockmap test script · ae30727f
      John Fastabend 提交于
      This adds the test script I am currently using to validate
      the latest sockmap changes. Shortly sockmap will be ported
      to selftests and these will be run from the infrastructure
      there. Until then add the script here so we have a coverage
      checklist when porting into selftests.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      ae30727f
    • J
      bpf: sockmap sample test for bpf_msg_pull_data · 0dcbbf67
      John Fastabend 提交于
      This adds an option to test the msg_pull_data helper. This
      uses two options txmsg_start and txmsg_end to let the user
      specify start and end bytes to pull.
      
      The options can be used with txmsg_apply, txmsg_cork options
      as well as with any of the basic tests, txmsg, txmsg_redir and
      txmsg_drop (plus noisy variants) to run pull_data inline with
      those tests. By giving user direct control over the variables
      we can easily do negative testing as well as positive tests.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      0dcbbf67
    • J
      bpf: sockmap add SK_DROP tests · e6373ce7
      John Fastabend 提交于
      Add tests for SK_DROP.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      e6373ce7
    • J
      bpf: sockmap sample support for bpf_msg_cork_bytes() · 468b3fde
      John Fastabend 提交于
      Add sample application support for the bpf_msg_cork_bytes helper. This
      lets the user specify how many bytes each verdict should apply to.
      
      Similar to apply_bytes() tests these can be run as a stand-alone test
      when used without other options or inline with other tests by using
      the txmsg_cork option along with any of the basic tests txmsg,
      txmsg_redir, txmsg_drop.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      468b3fde
    • J
      bpf: sockmap, add sample option to test apply_bytes helper · 1c16c312
      John Fastabend 提交于
      This adds an option to test the apply_bytes helper. This option lets
      the user specify an int on the command line specifying how much data
      each verdict should apply to.
      
      When this is set a map entry is set with the bytes input by the user
      and then the specified program --txmsg or --txmsg_redir will use the
      value and set the applied data. If no other option is set then a
      default --txmsg_apply program is run. This program will drop pkts
      if an error is detected on the bytes map lookup. Useful to verify
      the map lookup and apply helper are working and causing a hard
      error if it is not.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      1c16c312
    • J
      bpf: sockmap sample, add data verification option · 6bce9d2c
      John Fastabend 提交于
      To verify data is not being dropped or corrupted this adds an option
      to verify test-patterns on recv.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      6bce9d2c
    • J
      bpf: sockmap sample, add sendfile test · e67463cb
      John Fastabend 提交于
      To exercise TX ULP sendpage implementation we need a test that does
      a sendfile. Add sendfile test option here.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      e67463cb
    • J
      bpf: sockmap sample, add option to attach SK_MSG program · 4c4c3c27
      John Fastabend 提交于
      Add sockmap option to use SK_MSG program types.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      4c4c3c27
    • J
      bpf: add verifier tests for BPF_PROG_TYPE_SK_MSG · 1acc60b6
      John Fastabend 提交于
      Test read and writes for BPF_PROG_TYPE_SK_MSG.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      1acc60b6