1. 16 1月, 2020 1 次提交
    • J
      bpf: Sockmap, ensure sock lock held during tear down · 7e81a353
      John Fastabend 提交于
      The sock_map_free() and sock_hash_free() paths used to delete sockmap
      and sockhash maps walk the maps and destroy psock and bpf state associated
      with the socks in the map. When done the socks no longer have BPF programs
      attached and will function normally. This can happen while the socks in
      the map are still "live" meaning data may be sent/received during the walk.
      
      Currently, though we don't take the sock_lock when the psock and bpf state
      is removed through this path. Specifically, this means we can be writing
      into the ops structure pointers such as sendmsg, sendpage, recvmsg, etc.
      while they are also being called from the networking side. This is not
      safe, we never used proper READ_ONCE/WRITE_ONCE semantics here if we
      believed it was safe. Further its not clear to me its even a good idea
      to try and do this on "live" sockets while networking side might also
      be using the socket. Instead of trying to reason about using the socks
      from both sides lets realize that every use case I'm aware of rarely
      deletes maps, in fact kubernetes/Cilium case builds map at init and
      never tears it down except on errors. So lets do the simple fix and
      grab sock lock.
      
      This patch wraps sock deletes from maps in sock lock and adds some
      annotations so we catch any other cases easier.
      
      Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/bpf/20200111061206.8028-3-john.fastabend@gmail.com
      7e81a353
  2. 29 11月, 2019 1 次提交
    • J
      net: skmsg: fix TLS 1.3 crash with full sk_msg · 031097d9
      Jakub Kicinski 提交于
      TLS 1.3 started using the entry at the end of the SG array
      for chaining-in the single byte content type entry. This mostly
      works:
      
      [ E E E E E E . . ]
        ^           ^
         start       end
      
                       E < content type
                     /
      [ E E E E E E C . ]
        ^           ^
         start       end
      
      (Where E denotes a populated SG entry; C denotes a chaining entry.)
      
      If the array is full, however, the end will point to the start:
      
      [ E E E E E E E E ]
        ^
         start
         end
      
      And we end up overwriting the start:
      
          E < content type
         /
      [ C E E E E E E E ]
        ^
         start
         end
      
      The sg array is supposed to be a circular buffer with start and
      end markers pointing anywhere. In case where start > end
      (i.e. the circular buffer has "wrapped") there is an extra entry
      reserved at the end to chain the two halves together.
      
      [ E E E E E E . . l ]
      
      (Where l is the reserved entry for "looping" back to front.
      
      As suggested by John, let's reserve another entry for chaining
      SG entries after the main circular buffer. Note that this entry
      has to be pointed to by the end entry so its position is not fixed.
      
      Examples of full messages:
      
      [ E E E E E E E E . l ]
        ^               ^
         start           end
      
         <---------------.
      [ E E . E E E E E E l ]
            ^ ^
         end   start
      
      Now the end will always point to an unused entry, so TLS 1.3
      can always use it.
      
      Fixes: 130b392c ("net: tls: Add tls 1.3 support")
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: NSimon Horman <simon.horman@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      031097d9
  3. 22 11月, 2019 1 次提交
  4. 06 11月, 2019 1 次提交
  5. 25 8月, 2019 1 次提交
  6. 22 7月, 2019 1 次提交
    • J
      bpf: sockmap/tls, close can race with map free · 95fa1454
      John Fastabend 提交于
      When a map free is called and in parallel a socket is closed we
      have two paths that can potentially reset the socket prot ops, the
      bpf close() path and the map free path. This creates a problem
      with which prot ops should be used from the socket closed side.
      
      If the map_free side completes first then we want to call the
      original lowest level ops. However, if the tls path runs first
      we want to call the sockmap ops. Additionally there was no locking
      around prot updates in TLS code paths so the prot ops could
      be changed multiple times once from TLS path and again from sockmap
      side potentially leaving ops pointed at either TLS or sockmap
      when psock and/or tls context have already been destroyed.
      
      To fix this race first only update ops inside callback lock
      so that TLS, sockmap and lowest level all agree on prot state.
      Second and a ULP callback update() so that lower layers can
      inform the upper layer when they are being removed allowing the
      upper layer to reset prot ops.
      
      This gets us close to allowing sockmap and tls to be stacked
      in arbitrary order but will save that patch for *next trees.
      
      v4:
       - make sure we don't free things for device;
       - remove the checks which swap the callbacks back
         only if TLS is at the top.
      
      Reported-by: syzbot+06537213db7ba2745c4a@syzkaller.appspotmail.com
      Fixes: 02c558b2 ("bpf: sockmap, support for msg_peek in sk_msg with redirect ingress")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: NDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      95fa1454
  7. 14 5月, 2019 2 次提交
    • J
      bpf: sockmap fix msg->sg.size account on ingress skb · cabede8b
      John Fastabend 提交于
      When converting a skb to msg->sg we forget to set the size after the
      latest ktls/tls code conversion. This patch can be reached by doing
      a redir into ingress path from BPF skb sock recv hook. Then trying to
      read the size fails.
      
      Fix this by setting the size.
      
      Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      cabede8b
    • J
      bpf: sockmap, only stop/flush strp if it was enabled at some point · 01489436
      John Fastabend 提交于
      If we try to call strp_done on a parser that has never been
      initialized, because the sockmap user is only using TX side for
      example we get the following error.
      
        [  883.422081] WARNING: CPU: 1 PID: 208 at kernel/workqueue.c:3030 __flush_work+0x1ca/0x1e0
        ...
        [  883.422095] Workqueue: events sk_psock_destroy_deferred
        [  883.422097] RIP: 0010:__flush_work+0x1ca/0x1e0
      
      This had been wrapped in a 'if (psock->parser.enabled)' logic which
      was broken because the strp_done() was never actually being called
      because we do a strp_stop() earlier in the tear down logic will
      set parser.enabled to false. This could result in a use after free
      if work was still in the queue and was resolved by the patch here,
      1d79895a ("sk_msg: Always cancel strp work before freeing the
      psock"). However, calling strp_stop(), done by the patch marked in
      the fixes tag, only is useful if we never initialized a strp parser
      program and never initialized the strp to start with. Because if
      we had initialized a stream parser strp_stop() would have been called
      by sk_psock_drop() earlier in the tear down process.  By forcing the
      strp to stop we get past the WARNING in strp_done that checks
      the stopped flag but calling cancel_work_sync on work that has never
      been initialized is also wrong and generates the warning above.
      
      To fix check if the parser program exists. If the program exists
      then the strp work has been initialized and must be sync'd and
      cancelled before free'ing any structures. If no program exists we
      never initialized the stream parser in the first place so skip the
      sync/cancel logic implemented by strp_done.
      
      Finally, remove the strp_done its not needed and in the case where we
      are using the stream parser has already been called.
      
      Fixes: e8e34377 ("bpf: Stop the psock parser before canceling its work")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      01489436
  8. 07 3月, 2019 1 次提交
    • J
      bpf: Stop the psock parser before canceling its work · e8e34377
      Jakub Sitnicki 提交于
      We might have never enabled (started) the psock's parser, in which case it
      will not get stopped when destroying the psock. This leads to a warning
      when trying to cancel parser's work from psock's deferred destructor:
      
      [  405.325769] WARNING: CPU: 1 PID: 3216 at net/strparser/strparser.c:526 strp_done+0x3c/0x40
      [  405.326712] Modules linked in: [last unloaded: test_bpf]
      [  405.327359] CPU: 1 PID: 3216 Comm: kworker/1:164 Tainted: G        W         5.0.0 #42
      [  405.328294] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
      [  405.329712] Workqueue: events sk_psock_destroy_deferred
      [  405.330254] RIP: 0010:strp_done+0x3c/0x40
      [  405.330706] Code: 28 e8 b8 d5 6b ff 48 8d bb 80 00 00 00 e8 9c d5 6b ff 48 8b 7b 18 48 85 ff 74 0d e8 1e a5 e8 ff 48 c7 43 18 00 00 00 00 5b c3 <0f> 0b eb cf 66 66 66 66 90 55 89 f5 53 48 89 fb 48 83 c7 28 e8 0b
      [  405.332862] RSP: 0018:ffffc900026bbe50 EFLAGS: 00010246
      [  405.333482] RAX: ffffffff819323e0 RBX: ffff88812cb83640 RCX: ffff88812cb829e8
      [  405.334228] RDX: 0000000000000001 RSI: ffff88812cb837e8 RDI: ffff88812cb83640
      [  405.335366] RBP: ffff88813fd22680 R08: 0000000000000000 R09: 000073746e657665
      [  405.336472] R10: 8080808080808080 R11: 0000000000000001 R12: ffff88812cb83600
      [  405.337760] R13: 0000000000000000 R14: ffff88811f401780 R15: ffff88812cb837e8
      [  405.338777] FS:  0000000000000000(0000) GS:ffff88813fd00000(0000) knlGS:0000000000000000
      [  405.339903] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  405.340821] CR2: 00007fb11489a6b8 CR3: 000000012d4d6000 CR4: 00000000000406e0
      [  405.341981] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  405.343131] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  405.344415] Call Trace:
      [  405.344821]  sk_psock_destroy_deferred+0x23/0x1b0
      [  405.345585]  process_one_work+0x1ae/0x3e0
      [  405.346110]  worker_thread+0x3c/0x3b0
      [  405.346576]  ? pwq_unbound_release_workfn+0xd0/0xd0
      [  405.347187]  kthread+0x11d/0x140
      [  405.347601]  ? __kthread_parkme+0x80/0x80
      [  405.348108]  ret_from_fork+0x35/0x40
      [  405.348566] ---[ end trace a4a3af4026a327d4 ]---
      
      Stop psock's parser just before canceling its work.
      
      Fixes: 1d79895a ("sk_msg: Always cancel strp work before freeing the psock")
      Reported-by: Nkernel test robot <rong.a.chen@intel.com>
      Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      e8e34377
  9. 29 1月, 2019 1 次提交
    • J
      sk_msg: Always cancel strp work before freeing the psock · 1d79895a
      Jakub Sitnicki 提交于
      Despite having stopped the parser, we still need to deinitialize it
      by calling strp_done so that it cancels its work. Otherwise the worker
      thread can run after we have freed the parser, and attempt to access
      its workqueue resulting in a use-after-free:
      
      ==================================================================
      BUG: KASAN: use-after-free in pwq_activate_delayed_work+0x1b/0x1d0
      Read of size 8 at addr ffff888069975240 by task kworker/u2:2/93
      
      CPU: 0 PID: 93 Comm: kworker/u2:2 Not tainted 5.0.0-rc2-00335-g28f9d1a3-dirty #14
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
      Workqueue:            (null) (kstrp)
      Call Trace:
       print_address_description+0x6e/0x2b0
       ? pwq_activate_delayed_work+0x1b/0x1d0
       kasan_report+0xfd/0x177
       ? pwq_activate_delayed_work+0x1b/0x1d0
       ? pwq_activate_delayed_work+0x1b/0x1d0
       pwq_activate_delayed_work+0x1b/0x1d0
       ? process_one_work+0x4aa/0x660
       pwq_dec_nr_in_flight+0x9b/0x100
       worker_thread+0x82/0x680
       ? process_one_work+0x660/0x660
       kthread+0x1b9/0x1e0
       ? __kthread_create_on_node+0x250/0x250
       ret_from_fork+0x1f/0x30
      
      Allocated by task 111:
       sk_psock_init+0x3c/0x1b0
       sock_map_link.isra.2+0x103/0x4b0
       sock_map_update_common+0x94/0x270
       sock_map_update_elem+0x145/0x160
       __se_sys_bpf+0x152e/0x1e10
       do_syscall_64+0xb2/0x3e0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Freed by task 112:
       kfree+0x7f/0x140
       process_one_work+0x40b/0x660
       worker_thread+0x82/0x680
       kthread+0x1b9/0x1e0
       ret_from_fork+0x1f/0x30
      
      The buggy address belongs to the object at ffff888069975180
       which belongs to the cache kmalloc-512 of size 512
      The buggy address is located 192 bytes inside of
       512-byte region [ffff888069975180, ffff888069975380)
      The buggy address belongs to the page:
      page:ffffea0001a65d00 count:1 mapcount:0 mapping:ffff88806d401280 index:0x0 compound_mapcount: 0
      flags: 0x4000000000010200(slab|head)
      raw: 4000000000010200 dead000000000100 dead000000000200 ffff88806d401280
      raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff888069975100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff888069975180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff888069975200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                 ^
       ffff888069975280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff888069975300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ==================================================================
      Reported-by: NMarek Majkowski <marek@cloudflare.com>
      Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/netdev/CAJPywTLwgXNEZ2dZVoa=udiZmtrWJ0q5SuBW64aYs0Y1khXX3A@mail.gmail.comAcked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      1d79895a
  10. 18 1月, 2019 1 次提交
    • V
      Optimize sk_msg_clone() by data merge to end dst sg entry · fda497e5
      Vakul Garg 提交于
      Function sk_msg_clone has been modified to merge the data from source sg
      entry to destination sg entry if the cloned data resides in same page
      and is contiguous to the end entry of destination sk_msg. This improves
      kernel tls throughput to the tune of 10%.
      
      When the user space tls application calls sendmsg() with MSG_MORE, it leads
      to calling sk_msg_clone() with new data being cloned placed continuous to
      previously cloned data. Without this optimization, a new SG entry in
      the destination sk_msg i.e. rec->msg_plaintext in tls_clone_plaintext_msg()
      gets used. This leads to exhaustion of sg entries in rec->msg_plaintext
      even before a full 16K of allowable record data is accumulated. Hence we
      lose oppurtunity to encrypt and send a full 16K record.
      
      With this patch, the kernel tls can accumulate full 16K of record data
      irrespective of the size of data passed in sendmsg() with MSG_MORE.
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fda497e5
  11. 22 12月, 2018 1 次提交
    • V
      Prevent overflow of sk_msg in sk_msg_clone() · 5c1e7e94
      Vakul Garg 提交于
      Fixed function sk_msg_clone() to prevent overflow of 'dst' while adding
      pages in scatterlist entries. The overflow of 'dst' causes crash in kernel
      tls module while doing record encryption.
      
      Crash fixed by this patch.
      
      [   78.796119] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
      [   78.804900] Mem abort info:
      [   78.807683]   ESR = 0x96000004
      [   78.810744]   Exception class = DABT (current EL), IL = 32 bits
      [   78.816677]   SET = 0, FnV = 0
      [   78.819727]   EA = 0, S1PTW = 0
      [   78.822873] Data abort info:
      [   78.825759]   ISV = 0, ISS = 0x00000004
      [   78.829600]   CM = 0, WnR = 0
      [   78.832576] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000bf8ee311
      [   78.839195] [0000000000000008] pgd=0000000000000000
      [   78.844081] Internal error: Oops: 96000004 [#1] PREEMPT SMP
      [   78.849642] Modules linked in: tls xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_filter ip6_tables xt_CHECKSUM cpve cpufreq_conservative lm90 ina2xx crct10dif_ce
      [   78.865377] CPU: 0 PID: 6007 Comm: openssl Not tainted 4.20.0-rc6-01647-g754d5da6-dirty #107
      [   78.874149] Hardware name: LS1043A RDB Board (DT)
      [   78.878844] pstate: 60000005 (nZCv daif -PAN -UAO)
      [   78.883632] pc : scatterwalk_copychunks+0x164/0x1c8
      [   78.888500] lr : scatterwalk_copychunks+0x160/0x1c8
      [   78.893366] sp : ffff00001d04b600
      [   78.896668] x29: ffff00001d04b600 x28: ffff80006814c680
      [   78.901970] x27: 0000000000000000 x26: ffff80006c8de786
      [   78.907272] x25: ffff00001d04b760 x24: 000000000000001a
      [   78.912573] x23: 0000000000000006 x22: ffff80006814e440
      [   78.917874] x21: 0000000000000100 x20: 0000000000000000
      [   78.923175] x19: 000081ffffffffff x18: 0000000000000400
      [   78.928476] x17: 0000000000000008 x16: 0000000000000000
      [   78.933778] x15: 0000000000000100 x14: 0000000000000001
      [   78.939079] x13: 0000000000001080 x12: 0000000000000020
      [   78.944381] x11: 0000000000001080 x10: 00000000ffff0002
      [   78.949683] x9 : ffff80006814c248 x8 : 00000000ffff0000
      [   78.954985] x7 : ffff80006814c318 x6 : ffff80006c8de786
      [   78.960286] x5 : 0000000000000f80 x4 : ffff80006c8de000
      [   78.965588] x3 : 0000000000000000 x2 : 0000000000001086
      [   78.970889] x1 : ffff7e0001b74e02 x0 : 0000000000000000
      [   78.976192] Process openssl (pid: 6007, stack limit = 0x00000000291367f9)
      [   78.982968] Call trace:
      [   78.985406]  scatterwalk_copychunks+0x164/0x1c8
      [   78.989927]  skcipher_walk_next+0x28c/0x448
      [   78.994099]  skcipher_walk_done+0xfc/0x258
      [   78.998187]  gcm_encrypt+0x434/0x4c0
      [   79.001758]  tls_push_record+0x354/0xa58 [tls]
      [   79.006194]  bpf_exec_tx_verdict+0x1e4/0x3e8 [tls]
      [   79.010978]  tls_sw_sendmsg+0x650/0x780 [tls]
      [   79.015326]  inet_sendmsg+0x2c/0xf8
      [   79.018806]  sock_sendmsg+0x18/0x30
      [   79.022284]  __sys_sendto+0x104/0x138
      [   79.025935]  __arm64_sys_sendto+0x24/0x30
      [   79.029936]  el0_svc_common+0x60/0xe8
      [   79.033588]  el0_svc_handler+0x2c/0x80
      [   79.037327]  el0_svc+0x8/0xc
      [   79.040200] Code: 6b01005f 54fff788 940169b1 f9000320 (b9400801)
      [   79.046283] ---[ end trace 74db007d069c1cf7 ]---
      
      Fixes: d829e9c4 ("tls: convert to generic sk_msg interface")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5c1e7e94
  12. 21 12月, 2018 3 次提交
  13. 02 12月, 2018 1 次提交
    • P
      net/core/skmsg: Replace call_rcu_sched() with call_rcu() · 0245b80e
      Paul E. McKenney 提交于
      Now that call_rcu()'s callback is not invoked until after all
      preempt-disable regions of code have completed (in addition to explicitly
      marked RCU read-side critical sections), call_rcu() can be used in place
      of call_rcu_sched().  This commit therefore makes that change.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: <netdev@vger.kernel.org>
      0245b80e
  14. 16 10月, 2018 2 次提交
    • D
      tls: convert to generic sk_msg interface · d829e9c4
      Daniel Borkmann 提交于
      Convert kTLS over to make use of sk_msg interface for plaintext and
      encrypted scattergather data, so it reuses all the sk_msg helpers
      and data structure which later on in a second step enables to glue
      this to BPF.
      
      This also allows to remove quite a bit of open coded helpers which
      are covered by the sk_msg API. Recent changes in kTLs 80ece6a0
      ("tls: Remove redundant vars from tls record structure") and
      4e6d4720 ("tls: Add support for inplace records encryption")
      changed the data path handling a bit; while we've kept the latter
      optimization intact, we had to undo the former change to better
      fit the sk_msg model, hence the sg_aead_in and sg_aead_out have
      been brought back and are linked into the sk_msg sgs. Now the kTLS
      record contains a msg_plaintext and msg_encrypted sk_msg each.
      
      In the original code, the zerocopy_from_iter() has been used out
      of TX but also RX path. For the strparser skb-based RX path,
      we've left the zerocopy_from_iter() in decrypt_internal() mostly
      untouched, meaning it has been moved into tls_setup_from_iter()
      with charging logic removed (as not used from RX). Given RX path
      is not based on sk_msg objects, we haven't pursued setting up a
      dummy sk_msg to call into sk_msg_zerocopy_from_iter(), but it
      could be an option to prusue in a later step.
      
      Joint work with John.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      d829e9c4
    • D
      bpf, sockmap: convert to generic sk_msg interface · 604326b4
      Daniel Borkmann 提交于
      Add a generic sk_msg layer, and convert current sockmap and later
      kTLS over to make use of it. While sk_buff handles network packet
      representation from netdevice up to socket, sk_msg handles data
      representation from application to socket layer.
      
      This means that sk_msg framework spans across ULP users in the
      kernel, and enables features such as introspection or filtering
      of data with the help of BPF programs that operate on this data
      structure.
      
      Latter becomes in particular useful for kTLS where data encryption
      is deferred into the kernel, and as such enabling the kernel to
      perform L7 introspection and policy based on BPF for TLS connections
      where the record is being encrypted after BPF has run and came to
      a verdict. In order to get there, first step is to transform open
      coding of scatter-gather list handling into a common core framework
      that subsystems can use.
      
      The code itself has been split and refactored into three bigger
      pieces: i) the generic sk_msg API which deals with managing the
      scatter gather ring, providing helpers for walking and mangling,
      transferring application data from user space into it, and preparing
      it for BPF pre/post-processing, ii) the plain sock map itself
      where sockets can be attached to or detached from; these bits
      are independent of i) which can now be used also without sock
      map, and iii) the integration with plain TCP as one protocol
      to be used for processing L7 application data (later this could
      e.g. also be extended to other protocols like UDP). The semantics
      are the same with the old sock map code and therefore no change
      of user facing behavior or APIs. While pursuing this work it
      also helped finding a number of bugs in the old sockmap code
      that we've fixed already in earlier commits. The test_sockmap
      kselftest suite passes through fine as well.
      
      Joint work with John.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      604326b4
新手
引导
客服 返回
顶部