1. 27 10月, 2020 1 次提交
  2. 24 10月, 2020 1 次提交
  3. 23 10月, 2020 2 次提交
  4. 22 10月, 2020 3 次提交
  5. 21 10月, 2020 12 次提交
  6. 20 10月, 2020 7 次提交
  7. 19 10月, 2020 2 次提交
    • T
      net: core: use list_del_init() instead of list_del() in netdev_run_todo() · 0e8b8d6a
      Taehee Yoo 提交于
      dev->unlink_list is reused unless dev is deleted.
      So, list_del() should not be used.
      Due to using list_del(), dev->unlink_list can't be reused so that
      dev->nested_level update logic doesn't work.
      In order to fix this bug, list_del_init() should be used instead
      of list_del().
      
      Test commands:
          ip link add bond0 type bond
          ip link add bond1 type bond
          ip link set bond0 master bond1
          ip link set bond0 nomaster
          ip link set bond1 master bond0
          ip link set bond1 nomaster
      
      Splat looks like:
      [  255.750458][ T1030] ============================================
      [  255.751967][ T1030] WARNING: possible recursive locking detected
      [  255.753435][ T1030] 5.9.0-rc8+ #772 Not tainted
      [  255.754553][ T1030] --------------------------------------------
      [  255.756047][ T1030] ip/1030 is trying to acquire lock:
      [  255.757304][ T1030] ffff88811782a280 (&dev_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync_multiple+0xc2/0x150
      [  255.760056][ T1030]
      [  255.760056][ T1030] but task is already holding lock:
      [  255.761862][ T1030] ffff88811130a280 (&dev_addr_list_lock_key/1){+...}-{2:2}, at: bond_enslave+0x3d4d/0x43e0 [bonding]
      [  255.764581][ T1030]
      [  255.764581][ T1030] other info that might help us debug this:
      [  255.766645][ T1030]  Possible unsafe locking scenario:
      [  255.766645][ T1030]
      [  255.768566][ T1030]        CPU0
      [  255.769415][ T1030]        ----
      [  255.770259][ T1030]   lock(&dev_addr_list_lock_key/1);
      [  255.771629][ T1030]   lock(&dev_addr_list_lock_key/1);
      [  255.772994][ T1030]
      [  255.772994][ T1030]  *** DEADLOCK ***
      [  255.772994][ T1030]
      [  255.775091][ T1030]  May be due to missing lock nesting notation
      [  255.775091][ T1030]
      [  255.777182][ T1030] 2 locks held by ip/1030:
      [  255.778299][ T1030]  #0: ffffffffb1f63250 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x2e4/0x8b0
      [  255.780600][ T1030]  #1: ffff88811130a280 (&dev_addr_list_lock_key/1){+...}-{2:2}, at: bond_enslave+0x3d4d/0x43e0 [bonding]
      [  255.783411][ T1030]
      [  255.783411][ T1030] stack backtrace:
      [  255.784874][ T1030] CPU: 7 PID: 1030 Comm: ip Not tainted 5.9.0-rc8+ #772
      [  255.786595][ T1030] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [  255.789030][ T1030] Call Trace:
      [  255.789850][ T1030]  dump_stack+0x99/0xd0
      [  255.790882][ T1030]  __lock_acquire.cold.71+0x166/0x3cc
      [  255.792285][ T1030]  ? register_lock_class+0x1a30/0x1a30
      [  255.793619][ T1030]  ? rcu_read_lock_sched_held+0x91/0xc0
      [  255.794963][ T1030]  ? rcu_read_lock_bh_held+0xa0/0xa0
      [  255.796246][ T1030]  lock_acquire+0x1b8/0x850
      [  255.797332][ T1030]  ? dev_mc_sync_multiple+0xc2/0x150
      [  255.798624][ T1030]  ? bond_enslave+0x3d4d/0x43e0 [bonding]
      [  255.800039][ T1030]  ? check_flags+0x50/0x50
      [  255.801143][ T1030]  ? lock_contended+0xd80/0xd80
      [  255.802341][ T1030]  _raw_spin_lock_nested+0x2e/0x70
      [  255.803592][ T1030]  ? dev_mc_sync_multiple+0xc2/0x150
      [  255.804897][ T1030]  dev_mc_sync_multiple+0xc2/0x150
      [  255.806168][ T1030]  bond_enslave+0x3d58/0x43e0 [bonding]
      [  255.807542][ T1030]  ? __lock_acquire+0xe53/0x51b0
      [  255.808824][ T1030]  ? bond_update_slave_arr+0xdc0/0xdc0 [bonding]
      [  255.810451][ T1030]  ? check_chain_key+0x236/0x5e0
      [  255.811742][ T1030]  ? mutex_is_locked+0x13/0x50
      [  255.812910][ T1030]  ? rtnl_is_locked+0x11/0x20
      [  255.814061][ T1030]  ? netdev_master_upper_dev_get+0xf/0x120
      [  255.815553][ T1030]  do_setlink+0x94c/0x3040
      [ ... ]
      
      Reported-by: syzbot+4a0f7bc34e3997a6c7df@syzkaller.appspotmail.com
      Fixes: 1fc70edb ("net: core: add nested_level variable in net_device")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Link: https://lore.kernel.org/r/20201015162606.9377-1-ap420073@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      0e8b8d6a
    • E
      net: openvswitch: fix to make sure flow_lookup() is not preempted · f981fc3d
      Eelco Chaudron 提交于
      The flow_lookup() function uses per CPU variables, which must be called
      with BH disabled. However, this is fine in the general NAPI use case
      where the local BH is disabled. But, it's also called from the netlink
      context. The below patch makes sure that even in the netlink path, the
      BH is disabled.
      
      In addition, u64_stats_update_begin() requires a lock to ensure one writer
      which is not ensured here. Making it per-CPU and disabling NAPI (softirq)
      ensures that there is always only one writer.
      
      Fixes: eac87c41 ("net: openvswitch: reorder masks array based on usage")
      Reported-by: NJuri Lelli <jlelli@redhat.com>
      Signed-off-by: NEelco Chaudron <echaudro@redhat.com>
      Link: https://lore.kernel.org/r/160295903253.7789.826736662555102345.stgit@ebuildSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      f981fc3d
  8. 17 10月, 2020 5 次提交
  9. 16 10月, 2020 7 次提交
    • J
      Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH" · 2ecbc1f6
      Jakub Kicinski 提交于
      This reverts commit 1d273fcc.
      
      Alexei points out there's nothing implying headers will be built
      and therefore exist under usr/include, so this fix doesn't make
      much sense.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      2ecbc1f6
    • A
      net, sockmap: Don't call bpf_prog_put() on NULL pointer · 83c11c17
      Alex Dewar 提交于
      If bpf_prog_inc_not_zero() fails for skb_parser, then bpf_prog_put() is
      called unconditionally on skb_verdict, even though it may be NULL. Fix
      and tidy up error path.
      
      Fixes: 743df8b7 ("bpf, sockmap: Check skb_verdict and skb_parser programs explicitly")
      Addresses-Coverity-ID: 1497799: Null pointer dereferences (FORWARD_NULL)
      Signed-off-by: NAlex Dewar <alex.dewar90@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJakub Sitnicki <jakub@cloudflare.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20201012170952.60750-1-alex.dewar90@gmail.com
      83c11c17
    • L
      bpf, sockmap: Add locking annotations to iterator · f58423ae
      Lorenz Bauer 提交于
      The sparse checker currently outputs the following warnings:
      
          include/linux/rcupdate.h:632:9: sparse: sparse: context imbalance in 'sock_hash_seq_start' - wrong count at exit
          include/linux/rcupdate.h:632:9: sparse: sparse: context imbalance in 'sock_map_seq_start' - wrong count at exit
      
      Add the necessary __acquires and __release annotations to make the
      iterator locking schema palatable to sparse. Also add __must_hold
      for good measure.
      
      The kernel codebase uses both __acquires(rcu) and __acquires(RCU).
      I couldn't find any guidance which one is preferred, so I used
      what is easier to type out.
      
      Fixes: 03653515 ("net: Allow iterating sockmap and sockhash")
      Reported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NLorenz Bauer <lmb@cloudflare.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/20201012091850.67452-1-lmb@cloudflare.com
      f58423ae
    • D
      netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements · 346e320c
      Davide Caratti 提交于
      nftables payload statements are used to mangle SCTP headers, but they can
      only replace the Internet Checksum. As a consequence, nftables rules that
      mangle sport/dport/vtag in SCTP headers potentially generate packets that
      are discarded by the receiver, unless the CRC-32C is "offloaded" (e.g the
      rule mangles a skb having 'ip_summed' equal to 'CHECKSUM_PARTIAL'.
      
      Fix this extending uAPI definitions and L4 checksum update function, in a
      way that userspace programs (e.g. nft) can instruct the kernel to compute
      CRC-32C in SCTP headers. Also ensure that LIBCRC32C is built if NF_TABLES
      is 'y' or 'm' in the kernel build configuration.
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      346e320c
    • Y
      net: fix pos incrementment in ipv6_route_seq_next · 6617dfd4
      Yonghong Song 提交于
      Commit 4fc427e0 ("ipv6_route_seq_next should increase position index")
      tried to fix the issue where seq_file pos is not increased
      if a NULL element is returned with seq_ops->next(). See bug
        https://bugzilla.kernel.org/show_bug.cgi?id=206283
      The commit effectively does:
        - increase pos for all seq_ops->start()
        - increase pos for all seq_ops->next()
      
      For ipv6_route, increasing pos for all seq_ops->next() is correct.
      But increasing pos for seq_ops->start() is not correct
      since pos is used to determine how many items to skip during
      seq_ops->start():
        iter->skip = *pos;
      seq_ops->start() just fetches the *current* pos item.
      The item can be skipped only after seq_ops->show() which essentially
      is the beginning of seq_ops->next().
      
      For example, I have 7 ipv6 route entries,
        root@arch-fb-vm1:~/net-next dd if=/proc/net/ipv6_route bs=4096
        00000000000000000000000000000000 40 00000000000000000000000000000000 00 00000000000000000000000000000000 00000400 00000001 00000000 00000001     eth0
        fe800000000000000000000000000000 40 00000000000000000000000000000000 00 00000000000000000000000000000000 00000100 00000001 00000000 00000001     eth0
        00000000000000000000000000000000 00 00000000000000000000000000000000 00 00000000000000000000000000000000 ffffffff 00000001 00000000 00200200       lo
        00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000003 00000000 80200001       lo
        fe800000000000002050e3fffebd3be8 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000002 00000000 80200001     eth0
        ff000000000000000000000000000000 08 00000000000000000000000000000000 00 00000000000000000000000000000000 00000100 00000004 00000000 00000001     eth0
        00000000000000000000000000000000 00 00000000000000000000000000000000 00 00000000000000000000000000000000 ffffffff 00000001 00000000 00200200       lo
        0+1 records in
        0+1 records out
        1050 bytes (1.0 kB, 1.0 KiB) copied, 0.00707908 s, 148 kB/s
        root@arch-fb-vm1:~/net-next
      
      In the above, I specify buffer size 4096, so all records can be returned
      to user space with a single trip to the kernel.
      
      If I use buffer size 128, since each record size is 149, internally
      kernel seq_read() will read 149 into its internal buffer and return the data
      to user space in two read() syscalls. Then user read() syscall will trigger
      next seq_ops->start(). Since the current implementation increased pos even
      for seq_ops->start(), it will skip record #2, #4 and #6, assuming the first
      record is #1.
      
        root@arch-fb-vm1:~/net-next dd if=/proc/net/ipv6_route bs=128
        00000000000000000000000000000000 40 00000000000000000000000000000000 00 00000000000000000000000000000000 00000400 00000001 00000000 00000001     eth0
        00000000000000000000000000000000 00 00000000000000000000000000000000 00 00000000000000000000000000000000 ffffffff 00000001 00000000 00200200       lo
        fe800000000000002050e3fffebd3be8 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000002 00000000 80200001     eth0
        00000000000000000000000000000000 00 00000000000000000000000000000000 00 00000000000000000000000000000000 ffffffff 00000001 00000000 00200200       lo
      4+1 records in
      4+1 records out
      600 bytes copied, 0.00127758 s, 470 kB/s
      
      To fix the problem, create a fake pos pointer so seq_ops->start()
      won't actually increase seq_file pos. With this fix, the
      above `dd` command with `bs=128` will show correct result.
      
      Fixes: 4fc427e0 ("ipv6_route_seq_next should increase position index")
      Cc: Alexei Starovoitov <ast@kernel.org>
      Suggested-by: NVasily Averin <vvs@virtuozzo.com>
      Reviewed-by: NVasily Averin <vvs@virtuozzo.com>
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      6617dfd4
    • K
      net/smc: fix invalid return code in smcd_new_buf_create() · 6b1bbf94
      Karsten Graul 提交于
      smc_ism_register_dmb() returns error codes set by the ISM driver which
      are not guaranteed to be negative or in the errno range. Such values
      would not be handled by ERR_PTR() and finally the return code will be
      used as a memory address.
      Fix that by using a valid negative errno value with ERR_PTR().
      
      Fixes: 72b7f6c4 ("net/smc: unique reason code for exceeded max dmb count")
      Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      6b1bbf94
    • K
      net/smc: fix valid DMBE buffer sizes · ef12ad45
      Karsten Graul 提交于
      The SMCD_DMBE_SIZES should include all valid DMBE buffer sizes, so the
      correct value is 6 which means 1MB. With 7 the registration of an ISM
      buffer would always fail because of the invalid size requested.
      Fix that and set the value to 6.
      
      Fixes: c6ba7c9b ("net/smc: add base infrastructure for SMC-D and ISM")
      Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      ef12ad45