1. 27 8月, 2020 5 次提交
    • M
      net: Fix some comments · 645f0897
      Miaohe Lin 提交于
      Fix some comments, including wrong function name, duplicated word and so
      on.
      Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      645f0897
    • J
      net: disable netpoll on fresh napis · 96e97bc0
      Jakub Kicinski 提交于
      napi_disable() makes sure to set the NAPI_STATE_NPSVC bit to prevent
      netpoll from accessing rings before init is complete. However, the
      same is not done for fresh napi instances in netif_napi_add(),
      even though we expect NAPI instances to be added as disabled.
      
      This causes crashes during driver reconfiguration (enabling XDP,
      changing the channel count) - if there is any printk() after
      netif_napi_add() but before napi_enable().
      
      To ensure memory ordering is correct we need to use RCU accessors.
      Reported-by: NRob Sherwood <rsher@fb.com>
      Fixes: 2d8bff12 ("netpoll: Close race condition between poll_one_napi and napi_disable")
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96e97bc0
    • I
      ipv4: Silence suspicious RCU usage warning · 7f6f32bb
      Ido Schimmel 提交于
      fib_info_notify_update() is always called with RTNL held, but not from
      an RCU read-side critical section. This leads to the following warning
      [1] when the FIB table list is traversed with
      hlist_for_each_entry_rcu(), but without a proper lockdep expression.
      
      Since modification of the list is protected by RTNL, silence the warning
      by adding a lockdep expression which verifies RTNL is held.
      
      [1]
       =============================
       WARNING: suspicious RCU usage
       5.9.0-rc1-custom-14233-g2f26e122d62f #129 Not tainted
       -----------------------------
       net/ipv4/fib_trie.c:2124 RCU-list traversed in non-reader section!!
      
       other info that might help us debug this:
      
       rcu_scheduler_active = 2, debug_locks = 1
       1 lock held by ip/834:
        #0: ffffffff85a3b6b0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x49a/0xbd0
      
       stack backtrace:
       CPU: 0 PID: 834 Comm: ip Not tainted 5.9.0-rc1-custom-14233-g2f26e122d62f #129
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
       Call Trace:
        dump_stack+0x100/0x184
        lockdep_rcu_suspicious+0x143/0x14d
        fib_info_notify_update+0x8d1/0xa60
        __nexthop_replace_notify+0xd2/0x290
        rtm_new_nexthop+0x35e2/0x5946
        rtnetlink_rcv_msg+0x4f7/0xbd0
        netlink_rcv_skb+0x17a/0x480
        rtnetlink_rcv+0x22/0x30
        netlink_unicast+0x5ae/0x890
        netlink_sendmsg+0x98a/0xf40
        ____sys_sendmsg+0x879/0xa00
        ___sys_sendmsg+0x122/0x190
        __sys_sendmsg+0x103/0x1d0
        __x64_sys_sendmsg+0x7d/0xb0
        do_syscall_64+0x32/0x50
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
       RIP: 0033:0x7fde28c3be57
       Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51
      c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
      RSP: 002b:00007ffc09330028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fde28c3be57
      RDX: 0000000000000000 RSI: 00007ffc09330090 RDI: 0000000000000003
      RBP: 000000005f45f911 R08: 0000000000000001 R09: 00007ffc0933012c
      R10: 0000000000000076 R11: 0000000000000246 R12: 0000000000000001
      R13: 00007ffc09330290 R14: 00007ffc09330eee R15: 00005610e48ed020
      
      Fixes: 1bff1a0c ("ipv4: Add function to send route updates")
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f6f32bb
    • F
      mptcp: free acked data before waiting for more memory · 1cec170d
      Florian Westphal 提交于
      After subflow lock is dropped, more wmem might have been made available.
      
      This fixes a deadlock in mptcp_connect.sh 'mmap' mode: wmem is exhausted.
      But as the mptcp socket holds on to already-acked data (for retransmit)
      no wakeup will occur.
      
      Using 'goto restart' calls mptcp_clean_una(sk) which will free pages
      that have been acked completely in the mean time.
      
      Fixes: fb529e62 ("mptcp: break and restart in case mptcp sndbuf is full")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1cec170d
    • V
      taprio: Fix using wrong queues in gate mask · 09e31cf0
      Vinicius Costa Gomes 提交于
      Since commit 9c66d156 ("taprio: Add support for hardware
      offloading") there's a bit of inconsistency when offloading schedules
      to the hardware:
      
      In software mode, the gate masks are specified in terms of traffic
      classes, so if say "sched-entry S 03 20000", it means that the traffic
      classes 0 and 1 are open for 20us; when taprio is offloaded to
      hardware, the gate masks are specified in terms of hardware queues.
      
      The idea here is to fix hardware offloading, so schedules in hardware
      and software mode have the same behavior. What's needed to do is to
      map traffic classes to queues when applying the offload to the driver.
      
      Fixes: 9c66d156 ("taprio: Add support for hardware offloading")
      Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09e31cf0
  2. 25 8月, 2020 4 次提交
    • T
      net: caif: fix error code handling · e1046841
      Tong Zhang 提交于
      cfpkt_peek_head return 0 and 1, caller is checking error using <0
      Signed-off-by: NTong Zhang <ztong0001@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1046841
    • H
      net: Get rid of consume_skb when tracing is off · be769db2
      Herbert Xu 提交于
      The function consume_skb is only meaningful when tracing is enabled.
      This patch makes it conditional on CONFIG_TRACEPOINTS.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be769db2
    • P
      netlabel: fix problems with mapping removal · d3b990b7
      Paul Moore 提交于
      This patch fixes two main problems seen when removing NetLabel
      mappings: memory leaks and potentially extra audit noise.
      
      The memory leaks are caused by not properly free'ing the mapping's
      address selector struct when free'ing the entire entry as well as
      not properly cleaning up a temporary mapping entry when adding new
      address selectors to an existing entry.  This patch fixes both these
      problems such that kmemleak reports no NetLabel associated leaks
      after running the SELinux test suite.
      
      The potentially extra audit noise was caused by the auditing code in
      netlbl_domhsh_remove_entry() being called regardless of the entry's
      validity.  If another thread had already marked the entry as invalid,
      but not removed/free'd it from the list of mappings, then it was
      possible that an additional mapping removal audit record would be
      generated.  This patch fixes this by returning early from the removal
      function when the entry was previously marked invalid.  This change
      also had the side benefit of improving the code by decreasing the
      indentation level of large chunk of code by one (accounting for most
      of the diffstat).
      
      Fixes: 63c41688 ("netlabel: Add network address selectors to the NetLabel/LSM domain mapping")
      Reported-by: NStephen Smalley <stephen.smalley.work@gmail.com>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3b990b7
    • X
      sctp: not disable bh in the whole sctp_get_port_local() · 3106ecb4
      Xin Long 提交于
      With disabling bh in the whole sctp_get_port_local(), when
      snum == 0 and too many ports have been used, the do-while
      loop will take the cpu for a long time and cause cpu stuck:
      
        [ ] watchdog: BUG: soft lockup - CPU#11 stuck for 22s!
        [ ] RIP: 0010:native_queued_spin_lock_slowpath+0x4de/0x940
        [ ] Call Trace:
        [ ]  _raw_spin_lock+0xc1/0xd0
        [ ]  sctp_get_port_local+0x527/0x650 [sctp]
        [ ]  sctp_do_bind+0x208/0x5e0 [sctp]
        [ ]  sctp_autobind+0x165/0x1e0 [sctp]
        [ ]  sctp_connect_new_asoc+0x355/0x480 [sctp]
        [ ]  __sctp_connect+0x360/0xb10 [sctp]
      
      There's no need to disable bh in the whole function of
      sctp_get_port_local. So fix this cpu stuck by removing
      local_bh_disable() called at the beginning, and using
      spin_lock_bh() instead.
      
      The same thing was actually done for inet_csk_get_port() in
      Commit ea8add2b ("tcp/dccp: better use of ephemeral
      ports in bind()").
      
      Thanks to Marcelo for pointing the buggy code out.
      
      v1->v2:
        - use cond_resched() to yield cpu to other tasks if needed,
          as Eric noticed.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: NYing Xu <yinxu@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3106ecb4
  3. 23 8月, 2020 1 次提交
    • N
      net: nexthop: don't allow empty NHA_GROUP · eeaac363
      Nikolay Aleksandrov 提交于
      Currently the nexthop code will use an empty NHA_GROUP attribute, but it
      requires at least 1 entry in order to function properly. Otherwise we
      end up derefencing null or random pointers all over the place due to not
      having any nh_grp_entry members allocated, nexthop code relies on having at
      least the first member present. Empty NHA_GROUP doesn't make any sense so
      just disallow it.
      Also add a WARN_ON for any future users of nexthop_create_group().
      
       BUG: kernel NULL pointer dereference, address: 0000000000000080
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 0 P4D 0
       Oops: 0000 [#1] SMP
       CPU: 0 PID: 558 Comm: ip Not tainted 5.9.0-rc1+ #93
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
       RIP: 0010:fib_check_nexthop+0x4a/0xaa
       Code: 0f 84 83 00 00 00 48 c7 02 80 03 f7 81 c3 40 80 fe fe 75 12 b8 ea ff ff ff 48 85 d2 74 6b 48 c7 02 40 03 f7 81 c3 48 8b 40 10 <48> 8b 80 80 00 00 00 eb 36 80 78 1a 00 74 12 b8 ea ff ff ff 48 85
       RSP: 0018:ffff88807983ba00 EFLAGS: 00010213
       RAX: 0000000000000000 RBX: ffff88807983bc00 RCX: 0000000000000000
       RDX: ffff88807983bc00 RSI: 0000000000000000 RDI: ffff88807bdd0a80
       RBP: ffff88807983baf8 R08: 0000000000000dc0 R09: 000000000000040a
       R10: 0000000000000000 R11: ffff88807bdd0ae8 R12: 0000000000000000
       R13: 0000000000000000 R14: ffff88807bea3100 R15: 0000000000000001
       FS:  00007f10db393700(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000080 CR3: 000000007bd0f004 CR4: 00000000003706f0
       Call Trace:
        fib_create_info+0x64d/0xaf7
        fib_table_insert+0xf6/0x581
        ? __vma_adjust+0x3b6/0x4d4
        inet_rtm_newroute+0x56/0x70
        rtnetlink_rcv_msg+0x1e3/0x20d
        ? rtnl_calcit.isra.0+0xb8/0xb8
        netlink_rcv_skb+0x5b/0xac
        netlink_unicast+0xfa/0x17b
        netlink_sendmsg+0x334/0x353
        sock_sendmsg_nosec+0xf/0x3f
        ____sys_sendmsg+0x1a0/0x1fc
        ? copy_msghdr_from_user+0x4c/0x61
        ___sys_sendmsg+0x63/0x84
        ? handle_mm_fault+0xa39/0x11b5
        ? sockfd_lookup_light+0x72/0x9a
        __sys_sendmsg+0x50/0x6e
        do_syscall_64+0x54/0xbe
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
       RIP: 0033:0x7f10dacc0bb7
       Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 8b 05 9a 4b 2b 00 85 c0 75 2e 48 63 ff 48 63 d2 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 b1 f2 2a 00 f7 d8 64 89 02 48
       RSP: 002b:00007ffcbe628bf8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
       RAX: ffffffffffffffda RBX: 00007ffcbe628f80 RCX: 00007f10dacc0bb7
       RDX: 0000000000000000 RSI: 00007ffcbe628c60 RDI: 0000000000000003
       RBP: 000000005f41099c R08: 0000000000000001 R09: 0000000000000008
       R10: 00000000000005e9 R11: 0000000000000246 R12: 0000000000000000
       R13: 0000000000000000 R14: 00007ffcbe628d70 R15: 0000563a86c6e440
       Modules linked in:
       CR2: 0000000000000080
      
      CC: David Ahern <dsahern@gmail.com>
      Fixes: 430a0491 ("nexthop: Add support for nexthop groups")
      Reported-by: syzbot+a61aa19b0c14c8770bd9@syzkaller.appspotmail.com
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eeaac363
  4. 22 8月, 2020 2 次提交
    • F
      netfilter: nf_tables: fix destination register zeroing · 1e105e6a
      Florian Westphal 提交于
      Following bug was reported via irc:
      nft list ruleset
         set knock_candidates_ipv4 {
            type ipv4_addr . inet_service
            size 65535
            elements = { 127.0.0.1 . 123,
                         127.0.0.1 . 123 }
            }
       ..
         udp dport 123 add @knock_candidates_ipv4 { ip saddr . 123 }
         udp dport 123 add @knock_candidates_ipv4 { ip saddr . udp dport }
      
      It should not have been possible to add a duplicate set entry.
      
      After some debugging it turned out that the problem is the immediate
      value (123) in the second-to-last rule.
      
      Concatenations use 32bit registers, i.e. the elements are 8 bytes each,
      not 6 and it turns out the kernel inserted
      
      inet firewall @knock_candidates_ipv4
              element 0100007f ffff7b00  : 0 [end]
              element 0100007f 00007b00  : 0 [end]
      
      Note the non-zero upper bits of the first element.  It turns out that
      nft_immediate doesn't zero the destination register, but this is needed
      when the length isn't a multiple of 4.
      
      Furthermore, the zeroing in nft_payload is broken.  We can't use
      [len / 4] = 0 -- if len is a multiple of 4, index is off by one.
      
      Skip zeroing in this case and use a conditional instead of (len -1) / 4.
      
      Fixes: 49499c3e ("netfilter: nf_tables: switch registers to 32 bit addressing")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1e105e6a
    • P
      netfilter: nf_tables: add NFTA_SET_USERDATA if not null · 6f03bf43
      Pablo Neira Ayuso 提交于
      Kernel sends an empty NFTA_SET_USERDATA attribute with no value if
      userspace adds a set with no NFTA_SET_USERDATA attribute.
      
      Fixes: e6d8ecac ("netfilter: nf_tables: Add new attributes into nft_set to store user data.")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      6f03bf43
  5. 21 8月, 2020 8 次提交
  6. 20 8月, 2020 2 次提交
  7. 19 8月, 2020 9 次提交
  8. 18 8月, 2020 2 次提交
  9. 17 8月, 2020 6 次提交
  10. 15 8月, 2020 1 次提交