1. 10 11月, 2019 40 次提交
    • P
      selftests: fib_tests: add more tests for metric update · 0c3355cc
      Paolo Abeni 提交于
      [ Upstream commit 37de3b354150450ba12275397155e68113e99901 ]
      
      This patch adds two more tests to ipv4_addr_metric_test() to
      explicitly cover the scenarios fixed by the previous patch.
      Suggested-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c3355cc
    • P
      ipv4: fix route update on metric change. · b166e883
      Paolo Abeni 提交于
      [ Upstream commit 0b834ba00ab5337e938c727e216e1f5249794717 ]
      
      Since commit af4d768a ("net/ipv4: Add support for specifying metric
      of connected routes"), when updating an IP address with a different metric,
      the associated connected route is updated, too.
      
      Still, the mentioned commit doesn't handle properly some corner cases:
      
      $ ip addr add dev eth0 192.168.1.0/24
      $ ip addr add dev eth0 192.168.2.1/32 peer 192.168.2.2
      $ ip addr add dev eth0 192.168.3.1/24
      $ ip addr change dev eth0 192.168.1.0/24 metric 10
      $ ip addr change dev eth0 192.168.2.1/32 peer 192.168.2.2 metric 10
      $ ip addr change dev eth0 192.168.3.1/24 metric 10
      $ ip -4 route
      192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.0
      192.168.2.2 dev eth0 proto kernel scope link src 192.168.2.1
      192.168.3.0/24 dev eth0 proto kernel scope link src 192.168.2.1 metric 10
      
      Only the last route is correctly updated.
      
      The problem is the current test in fib_modify_prefix_metric():
      
      	if (!(dev->flags & IFF_UP) ||
      	    ifa->ifa_flags & (IFA_F_SECONDARY | IFA_F_NOPREFIXROUTE) ||
      	    ipv4_is_zeronet(prefix) ||
      	    prefix == ifa->ifa_local || ifa->ifa_prefixlen == 32)
      
      Which should be the logical 'not' of the pre-existing test in
      fib_add_ifaddr():
      
      	if (!ipv4_is_zeronet(prefix) && !(ifa->ifa_flags & IFA_F_SECONDARY) &&
      	    (prefix != addr || ifa->ifa_prefixlen < 32))
      
      To properly negate the original expression, we need to change the last
      logical 'or' to a logical 'and'.
      
      Fixes: af4d768a ("net/ipv4: Add support for specifying metric of connected routes")
      Reported-and-suggested-by: NBeniamino Galvani <bgalvani@redhat.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b166e883
    • E
      net: add READ_ONCE() annotation in __skb_wait_for_more_packets() · cd3bcb44
      Eric Dumazet 提交于
      [ Upstream commit 7c422d0ce97552dde4a97e6290de70ec6efb0fc6 ]
      
      __skb_wait_for_more_packets() can be called while other cpus
      can feed packets to the socket receive queue.
      
      KCSAN reported :
      
      BUG: KCSAN: data-race in __skb_wait_for_more_packets / __udp_enqueue_schedule_skb
      
      write to 0xffff888102e40b58 of 8 bytes by interrupt on cpu 0:
       __skb_insert include/linux/skbuff.h:1852 [inline]
       __skb_queue_before include/linux/skbuff.h:1958 [inline]
       __skb_queue_tail include/linux/skbuff.h:1991 [inline]
       __udp_enqueue_schedule_skb+0x2d7/0x410 net/ipv4/udp.c:1470
       __udp_queue_rcv_skb net/ipv4/udp.c:1940 [inline]
       udp_queue_rcv_one_skb+0x7bd/0xc70 net/ipv4/udp.c:2057
       udp_queue_rcv_skb+0xb5/0x400 net/ipv4/udp.c:2074
       udp_unicast_rcv_skb.isra.0+0x7e/0x1c0 net/ipv4/udp.c:2233
       __udp4_lib_rcv+0xa44/0x17c0 net/ipv4/udp.c:2300
       udp_rcv+0x2b/0x40 net/ipv4/udp.c:2470
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
      
      read to 0xffff888102e40b58 of 8 bytes by task 13035 on cpu 1:
       __skb_wait_for_more_packets+0xfa/0x320 net/core/datagram.c:100
       __skb_recv_udp+0x374/0x500 net/ipv4/udp.c:1683
       udp_recvmsg+0xe1/0xb10 net/ipv4/udp.c:1712
       inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
       sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871
       ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480
       do_recvmmsg+0x19a/0x5c0 net/socket.c:2601
       __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680
       __do_sys_recvmmsg net/socket.c:2703 [inline]
       __se_sys_recvmmsg net/socket.c:2696 [inline]
       __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 13035 Comm: syz-executor.3 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd3bcb44
    • E
      net: use skb_queue_empty_lockless() in busy poll contexts · 4f3df7f1
      Eric Dumazet 提交于
      [ Upstream commit 3f926af3f4d688e2e11e7f8ed04e277a14d4d4a4 ]
      
      Busy polling usually runs without locks.
      Let's use skb_queue_empty_lockless() instead of skb_queue_empty()
      
      Also uses READ_ONCE() in __skb_try_recv_datagram() to address
      a similar potential problem.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f3df7f1
    • E
      net: use skb_queue_empty_lockless() in poll() handlers · eaf548fe
      Eric Dumazet 提交于
      [ Upstream commit 3ef7cf57c72f32f61e97f8fa401bc39ea1f1a5d4 ]
      
      Many poll() handlers are lockless. Using skb_queue_empty_lockless()
      instead of skb_queue_empty() is more appropriate.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eaf548fe
    • E
      udp: use skb_queue_empty_lockless() · afa1f5e9
      Eric Dumazet 提交于
      [ Upstream commit 137a0dbe3426fd7bcfe3f8117b36a87b3590e4eb ]
      
      syzbot reported a data-race [1].
      
      We should use skb_queue_empty_lockless() to document that we are
      not ensuring a mutual exclusion and silence KCSAN.
      
      [1]
      BUG: KCSAN: data-race in __skb_recv_udp / __udp_enqueue_schedule_skb
      
      write to 0xffff888122474b50 of 8 bytes by interrupt on cpu 0:
       __skb_insert include/linux/skbuff.h:1852 [inline]
       __skb_queue_before include/linux/skbuff.h:1958 [inline]
       __skb_queue_tail include/linux/skbuff.h:1991 [inline]
       __udp_enqueue_schedule_skb+0x2c1/0x410 net/ipv4/udp.c:1470
       __udp_queue_rcv_skb net/ipv4/udp.c:1940 [inline]
       udp_queue_rcv_one_skb+0x7bd/0xc70 net/ipv4/udp.c:2057
       udp_queue_rcv_skb+0xb5/0x400 net/ipv4/udp.c:2074
       udp_unicast_rcv_skb.isra.0+0x7e/0x1c0 net/ipv4/udp.c:2233
       __udp4_lib_rcv+0xa44/0x17c0 net/ipv4/udp.c:2300
       udp_rcv+0x2b/0x40 net/ipv4/udp.c:2470
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
      
      read to 0xffff888122474b50 of 8 bytes by task 8921 on cpu 1:
       skb_queue_empty include/linux/skbuff.h:1494 [inline]
       __skb_recv_udp+0x18d/0x500 net/ipv4/udp.c:1653
       udp_recvmsg+0xe1/0xb10 net/ipv4/udp.c:1712
       inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
       sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871
       ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480
       do_recvmmsg+0x19a/0x5c0 net/socket.c:2601
       __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680
       __do_sys_recvmmsg net/socket.c:2703 [inline]
       __se_sys_recvmmsg net/socket.c:2696 [inline]
       __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 8921 Comm: syz-executor.4 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      afa1f5e9
    • E
      net: add skb_queue_empty_lockless() · d5ac4232
      Eric Dumazet 提交于
      [ Upstream commit d7d16a89350ab263484c0aa2b523dd3a234e4a80 ]
      
      Some paths call skb_queue_empty() without holding
      the queue lock. We must use a barrier in order
      to not let the compiler do strange things, and avoid
      KCSAN splats.
      
      Adding a barrier in skb_queue_empty() might be overkill,
      I prefer adding a new helper to clearly identify
      points where the callers might be lockless. This might
      help us finding real bugs.
      
      The corresponding WRITE_ONCE() should add zero cost
      for current compilers.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5ac4232
    • X
      vxlan: check tun_info options_len properly · 83532eb4
      Xin Long 提交于
      [ Upstream commit eadf52cf1852196a1363044dcda22fa5d7f296f7 ]
      
      This patch is to improve the tun_info options_len by dropping
      the skb when TUNNEL_VXLAN_OPT is set but options_len is less
      than vxlan_metadata. This can void a potential out-of-bounds
      access on ip_tun_info.
      
      Fixes: ee122c79 ("vxlan: Flow based tunneling")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      83532eb4
    • E
      udp: fix data-race in udp_set_dev_scratch() · a8a5adbb
      Eric Dumazet 提交于
      [ Upstream commit a793183caa9afae907a0d7ddd2ffd57329369bf5 ]
      
      KCSAN reported a data-race in udp_set_dev_scratch() [1]
      
      The issue here is that we must not write over skb fields
      if skb is shared. A similar issue has been fixed in commit
      89c22d8c ("net: Fix skb csum races when peeking")
      
      While we are at it, use a helper only dealing with
      udp_skb_scratch(skb)->csum_unnecessary, as this allows
      udp_set_dev_scratch() to be called once and thus inlined.
      
      [1]
      BUG: KCSAN: data-race in udp_set_dev_scratch / udpv6_recvmsg
      
      write to 0xffff888120278317 of 1 bytes by task 10411 on cpu 1:
       udp_set_dev_scratch+0xea/0x200 net/ipv4/udp.c:1308
       __first_packet_length+0x147/0x420 net/ipv4/udp.c:1556
       first_packet_length+0x68/0x2a0 net/ipv4/udp.c:1579
       udp_poll+0xea/0x110 net/ipv4/udp.c:2720
       sock_poll+0xed/0x250 net/socket.c:1256
       vfs_poll include/linux/poll.h:90 [inline]
       do_select+0x7d0/0x1020 fs/select.c:534
       core_sys_select+0x381/0x550 fs/select.c:677
       do_pselect.constprop.0+0x11d/0x160 fs/select.c:759
       __do_sys_pselect6 fs/select.c:784 [inline]
       __se_sys_pselect6 fs/select.c:769 [inline]
       __x64_sys_pselect6+0x12e/0x170 fs/select.c:769
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      read to 0xffff888120278317 of 1 bytes by task 10413 on cpu 0:
       udp_skb_csum_unnecessary include/net/udp.h:358 [inline]
       udpv6_recvmsg+0x43e/0xe90 net/ipv6/udp.c:310
       inet6_recvmsg+0xbb/0x240 net/ipv6/af_inet6.c:592
       sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871
       ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480
       do_recvmmsg+0x19a/0x5c0 net/socket.c:2601
       __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680
       __do_sys_recvmmsg net/socket.c:2703 [inline]
       __se_sys_recvmmsg net/socket.c:2696 [inline]
       __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 10413 Comm: syz-executor.0 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: 2276f58a ("udp: use a separate rx queue for packet reception")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Reviewed-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a8a5adbb
    • W
      selftests: net: reuseport_dualstack: fix uninitalized parameter · 12fab163
      Wei Wang 提交于
      [ Upstream commit d64479a3e3f9924074ca7b50bd72fa5211dca9c1 ]
      
      This test reports EINVAL for getsockopt(SOL_SOCKET, SO_DOMAIN)
      occasionally due to the uninitialized length parameter.
      Initialize it to fix this, and also use int for "test_family" to comply
      with the API standard.
      
      Fixes: d6a61f80 ("soreuseport: test mixed v4/v6 sockets")
      Reported-by: NMaciej Żenczykowski <maze@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NWei Wang <weiwan@google.com>
      Cc: Craig Gallek <cgallek@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12fab163
    • Z
      net: Zeroing the structure ethtool_wolinfo in ethtool_get_wol() · 321c9915
      zhanglin 提交于
      [ Upstream commit 5ff223e86f5addbfae26419cbb5d61d98f6fbf7d ]
      
      memset() the structure ethtool_wolinfo that has padded bytes
      but the padded bytes have not been zeroed out.
      Signed-off-by: Nzhanglin <zhang.lin16@zte.com.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      321c9915
    • D
      net: usb: lan78xx: Disable interrupts before calling generic_handle_irq() · 9da271c1
      Daniel Wagner 提交于
      [ Upstream commit 0a29ac5bd3a988dc151c8d26910dec2557421f64 ]
      
      lan78xx_status() will run with interrupts enabled due to the change in
      ed194d136769 ("usb: core: remove local_irq_save() around ->complete()
      handler"). generic_handle_irq() expects to be run with IRQs disabled.
      
      [    4.886203] 000: irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts
      [    4.886243] 000: WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x154/0x168
      [    4.896294] 000: Modules linked in:
      [    4.896301] 000: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.6 #39
      [    4.896310] 000: Hardware name: Raspberry Pi 3 Model B+ (DT)
      [    4.896315] 000: pstate: 60000005 (nZCv daif -PAN -UAO)
      [    4.896321] 000: pc : __handle_irq_event_percpu+0x154/0x168
      [    4.896331] 000: lr : __handle_irq_event_percpu+0x154/0x168
      [    4.896339] 000: sp : ffff000010003cc0
      [    4.896346] 000: x29: ffff000010003cc0 x28: 0000000000000060
      [    4.896355] 000: x27: ffff000011021980 x26: ffff00001189c72b
      [    4.896364] 000: x25: ffff000011702bc0 x24: ffff800036d6e400
      [    4.896373] 000: x23: 000000000000004f x22: ffff000010003d64
      [    4.896381] 000: x21: 0000000000000000 x20: 0000000000000002
      [    4.896390] 000: x19: ffff8000371c8480 x18: 0000000000000060
      [    4.896398] 000: x17: 0000000000000000 x16: 00000000000000eb
      [    4.896406] 000: x15: ffff000011712d18 x14: 7265746e69206465
      [    4.896414] 000: x13: ffff000010003ba0 x12: ffff000011712df0
      [    4.896422] 000: x11: 0000000000000001 x10: ffff000011712e08
      [    4.896430] 000: x9 : 0000000000000001 x8 : 000000000003c920
      [    4.896437] 000: x7 : ffff0000118cc410 x6 : ffff0000118c7f00
      [    4.896445] 000: x5 : 000000000003c920 x4 : 0000000000004510
      [    4.896453] 000: x3 : ffff000011712dc8 x2 : 0000000000000000
      [    4.896461] 000: x1 : 73a3f67df94c1500 x0 : 0000000000000000
      [    4.896466] 000: Call trace:
      [    4.896471] 000:  __handle_irq_event_percpu+0x154/0x168
      [    4.896481] 000:  handle_irq_event_percpu+0x50/0xb0
      [    4.896489] 000:  handle_irq_event+0x40/0x98
      [    4.896497] 000:  handle_simple_irq+0xa4/0xf0
      [    4.896505] 000:  generic_handle_irq+0x24/0x38
      [    4.896513] 000:  intr_complete+0xb0/0xe0
      [    4.896525] 000:  __usb_hcd_giveback_urb+0x58/0xd8
      [    4.896533] 000:  usb_giveback_urb_bh+0xd0/0x170
      [    4.896539] 000:  tasklet_action_common.isra.0+0x9c/0x128
      [    4.896549] 000:  tasklet_hi_action+0x24/0x30
      [    4.896556] 000:  __do_softirq+0x120/0x23c
      [    4.896564] 000:  irq_exit+0xb8/0xd8
      [    4.896571] 000:  __handle_domain_irq+0x64/0xb8
      [    4.896579] 000:  bcm2836_arm_irqchip_handle_irq+0x60/0xc0
      [    4.896586] 000:  el1_irq+0xb8/0x140
      [    4.896592] 000:  arch_cpu_idle+0x10/0x18
      [    4.896601] 000:  do_idle+0x200/0x280
      [    4.896608] 000:  cpu_startup_entry+0x20/0x28
      [    4.896615] 000:  rest_init+0xb4/0xc0
      [    4.896623] 000:  arch_call_rest_init+0xc/0x14
      [    4.896632] 000:  start_kernel+0x454/0x480
      
      Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler")
      Cc: Woojung Huh <woojung.huh@microchip.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Stefan Wahren <wahrenst@gmx.net>
      Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NDaniel Wagner <dwagner@suse.de>
      Tested-by: NStefan Wahren <wahrenst@gmx.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9da271c1
    • G
      netns: fix GFP flags in rtnl_net_notifyid() · 40400fdd
      Guillaume Nault 提交于
      [ Upstream commit d4e4fdf9e4a27c87edb79b1478955075be141f67 ]
      
      In rtnl_net_notifyid(), we certainly can't pass a null GFP flag to
      rtnl_notify(). A GFP_KERNEL flag would be fine in most circumstances,
      but there are a few paths calling rtnl_net_notifyid() from atomic
      context or from RCU critical sections. The later also precludes the use
      of gfp_any() as it wouldn't detect the RCU case. Also, the nlmsg_new()
      call is wrong too, as it uses GFP_KERNEL unconditionally.
      
      Therefore, we need to pass the GFP flags as parameter and propagate it
      through function calls until the proper flags can be determined.
      
      In most cases, GFP_KERNEL is fine. The exceptions are:
        * openvswitch: ovs_vport_cmd_get() and ovs_vport_cmd_dump()
          indirectly call rtnl_net_notifyid() from RCU critical section,
      
        * rtnetlink: rtmsg_ifinfo_build_skb() already receives GFP flags as
          parameter.
      
      Also, in ovs_vport_cmd_build_info(), let's change the GFP flags used
      by nlmsg_new(). The function is allowed to sleep, so better make the
      flags consistent with the ones used in the following
      ovs_vport_cmd_fill_info() call.
      
      Found by code inspection.
      
      Fixes: 9a963454 ("netns: notify netns id events")
      Signed-off-by: NGuillaume Nault <gnault@redhat.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      40400fdd
    • E
      net/mlx4_core: Dynamically set guaranteed amount of counters per VF · 1d72dbb4
      Eran Ben Elisha 提交于
      [ Upstream commit e19868efea0c103f23b4b7e986fd0a703822111f ]
      
      Prior to this patch, the amount of counters guaranteed per VF in the
      resource tracker was MLX4_VF_COUNTERS_PER_PORT * MLX4_MAX_PORTS. It was
      set regardless if the VF was single or dual port.
      This caused several VFs to have no guaranteed counters although the
      system could satisfy their request.
      
      The fix is to dynamically guarantee counters, based on each VF
      specification.
      
      Fixes: 9de92c60 ("net/mlx4_core: Adjust counter grant policy in the resource tracker")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d72dbb4
    • J
      net: hisilicon: Fix ping latency when deal with high throughput · f05975d9
      Jiangfeng Xiao 提交于
      [ Upstream commit e56bd641ca61beb92b135298d5046905f920b734 ]
      
      This is due to error in over budget processing.
      When dealing with high throughput, the used buffers
      that exceeds the budget is not cleaned up. In addition,
      it takes a lot of cycles to clean up the used buffer,
      and then the buffer where the valid data is located can take effect.
      Signed-off-by: NJiangfeng Xiao <xiaojiangfeng@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f05975d9
    • T
      net: fix sk_page_frag() recursion from memory reclaim · 1d5cb12a
      Tejun Heo 提交于
      [ Upstream commit 20eb4f29b60286e0d6dc01d9c260b4bd383c58fb ]
      
      sk_page_frag() optimizes skb_frag allocations by using per-task
      skb_frag cache when it knows it's the only user.  The condition is
      determined by seeing whether the socket allocation mask allows
      blocking - if the allocation may block, it obviously owns the task's
      context and ergo exclusively owns current->task_frag.
      
      Unfortunately, this misses recursion through memory reclaim path.
      Please take a look at the following backtrace.
      
       [2] RIP: 0010:tcp_sendmsg_locked+0xccf/0xe10
           ...
           tcp_sendmsg+0x27/0x40
           sock_sendmsg+0x30/0x40
           sock_xmit.isra.24+0xa1/0x170 [nbd]
           nbd_send_cmd+0x1d2/0x690 [nbd]
           nbd_queue_rq+0x1b5/0x3b0 [nbd]
           __blk_mq_try_issue_directly+0x108/0x1b0
           blk_mq_request_issue_directly+0xbd/0xe0
           blk_mq_try_issue_list_directly+0x41/0xb0
           blk_mq_sched_insert_requests+0xa2/0xe0
           blk_mq_flush_plug_list+0x205/0x2a0
           blk_flush_plug_list+0xc3/0xf0
       [1] blk_finish_plug+0x21/0x2e
           _xfs_buf_ioapply+0x313/0x460
           __xfs_buf_submit+0x67/0x220
           xfs_buf_read_map+0x113/0x1a0
           xfs_trans_read_buf_map+0xbf/0x330
           xfs_btree_read_buf_block.constprop.42+0x95/0xd0
           xfs_btree_lookup_get_block+0x95/0x170
           xfs_btree_lookup+0xcc/0x470
           xfs_bmap_del_extent_real+0x254/0x9a0
           __xfs_bunmapi+0x45c/0xab0
           xfs_bunmapi+0x15/0x30
           xfs_itruncate_extents_flags+0xca/0x250
           xfs_free_eofblocks+0x181/0x1e0
           xfs_fs_destroy_inode+0xa8/0x1b0
           destroy_inode+0x38/0x70
           dispose_list+0x35/0x50
           prune_icache_sb+0x52/0x70
           super_cache_scan+0x120/0x1a0
           do_shrink_slab+0x120/0x290
           shrink_slab+0x216/0x2b0
           shrink_node+0x1b6/0x4a0
           do_try_to_free_pages+0xc6/0x370
           try_to_free_mem_cgroup_pages+0xe3/0x1e0
           try_charge+0x29e/0x790
           mem_cgroup_charge_skmem+0x6a/0x100
           __sk_mem_raise_allocated+0x18e/0x390
           __sk_mem_schedule+0x2a/0x40
       [0] tcp_sendmsg_locked+0x8eb/0xe10
           tcp_sendmsg+0x27/0x40
           sock_sendmsg+0x30/0x40
           ___sys_sendmsg+0x26d/0x2b0
           __sys_sendmsg+0x57/0xa0
           do_syscall_64+0x42/0x100
           entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      In [0], tcp_send_msg_locked() was using current->page_frag when it
      called sk_wmem_schedule().  It already calculated how many bytes can
      be fit into current->page_frag.  Due to memory pressure,
      sk_wmem_schedule() called into memory reclaim path which called into
      xfs and then IO issue path.  Because the filesystem in question is
      backed by nbd, the control goes back into the tcp layer - back into
      tcp_sendmsg_locked().
      
      nbd sets sk_allocation to (GFP_NOIO | __GFP_MEMALLOC) which makes
      sense - it's in the process of freeing memory and wants to be able to,
      e.g., drop clean pages to make forward progress.  However, this
      confused sk_page_frag() called from [2].  Because it only tests
      whether the allocation allows blocking which it does, it now thinks
      current->page_frag can be used again although it already was being
      used in [0].
      
      After [2] used current->page_frag, the offset would be increased by
      the used amount.  When the control returns to [0],
      current->page_frag's offset is increased and the previously calculated
      number of bytes now may overrun the end of allocated memory leading to
      silent memory corruptions.
      
      Fix it by adding gfpflags_normal_context() which tests sleepable &&
      !reclaim and use it to determine whether to use current->task_frag.
      
      v2: Eric didn't like gfp flags being tested twice.  Introduce a new
          helper gfpflags_normal_context() and combine the two tests.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d5cb12a
    • B
      net: ethernet: ftgmac100: Fix DMA coherency issue with SW checksum · 189982d1
      Benjamin Herrenschmidt 提交于
      [ Upstream commit 88824e3bf29a2fcacfd9ebbfe03063649f0f3254 ]
      
      We are calling the checksum helper after the dma_map_single()
      call to map the packet. This is incorrect as the checksumming
      code will touch the packet from the CPU. This means the cache
      won't be properly flushes (or the bounce buffering will leave
      us with the unmodified packet to DMA).
      
      This moves the calculation of the checksum & vlan tags to
      before the DMA mapping.
      
      This also has the side effect of fixing another bug: If the
      checksum helper fails, we goto "drop" to drop the packet, which
      will not unmap the DMA mapping.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Fixes: 05690d63 ("ftgmac100: Upgrade to NETIF_F_HW_CSUM")
      Reviewed-by: NVijay Khemka <vijaykhemka@fb.com>
      Tested-by: NVijay Khemka <vijaykhemka@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      189982d1
    • F
      net: dsa: bcm_sf2: Fix IMP setup for port different than 8 · 5536fc89
      Florian Fainelli 提交于
      [ Upstream commit 5fc0f21246e50afdf318b5a3a941f7f4f57b8947 ]
      
      Since it became possible for the DSA core to use a CPU port different
      than 8, our bcm_sf2_imp_setup() function was broken because it assumes
      that registers are applicable to port 8. In particular, the port's MAC
      is going to stay disabled, so make sure we clear the RX_DIS and TX_DIS
      bits if we are not configured for port 8.
      
      Fixes: 9f91484f ("net: dsa: make "label" property optional for dsa2")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5536fc89
    • E
      net: annotate lockless accesses to sk->sk_napi_id · 2c50a36d
      Eric Dumazet 提交于
      [ Upstream commit ee8d153d46a3b98c064ee15c0c0a3bbf1450e5a1 ]
      
      We already annotated most accesses to sk->sk_napi_id
      
      We missed sk_mark_napi_id() and sk_mark_napi_id_once()
      which might be called without socket lock held in UDP stack.
      
      KCSAN reported :
      BUG: KCSAN: data-race in udpv6_queue_rcv_one_skb / udpv6_queue_rcv_one_skb
      
      write to 0xffff888121c6d108 of 4 bytes by interrupt on cpu 0:
       sk_mark_napi_id include/net/busy_poll.h:125 [inline]
       __udpv6_queue_rcv_skb net/ipv6/udp.c:571 [inline]
       udpv6_queue_rcv_one_skb+0x70c/0xb40 net/ipv6/udp.c:672
       udpv6_queue_rcv_skb+0xb5/0x400 net/ipv6/udp.c:689
       udp6_unicast_rcv_skb.isra.0+0xd7/0x180 net/ipv6/udp.c:832
       __udp6_lib_rcv+0x69c/0x1770 net/ipv6/udp.c:913
       udpv6_rcv+0x2b/0x40 net/ipv6/udp.c:1015
       ip6_protocol_deliver_rcu+0x22a/0xbe0 net/ipv6/ip6_input.c:409
       ip6_input_finish+0x30/0x50 net/ipv6/ip6_input.c:450
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip6_input+0x177/0x190 net/ipv6/ip6_input.c:459
       dst_input include/net/dst.h:442 [inline]
       ip6_rcv_finish+0x110/0x140 net/ipv6/ip6_input.c:76
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ipv6_rcv+0x1a1/0x1b0 net/ipv6/ip6_input.c:284
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
       napi_poll net/core/dev.c:6392 [inline]
       net_rx_action+0x3ae/0xa90 net/core/dev.c:6460
      
      write to 0xffff888121c6d108 of 4 bytes by interrupt on cpu 1:
       sk_mark_napi_id include/net/busy_poll.h:125 [inline]
       __udpv6_queue_rcv_skb net/ipv6/udp.c:571 [inline]
       udpv6_queue_rcv_one_skb+0x70c/0xb40 net/ipv6/udp.c:672
       udpv6_queue_rcv_skb+0xb5/0x400 net/ipv6/udp.c:689
       udp6_unicast_rcv_skb.isra.0+0xd7/0x180 net/ipv6/udp.c:832
       __udp6_lib_rcv+0x69c/0x1770 net/ipv6/udp.c:913
       udpv6_rcv+0x2b/0x40 net/ipv6/udp.c:1015
       ip6_protocol_deliver_rcu+0x22a/0xbe0 net/ipv6/ip6_input.c:409
       ip6_input_finish+0x30/0x50 net/ipv6/ip6_input.c:450
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip6_input+0x177/0x190 net/ipv6/ip6_input.c:459
       dst_input include/net/dst.h:442 [inline]
       ip6_rcv_finish+0x110/0x140 net/ipv6/ip6_input.c:76
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ipv6_rcv+0x1a1/0x1b0 net/ipv6/ip6_input.c:284
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 10890 Comm: syz-executor.0 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: e68b6e50 ("udp: enable busy polling for all sockets")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c50a36d
    • E
      net: annotate accesses to sk->sk_incoming_cpu · 0cfaf03c
      Eric Dumazet 提交于
      [ Upstream commit 7170a977743b72cf3eb46ef6ef89885dc7ad3621 ]
      
      This socket field can be read and written by concurrent cpus.
      
      Use READ_ONCE() and WRITE_ONCE() annotations to document this,
      and avoid some compiler 'optimizations'.
      
      KCSAN reported :
      
      BUG: KCSAN: data-race in tcp_v4_rcv / tcp_v4_rcv
      
      write to 0xffff88812220763c of 4 bytes by interrupt on cpu 0:
       sk_incoming_cpu_update include/net/sock.h:953 [inline]
       tcp_v4_rcv+0x1b3c/0x1bb0 net/ipv4/tcp_ipv4.c:1934
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
       napi_poll net/core/dev.c:6392 [inline]
       net_rx_action+0x3ae/0xa90 net/core/dev.c:6460
       __do_softirq+0x115/0x33f kernel/softirq.c:292
       do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082
       do_softirq.part.0+0x6b/0x80 kernel/softirq.c:337
       do_softirq kernel/softirq.c:329 [inline]
       __local_bh_enable_ip+0x76/0x80 kernel/softirq.c:189
      
      read to 0xffff88812220763c of 4 bytes by interrupt on cpu 1:
       sk_incoming_cpu_update include/net/sock.h:952 [inline]
       tcp_v4_rcv+0x181a/0x1bb0 net/ipv4/tcp_ipv4.c:1934
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       process_backlog+0x1d3/0x420 net/core/dev.c:5955
       napi_poll net/core/dev.c:6392 [inline]
       net_rx_action+0x3ae/0xa90 net/core/dev.c:6460
       __do_softirq+0x115/0x33f kernel/softirq.c:292
       run_ksoftirqd+0x46/0x60 kernel/softirq.c:603
       smpboot_thread_fn+0x37d/0x4a0 kernel/smpboot.c:165
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0cfaf03c
    • E
      inet: stop leaking jiffies on the wire · 07de7389
      Eric Dumazet 提交于
      [ Upstream commit a904a0693c189691eeee64f6c6b188bd7dc244e9 ]
      
      Historically linux tried to stick to RFC 791, 1122, 2003
      for IPv4 ID field generation.
      
      RFC 6864 made clear that no matter how hard we try,
      we can not ensure unicity of IP ID within maximum
      lifetime for all datagrams with a given source
      address/destination address/protocol tuple.
      
      Linux uses a per socket inet generator (inet_id), initialized
      at connection startup with a XOR of 'jiffies' and other
      fields that appear clear on the wire.
      
      Thiemo Nagel pointed that this strategy is a privacy
      concern as this provides 16 bits of entropy to fingerprint
      devices.
      
      Let's switch to a random starting point, this is just as
      good as far as RFC 6864 is concerned and does not leak
      anything critical.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NThiemo Nagel <tnagel@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      07de7389
    • X
      erspan: fix the tun_info options_len check for erspan · 163901dc
      Xin Long 提交于
      [ Upstream commit 2eb8d6d2910cfe3dc67dc056f26f3dd9c63d47cd ]
      
      The check for !md doens't really work for ip_tunnel_info_opts(info) which
      only does info + 1. Also to avoid out-of-bounds access on info, it should
      ensure options_len is not less than erspan_metadata in both erspan_xmit()
      and ip6erspan_tunnel_xmit().
      
      Fixes: 1a66a836 ("gre: add collect_md mode to ERSPAN tunnel")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      163901dc
    • E
      dccp: do not leak jiffies on the wire · 96df1ec2
      Eric Dumazet 提交于
      [ Upstream commit 3d1e5039f5f87a8731202ceca08764ee7cb010d3 ]
      
      For some reason I missed the case of DCCP passive
      flows in my previous patch.
      
      Fixes: a904a0693c18 ("inet: stop leaking jiffies on the wire")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NThiemo Nagel <tnagel@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96df1ec2
    • V
      cxgb4: fix panic when attaching to ULD fail · f291613f
      Vishal Kulkarni 提交于
      [ Upstream commit fc89cc358fb64e2429aeae0f37906126636507ec ]
      
      Release resources when attaching to ULD fail. Otherwise, data
      mismatch is seen between LLD and ULD later on, which lead to
      kernel panic when accessing resources that should not even
      exist in the first place.
      
      Fixes: 94cdb8bb ("cxgb4: Add support for dynamic allocation of resources for ULD")
      Signed-off-by: NShahjada Abul Husain <shahjada@chelsio.com>
      Signed-off-by: NVishal Kulkarni <vishal@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f291613f
    • J
      nbd: handle racing with error'ed out commands · 1f032ca2
      Josef Bacik 提交于
      [ Upstream commit 7ce23e8e0a9cd38338fc8316ac5772666b565ca9 ]
      
      We hit the following warning in production
      
      print_req_error: I/O error, dev nbd0, sector 7213934408 flags 80700
      ------------[ cut here ]------------
      refcount_t: underflow; use-after-free.
      WARNING: CPU: 25 PID: 32407 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
      Workqueue: knbd-recv recv_work [nbd]
      RIP: 0010:refcount_sub_and_test_checked+0x53/0x60
      Call Trace:
       blk_mq_free_request+0xb7/0xf0
       blk_mq_complete_request+0x62/0xf0
       recv_work+0x29/0xa1 [nbd]
       process_one_work+0x1f5/0x3f0
       worker_thread+0x2d/0x3d0
       ? rescuer_thread+0x340/0x340
       kthread+0x111/0x130
       ? kthread_create_on_node+0x60/0x60
       ret_from_fork+0x1f/0x30
      ---[ end trace b079c3c67f98bb7c ]---
      
      This was preceded by us timing out everything and shutting down the
      sockets for the device.  The problem is we had a request in the queue at
      the same time, so we completed the request twice.  This can actually
      happen in a lot of cases, we fail to get a ref on our config, we only
      have one connection and just error out the command, etc.
      
      Fix this by checking cmd->status in nbd_read_stat.  We only change this
      under the cmd->lock, so we are safe to check this here and see if we've
      already error'ed this command out, which would indicate that we've
      completed it as well.
      Reviewed-by: NMike Christie <mchristi@redhat.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      1f032ca2
    • J
      nbd: protect cmd->status with cmd->lock · 82b7c99e
      Josef Bacik 提交于
      [ Upstream commit de6346ecbc8f5591ebd6c44ac164e8b8671d71d7 ]
      
      We already do this for the most part, except in timeout and clear_req.
      For the timeout case we take the lock after we grab a ref on the config,
      but that isn't really necessary because we're safe to touch the cmd at
      this point, so just move the order around.
      
      For the clear_req cause this is initiated by the user, so again is safe.
      Reviewed-by: NMike Christie <mchristi@redhat.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      82b7c99e
    • D
      cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs · 80b42f43
      Dave Wysochanski 提交于
      [ Upstream commit d46b0da7a33dd8c99d969834f682267a45444ab3 ]
      
      There's a deadlock that is possible and can easily be seen with
      a test where multiple readers open/read/close of the same file
      and a disruption occurs causing reconnect.  The deadlock is due
      a reader thread inside cifs_strict_readv calling down_read and
      obtaining lock_sem, and then after reconnect inside
      cifs_reopen_file calling down_read a second time.  If in
      between the two down_read calls, a down_write comes from
      another process, deadlock occurs.
      
              CPU0                    CPU1
              ----                    ----
      cifs_strict_readv()
       down_read(&cifsi->lock_sem);
                                     _cifsFileInfo_put
                                        OR
                                     cifs_new_fileinfo
                                      down_write(&cifsi->lock_sem);
      cifs_reopen_file()
       down_read(&cifsi->lock_sem);
      
      Fix the above by changing all down_write(lock_sem) calls to
      down_write_trylock(lock_sem)/msleep() loop, which in turn
      makes the second down_read call benign since it will never
      block behind the writer while holding lock_sem.
      Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
      Suggested-by: NRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed--by: NRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      80b42f43
    • A
      i2c: stm32f7: remove warning when compiling with W=1 · a7448991
      Alain Volmat 提交于
      [ Upstream commit 348e46fbb4cdb2aead79aee1fd8bb25ec5fd25db ]
      
      Remove the following warning:
      
      drivers/i2c/busses/i2c-stm32f7.c:315:
      warning: cannot understand function prototype:
      'struct stm32f7_i2c_spec i2c_specs[] =
      
      Replace a comment starting with /** by simply /* to avoid having
      it interpreted as a kernel-doc comment.
      
      Fixes: aeb068c5 ("i2c: i2c-stm32f7: add driver")
      Signed-off-by: NAlain Volmat <alain.volmat@st.com>
      Reviewed-by: NPierre-Yves MORDRET <pierre-yves.mordret@st.com>
      Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      a7448991
    • F
      i2c: stm32f7: fix a race in slave mode with arbitration loss irq · 86fd9e33
      Fabrice Gasnier 提交于
      [ Upstream commit 6d6b0d0d5afc8c4c84b08261260ba11dfa5206f2 ]
      
      When in slave mode, an arbitration loss (ARLO) may be detected before the
      slave had a chance to detect the stop condition (STOPF in ISR).
      This is seen when two master + slave adapters switch their roles. It
      provokes the i2c bus to be stuck, busy as SCL line is stretched.
      - the I2C_SLAVE_STOP event is never generated due to STOPF flag is set but
        don't generate an irq (race with ARLO irq, STOPIE is masked). STOPF flag
        remains set until next master xfer (e.g. when STOPIE irq get unmasked).
        In this case, completion is generated too early: immediately upon new
        transfer request (then it doesn't send all data).
      - Some data get stuck in TXDR register. As a consequence, the controller
        stretches the SCL line: the bus gets busy until a future master transfer
        triggers the bus busy / recovery mechanism (this can take time... and
        may never happen at all)
      
      So choice is to let the STOPF being detected by the slave isr handler,
      to properly handle this stop condition. E.g. don't mask IRQs in error
      handler, when the slave is running.
      
      Fixes: 60d609f3 ("i2c: i2c-stm32f7: Add slave support")
      Signed-off-by: NFabrice Gasnier <fabrice.gasnier@st.com>
      Reviewed-by: NPierre-Yves MORDRET <pierre-yves.mordret@st.com>
      Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      86fd9e33
    • F
      i2c: stm32f7: fix first byte to send in slave mode · d746ce64
      Fabrice Gasnier 提交于
      [ Upstream commit 02e64276c6dbcc4c5f39844f33d18180832a58f3 ]
      
      The slave-interface documentation [1] states "the bus driver should
      transmit the first byte" upon I2C_SLAVE_READ_REQUESTED slave event:
      - 'val': backend returns first byte to be sent
      The driver currently ignores the 1st byte to send on this event.
      
      [1] https://www.kernel.org/doc/Documentation/i2c/slave-interface
      
      Fixes: 60d609f3 ("i2c: i2c-stm32f7: Add slave support")
      Signed-off-by: NFabrice Gasnier <fabrice.gasnier@st.com>
      Reviewed-by: NPierre-Yves MORDRET <pierre-yves.mordret@st.com>
      Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d746ce64
    • Z
      irqchip/gic-v3-its: Use the exact ITSList for VMOVP · 18e7fae3
      Zenghui Yu 提交于
      [ Upstream commit 8424312516e5d9baeeb0a95d0e4523579b7aa395 ]
      
      On a system without Single VMOVP support (say GITS_TYPER.VMOVP == 0),
      we will map vPEs only on ITSs that will actually control interrupts
      for the given VM.  And when moving a vPE, the VMOVP command will be
      issued only for those ITSs.
      
      But when issuing VMOVPs we seemed fail to present the exact ITSList
      to ITSs who are actually included in the synchronization operation.
      The its_list_map we're currently using includes all ITSs in the system,
      even though some of them don't have the corresponding vPE mapping at all.
      
      Introduce get_its_list() to get the per-VM its_list_map, to indicate
      which ITSs have vPE mappings for the given VM, and use this map as
      the expected ITSList when building VMOVP. This is hopefully a performance
      gain not to do some synchronization with those unsuspecting ITSs.
      And initialize the whole command descriptor to zero at beginning, since
      the seq_num and its_list should be RES0 when GITS_TYPER.VMOVP == 1.
      Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/1571802386-2680-1-git-send-email-yuzenghui@huawei.comSigned-off-by: NSasha Levin <sashal@kernel.org>
      18e7fae3
    • J
      MIPS: bmips: mark exception vectors as char arrays · 39637aaf
      Jonas Gorski 提交于
      [ Upstream commit e4f5cb1a9b27c0f94ef4f5a0178a3fde2d3d0e9e ]
      
      The vectors span more than one byte, so mark them as arrays.
      
      Fixes the following build error when building when using GCC 8.3:
      
      In file included from ./include/linux/string.h:19,
                       from ./include/linux/bitmap.h:9,
                       from ./include/linux/cpumask.h:12,
                       from ./arch/mips/include/asm/processor.h:15,
                       from ./arch/mips/include/asm/thread_info.h:16,
                       from ./include/linux/thread_info.h:38,
                       from ./include/asm-generic/preempt.h:5,
                       from ./arch/mips/include/generated/asm/preempt.h:1,
                       from ./include/linux/preempt.h:81,
                       from ./include/linux/spinlock.h:51,
                       from ./include/linux/mmzone.h:8,
                       from ./include/linux/bootmem.h:8,
                       from arch/mips/bcm63xx/prom.c:10:
      arch/mips/bcm63xx/prom.c: In function 'prom_init':
      ./arch/mips/include/asm/string.h:162:11: error: '__builtin_memcpy' forming offset [2, 32] is out of the bounds [0, 1] of object 'bmips_smp_movevec' with type 'char' [-Werror=array-bounds]
         __ret = __builtin_memcpy((dst), (src), __len); \
                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      arch/mips/bcm63xx/prom.c:97:3: note: in expansion of macro 'memcpy'
         memcpy((void *)0xa0000200, &bmips_smp_movevec, 0x20);
         ^~~~~~
      In file included from arch/mips/bcm63xx/prom.c:14:
      ./arch/mips/include/asm/bmips.h:80:13: note: 'bmips_smp_movevec' declared here
       extern char bmips_smp_movevec;
      
      Fixes: 18a1eef9 ("MIPS: BMIPS: Introduce bmips.h")
      Signed-off-by: NJonas Gorski <jonas.gorski@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NPaul Burton <paulburton@kernel.org>
      Cc: linux-mips@vger.kernel.org
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      39637aaf
    • N
      of: unittest: fix memory leak in unittest_data_add · fcc3f7c8
      Navid Emamdoost 提交于
      [ Upstream commit e13de8fe0d6a51341671bbe384826d527afe8d44 ]
      
      In unittest_data_add, a copy buffer is created via kmemdup. This buffer
      is leaked if of_fdt_unflatten_tree fails. The release for the
      unittest_data buffer is added.
      
      Fixes: b951f9dc ("Enabling OF selftest to run without machine's devicetree")
      Signed-off-by: NNavid Emamdoost <navid.emamdoost@gmail.com>
      Reviewed-by: NFrank Rowand <frowand.list@gmail.com>
      Signed-off-by: NRob Herring <robh@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      fcc3f7c8
    • A
      ARM: 8926/1: v7m: remove register save to stack before svc · c56b9da7
      afzal mohammed 提交于
      [ Upstream commit 2ecb287998a47cc0a766f6071f63bc185f338540 ]
      
      r0-r3 & r12 registers are saved & restored, before & after svc
      respectively. Intention was to preserve those registers across thread to
      handler mode switch.
      
      On v7-M, hardware saves the register context upon exception in AAPCS
      complaint way. Restoring r0-r3 & r12 is done from stack location where
      hardware saves it, not from the location on stack where these registers
      were saved.
      
      To clarify, on stm32f429 discovery board:
      
      1. before svc, sp - 0x90009ff8
      2. r0-r3,r12 saved to 0x90009ff8 - 0x9000a00b
      3. upon svc, h/w decrements sp by 32 & pushes registers onto stack
      4. after svc,  sp - 0x90009fd8
      5. r0-r3,r12 restored from 0x90009fd8 - 0x90009feb
      
      Above means r0-r3,r12 is not restored from the location where they are
      saved, but since hardware pushes the registers onto stack, the registers
      are restored correctly.
      
      Note that during register saving to stack (step 2), it goes past
      0x9000a000. And it seems, based on objdump, there are global symbols
      residing there, and it perhaps can cause issues on a non-XIP Kernel
      (on XIP, data section is setup later).
      
      Based on the analysis above, manually saving registers onto stack is at
      best no-op and at worst can cause data section corruption. Hence remove
      storing of registers onto stack before svc.
      
      Fixes: b70cd406 ("ARM: 8671/1: V7M: Preserve registers across switch from Thread to Handler mode")
      Signed-off-by: Nafzal mohammed <afzal.mohd.ma@gmail.com>
      Acked-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c56b9da7
    • Z
      tracing: Fix "gfp_t" format for synthetic events · fa18f803
      Zhengjun Xing 提交于
      [ Upstream commit 9fa8c9c647be624e91b09ecffa7cd97ee0600b40 ]
      
      In the format of synthetic events, the "gfp_t" is shown as "signed:1",
      but in fact the "gfp_t" is "unsigned", should be shown as "signed:0".
      
      The issue can be reproduced by the following commands:
      
      echo 'memlatency u64 lat; unsigned int order; gfp_t gfp_flags; int migratetype' > /sys/kernel/debug/tracing/synthetic_events
      cat  /sys/kernel/debug/tracing/events/synthetic/memlatency/format
      
      name: memlatency
      ID: 2233
      format:
              field:unsigned short common_type;       offset:0;       size:2; signed:0;
              field:unsigned char common_flags;       offset:2;       size:1; signed:0;
              field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
              field:int common_pid;   offset:4;       size:4; signed:1;
      
              field:u64 lat;  offset:8;       size:8; signed:0;
              field:unsigned int order;       offset:16;      size:4; signed:0;
              field:gfp_t gfp_flags;  offset:24;      size:4; signed:1;
              field:int migratetype;  offset:32;      size:4; signed:1;
      
      print fmt: "lat=%llu, order=%u, gfp_flags=%x, migratetype=%d", REC->lat, REC->order, REC->gfp_flags, REC->migratetype
      
      Link: http://lkml.kernel.org/r/20191018012034.6404-1-zhengjun.xing@linux.intel.comReviewed-by: NTom Zanussi <tom.zanussi@linux.intel.com>
      Signed-off-by: NZhengjun Xing <zhengjun.xing@linux.intel.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      fa18f803
    • B
      scsi: target: core: Do not overwrite CDB byte 1 · 63571a1f
      Bodo Stroesser 提交于
      [ Upstream commit 27e84243cb63601a10e366afe3e2d05bb03c1cb5 ]
      
      passthrough_parse_cdb() - used by TCMU and PSCSI - attepts to reset the LUN
      field of SCSI-2 CDBs (bits 5,6,7 of byte 1).  The current code is wrong as
      for newer commands not having the LUN field it overwrites relevant command
      bits (e.g. for SECURITY PROTOCOL IN / OUT). We think this code was
      unnecessary from the beginning or at least it is no longer useful. So we
      remove it entirely.
      
      Link: https://lore.kernel.org/r/12498eab-76fd-eaad-1316-c2827badb76a@ts.fujitsu.comSigned-off-by: NBodo Stroesser <bstroesser@ts.fujitsu.com>
      Reviewed-by: NBart Van Assche <bvanassche@acm.org>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      63571a1f
    • C
      drm/amdgpu: fix potential VM faults · 1df8da33
      Christian König 提交于
      [ Upstream commit 3122051edc7c27cc08534be730f4c7c180919b8a ]
      
      When we allocate new page tables under memory
      pressure we should not evict old ones.
      Signed-off-by: NChristian König <christian.koenig@amd.com>
      Acked-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      1df8da33
    • P
      ARM: davinci: dm365: Fix McBSP dma_slave_map entry · 3cd2b649
      Peter Ujfalusi 提交于
      [ Upstream commit 564b6bb9d42d31fc80c006658cf38940a9b99616 ]
      
      dm365 have only single McBSP, so the device name is without .0
      
      Fixes: 0c750e1f ("ARM: davinci: dm365: Add dma_slave_map to edma")
      Signed-off-by: NPeter Ujfalusi <peter.ujfalusi@ti.com>
      Signed-off-by: NSekhar Nori <nsekhar@ti.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      3cd2b649
    • Y
      perf kmem: Fix memory leak in compact_gfp_flags() · e18bf407
      Yunfeng Ye 提交于
      [ Upstream commit 1abecfcaa7bba21c9985e0136fa49836164dd8fd ]
      
      The memory @orig_flags is allocated by strdup(), it is freed on the
      normal path, but leak to free on the error path.
      
      Fix this by adding free(orig_flags) on the error path.
      
      Fixes: 0e111156 ("perf kmem: Print gfp flags in human readable string")
      Signed-off-by: NYunfeng Ye <yeyunfeng@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Feilong Lin <linfeilong@huawei.com>
      Cc: Hu Shiyuan <hushiyuan@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/f9e9f458-96f3-4a97-a1d5-9feec2420e07@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      e18bf407
    • C
      8250-men-mcb: fix error checking when get_num_ports returns -ENODEV · 05dd6283
      Colin Ian King 提交于
      [ Upstream commit f50b6805dbb993152025ec04dea094c40cc93a0c ]
      
      The current checking for failure on the number of ports fails when
      -ENODEV is returned from the call to get_num_ports. Fix this by making
      num_ports and loop counter i signed rather than unsigned ints. Also
      add check for num_ports being less than zero to check for -ve error
      returns.
      
      Addresses-Coverity: ("Unsigned compared against 0")
      Fixes: e2fea54e ("8250-men-mcb: add support for 16z025 and 16z057")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Reviewed-by: NMichael Moese <mmoese@suse.de>
      Link: https://lore.kernel.org/r/20191013220016.9369-1-colin.king@canonical.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      05dd6283