1. 04 6月, 2019 1 次提交
    • E
      net-gro: fix use-after-free read in napi_gro_frags() · 39fd0dc4
      Eric Dumazet 提交于
      [ Upstream commit a4270d6795b0580287453ea55974d948393e66ef ]
      
      If a network driver provides to napi_gro_frags() an
      skb with a page fragment of exactly 14 bytes, the call
      to gro_pull_from_frag0() will 'consume' the fragment
      by calling skb_frag_unref(skb, 0), and the page might
      be freed and reused.
      
      Reading eth->h_proto at the end of napi_frags_skb() might
      read mangled data, or crash under specific debugging features.
      
      BUG: KASAN: use-after-free in napi_frags_skb net/core/dev.c:5833 [inline]
      BUG: KASAN: use-after-free in napi_gro_frags+0xc6f/0xd10 net/core/dev.c:5841
      Read of size 2 at addr ffff88809366840c by task syz-executor599/8957
      
      CPU: 1 PID: 8957 Comm: syz-executor599 Not tainted 5.2.0-rc1+ #32
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188
       __kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
       kasan_report+0x12/0x20 mm/kasan/common.c:614
       __asan_report_load_n_noabort+0xf/0x20 mm/kasan/generic_report.c:142
       napi_frags_skb net/core/dev.c:5833 [inline]
       napi_gro_frags+0xc6f/0xd10 net/core/dev.c:5841
       tun_get_user+0x2f3c/0x3ff0 drivers/net/tun.c:1991
       tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2037
       call_write_iter include/linux/fs.h:1872 [inline]
       do_iter_readv_writev+0x5f8/0x8f0 fs/read_write.c:693
       do_iter_write fs/read_write.c:970 [inline]
       do_iter_write+0x184/0x610 fs/read_write.c:951
       vfs_writev+0x1b3/0x2f0 fs/read_write.c:1015
       do_writev+0x15b/0x330 fs/read_write.c:1058
      
      Fixes: a50e233c ("net-gro: restore frag0 optimization")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      39fd0dc4
  2. 26 5月, 2019 1 次提交
  3. 27 4月, 2019 1 次提交
    • S
      failover: allow name change on IFF_UP slave interfaces · fae6053d
      Si-Wei Liu 提交于
      [ Upstream commit 8065a779f17e94536a1c4dcee4f9d88011672f97 ]
      
      When a netdev appears through hot plug then gets enslaved by a failover
      master that is already up and running, the slave will be opened
      right away after getting enslaved. Today there's a race that userspace
      (udev) may fail to rename the slave if the kernel (net_failover)
      opens the slave earlier than when the userspace rename happens.
      Unlike bond or team, the primary slave of failover can't be renamed by
      userspace ahead of time, since the kernel initiated auto-enslavement is
      unable to, or rather, is never meant to be synchronized with the rename
      request from userspace.
      
      As the failover slave interfaces are not designed to be operated
      directly by userspace apps: IP configuration, filter rules with
      regard to network traffic passing and etc., should all be done on master
      interface. In general, userspace apps only care about the
      name of master interface, while slave names are less important as long
      as admin users can see reliable names that may carry
      other information describing the netdev. For e.g., they can infer that
      "ens3nsby" is a standby slave of "ens3", while for a
      name like "eth0" they can't tell which master it belongs to.
      
      Historically the name of IFF_UP interface can't be changed because
      there might be admin script or management software that is already
      relying on such behavior and assumes that the slave name can't be
      changed once UP. But failover is special: with the in-kernel
      auto-enslavement mechanism, the userspace expectation for device
      enumeration and bring-up order is already broken. Previously initramfs
      and various userspace config tools were modified to bypass failover
      slaves because of auto-enslavement and duplicate MAC address. Similarly,
      in case that users care about seeing reliable slave name, the new type
      of failover slaves needs to be taken care of specifically in userspace
      anyway.
      
      It's less risky to lift up the rename restriction on failover slave
      which is already UP. Although it's possible this change may potentially
      break userspace component (most likely configuration scripts or
      management software) that assumes slave name can't be changed while
      UP, it's relatively a limited and controllable set among all userspace
      components, which can be fixed specifically to listen for the rename
      events on failover slaves. Userspace component interacting with slaves
      is expected to be changed to operate on failover master interface
      instead, as the failover slave is dynamic in nature which may come and
      go at any point.  The goal is to make the role of failover slaves less
      relevant, and userspace components should only deal with failover master
      in the long run.
      
      Fixes: 30c8bd5a ("net: Introduce generic failover module")
      Signed-off-by: NSi-Wei Liu <si-wei.liu@oracle.com>
      Reviewed-by: NLiran Alon <liran.alon@oracle.com>
      Acked-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fae6053d
  4. 17 4月, 2019 1 次提交
    • A
      net: core: netif_receive_skb_list: unlist skb before passing to pt->func · 55a7f7b2
      Alexander Lobakin 提交于
      [ Upstream commit 9a5a90d167b0e5fe3d47af16b68fd09ce64085cd ]
      
      __netif_receive_skb_list_ptype() leaves skb->next poisoned before passing
      it to pt_prev->func handler, what may produce (in certain cases, e.g. DSA
      setup) crashes like:
      
      [ 88.606777] CPU 0 Unable to handle kernel paging request at virtual address 0000000e, epc == 80687078, ra == 8052cc7c
      [ 88.618666] Oops[#1]:
      [ 88.621196] CPU: 0 PID: 0 Comm: swapper Not tainted 5.1.0-rc2-dlink-00206-g4192a172-dirty #1473
      [ 88.630885] $ 0 : 00000000 10000400 00000002 864d7850
      [ 88.636709] $ 4 : 87c0ddf0 864d7800 87c0ddf0 00000000
      [ 88.642526] $ 8 : 00000000 49600000 00000001 00000001
      [ 88.648342] $12 : 00000000 c288617b dadbee27 25d17c41
      [ 88.654159] $16 : 87c0ddf0 85cff080 80790000 fffffffd
      [ 88.659975] $20 : 80797b20 ffffffff 00000001 864d7800
      [ 88.665793] $24 : 00000000 8011e658
      [ 88.671609] $28 : 80790000 87c0dbc0 87cabf00 8052cc7c
      [ 88.677427] Hi : 00000003
      [ 88.680622] Lo : 7b5b4220
      [ 88.683840] epc : 80687078 vlan_dev_hard_start_xmit+0x1c/0x1a0
      [ 88.690532] ra : 8052cc7c dev_hard_start_xmit+0xac/0x188
      [ 88.696734] Status: 10000404	IEp
      [ 88.700422] Cause : 50000008 (ExcCode 02)
      [ 88.704874] BadVA : 0000000e
      [ 88.708069] PrId : 0001a120 (MIPS interAptiv (multi))
      [ 88.713005] Modules linked in:
      [ 88.716407] Process swapper (pid: 0, threadinfo=(ptrval), task=(ptrval), tls=00000000)
      [ 88.725219] Stack : 85f61c28 00000000 0000000e 80780000 87c0ddf0 85cff080 80790000 8052cc7c
      [ 88.734529] 87cabf00 00000000 00000001 85f5fb40 807b0000 864d7850 87cabf00 807d0000
      [ 88.743839] 864d7800 8655f600 00000000 85cff080 87c1c000 0000006a 00000000 8052d96c
      [ 88.753149] 807a0000 8057adb8 87c0dcc8 87c0dc50 85cfff08 00000558 87cabf00 85f58c50
      [ 88.762460] 00000002 85f58c00 864d7800 80543308 fffffff4 00000001 85f58c00 864d7800
      [ 88.771770] ...
      [ 88.774483] Call Trace:
      [ 88.777199] [<80687078>] vlan_dev_hard_start_xmit+0x1c/0x1a0
      [ 88.783504] [<8052cc7c>] dev_hard_start_xmit+0xac/0x188
      [ 88.789326] [<8052d96c>] __dev_queue_xmit+0x6e8/0x7d4
      [ 88.794955] [<805a8640>] ip_finish_output2+0x238/0x4d0
      [ 88.800677] [<805ab6a0>] ip_output+0xc8/0x140
      [ 88.805526] [<805a68f4>] ip_forward+0x364/0x560
      [ 88.810567] [<805a4ff8>] ip_rcv+0x48/0xe4
      [ 88.815030] [<80528d44>] __netif_receive_skb_one_core+0x44/0x58
      [ 88.821635] [<8067f220>] dsa_switch_rcv+0x108/0x1ac
      [ 88.827067] [<80528f80>] __netif_receive_skb_list_core+0x228/0x26c
      [ 88.833951] [<8052ed84>] netif_receive_skb_list+0x1d4/0x394
      [ 88.840160] [<80355a88>] lunar_rx_poll+0x38c/0x828
      [ 88.845496] [<8052fa78>] net_rx_action+0x14c/0x3cc
      [ 88.850835] [<806ad300>] __do_softirq+0x178/0x338
      [ 88.856077] [<8012a2d4>] irq_exit+0xbc/0x100
      [ 88.860846] [<802f8b70>] plat_irq_dispatch+0xc0/0x144
      [ 88.866477] [<80105974>] handle_int+0x14c/0x158
      [ 88.871516] [<806acfb0>] r4k_wait+0x30/0x40
      [ 88.876462] Code: afb10014 8c8200a0 00803025 <9443000c> 94a20468 00000000 10620042 00a08025 9605046a
      [ 88.887332]
      [ 88.888982] ---[ end trace eb863d007da11cf1 ]---
      [ 88.894122] Kernel panic - not syncing: Fatal exception in interrupt
      [ 88.901202] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      Fix this by pulling skb off the sublist and zeroing skb->next pointer
      before calling ptype callback.
      
      Fixes: 88eb1944 ("net: core: propagate SKB lists through packet_type lookup")
      Reviewed-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NAlexander Lobakin <alobakin@dlink.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      55a7f7b2
  5. 23 2月, 2019 1 次提交
    • H
      net: Fix for_each_netdev_feature on Big endian · b2975c2e
      Hauke Mehrtens 提交于
      [ Upstream commit 3b89ea9c5902acccdbbdec307c85edd1bf52515e ]
      
      The features attribute is of type u64 and stored in the native endianes on
      the system. The for_each_set_bit() macro takes a pointer to a 32 bit array
      and goes over the bits in this area. On little Endian systems this also
      works with an u64 as the most significant bit is on the highest address,
      but on big endian the words are swapped. When we expect bit 15 here we get
      bit 47 (15 + 32).
      
      This patch converts it more or less to its own for_each_set_bit()
      implementation which works on 64 bit integers directly. This is then
      completely in host endianness and should work like expected.
      
      Fixes: fd867d51 ("net/core: generic support for disabling netdev features down stack")
      Signed-off-by: NHauke Mehrtens <hauke.mehrtens@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      b2975c2e
  6. 07 2月, 2019 1 次提交
  7. 17 12月, 2018 3 次提交
    • S
      net: fix XPS static_key accounting · 9ac60749
      Sabrina Dubroca 提交于
      [ Upstream commit 867d0ad476db89a1e8af3f297af402399a54eea5 ]
      
      Commit 04157469 ("net: Use static_key for XPS maps") introduced a
      static key for XPS, but the increments/decrements don't match.
      
      First, the static key's counter is incremented once for each queue, but
      only decremented once for a whole batch of queues, leading to large
      unbalances.
      
      Second, the xps_rxqs_needed key is decremented whenever we reset a batch
      of queues, whether they had any rxqs mapping or not, so that if we setup
      cpu-XPS on em1 and RXQS-XPS on em2, resetting the queues on em1 would
      decrement the xps_rxqs_needed key.
      
      This reworks the accounting scheme so that the xps_needed key is
      incremented only once for each type of XPS for all the queues on a
      device, and the xps_rxqs_needed key is incremented only once for all
      queues. This is sufficient to let us retrieve queues via
      get_xps_queue().
      
      This patch introduces a new reset_xps_maps(), which reinitializes and
      frees the appropriate map (xps_rxqs_map or xps_cpus_map), and drops a
      reference to the needed keys:
       - both xps_needed and xps_rxqs_needed, in case of rxqs maps,
       - only xps_needed, in case of CPU maps.
      
      Now, we also need to call reset_xps_maps() at the end of
      __netif_set_xps_queue() when there's no active map left, for example
      when writing '00000000,00000000' to all queues' xps_rxqs setting.
      
      Fixes: 04157469 ("net: Use static_key for XPS maps")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ac60749
    • S
      net: restore call to netdev_queue_numa_node_write when resetting XPS · b4b8a71c
      Sabrina Dubroca 提交于
      [ Upstream commit f28c020fb488e1a8b87469812017044bef88aa2b ]
      
      Before commit 80d19669 ("net: Refactor XPS for CPUs and Rx queues"),
      netif_reset_xps_queues() did netdev_queue_numa_node_write() for all the
      queues being reset. Now, this is only done when the "active" variable in
      clean_xps_maps() is false, ie when on all the CPUs, there's no active
      XPS mapping left.
      
      Fixes: 80d19669 ("net: Refactor XPS for CPUs and Rx queues")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b4b8a71c
    • E
      net: use skb_list_del_init() to remove from RX sublists · 7fafda16
      Edward Cree 提交于
      [ Upstream commit 22f6bbb7bcfcef0b373b0502a7ff390275c575dd ]
      
      list_del() leaves the skb->next pointer poisoned, which can then lead to
       a crash in e.g. OVS forwarding.  For example, setting up an OVS VXLAN
       forwarding bridge on sfc as per:
      
      ========
      $ ovs-vsctl show
      5dfd9c47-f04b-4aaa-aa96-4fbb0a522a30
          Bridge "br0"
              Port "br0"
                  Interface "br0"
                      type: internal
              Port "enp6s0f0"
                  Interface "enp6s0f0"
              Port "vxlan0"
                  Interface "vxlan0"
                      type: vxlan
                      options: {key="1", local_ip="10.0.0.5", remote_ip="10.0.0.4"}
          ovs_version: "2.5.0"
      ========
      (where 10.0.0.5 is an address on enp6s0f1)
      and sending traffic across it will lead to the following panic:
      ========
      general protection fault: 0000 [#1] SMP PTI
      CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.20.0-rc3-ehc+ #701
      Hardware name: Dell Inc. PowerEdge R710/0M233H, BIOS 6.4.0 07/23/2013
      RIP: 0010:dev_hard_start_xmit+0x38/0x200
      Code: 53 48 89 fb 48 83 ec 20 48 85 ff 48 89 54 24 08 48 89 4c 24 18 0f 84 ab 01 00 00 48 8d 86 90 00 00 00 48 89 f5 48 89 44 24 10 <4c> 8b 33 48 c7 03 00 00 00 00 48 8b 05 c7 d1 b3 00 4d 85 f6 0f 95
      RSP: 0018:ffff888627b437e0 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: dead000000000100 RCX: ffff88862279c000
      RDX: ffff888614a342c0 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: ffff888618a88000 R08: 0000000000000001 R09: 00000000000003e8
      R10: 0000000000000000 R11: ffff888614a34140 R12: 0000000000000000
      R13: 0000000000000062 R14: dead000000000100 R15: ffff888616430000
      FS:  0000000000000000(0000) GS:ffff888627b40000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f6d2bc6d000 CR3: 000000000200a000 CR4: 00000000000006e0
      Call Trace:
       <IRQ>
       __dev_queue_xmit+0x623/0x870
       ? masked_flow_lookup+0xf7/0x220 [openvswitch]
       ? ep_poll_callback+0x101/0x310
       do_execute_actions+0xaba/0xaf0 [openvswitch]
       ? __wake_up_common+0x8a/0x150
       ? __wake_up_common_lock+0x87/0xc0
       ? queue_userspace_packet+0x31c/0x5b0 [openvswitch]
       ovs_execute_actions+0x47/0x120 [openvswitch]
       ovs_dp_process_packet+0x7d/0x110 [openvswitch]
       ovs_vport_receive+0x6e/0xd0 [openvswitch]
       ? dst_alloc+0x64/0x90
       ? rt_dst_alloc+0x50/0xd0
       ? ip_route_input_slow+0x19a/0x9a0
       ? __udp_enqueue_schedule_skb+0x198/0x1b0
       ? __udp4_lib_rcv+0x856/0xa30
       ? __udp4_lib_rcv+0x856/0xa30
       ? cpumask_next_and+0x19/0x20
       ? find_busiest_group+0x12d/0xcd0
       netdev_frame_hook+0xce/0x150 [openvswitch]
       __netif_receive_skb_core+0x205/0xae0
       __netif_receive_skb_list_core+0x11e/0x220
       netif_receive_skb_list+0x203/0x460
       ? __efx_rx_packet+0x335/0x5e0 [sfc]
       efx_poll+0x182/0x320 [sfc]
       net_rx_action+0x294/0x3c0
       __do_softirq+0xca/0x297
       irq_exit+0xa6/0xb0
       do_IRQ+0x54/0xd0
       common_interrupt+0xf/0xf
       </IRQ>
      ========
      So, in all listified-receive handling, instead pull skbs off the lists with
       skb_list_del_init().
      
      Fixes: 9af86f93 ("net: core: fix use-after-free in __netif_receive_skb_list_core")
      Fixes: 7da517a3 ("net: core: Another step of skb receive list processing")
      Fixes: a4ca8b7d ("net: ipv4: fix drop handling in ip_list_rcv() and ip_list_rcv_finish()")
      Fixes: d8269e2c ("net: ipv6: listify ipv6_rcv() and ip6_rcv_finish()")
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7fafda16
  8. 06 12月, 2018 1 次提交
  9. 23 11月, 2018 1 次提交
    • E
      net-gro: reset skb->pkt_type in napi_reuse_skb() · a21a82a9
      Eric Dumazet 提交于
      [ Upstream commit 33d9a2c72f086cbf1087b2fd2d1a15aa9df14a7f ]
      
      eth_type_trans() assumes initial value for skb->pkt_type
      is PACKET_HOST.
      
      This is indeed the value right after a fresh skb allocation.
      
      However, it is possible that GRO merged a packet with a different
      value (like PACKET_OTHERHOST in case macvlan is used), so
      we need to make sure napi->skb will have pkt_type set back to
      PACKET_HOST.
      
      Otherwise, valid packets might be dropped by the stack because
      their pkt_type is not PACKET_HOST.
      
      napi_reuse_skb() was added in commit 96e93eab ("gro: Add
      internal interfaces for VLAN"), but this bug always has
      been there.
      
      Fixes: 96e93eab ("gro: Add internal interfaces for VLAN")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a21a82a9
  10. 04 11月, 2018 1 次提交
  11. 11 10月, 2018 1 次提交
    • S
      net: ipv4: update fnhe_pmtu when first hop's MTU changes · af7d6cce
      Sabrina Dubroca 提交于
      Since commit 5aad1de5 ("ipv4: use separate genid for next hop
      exceptions"), exceptions get deprecated separately from cached
      routes. In particular, administrative changes don't clear PMTU anymore.
      
      As Stefano described in commit e9fa1495 ("ipv6: Reflect MTU changes
      on PMTU of exceptions for MTU-less routes"), the PMTU discovered before
      the local MTU change can become stale:
       - if the local MTU is now lower than the PMTU, that PMTU is now
         incorrect
       - if the local MTU was the lowest value in the path, and is increased,
         we might discover a higher PMTU
      
      Similarly to what commit e9fa1495 did for IPv6, update PMTU in those
      cases.
      
      If the exception was locked, the discovered PMTU was smaller than the
      minimal accepted PMTU. In that case, if the new local MTU is smaller
      than the current PMTU, let PMTU discovery figure out if locking of the
      exception is still needed.
      
      To do this, we need to know the old link MTU in the NETDEV_CHANGEMTU
      notifier. By the time the notifier is called, dev->mtu has been
      changed. This patch adds the old MTU as additional information in the
      notifier structure, and a new call_netdevice_notifiers_u32() function.
      
      Fixes: 5aad1de5 ("ipv4: use separate genid for next hop exceptions")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af7d6cce
  12. 30 8月, 2018 1 次提交
  13. 10 8月, 2018 1 次提交
    • A
      net: allow to call netif_reset_xps_queues() under cpus_read_lock · 4d99f660
      Andrei Vagin 提交于
      The definition of static_key_slow_inc() has cpus_read_lock in place. In the
      virtio_net driver, XPS queues are initialized after setting the queue:cpu
      affinity in virtnet_set_affinity() which is already protected within
      cpus_read_lock. Lockdep prints a warning when we are trying to acquire
      cpus_read_lock when it is already held.
      
      This patch adds an ability to call __netif_set_xps_queue under
      cpus_read_lock().
      Acked-by: NJason Wang <jasowang@redhat.com>
      
      ============================================
      WARNING: possible recursive locking detected
      4.18.0-rc3-next-20180703+ #1 Not tainted
      --------------------------------------------
      swapper/0/1 is trying to acquire lock:
      00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: static_key_slow_inc+0xe/0x20
      
      but task is already holding lock:
      00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0x513/0x5a0
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(cpu_hotplug_lock.rw_sem);
        lock(cpu_hotplug_lock.rw_sem);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      3 locks held by swapper/0/1:
       #0: 00000000244bc7da (&dev->mutex){....}, at: __driver_attach+0x5a/0x110
       #1: 00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0x513/0x5a0
       #2: 000000005cd8463f (xps_map_mutex){+.+.}, at: __netif_set_xps_queue+0x8d/0xc60
      
      v2: move cpus_read_lock() out of __netif_set_xps_queue()
      
      Cc: "Nambiar, Amritha" <amritha.nambiar@intel.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Fixes: 8af2c06f ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
      Signed-off-by: NAndrei Vagin <avagin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d99f660
  14. 06 8月, 2018 1 次提交
  15. 31 7月, 2018 1 次提交
    • P
      net/tc: introduce TC_ACT_REINSERT. · cd11b164
      Paolo Abeni 提交于
      This is similar TC_ACT_REDIRECT, but with a slightly different
      semantic:
      - on ingress the mirred skbs are passed to the target device
      network stack without any additional check not scrubbing.
      - the rcu-protected stats provided via the tcf_result struct
        are updated on error conditions.
      
      This new tcfa_action value is not exposed to the user-space
      and can be used only internally by clsact.
      
      v1 -> v2: do not touch TC_ACT_REDIRECT code path, introduce
       a new action type instead
      v2 -> v3:
       - rename the new action value TC_ACT_REINJECT, update the
         helper accordingly
       - take care of uncloned reinjected packets in XDP generic
         hook
      v3 -> v4:
       - renamed again the new action value (JiriP)
      v4 -> v5:
       - fix build error with !NET_CLS_ACT (kbuild bot)
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd11b164
  16. 30 7月, 2018 1 次提交
  17. 27 7月, 2018 1 次提交
  18. 21 7月, 2018 1 次提交
  19. 17 7月, 2018 2 次提交
  20. 14 7月, 2018 2 次提交
  21. 13 7月, 2018 1 次提交
    • P
      net: gro: properly remove skb from list · 68d2f84a
      Prashant Bhole 提交于
      Following crash occurs in validate_xmit_skb_list() when same skb is
      iterated multiple times in the loop and consume_skb() is called.
      
      The root cause is calling list_del_init(&skb->list) and not clearing
      skb->next in d4546c25. list_del_init(&skb->list) sets skb->next
      to point to skb itself. skb->next needs to be cleared because other
      parts of network stack uses another kind of SKB lists.
      validate_xmit_skb_list() uses such list.
      
      A similar type of bugfix was reported by Jesper Dangaard Brouer.
      https://patchwork.ozlabs.org/patch/942541/
      
      This patch clears skb->next and changes list_del_init() to list_del()
      so that list->prev will maintain the list poison.
      
      [  148.185511] ==================================================================
      [  148.187865] BUG: KASAN: use-after-free in validate_xmit_skb_list+0x4b/0xa0
      [  148.190158] Read of size 8 at addr ffff8801e52eefc0 by task swapper/1/0
      [  148.192940]
      [  148.193642] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.18.0-rc3+ #25
      [  148.195423] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
      [  148.199129] Call Trace:
      [  148.200565]  <IRQ>
      [  148.201911]  dump_stack+0xc6/0x14c
      [  148.203572]  ? dump_stack_print_info.cold.1+0x2f/0x2f
      [  148.205083]  ? kmsg_dump_rewind_nolock+0x59/0x59
      [  148.206307]  ? validate_xmit_skb+0x2c6/0x560
      [  148.207432]  ? debug_show_held_locks+0x30/0x30
      [  148.208571]  ? validate_xmit_skb_list+0x4b/0xa0
      [  148.211144]  print_address_description+0x6c/0x23c
      [  148.212601]  ? validate_xmit_skb_list+0x4b/0xa0
      [  148.213782]  kasan_report.cold.6+0x241/0x2fd
      [  148.214958]  validate_xmit_skb_list+0x4b/0xa0
      [  148.216494]  sch_direct_xmit+0x1b0/0x680
      [  148.217601]  ? dev_watchdog+0x4e0/0x4e0
      [  148.218675]  ? do_raw_spin_trylock+0x10/0x120
      [  148.219818]  ? do_raw_spin_lock+0xe0/0xe0
      [  148.221032]  __dev_queue_xmit+0x1167/0x1810
      [  148.222155]  ? sched_clock+0x5/0x10
      [...]
      
      [  148.474257] Allocated by task 0:
      [  148.475363]  kasan_kmalloc+0xbf/0xe0
      [  148.476503]  kmem_cache_alloc+0xb4/0x1b0
      [  148.477654]  __build_skb+0x91/0x250
      [  148.478677]  build_skb+0x67/0x180
      [  148.479657]  e1000_clean_rx_irq+0x542/0x8a0
      [  148.480757]  e1000_clean+0x652/0xd10
      [  148.481772]  net_rx_action+0x4ea/0xc20
      [  148.482808]  __do_softirq+0x1f9/0x574
      [  148.483831]
      [  148.484575] Freed by task 0:
      [  148.485504]  __kasan_slab_free+0x12e/0x180
      [  148.486589]  kmem_cache_free+0xb4/0x240
      [  148.487634]  kfree_skbmem+0xed/0x150
      [  148.488648]  consume_skb+0x146/0x250
      [  148.489665]  validate_xmit_skb+0x2b7/0x560
      [  148.490754]  validate_xmit_skb_list+0x70/0xa0
      [  148.491897]  sch_direct_xmit+0x1b0/0x680
      [  148.493949]  __dev_queue_xmit+0x1167/0x1810
      [  148.495103]  br_dev_queue_push_xmit+0xce/0x250
      [  148.496196]  br_forward_finish+0x276/0x280
      [  148.497234]  __br_forward+0x44f/0x520
      [  148.498260]  br_forward+0x19f/0x1b0
      [  148.499264]  br_handle_frame_finish+0x65e/0x980
      [  148.500398]  NF_HOOK.constprop.10+0x290/0x2a0
      [  148.501522]  br_handle_frame+0x417/0x640
      [  148.502582]  __netif_receive_skb_core+0xaac/0x18f0
      [  148.503753]  __netif_receive_skb_one_core+0x98/0x120
      [  148.504958]  netif_receive_skb_internal+0xe3/0x330
      [  148.506154]  napi_gro_complete+0x190/0x2a0
      [  148.507243]  dev_gro_receive+0x9f7/0x1100
      [  148.508316]  napi_gro_receive+0xcb/0x260
      [  148.509387]  e1000_clean_rx_irq+0x2fc/0x8a0
      [  148.510501]  e1000_clean+0x652/0xd10
      [  148.511523]  net_rx_action+0x4ea/0xc20
      [  148.512566]  __do_softirq+0x1f9/0x574
      [  148.513598]
      [  148.514346] The buggy address belongs to the object at ffff8801e52eefc0
      [  148.514346]  which belongs to the cache skbuff_head_cache of size 232
      [  148.517047] The buggy address is located 0 bytes inside of
      [  148.517047]  232-byte region [ffff8801e52eefc0, ffff8801e52ef0a8)
      [  148.519549] The buggy address belongs to the page:
      [  148.520726] page:ffffea000794bb00 count:1 mapcount:0 mapping:ffff880106f4dfc0 index:0xffff8801e52ee840 compound_mapcount: 0
      [  148.524325] flags: 0x17ffffc0008100(slab|head)
      [  148.525481] raw: 0017ffffc0008100 ffff880106b938d0 ffff880106b938d0 ffff880106f4dfc0
      [  148.527503] raw: ffff8801e52ee840 0000000000190011 00000001ffffffff 0000000000000000
      [  148.529547] page dumped because: kasan: bad access detected
      
      Fixes: d4546c25 ("net: Convert GRO SKB handling to list_head.")
      Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Reported-by: NTyler Hicks <tyhicks@canonical.com>
      Tested-by: NTyler Hicks <tyhicks@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68d2f84a
  22. 10 7月, 2018 7 次提交
  23. 05 7月, 2018 1 次提交
  24. 04 7月, 2018 7 次提交