1. 19 3月, 2021 5 次提交
  2. 10 2月, 2021 1 次提交
  3. 09 2月, 2021 1 次提交
  4. 29 12月, 2020 4 次提交
    • A
      net-sysfs: take the rtnl lock when accessing xps_rxqs_map and num_tc · 4ae2bb81
      Antoine Tenart 提交于
      Accesses to dev->xps_rxqs_map (when using dev->num_tc) should be
      protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't
      see an actual bug being triggered, but let's be safe here and take the
      rtnl lock while accessing the map in sysfs.
      
      Fixes: 8af2c06f ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
      Signed-off-by: NAntoine Tenart <atenart@kernel.org>
      Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      4ae2bb81
    • A
      net-sysfs: take the rtnl lock when storing xps_rxqs · 2d57b4f1
      Antoine Tenart 提交于
      Two race conditions can be triggered when storing xps rxqs, resulting in
      various oops and invalid memory accesses:
      
      1. Calling netdev_set_num_tc while netif_set_xps_queue:
      
         - netif_set_xps_queue uses dev->tc_num as one of the parameters to
           compute the size of new_dev_maps when allocating it. dev->tc_num is
           also used to access the map, and the compiler may generate code to
           retrieve this field multiple times in the function.
      
         - netdev_set_num_tc sets dev->tc_num.
      
         If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
         is set to a higher value through netdev_set_num_tc, later accesses to
         new_dev_maps in netif_set_xps_queue could lead to accessing memory
         outside of new_dev_maps; triggering an oops.
      
      2. Calling netif_set_xps_queue while netdev_set_num_tc is running:
      
         2.1. netdev_set_num_tc starts by resetting the xps queues,
              dev->tc_num isn't updated yet.
      
         2.2. netif_set_xps_queue is called, setting up the map with the
              *old* dev->num_tc.
      
         2.3. netdev_set_num_tc updates dev->tc_num.
      
         2.4. Later accesses to the map lead to out of bound accesses and
              oops.
      
         A similar issue can be found with netdev_reset_tc.
      
      One way of triggering this is to set an iface up (for which the driver
      uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
      xps_rxqs in a concurrent thread. With the right timing an oops is
      triggered.
      
      Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
      and netdev_reset_tc should be mutually exclusive. We do that by taking
      the rtnl lock in xps_rxqs_store.
      
      Fixes: 8af2c06f ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
      Signed-off-by: NAntoine Tenart <atenart@kernel.org>
      Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      2d57b4f1
    • A
      net-sysfs: take the rtnl lock when accessing xps_cpus_map and num_tc · fb250385
      Antoine Tenart 提交于
      Accesses to dev->xps_cpus_map (when using dev->num_tc) should be
      protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't
      see an actual bug being triggered, but let's be safe here and take the
      rtnl lock while accessing the map in sysfs.
      
      Fixes: 184c449f ("net: Add support for XPS with QoS via traffic classes")
      Signed-off-by: NAntoine Tenart <atenart@kernel.org>
      Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      fb250385
    • A
      net-sysfs: take the rtnl lock when storing xps_cpus · 1ad58225
      Antoine Tenart 提交于
      Two race conditions can be triggered when storing xps cpus, resulting in
      various oops and invalid memory accesses:
      
      1. Calling netdev_set_num_tc while netif_set_xps_queue:
      
         - netif_set_xps_queue uses dev->tc_num as one of the parameters to
           compute the size of new_dev_maps when allocating it. dev->tc_num is
           also used to access the map, and the compiler may generate code to
           retrieve this field multiple times in the function.
      
         - netdev_set_num_tc sets dev->tc_num.
      
         If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
         is set to a higher value through netdev_set_num_tc, later accesses to
         new_dev_maps in netif_set_xps_queue could lead to accessing memory
         outside of new_dev_maps; triggering an oops.
      
      2. Calling netif_set_xps_queue while netdev_set_num_tc is running:
      
         2.1. netdev_set_num_tc starts by resetting the xps queues,
              dev->tc_num isn't updated yet.
      
         2.2. netif_set_xps_queue is called, setting up the map with the
              *old* dev->num_tc.
      
         2.3. netdev_set_num_tc updates dev->tc_num.
      
         2.4. Later accesses to the map lead to out of bound accesses and
              oops.
      
         A similar issue can be found with netdev_reset_tc.
      
      One way of triggering this is to set an iface up (for which the driver
      uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
      xps_cpus in a concurrent thread. With the right timing an oops is
      triggered.
      
      Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
      and netdev_reset_tc should be mutually exclusive. We do that by taking
      the rtnl lock in xps_cpus_store.
      
      Fixes: 184c449f ("net: Add support for XPS with QoS via traffic classes")
      Signed-off-by: NAntoine Tenart <atenart@kernel.org>
      Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      1ad58225
  5. 02 10月, 2020 1 次提交
  6. 19 8月, 2020 1 次提交
    • C
      net: Use generic ns_common::count · 8b8f3e66
      Christian Brauner 提交于
      Switch over network namespaces to use the newly introduced common lifetime
      counter.
      Network namespaces have an additional counter named "passive". This counter
      does not guarantee that the network namespace is not already de-initialized
      and so isn't concerned with the actual lifetime of the network namespace;
      only the "count" counter is. So the latter is moved into struct ns_common.
      
      Currently every namespace type has its own lifetime counter which is stored
      in the specific namespace struct. The lifetime counters are used
      identically for all namespaces types. Namespaces may of course have
      additional unrelated counters and these are not altered.
      
      This introduces a common lifetime counter into struct ns_common. The
      ns_common struct encompasses information that all namespaces share. That
      should include the lifetime counter since its common for all of them.
      
      It also allows us to unify the type of the counters across all namespaces.
      Most of them use refcount_t but one uses atomic_t and at least one uses
      kref. Especially the last one doesn't make much sense since it's just a
      wrapper around refcount_t since 2016 and actually complicates cleanup
      operations by having to use container_of() to cast the correct namespace
      struct out of struct ns_common.
      
      Having the lifetime counter for the namespaces in one place reduces
      maintenance cost. Not just because after switching all namespaces over we
      will have removed more code than we added but also because the logic is
      more easily understandable and we indicate to the user that the basic
      lifetime requirements for all namespaces are currently identical.
      Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Acked-by: NChristian Brauner <christian.brauner@ubuntu.com>
      [christian.brauner@ubuntu.com: rewrite commit]
      Link: https://lore.kernel.org/r/159644977635.604812.1319877322927063560.stgit@localhost.localdomainSigned-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
      8b8f3e66
  7. 13 8月, 2020 1 次提交
  8. 22 7月, 2020 1 次提交
  9. 08 7月, 2020 1 次提交
  10. 16 5月, 2020 1 次提交
  11. 24 4月, 2020 2 次提交
    • E
      net: napi: use READ_ONCE()/WRITE_ONCE() · 7e417a66
      Eric Dumazet 提交于
      gro_flush_timeout and napi_defer_hard_irqs can be read
      from napi_complete_done() while other cpus write the value,
      whithout explicit synchronization.
      
      Use READ_ONCE()/WRITE_ONCE() to annotate the races.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e417a66
    • E
      net: napi: add hard irqs deferral feature · 6f8b12d6
      Eric Dumazet 提交于
      Back in commit 3b47d303 ("net: gro: add a per device gro flush timer")
      we added the ability to arm one high resolution timer, that we used
      to keep not-complete packets in GRO engine a bit longer, hoping that further
      frames might be added to them.
      
      Since then, we added the napi_complete_done() interface, and commit
      364b6055 ("net: busy-poll: return busypolling status to drivers")
      allowed drivers to avoid re-arming NIC interrupts if we made a promise
      that their NAPI poll() handler would be called in the near future.
      
      This infrastructure can be leveraged, thanks to a new device parameter,
      which allows to arm the napi hrtimer, instead of re-arming the device
      hard IRQ.
      
      We have noticed that on some servers with 32 RX queues or more, the chit-chat
      between the NIC and the host caused by IRQ delivery and re-arming could hurt
      throughput by ~20% on 100Gbit NIC.
      
      In contrast, hrtimers are using local (percpu) resources and might have lower
      cost.
      
      The new tunable, named napi_defer_hard_irqs, is placed in the same hierarchy
      than gro_flush_timeout (/sys/class/net/ethX/)
      
      By default, both gro_flush_timeout and napi_defer_hard_irqs are zero.
      
      This patch does not change the prior behavior of gro_flush_timeout
      if used alone : NIC hard irqs should be rearmed as before.
      
      One concrete usage can be :
      
      echo 20000 >/sys/class/net/eth1/gro_flush_timeout
      echo 10 >/sys/class/net/eth1/napi_defer_hard_irqs
      
      If at least one packet is retired, then we will reset napi counter
      to 10 (napi_defer_hard_irqs), ensuring at least 10 periodic scans
      of the queue.
      
      On busy queues, this should avoid NIC hard IRQ, while before this patch IRQ
      avoidance was only possible if napi->poll() was exhausting its budget
      and not call napi_complete_done().
      
      This feature also can be used to work around some non-optimal NIC irq
      coalescing strategies.
      
      Having the ability to insert XX usec delays between each napi->poll()
      can increase cache efficiency, since we increase batch sizes.
      
      It also keeps serving cpus not idle too long, reducing tail latencies.
      Co-developed-by: NLuigi Rizzo <lrizzo@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f8b12d6
  12. 21 4月, 2020 1 次提交
  13. 10 4月, 2020 1 次提交
  14. 27 2月, 2020 2 次提交
    • C
      net-sysfs: add queue_change_owner() · d755407d
      Christian Brauner 提交于
      Add a function to change the owner of the queue entries for a network device
      when it is moved between network namespaces.
      
      Currently, when moving network devices between network namespaces the
      ownership of the corresponding queue sysfs entries are not changed. This leads
      to problems when tools try to operate on the corresponding sysfs files. Fix
      this.
      Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d755407d
    • C
      net-sysfs: add netdev_change_owner() · e6dee9f3
      Christian Brauner 提交于
      Add a function to change the owner of a network device when it is moved
      between network namespaces.
      
      Currently, when moving network devices between network namespaces the
      ownership of the corresponding sysfs entries is not changed. This leads
      to problems when tools try to operate on the corresponding sysfs files.
      This leads to a bug whereby a network device that is created in a
      network namespaces owned by a user namespace will have its corresponding
      sysfs entry owned by the root user of the corresponding user namespace.
      If such a network device has to be moved back to the host network
      namespace the permissions will still be set to the user namespaces. This
      means unprivileged users can e.g. trigger uevents for such incorrectly
      owned devices. They can also modify the settings of the device itself.
      Both of these things are unwanted.
      
      For example, workloads will create network devices in the host network
      namespace. Other tools will then proceed to move such devices between
      network namespaces owner by other user namespaces. While the ownership
      of the device itself is updated in
      net/core/net-sysfs.c:dev_change_net_namespace() the corresponding sysfs
      entry for the device is not:
      
      drwxr-xr-x 5 nobody nobody    0 Jan 25 18:08 .
      drwxr-xr-x 9 nobody nobody    0 Jan 25 18:08 ..
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 addr_assign_type
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 addr_len
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 address
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 broadcast
      -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_changes
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_down_count
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_up_count
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dev_id
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dev_port
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dormant
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 duplex
      -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 flags
      -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 gro_flush_timeout
      -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 ifalias
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 ifindex
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 iflink
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 link_mode
      -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 mtu
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 name_assign_type
      -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 netdev_group
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 operstate
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_port_id
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_port_name
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_switch_id
      drwxr-xr-x 2 nobody nobody    0 Jan 25 18:09 power
      -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 proto_down
      drwxr-xr-x 4 nobody nobody    0 Jan 25 18:09 queues
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 speed
      drwxr-xr-x 2 nobody nobody    0 Jan 25 18:09 statistics
      lrwxrwxrwx 1 nobody nobody    0 Jan 25 18:08 subsystem -> ../../../../class/net
      -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 tx_queue_len
      -r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 type
      -rw-r--r-- 1 nobody nobody 4096 Jan 25 18:08 uevent
      
      However, if a device is created directly in the network namespace then
      the device's sysfs permissions will be correctly updated:
      
      drwxr-xr-x 5 root   root      0 Jan 25 18:12 .
      drwxr-xr-x 9 nobody nobody    0 Jan 25 18:08 ..
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 addr_assign_type
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 addr_len
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 address
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 broadcast
      -rw-r--r-- 1 root   root   4096 Jan 25 18:12 carrier
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_changes
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_down_count
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_up_count
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 dev_id
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 dev_port
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 dormant
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 duplex
      -rw-r--r-- 1 root   root   4096 Jan 25 18:12 flags
      -rw-r--r-- 1 root   root   4096 Jan 25 18:12 gro_flush_timeout
      -rw-r--r-- 1 root   root   4096 Jan 25 18:12 ifalias
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 ifindex
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 iflink
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 link_mode
      -rw-r--r-- 1 root   root   4096 Jan 25 18:12 mtu
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 name_assign_type
      -rw-r--r-- 1 root   root   4096 Jan 25 18:12 netdev_group
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 operstate
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_port_id
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_port_name
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_switch_id
      drwxr-xr-x 2 root   root      0 Jan 25 18:12 power
      -rw-r--r-- 1 root   root   4096 Jan 25 18:12 proto_down
      drwxr-xr-x 4 root   root      0 Jan 25 18:12 queues
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 speed
      drwxr-xr-x 2 root   root      0 Jan 25 18:12 statistics
      lrwxrwxrwx 1 nobody nobody    0 Jan 25 18:12 subsystem -> ../../../../class/net
      -rw-r--r-- 1 root   root   4096 Jan 25 18:12 tx_queue_len
      -r--r--r-- 1 root   root   4096 Jan 25 18:12 type
      -rw-r--r-- 1 root   root   4096 Jan 25 18:12 uevent
      
      Now, when creating a network device in a network namespace owned by a
      user namespace and moving it to the host the permissions will be set to
      the id that the user namespace root user has been mapped to on the host
      leading to all sorts of permission issues:
      
      458752
      drwxr-xr-x 5 458752 458752      0 Jan 25 18:12 .
      drwxr-xr-x 9 root   root        0 Jan 25 18:08 ..
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 addr_assign_type
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 addr_len
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 address
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 broadcast
      -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_changes
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_down_count
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_up_count
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dev_id
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dev_port
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dormant
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 duplex
      -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 flags
      -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 gro_flush_timeout
      -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 ifalias
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 ifindex
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 iflink
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 link_mode
      -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 mtu
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 name_assign_type
      -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 netdev_group
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 operstate
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_port_id
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_port_name
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_switch_id
      drwxr-xr-x 2 458752 458752      0 Jan 25 18:12 power
      -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 proto_down
      drwxr-xr-x 4 458752 458752      0 Jan 25 18:12 queues
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 speed
      drwxr-xr-x 2 458752 458752      0 Jan 25 18:12 statistics
      lrwxrwxrwx 1 root   root        0 Jan 25 18:12 subsystem -> ../../../../class/net
      -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 tx_queue_len
      -r--r--r-- 1 458752 458752   4096 Jan 25 18:12 type
      -rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 uevent
      Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e6dee9f3
  15. 18 12月, 2019 1 次提交
  16. 07 12月, 2019 1 次提交
  17. 21 11月, 2019 2 次提交
    • E
      net-sysfs: fix netdev_queue_add_kobject() breakage · 48a322b6
      Eric Dumazet 提交于
      kobject_put() should only be called in error path.
      
      Fixes: b8eb7183 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Jouni Hogander <jouni.hogander@unikie.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48a322b6
    • J
      net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject · b8eb7183
      Jouni Hogander 提交于
      kobject_init_and_add takes reference even when it fails. This has
      to be given up by the caller in error handling. Otherwise memory
      allocated by kobject_init_and_add is never freed. Originally found
      by Syzkaller:
      
      BUG: memory leak
      unreferenced object 0xffff8880679f8b08 (size 8):
        comm "netdev_register", pid 269, jiffies 4294693094 (age 12.132s)
        hex dump (first 8 bytes):
          72 78 2d 30 00 36 20 d4                          rx-0.6 .
        backtrace:
          [<000000008c93818e>] __kmalloc_track_caller+0x16e/0x290
          [<000000001f2e4e49>] kvasprintf+0xb1/0x140
          [<000000007f313394>] kvasprintf_const+0x56/0x160
          [<00000000aeca11c8>] kobject_set_name_vargs+0x5b/0x140
          [<0000000073a0367c>] kobject_init_and_add+0xd8/0x170
          [<0000000088838e4b>] net_rx_queue_update_kobjects+0x152/0x560
          [<000000006be5f104>] netdev_register_kobject+0x210/0x380
          [<00000000e31dab9d>] register_netdevice+0xa1b/0xf00
          [<00000000f68b2465>] __tun_chr_ioctl+0x20d5/0x3dd0
          [<000000004c50599f>] tun_chr_ioctl+0x2f/0x40
          [<00000000bbd4c317>] do_vfs_ioctl+0x1c7/0x1510
          [<00000000d4c59e8f>] ksys_ioctl+0x99/0xb0
          [<00000000946aea81>] __x64_sys_ioctl+0x78/0xb0
          [<0000000038d946e5>] do_syscall_64+0x16f/0x580
          [<00000000e0aa5d8f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
          [<00000000285b3d1a>] 0xffffffffffffffff
      
      Cc: David Miller <davem@davemloft.net>
      Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
      Signed-off-by: NJouni Hogander <jouni.hogander@unikie.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8eb7183
  18. 31 5月, 2019 1 次提交
  19. 26 4月, 2019 1 次提交
  20. 16 4月, 2019 1 次提交
  21. 24 3月, 2019 1 次提交
  22. 22 3月, 2019 1 次提交
    • W
      net-sysfs: Fix memory leak in netdev_register_kobject · 6b70fc94
      Wang Hai 提交于
      When registering struct net_device, it will call
      	register_netdevice ->
      		netdev_register_kobject ->
      			device_initialize(dev);
      			dev_set_name(dev, "%s", ndev->name)
      			device_add(dev)
      			register_queue_kobjects(ndev)
      
      In netdev_register_kobject(), if device_add(dev) or
      register_queue_kobjects(ndev) failed. Register_netdevice()
      will return error, causing netdev_freemem(ndev) to be
      called to free net_device, however put_device(&dev->dev)->..->
      kobject_cleanup() won't be called, resulting in a memory leak.
      
      syzkaller report this:
      BUG: memory leak
      unreferenced object 0xffff8881f4fad168 (size 8):
      comm "syz-executor.0", pid 3575, jiffies 4294778002 (age 20.134s)
      hex dump (first 8 bytes):
        77 70 61 6e 30 00 ff ff                          wpan0...
      backtrace:
        [<000000006d2d91d7>] kstrdup_const+0x3d/0x50 mm/util.c:73
        [<00000000ba9ff953>] kvasprintf_const+0x112/0x170 lib/kasprintf.c:48
        [<000000005555ec09>] kobject_set_name_vargs+0x55/0x130 lib/kobject.c:281
        [<0000000098d28ec3>] dev_set_name+0xbb/0xf0 drivers/base/core.c:1915
        [<00000000b7553017>] netdev_register_kobject+0xc0/0x410 net/core/net-sysfs.c:1727
        [<00000000c826a797>] register_netdevice+0xa51/0xeb0 net/core/dev.c:8711
        [<00000000857bfcfd>] cfg802154_update_iface_num.isra.2+0x13/0x90 [ieee802154]
        [<000000003126e453>] ieee802154_llsec_fill_key_id+0x1d5/0x570 [ieee802154]
        [<00000000e4b3df51>] 0xffffffffc1500e0e
        [<00000000b4319776>] platform_drv_probe+0xc6/0x180 drivers/base/platform.c:614
        [<0000000037669347>] really_probe+0x491/0x7c0 drivers/base/dd.c:509
        [<000000008fed8862>] driver_probe_device+0xdc/0x240 drivers/base/dd.c:671
        [<00000000baf52041>] device_driver_attach+0xf2/0x130 drivers/base/dd.c:945
        [<00000000c7cc8dec>] __driver_attach+0x10e/0x210 drivers/base/dd.c:1022
        [<0000000057a757c2>] bus_for_each_dev+0x154/0x1e0 drivers/base/bus.c:304
        [<000000005f5ae04b>] bus_add_driver+0x427/0x5e0 drivers/base/bus.c:645
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Fixes: 1fa5ae85 ("driver core: get rid of struct device's bus_id string array")
      Signed-off-by: NWang Hai <wanghai26@huawei.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b70fc94
  23. 20 3月, 2019 1 次提交
  24. 05 3月, 2019 1 次提交
  25. 04 3月, 2019 1 次提交
    • Y
      net-sysfs: Fix mem leak in netdev_register_kobject · 895a5e96
      YueHaibing 提交于
      syzkaller report this:
      BUG: memory leak
      unreferenced object 0xffff88837a71a500 (size 256):
        comm "syz-executor.2", pid 9770, jiffies 4297825125 (age 17.843s)
        hex dump (first 32 bytes):
          00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
          ff ff ff ff ff ff ff ff 20 c0 ef 86 ff ff ff ff  ........ .......
        backtrace:
          [<00000000db12624b>] netdev_register_kobject+0x124/0x2e0 net/core/net-sysfs.c:1751
          [<00000000dc49a994>] register_netdevice+0xcc1/0x1270 net/core/dev.c:8516
          [<00000000e5f3fea0>] tun_set_iff drivers/net/tun.c:2649 [inline]
          [<00000000e5f3fea0>] __tun_chr_ioctl+0x2218/0x3d20 drivers/net/tun.c:2883
          [<000000001b8ac127>] vfs_ioctl fs/ioctl.c:46 [inline]
          [<000000001b8ac127>] do_vfs_ioctl+0x1a5/0x10e0 fs/ioctl.c:690
          [<0000000079b269f8>] ksys_ioctl+0x89/0xa0 fs/ioctl.c:705
          [<00000000de649beb>] __do_sys_ioctl fs/ioctl.c:712 [inline]
          [<00000000de649beb>] __se_sys_ioctl fs/ioctl.c:710 [inline]
          [<00000000de649beb>] __x64_sys_ioctl+0x74/0xb0 fs/ioctl.c:710
          [<000000007ebded1e>] do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
          [<00000000db315d36>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<00000000115be9bb>] 0xffffffffffffffff
      
      It should call kset_unregister to free 'dev->queues_kset'
      in error path of register_queue_kobjects, otherwise will cause a mem leak.
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Fixes: 1d24eb48 ("xps: Transmit Packet Steering")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      895a5e96
  26. 07 2月, 2019 2 次提交
  27. 07 12月, 2018 1 次提交
  28. 10 8月, 2018 1 次提交
    • A
      net: allow to call netif_reset_xps_queues() under cpus_read_lock · 4d99f660
      Andrei Vagin 提交于
      The definition of static_key_slow_inc() has cpus_read_lock in place. In the
      virtio_net driver, XPS queues are initialized after setting the queue:cpu
      affinity in virtnet_set_affinity() which is already protected within
      cpus_read_lock. Lockdep prints a warning when we are trying to acquire
      cpus_read_lock when it is already held.
      
      This patch adds an ability to call __netif_set_xps_queue under
      cpus_read_lock().
      Acked-by: NJason Wang <jasowang@redhat.com>
      
      ============================================
      WARNING: possible recursive locking detected
      4.18.0-rc3-next-20180703+ #1 Not tainted
      --------------------------------------------
      swapper/0/1 is trying to acquire lock:
      00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: static_key_slow_inc+0xe/0x20
      
      but task is already holding lock:
      00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0x513/0x5a0
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(cpu_hotplug_lock.rw_sem);
        lock(cpu_hotplug_lock.rw_sem);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      3 locks held by swapper/0/1:
       #0: 00000000244bc7da (&dev->mutex){....}, at: __driver_attach+0x5a/0x110
       #1: 00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0x513/0x5a0
       #2: 000000005cd8463f (xps_map_mutex){+.+.}, at: __netif_set_xps_queue+0x8d/0xc60
      
      v2: move cpus_read_lock() out of __netif_set_xps_queue()
      
      Cc: "Nambiar, Amritha" <amritha.nambiar@intel.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Fixes: 8af2c06f ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
      Signed-off-by: NAndrei Vagin <avagin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d99f660
  29. 21 7月, 2018 1 次提交