1. 22 11月, 2018 2 次提交
  2. 17 10月, 2018 1 次提交
  3. 16 10月, 2018 1 次提交
    • D
      IB/ipoib: Clear IPCB before icmp_send · 4d6e4d12
      Denis Drozdov 提交于
      IPCB should be cleared before icmp_send, since it may contain data from
      previous layers and the data could be misinterpreted as ip header options,
      which later caused the ihl to be set to an invalid value and resulted in
      the following stack corruption:
      
      [ 1083.031512] ib0: packet len 57824 (> 2048) too long to send, dropping
      [ 1083.031843] ib0: packet len 37904 (> 2048) too long to send, dropping
      [ 1083.032004] ib0: packet len 4040 (> 2048) too long to send, dropping
      [ 1083.032253] ib0: packet len 63800 (> 2048) too long to send, dropping
      [ 1083.032481] ib0: packet len 23960 (> 2048) too long to send, dropping
      [ 1083.033149] ib0: packet len 63800 (> 2048) too long to send, dropping
      [ 1083.033439] ib0: packet len 63800 (> 2048) too long to send, dropping
      [ 1083.033700] ib0: packet len 63800 (> 2048) too long to send, dropping
      [ 1083.034124] ib0: packet len 63800 (> 2048) too long to send, dropping
      [ 1083.034387] ==================================================================
      [ 1083.034602] BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0xf08/0x1310
      [ 1083.034798] Write of size 4 at addr ffff880353457c5f by task kworker/u16:0/7
      [ 1083.034990]
      [ 1083.035104] CPU: 7 PID: 7 Comm: kworker/u16:0 Tainted: G           O      4.19.0-rc5+ #1
      [ 1083.035316] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
      [ 1083.035573] Workqueue: ipoib_wq ipoib_cm_skb_reap [ib_ipoib]
      [ 1083.035750] Call Trace:
      [ 1083.035888]  dump_stack+0x9a/0xeb
      [ 1083.036031]  print_address_description+0xe3/0x2e0
      [ 1083.036213]  kasan_report+0x18a/0x2e0
      [ 1083.036356]  ? __ip_options_echo+0xf08/0x1310
      [ 1083.036522]  __ip_options_echo+0xf08/0x1310
      [ 1083.036688]  icmp_send+0x7b9/0x1cd0
      [ 1083.036843]  ? icmp_route_lookup.constprop.9+0x1070/0x1070
      [ 1083.037018]  ? netif_schedule_queue+0x5/0x200
      [ 1083.037180]  ? debug_show_all_locks+0x310/0x310
      [ 1083.037341]  ? rcu_dynticks_curr_cpu_in_eqs+0x85/0x120
      [ 1083.037519]  ? debug_locks_off+0x11/0x80
      [ 1083.037673]  ? debug_check_no_obj_freed+0x207/0x4c6
      [ 1083.037841]  ? check_flags.part.27+0x450/0x450
      [ 1083.037995]  ? debug_check_no_obj_freed+0xc3/0x4c6
      [ 1083.038169]  ? debug_locks_off+0x11/0x80
      [ 1083.038318]  ? skb_dequeue+0x10e/0x1a0
      [ 1083.038476]  ? ipoib_cm_skb_reap+0x2b5/0x650 [ib_ipoib]
      [ 1083.038642]  ? netif_schedule_queue+0xa8/0x200
      [ 1083.038820]  ? ipoib_cm_skb_reap+0x544/0x650 [ib_ipoib]
      [ 1083.038996]  ipoib_cm_skb_reap+0x544/0x650 [ib_ipoib]
      [ 1083.039174]  process_one_work+0x912/0x1830
      [ 1083.039336]  ? wq_pool_ids_show+0x310/0x310
      [ 1083.039491]  ? lock_acquire+0x145/0x3a0
      [ 1083.042312]  worker_thread+0x87/0xbb0
      [ 1083.045099]  ? process_one_work+0x1830/0x1830
      [ 1083.047865]  kthread+0x322/0x3e0
      [ 1083.050624]  ? kthread_create_worker_on_cpu+0xc0/0xc0
      [ 1083.053354]  ret_from_fork+0x3a/0x50
      
      For instance __ip_options_echo is failing to proceed with invalid srr and
      optlen passed from another layer via IPCB
      
      [  762.139568] IPv4: __ip_options_echo rr=0 ts=0 srr=43 cipso=0
      [  762.139720] IPv4: ip_options_build: IPCB 00000000f3cd969e opt 000000002ccb3533
      [  762.139838] IPv4: __ip_options_echo in srr: optlen 197 soffset 84
      [  762.139852] IPv4: ip_options_build srr=0 is_frag=0 rr_needaddr=0 ts_needaddr=0 ts_needtime=0 rr=0 ts=0
      [  762.140269] ==================================================================
      [  762.140713] IPv4: __ip_options_echo rr=0 ts=0 srr=0 cipso=0
      [  762.141078] BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0x12ec/0x1680
      [  762.141087] Write of size 4 at addr ffff880353457c7f by task kworker/u16:0/7
      Signed-off-by: NDenis Drozdov <denisd@mellanox.com>
      Reviewed-by: NErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: NFeras Daoud <ferasda@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      4d6e4d12
  4. 11 10月, 2018 2 次提交
  5. 01 10月, 2018 1 次提交
  6. 28 9月, 2018 1 次提交
  7. 27 9月, 2018 1 次提交
  8. 20 9月, 2018 1 次提交
  9. 13 9月, 2018 2 次提交
    • A
      IB/ipoib: Log sysfs 'dev_id' accesses from userspace · f6350da4
      Arseny Maslennikov 提交于
      Some tools may currently be using only the deprecated attribute;
      let's print an elaborate and clear deprecation notice to kmsg.
      
      To do that, we have to replace the whole sysfs file, since we inherit
      the original one from netdev.
      Signed-off-by: NArseny Maslennikov <ar@cs.msu.ru>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      f6350da4
    • A
      IB/ipoib: Use dev_port to expose network interface port numbers · 9b8b2a32
      Arseny Maslennikov 提交于
      Some InfiniBand network devices have multiple ports on the same PCI
      function. This initializes the `dev_port' sysfs field of those
      network interfaces with their port number.
      
      Prior to this the kernel erroneously used the `dev_id' sysfs
      field of those network interfaces to convey the port number to userspace.
      
      The use of `dev_id' was considered correct until Linux 3.15,
      when another field, `dev_port', was defined for this particular
      purpose and `dev_id' was reserved for distinguishing stacked ifaces
      (e.g: VLANs) with the same hardware address as their parent device.
      
      Similar fixes to net/mlx4_en and many other drivers, which started
      exporting this information through `dev_id' before 3.15, were accepted
      into the kernel 4 years ago.
      See 76a066f2 (`net/mlx4_en: Expose port number through sysfs').
      Signed-off-by: NArseny Maslennikov <ar@cs.msu.ru>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      9b8b2a32
  10. 07 9月, 2018 1 次提交
  11. 06 9月, 2018 2 次提交
    • I
      IB/srp: Remove unnecessary unlikely() · 882dff28
      Igor Stoppa 提交于
      WARN_ON() already contains an unlikely(), so it's not necessary to wrap it
      into another.
      Signed-off-by: NIgor Stoppa <igor.stoppa@huawei.com>
      Reviewed-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      882dff28
    • A
      IB/ipoib: Avoid a race condition between start_xmit and cm_rep_handler · 816e846c
      Aaron Knister 提交于
      Inside of start_xmit() the call to check if the connection is up and the
      queueing of the packets for later transmission is not atomic which leaves
      a window where cm_rep_handler can run, set the connection up, dequeue
      pending packets and leave the subsequently queued packets by start_xmit()
      sitting on neigh->queue until they're dropped when the connection is torn
      down. This only applies to connected mode. These dropped packets can
      really upset TCP, for example, and cause multi-minute delays in
      transmission for open connections.
      
      Here's the code in start_xmit where we check to see if the connection is
      up:
      
             if (ipoib_cm_get(neigh)) {
                     if (ipoib_cm_up(neigh)) {
                             ipoib_cm_send(dev, skb, ipoib_cm_get(neigh));
                             goto unref;
                     }
             }
      
      The race occurs if cm_rep_handler execution occurs after the above
      connection check (specifically if it gets to the point where it acquires
      priv->lock to dequeue pending skb's) but before the below code snippet in
      start_xmit where packets are queued.
      
             if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE) {
                     push_pseudo_header(skb, phdr->hwaddr);
                     spin_lock_irqsave(&priv->lock, flags);
                     __skb_queue_tail(&neigh->queue, skb);
                     spin_unlock_irqrestore(&priv->lock, flags);
             } else {
                     ++dev->stats.tx_dropped;
                     dev_kfree_skb_any(skb);
             }
      
      The patch acquires the netif tx lock in cm_rep_handler for the section
      where it sets the connection up and dequeues and retransmits deferred
      skb's.
      
      Fixes: 839fcaba ("IPoIB: Connected mode experimental support")
      Cc: stable@vger.kernel.org
      Signed-off-by: NAaron Knister <aaron.s.knister@nasa.gov>
      Tested-by: NIra Weiny <ira.weiny@intel.com>
      Reviewed-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      816e846c
  12. 03 8月, 2018 11 次提交
    • J
      IB/ipoib: Consolidate checking of the proposed child interface · 76010976
      Jason Gunthorpe 提交于
      Move all the checking for pkey and other validity to the __ipoib_vlan_add
      function. This removes the last difference from the control flow
      of the __ipoib_vlan_add to make the overall design simpler to
      understand.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      76010976
    • J
      IB/ipoib: Maintain the child_intfs list from ndo_init/uninit · 13476d35
      Jason Gunthorpe 提交于
      This fixes a bug in the netlink path where the vlan_rwsem was not
      held around __ipoib_vlan_add causing the child_intfs to be manipulated
      unsafely.
      
      In the process this greatly simplifies the vlan_rwsem write side locking
      to only cover a single non-sleeping statement.
      
      This also further increases the safety of the removal ordering by holding
      the netdev of the parent while the child is active to ensure most bugs
      become either an oops on a NULL priv or a deadlock on the netdev refcount.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      13476d35
    • J
      IB/ipoib: Do not remove child devices from within the ndo_uninit · 25405d98
      Jason Gunthorpe 提交于
      Switching to priv_destructor and needs_free_netdev created a subtle
      ordering problem in ipoib_remove_one.
      
      Now that unregister_netdev frees the netdev and priv we must ensure that
      the children are unregistered before trying to unregister the parent,
      or child unregister will use after free.
      
      The solution is to unregister the children, then parent, in the same batch
      all while holding the rtnl_lock. This closes all the races where a new
      child could have been added and ensures proper ordering.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      25405d98
    • J
      IB/ipoib: Get rid of the sysfs_mutex · ee190ab7
      Jason Gunthorpe 提交于
      This mutex was introduced to deal with the deadlock formed by calling
      unregister_netdev from within the sysfs callback of a netdev.
      
      Now that we have priv_destructor and needs_free_netdev we can switch
      to the more targeted solution of running the unregister from a
      work queue. This avoids the deadlock and gets rid of the mutex.
      
      The next patch in the series needs this mutex eliminated to create
      atomicity of unregisteration.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      ee190ab7
    • J
      RDMA/netdev: Use priv_destructor for netdev cleanup · 9f49a5b5
      Jason Gunthorpe 提交于
      Now that the unregister_netdev flow for IPoIB no longer relies on external
      code we can now introduce the use of priv_destructor and
      needs_free_netdev.
      
      The rdma_netdev flow is switched to use the netdev common priv_destructor
      instead of the special free_rdma_netdev and the IPOIB ULP adjusted:
       - priv_destructor needs to switch to point to the ULP's destructor
         which will then call the rdma_ndev's in the right order
       - We need to be careful around the error unwind of register_netdev
         as it sometimes calls priv_destructor on failure
       - ULPs need to use ndo_init/uninit to ensure proper ordering
         of failures around register_netdev
      
      Switching to priv_destructor is a necessary pre-requisite to using
      the rtnl new_link mechanism.
      
      The VNIC user for rdma_netdev should also be revised, but that is left for
      another patch.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NDenis Drozdov <denisd@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      9f49a5b5
    • J
      IB/ipoib: Move init code to ndo_init · eaeb3984
      Jason Gunthorpe 提交于
      Now that we have a proper ndo_uninit, move code that naturally pairs
      with the ndo_uninit into ndo_init. This allows the netdev core to natually
      handle ordering.
      
      This fixes the situation where register_netdev can fail before calling
      ndo_init, in which case it wouldn't call ndo_uninit either.
      
      Also move a bunch of duplicated init code that is shared between child
      and parent for clarity. Now the child and parent register functions look
      very similar.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      eaeb3984
    • J
      IB/ipoib: Move all uninit code into ndo_uninit · 7cbee87c
      Jason Gunthorpe 提交于
      Currently uninit is sometimes done twice in error flows, and is sprinkled
      a bit all over the place.
      
      Improve the clarity of the design by moving all uninit only into
      ndo_uinit.
      
      Some duplication is removed:
       - Sometimes IPOIB_STOP_NEIGH_GC was done before unregister, but
         this duplicates the process in ipoib_neigh_hash_init
       - Flushing priv->wq was sometimes done before unregister,
         but that duplicates what has been done in ndo_uninit
      
      Uniniting the IB event queue must remain before unregister_netdev as it
      requires the RTNL lock to be dropped, this is moved to a helper to make
      that flow really clear and remove some duplication in error flows.
      
      If register_netdev fails (and ndo_init is NULL) then it almost always
      calls ndo_uninit, which lets us remove all the extra code from the error
      unwinds. The next patch in the series will close the 'almost always' hole
      by pairing a proper ndo_init with ndo_uninit.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      7cbee87c
    • E
      IB/ipoib: Use cancel_delayed_work_sync for neigh-clean task · cda8daf1
      Erez Shitrit 提交于
      The neigh_reap_task is self restarting, but so long as we call
      cancel_delayed_work_sync() it will be guaranteed to not be running and
      never start again. Thus we don't need to have the racy
      IPOIB_STOP_NEIGH_GC bit, or the confusing mismatch of places sometimes
      calling flush_workqueue after the cancel.
      
      This fixes a situation where the GC work could have been left running
      in some rare situations.
      Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      cda8daf1
    • J
      IB/ipoib: Get rid of IPOIB_FLAG_GOING_DOWN · 577e07ff
      Jason Gunthorpe 提交于
      This essentially duplicates the netdev's reg_state, so just use that
      directly. The reg_state is updated under the rntl_lock, and all places
      using GOING_DOWN already acquire the rtnl_lock so checking is safe.
      
      Since the only place we use GOING_DOWN is for the parent device this
      does not fix any bugs, but it is a step to tidy up the unregister flow
      so that after later patches the flow is uniform and sane.
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      577e07ff
    • M
      scsi: target: srp, vscsi, sbp, qla: use target_remove_session · b287e351
      Mike Christie 提交于
      This converts the drivers that called transport_deregister_session_configfs
      and then immediately called transport_deregister_session to use
      target_remove_session.
      Signed-off-by: NMike Christie <mchristi@redhat.com>
      Reviewed-by: NBart Van Assche <bart.vanassche@wdc.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Chris Boot <bootc@bootc.net>
      Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
      Cc: Michael Cyr <mikecyr@linux.vnet.ibm.com>
      Cc: <qla2xxx-upstream@qlogic.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      b287e351
    • M
      scsi: target: rename target_alloc_session · fa834287
      Mike Christie 提交于
      Rename target_alloc_session to target_setup_session to avoid confusion with
      the other transport session allocation function that only allocates the
      session and because the target_alloc_session does so much more. It
      allocates the session, sets up the nacl and registers the session.
      
      The next patch will then add a remove function to match the setup in this
      one, so it should make sense for all drivers, except iscsi, to just call
      those 2 functions to setup and remove a session.
      
      iscsi will continue to be the odd driver.
      Signed-off-by: NMike Christie <mchristi@redhat.com>
      Reviewed-by: NBart Van Assche <bart.vanassche@wdc.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Chris Boot <bootc@bootc.net>
      Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
      Cc: Michael Cyr <mikecyr@linux.vnet.ibm.com>
      Cc: <qla2xxx-upstream@qlogic.com>
      Cc: Johannes Thumshirn <jth@kernel.org>
      Cc: Felipe Balbi <balbi@kernel.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      fa834287
  13. 02 8月, 2018 1 次提交
  14. 01 8月, 2018 1 次提交
  15. 31 7月, 2018 3 次提交
  16. 30 7月, 2018 1 次提交
  17. 25 7月, 2018 6 次提交
  18. 24 7月, 2018 1 次提交
  19. 14 7月, 2018 1 次提交