1. 02 2月, 2018 3 次提交
    • K
      IB/hfi1: Convert PortXmitWait/PortVLXmitWait counters to flit times · 07190076
      Kamenee Arumugam 提交于
      HFI's counters SendWaitCnt and SendWaitVlCnt are in units
      of TXE cycle time (at 805MHz). OPA counters PortXmitWait and
      PortVLXmtWait are in units of flit times.
      Convert the counter values to flit units using following
      conversion formula:
      
      PortXmitWait =
      	SendWaitCnt * 2 * (4 /link_width) * (25 Gbps /link_speed)
      PortVLXmitWait =
      	SendWaitVLCnt * 2 * (4 /link_width) * (25 Gbps /link_speed)
      
      At link up or downgrade events, the link width can change. To ensure
      accurate counter calculations, sample the counters after the events,
      during counter requests, and then aggregate the OPA counters.
      Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: NKamenee Arumugam <kamenee.arumugam@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      07190076
    • A
      IB/hfi1: Fix for early release of sdma context · 473291b3
      Alex Estrin 提交于
      With IRQF_SHARED flag set and CONFIG_DEBUG_SHIRQ enabled
      module removal may result in panic in sdma_interrupt() routine
      if associated sdma context was released before pci_free_irq();
      
      [ 9198.939885] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [ 9198.940514] IP: sdma_make_progress+0xa5/0x450 [hfi1]
      [ 9198.941114] PGD 170bdc0067 P4D 170bdc0067 PUD 172063e067 PMD 0
      [ 9198.941783] Oops: 0000 [#1] SMP
      .....
      [ 9198.958877] CPU: 132 PID: 64173 Comm: rmmod Tainted: G           OE   4.14.0-rc4+ #1
      [ 9198.961032] Hardware name: Intel Corporation S7200AP/S7200AP, BIOS S72C610.86B.01.02.0118.080620171935 08/06/2017
      [ 9198.963323] task: ffff9681397f0000 task.stack: ffffae1647c40000
      [ 9198.965695] RIP: 0010:sdma_make_progress+0xa5/0x450 [hfi1]
      [ 9198.968082] RSP: 0018:ffffae1647c43be8 EFLAGS: 00010046
      [ 9198.970503] RAX: 0000000000000000 RBX: ffff9680ce8b5ca8 RCX: 0000000000000000
      [ 9198.973006] RDX: 0000000000000000 RSI: 0000000001a00d28 RDI: ffff9680ce8b5ca0
      [ 9198.975546] RBP: ffffae1647c43c40 R08: ffff96814325ec00 R09: 00000000ffffffff
      [ 9198.978142] R10: 000000004325e501 R11: ffff96814325ec00 R12: ffff9680ce8b5c44
      [ 9198.980779] R13: ffff9680ce8b5ca0 R14: 0000000000000000 R15: ffff9680ce8b5b00
      [ 9198.983462] FS:  00007f31196ba740(0000) GS:ffff96819df00000(0000) knlGS:0000000000000000
      [ 9198.986231] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 9198.989036] CR2: 0000000000000000 CR3: 000000170833f000 CR4: 00000000001406e0
      [ 9198.991911] Call Trace:
      [ 9198.994847]  sdma_engine_interrupt+0x82/0x100 [hfi1]
      [ 9198.997852]  sdma_interrupt+0x61/0xc0 [hfi1]
      [ 9199.000852]  __free_irq+0x1b3/0x2d0
      [ 9199.003873]  free_irq+0x35/0x70
      [ 9199.006909]  pci_free_irq+0x1c/0x30
      [ 9199.009999]  clean_up_interrupts+0x53/0xf0 [hfi1]
      [ 9199.013137]  hfi1_start_cleanup+0x117/0x190 [hfi1]
      [ 9199.016315]  postinit_cleanup+0x1d/0x270 [hfi1]
      [ 9199.019529]  remove_one+0x1f3/0x210 [hfi1]
      [ 9199.022738]  pci_device_remove+0x39/0xc0
      [ 9199.025974]  device_release_driver_internal+0x141/0x210
      [ 9199.029268]  driver_detach+0x3f/0x80
      [ 9199.032580]  bus_remove_driver+0x55/0xd0
      [ 9199.035931]  driver_unregister+0x2c/0x50
      [ 9199.039321]  pci_unregister_driver+0x2a/0xa0
      [ 9199.042755]  hfi1_mod_cleanup+0x10/0xb50 [hfi1]
      [ 9199.046196]  SyS_delete_module+0x171/0x250
      ...
      
      Fix by exporting sdma_clean() and removing from sdma_exit().
      sdma_exit() now just manipulates the engine state,
      leaving the memory free to sdma_clean() which is now called
      just before the dd is freed.
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: NMichael J Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: NAlex Estrin <alex.estrin@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      473291b3
    • M
      IB/hfi1: Re-order IRQ cleanup to address driver cleanup race · 82a97926
      Michael J. Ruhl 提交于
      The pci_request_irq() interfaces always adds the IRQF_SHARED bit to
      all IRQ requests.
      
      When the kernel is built with CONFIG_DEBUG_SHIRQ config flag, if the
      IRQF_SHARED bit is set, a call to the IRQ handler is made from the
      __free_irq() function. This is testing a race condition between the
      IRQ cleanup and an IRQ racing the cleanup.  The HFI driver should be
      able to handle this race, but does not.
      
      This race can cause traces that start with this footprint:
      
      BUG: unable to handle kernel NULL pointer dereference at   (null)
      Call Trace:
       <hfi1 irq handler>
       ...
       __free_irq+0x1b3/0x2d0
       free_irq+0x35/0x70
       pci_free_irq+0x1c/0x30
       clean_up_interrupts+0x53/0xf0 [hfi1]
       hfi1_start_cleanup+0x122/0x190 [hfi1]
       postinit_cleanup+0x1d/0x280 [hfi1]
       remove_one+0x233/0x250 [hfi1]
       pci_device_remove+0x39/0xc0
      
      Export IRQ cleanup function so it can be called from other modules.
      
      Using the exported cleanup function:
      
        Re-order the driver cleanup code to clean up IRQ resources before
        other resources, eliminating the race.
      
        Re-order error path for init so that the race does not occur.
      
      Reduce severity on spurious error message for SDMA IRQs to info.
      Reviewed-by: NAlex Estrin <alex.estrin@intel.com>
      Reviewed-by: NPatel Jay P <jay.p.patel@intel.com>
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      82a97926
  2. 06 1月, 2018 1 次提交
  3. 14 11月, 2017 1 次提交
  4. 31 10月, 2017 1 次提交
  5. 18 10月, 2017 1 次提交
    • K
      IB/hfi1: Convert timers to use timer_setup() · 8064135e
      Kees Cook 提交于
      In preparation for unconditionally passing the struct timer_list pointer to
      all timer callbacks, switch to using the new timer_setup() and from_timer()
      to pass the timer pointer explicitly. Switches test of .data field to
      .function, since .data will be going away.
      
      Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
      Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Sean Hefty <sean.hefty@intel.com>
      Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
      Cc: linux-rdma@vger.kernel.org
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      8064135e
  6. 27 9月, 2017 3 次提交
  7. 23 8月, 2017 3 次提交
  8. 01 8月, 2017 3 次提交
  9. 28 6月, 2017 2 次提交
  10. 05 5月, 2017 7 次提交
  11. 29 4月, 2017 3 次提交
  12. 21 4月, 2017 3 次提交
  13. 06 4月, 2017 1 次提交
  14. 19 2月, 2017 1 次提交
  15. 16 11月, 2016 5 次提交
  16. 02 10月, 2016 2 次提交