1. 14 11月, 2016 15 次提交
  2. 08 9月, 2016 1 次提交
  3. 16 8月, 2016 2 次提交
    • A
      xhci: really enqueue zero length TRBs. · 0d2daade
      Alban Browaeys 提交于
      Enqueue the first TRB even if full_len is zero.
      Without this "adb install <apk>" freezes the system.
      Signed-off-by: NAlban Browaeys <alban.browaeys@gmail.com>
      Fixes: 86065c27 ("xhci: don't rely on precalculated value of needed trbs in the enqueue loop")
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d2daade
    • M
      xhci: always handle "Command Ring Stopped" events · 33be1265
      Mathias Nyman 提交于
      Fix "Command completion event does not match command" errors by always
      handling the command ring stopped events.
      
      The command ring stopped event is generated as a result of aborting
      or stopping the command ring with a register write. It is not caused
      by a command in the command queue, and thus won't have a matching command
      in the comman list.
      
      Solve it by handling the command ring stopped event before checking for a
      matching command.
      
      In most command time out cases we abort the command ring, and get
      a command ring stopped event. The events command pointer will point at
      the current command ring dequeue, which in most cases matches the timed
      out command in the command list, and no error messages are seen.
      
      If we instead get a command aborted event before the command ring stopped
      event, the abort event will increse the command ring dequeue pointer, and
      the following command ring stopped events command pointer will point at the
      next, not yet queued command. This case triggered the error message
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      CC: <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      33be1265
  4. 01 7月, 2016 1 次提交
    • A
      xhci: free the correct ring · f76a28a6
      Arnd Bergmann 提交于
      gcc warns about what first looks like a reference to an uninitialized
      variable:
      
      drivers/usb/host/xhci-ring.c: In function 'handle_cmd_completion':
      drivers/usb/host/xhci-ring.c:753:4: error: 'ep_ring' may be used uninitialized in this function [-Werror=maybe-uninitialized]
          xhci_unmap_td_bounce_buffer(xhci, ep_ring, cur_td);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/usb/host/xhci-ring.c:647:20: note: 'ep_ring' was declared here
        struct xhci_ring *ep_ring;
                          ^~~~~~~
      
      It's clear to see that the list_empty() check means it can never be
      uninitialized, however it still looks wrong:
      
      When ep->cancelled_td_list contains more than one entry, the
      ep_ring variable will point to the ring that was retrieved
      from the last urb, and we have to look it up again in the
      second loop instead, which fixes the behavior and gets rid of the
      warning too.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: f9c589e1 ("xhci: TD-fragment, align the unsplittable case with a bounce buffer")
      Acked-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f76a28a6
  5. 27 6月, 2016 11 次提交
  6. 02 6月, 2016 2 次提交
    • M
      xhci: Fix handling timeouted commands on hosts in weird states. · 3425aa03
      Mathias Nyman 提交于
      If commands timeout we mark them for abortion, then stop the command
      ring, and turn the commands to no-ops and finally restart the command
      ring.
      
      If the host is working properly the no-op commands will finish and
      pending completions are called.
      If we notice the host is failing, driver clears the command ring and
      completes, deletes and frees all pending commands.
      
      There are two separate cases reported where host is believed to work
      properly but is not. In the first case we successfully stop the ring
      but no abort or stop command ring event is ever sent and host locks up.
      
      The second case is if a host is removed, command times out and driver
      believes the ring is stopped, and assumes it will be restarted, but
      actually ends up timing out on the same command forever.
      If one of the pending commands has the xhci->mutex held it will block
      xhci_stop() in the remove codepath which otherwise would cleanup pending
      commands.
      
      Add a check that clears all pending commands in case host is removed,
      or we are stuck timing out on the same command. Also restart the
      command timeout timer when stopping the command ring to ensure we
      recive an ring stop/abort event.
      
      Cc: stable <stable@vger.kernel.org>
      Tested-by: NJoe Lawrence <joe.lawrence@stratus.com>
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3425aa03
    • G
      xhci: Cleanup only when releasing primary hcd · 27a41a83
      Gabriel Krisman Bertazi 提交于
      Under stress occasions some TI devices might not return early when
      reading the status register during the quirk invocation of xhci_irq made
      by usb_hcd_pci_remove.  This means that instead of returning, we end up
      handling this interruption in the middle of a shutdown.  Since
      xhci->event_ring has already been freed in xhci_mem_cleanup, we end up
      accessing freed memory, causing the Oops below.
      
      commit 8c24d6d7 ("usb: xhci: stop everything on the first call to
      xhci_stop") is the one that changed the instant in which we clean up the
      event queue when stopping a device.  Before, we didn't call
      xhci_mem_cleanup at the first time xhci_stop is executed (for the shared
      HCD), instead, we only did it after the invocation for the primary HCD,
      much later at the removal path.  The code flow for this oops looks like
      this:
      
      xhci_pci_remove()
      	usb_remove_hcd(xhci->shared)
      	        xhci_stop(xhci->shared)
       			xhci_halt()
      			xhci_mem_cleanup(xhci);  // Free the event_queue
      	usb_hcd_pci_remove(primary)
      		xhci_irq()  // Access the event_queue if STS_EINT is set. Crash.
      		xhci_stop()
      			xhci_halt()
      			// return early
      
      The fix modifies xhci_stop to only cleanup the xhci data when releasing
      the primary HCD.  This way, we still have the event_queue configured
      when invoking xhci_irq.  We still halt the device on the first call to
      xhci_stop, though.
      
      I could reproduce this issue several times on the mainline kernel by
      doing a bind-unbind stress test with a specific storage gadget attached.
      I also ran the same test over-night with my patch applied and didn't
      observe the issue anymore.
      
      [  113.334124] Unable to handle kernel paging request for data at address 0x00000028
      [  113.335514] Faulting instruction address: 0xd00000000d4f767c
      [  113.336839] Oops: Kernel access of bad area, sig: 11 [#1]
      [  113.338214] SMP NR_CPUS=1024 NUMA PowerNV
      
      [c000000efe47ba90] c000000000720850 usb_hcd_irq+0x50/0x80
      [c000000efe47bac0] c00000000073d328 usb_hcd_pci_remove+0x68/0x1f0
      [c000000efe47bb00] d00000000daf0128 xhci_pci_remove+0x78/0xb0
      [xhci_pci]
      [c000000efe47bb30] c00000000055cf70 pci_device_remove+0x70/0x110
      [c000000efe47bb70] c00000000061c6bc __device_release_driver+0xbc/0x190
      [c000000efe47bba0] c00000000061c7d0 device_release_driver+0x40/0x70
      [c000000efe47bbd0] c000000000619510 unbind_store+0x120/0x150
      [c000000efe47bc20] c0000000006183c4 drv_attr_store+0x64/0xa0
      [c000000efe47bc60] c00000000039f1d0 sysfs_kf_write+0x80/0xb0
      [c000000efe47bca0] c00000000039e14c kernfs_fop_write+0x18c/0x1f0
      [c000000efe47bcf0] c0000000002e962c __vfs_write+0x6c/0x190
      [c000000efe47bd90] c0000000002eab40 vfs_write+0xc0/0x200
      [c000000efe47bde0] c0000000002ec85c SyS_write+0x6c/0x110
      [c000000efe47be30] c000000000009260 system_call+0x38/0x108
      Signed-off-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
      Cc: Roger Quadros <rogerq@ti.com>
      Cc: joel@jms.id.au
      Cc: stable@vger.kernel.org
      Reviewed-by: NRoger Quadros <rogerq@ti.com>
      Cc: <stable@vger.kernel.org> #v4.3+
      Tested-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      27a41a83
  7. 27 4月, 2016 3 次提交
  8. 19 4月, 2016 1 次提交
  9. 14 4月, 2016 1 次提交
    • M
      xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers · 98d74f9c
      Mathias Nyman 提交于
      PCI hotpluggable xhci controllers such as some Alpine Ridge solutions will
      remove the xhci controller from the PCI bus when the last USB device is
      disconnected.
      
      Add a flag to indicate that the host is being removed to avoid queueing
      configure_endpoint commands for the dropped endpoints.
      For PCI hotplugged controllers this will prevent 5 second command timeouts
      For static xhci controllers the configure_endpoint command is not needed
      in the removal case as everything will be returned, freed, and the
      controller is reset.
      
      For now the flag is only set for PCI connected host controllers.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98d74f9c
  10. 15 2月, 2016 2 次提交
  11. 04 2月, 2016 1 次提交
    • M
      Revert "xhci: don't finish a TD if we get a short-transfer event mid TD" · a6835090
      Mathias Nyman 提交于
      This reverts commit e210c422 ("xhci: don't finish a TD if we get a
      short transfer event mid TD")
      
      Turns out that most host controllers do not follow the xHCI specs and never
      send the second event for the last TRB in the TD if there was a short event
      mid-TD.
      
      Returning the URB directly after the first short-transfer event is far
      better than never returning the URB. (class drivers usually timeout
      after 30sec). For the hosts that do send the second event we will go
      back to treating it as misplaced event and print an error message for it.
      
      The origial patch was sent to stable kernels and needs to be reverted from
      there as well
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a6835090