1. 09 8月, 2012 1 次提交
    • S
      xhci: Fix bug after deq ptr set to link TRB. · 50d0206f
      Sarah Sharp 提交于
      This patch fixes a particularly nasty bug that was revealed by the ring
      expansion patches.  The bug has been present since the very beginning of
      the xHCI driver history, and could have caused general protection faults
      from bad memory accesses.
      
      The first thing to note is that a Set TR Dequeue Pointer command can
      move the dequeue pointer to a link TRB, if the canceled or stalled
      transfer TD ended just before a link TRB.  The function to increment the
      dequeue pointer, inc_deq, was written before cancellation and stall
      support was added.  It assumed that the dequeue pointer could never
      point to a link TRB.  It would unconditionally increment the dequeue
      pointer at the start of the function, check if the pointer was now on a
      link TRB, and move it to the top of the next segment if so.
      
      This means that if a Set TR Dequeue Point command moved the dequeue
      pointer to a link TRB, a subsequent call to inc_deq() would move the
      pointer off the segment and into la-la-land.  It would then read from
      that memory to determine if it was a link TRB.  Other functions would
      often call inc_deq() until the dequeue pointer matched some other
      pointer, which means this function would quite happily read all of
      system memory before wrapping around to the right pointer value.
      
      Often, there would be another endpoint segment from a different ring
      allocated from the same DMA pool, which would be contiguous to the
      segment inc_deq just stepped off of.  inc_deq would eventually find the
      link TRB in that segment, and blindly move the dequeue pointer back to
      the top of the correct ring segment.
      
      The only reason the original code worked at all is because there was
      only one ring segment.  With the ring expansion patches, the dequeue
      pointer would eventually wrap into place, but the dequeue segment would
      be out-of-sync.  On the second TD after the dequeue pointer was moved to
      a link TRB, trb_in_td() would fail (because the dequeue pointer and
      dequeue segment were out-of-sync), and this message would appear:
      
      ERROR Transfer event TRB DMA ptr not part of current TD
      
      This fixes bugzilla entry 4333 (option-based modem unhappy on USB 3.0
      port: "Transfer event TRB DMA ptr not part of current TD", "rejecting
      I/O to offline device"),
      
      	https://bugzilla.kernel.org/show_bug.cgi?id=43333
      
      and possibly other general protection fault bugs as well.
      
      This patch should be backported to kernels as old as 2.6.31.  A separate
      patch will be created for kernels older than 3.4, since inc_deq was
      modified in 3.4 and this patch will not apply.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Tested-by: NJames Ettle <theholyettlz@googlemail.com>
      Tested-by: NMatthew Hall <mhall@mhcomputing.net>
      Cc: stable@vger.kernel.org
      50d0206f
  2. 08 8月, 2012 1 次提交
  3. 03 7月, 2012 1 次提交
    • S
      xhci: Fix hang on back-to-back Set TR Deq Ptr commands. · 0d9f78a9
      Sarah Sharp 提交于
      The Microsoft LifeChat 3000 USB headset was causing a very reproducible
      hang whenever it was plugged in.  At first, I thought the host
      controller was producing bad transfer events, because the log was filled
      with errors like:
      
      xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD
      
      However, it turned out to be an xHCI driver bug in the ring expansion
      patches.  The bug is triggered When there are two ring segments, and a
      TD that ends just before a link TRB, like so:
      
       ______________                     _____________
      |              |              ---> | setup TRB B |
       ______________               |     _____________
      |              |              |    |  data TRB B |
       ______________               |     _____________
      | setup TRB A  | <-- deq      |    |  data TRB B |
       ______________               |     _____________
      | data TRB A   |              |    |             | <-- enq, deq''
       ______________               |     _____________
      | status TRB A |              |    |             |
       ______________               |     _____________
      |  link TRB    |---------------    |  link TRB   |
       _____________  <--- deq'           _____________
      
      TD A (the first control transfer) stalls on the data phase.  That halts
      the ring.  The xHCI driver moves the hardware dequeue pointer to the
      first TRB after the stalled transfer, which happens to be the link TRB.
      
      Once the Set TR dequeue pointer command completes, the function
      update_ring_for_set_deq_completion runs.  That function is supposed to
      update the xHCI driver's dequeue pointer to match the internal hardware
      dequeue pointer.  On the first call this would work fine, and the
      software dequeue pointer would move to deq'.
      
      However, if the transfer immediately after that stalled (TD B in this
      case), another Set TR Dequeue command would be issued.  That would move
      the hardware dequeue pointer to deq''.  Once that command completed,
      update_ring_for_set_deq_completion would run again.
      
      The original code would unconditionally increment the software dequeue
      pointer, which moved the pointer off the ring segment into la-la-land.
      The while loop would happy increment the dequeue pointer (possibly
      wrapping it) until it matched the hardware pointer value.
      
      The while loop would also access all the memory in between the first
      ring segment and the second ring segment to determine if it was a link
      TRB.  This could cause general protection faults, although it was
      unlikely because the ring segments came from a DMA pool, and would often
      have consecutive memory addresses.
      
      If nothing in that space looked like a link TRB, the deq_seg pointer for
      the ring would remain on the first segment.  Thus, the deq_seg and the
      software dequeue pointer would get out of sync.
      
      When the next transfer event came in after the stalled transfer, the
      xHCI driver code would attempt to convert the software dequeue pointer
      into a DMA address in order to compare the DMA address for the completed
      transfer.  Since the deq_seg and the dequeue pointer were out of sync,
      xhci_trb_virt_to_dma would return NULL.
      
      The transfer event would get ignored, the transfer would eventually
      timeout, and we would mistakenly convert the finished transfer to no-op
      TRBs.  Some kernel driver (maybe xHCI?) would then get stuck in an
      infinite loop in interrupt context, and the whole machine would hang.
      
      This patch should be backported to kernels as old as 3.4, that contain
      the commit b008df60 "xHCI: count free
      TRBs on transfer ring"
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Andiry Xu <andiry.xu@amd.com>
      Cc: stable@vger.kernel.org
      0d9f78a9
  4. 19 5月, 2012 1 次提交
    • S
      xhci: Some Evaluate Context commands must succeed. · 4b266541
      Sarah Sharp 提交于
      The upcoming USB 3.0 Link PM patches will introduce new API to enable
      and disable low-power link states.  We must be able to disable LPM in
      order to reset a device, or place the device into U3 (device suspend).
      Therefore, we need to make sure the Evaluate Context command to disable
      the LPM timeouts can't fail due to there being no room on the command
      ring.
      
      Introduce a new flag to the function that queues the Evaluate Context
      command, command_must_succeed.  This tells the ring handler that a TRB
      has already been reserved for the command (by incrementing
      xhci->cmd_ring_reserved_trbs), and basically ensures that prepare_ring()
      won't fail.  A similar flag was already implemented for the Configure
      Endpoint command queuing function.
      
      All functions that currently call xhci_configure_endpoint() to issue an
      Evaluate Context command pass "false" for the "must_succeed" parameter,
      so this patch should have no effect on current xHCI driver behavior.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      4b266541
  5. 18 5月, 2012 1 次提交
    • S
      xhci: Add new short TX quirk for Fresco Logic host. · 1530bbc6
      Sarah Sharp 提交于
      Sergio reported that when he recorded audio from a USB headset mic
      plugged into the USB 3.0 port on his ASUS N53SV-DH72, the audio sounded
      "robotic".  When plugged into the USB 2.0 port under EHCI on the same
      laptop, the audio sounded fine.  The device is:
      
      Bus 002 Device 004: ID 046d:0a0c Logitech, Inc. Clear Chat Comfort USB Headset
      
      The problem was tracked down to the Fresco Logic xHCI host controller
      not correctly reporting short transfers on isochronous IN endpoints.
      The driver would submit a 96 byte transfer, the device would only send
      88 or 90 bytes, and the xHCI host would report the transfer had a
      "successful" completion code, with an untransferred buffer length of 8
      or 6 bytes.
      
      The successful completion code and non-zero untransferred length is a
      contradiction.  The xHCI host is supposed to only mark a transfer as
      successful if all the bytes are transferred.  Otherwise, the transfer
      should be marked with a short packet completion code.  Without the EHCI
      bus trace, we wouldn't know whether the xHCI driver should trust the
      completion code or the untransferred length.  With it, we know to trust
      the untransferred length.
      
      Add a new xHCI quirk for the Fresco Logic host controller.  If a
      transfer is reported as successful, but the untransferred length is
      non-zero, print a warning.  For the Fresco Logic host, change the
      completion code to COMP_SHORT_TX and process the transfer like a short
      transfer.
      
      This should be backported to stable kernels that contain the commit
      f5182b41 "xhci: Disable MSI for some
      Fresco Logic hosts."  That commit was marked for stable kernels as old
      as 2.6.36.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Reported-by: NSergio Correia <lists@uece.net>
      Tested-by: NSergio Correia <lists@uece.net>
      Cc: stable@vger.kernel.org
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      1530bbc6
  6. 08 5月, 2012 1 次提交
  7. 04 5月, 2012 2 次提交
    • H
      usb-xhci: Handle COMP_TX_ERR for isoc tds · 9c745995
      Hans de Goede 提交于
      While testing unplugging an UVC HD webcam with usb-redirection (so through
      usbdevfs), my userspace usb-redir code was getting a value of -1 in
      iso_frame_desc[n].status, which according to Documentation/usb/error-codes.txt
      is not a valid value.
      
      The source of this -1 is the default case in xhci-ring.c:process_isoc_td()
      adding a kprintf there showed the value of trb_comp_code to be COMP_TX_ERR
      in this case, so this patch adds handling for that completion code to
      process_isoc_td().
      
      This was observed and tested with the following xhci controller:
      1033:0194 NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04)
      
      Note: I also wonder if setting frame->status to -1 (-EPERM) is the best we can
      do, but since I cannot come up with anything better I've left that as is.
      
      This patch should be backported to kernels as old as 2.6.36, which contain the
      commit 04e51901 "USB: xHCI: Isochronous
      transfer implementation".
      Signed-off-by: NHans de Goede <hdegoede@redhat.com>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: stable@vger.kernel.org
      9c745995
    • A
      xHCI: keep track of ports being resumed and indicate in hub_status_data · f370b996
      Andiry Xu 提交于
      This commit adds a bit-array to xhci bus_state for keeping track of
      which ports are undergoing a resume transition. If any of the bits
      are set when xhci_hub_status_data() is called, the routine will return
      a non-zero value even if no ports have any status changes pending.
      This will allow usbcore to handle races between root-hub suspend and
      port wakeup.
      
      This patch should be backported to kernels as old as 3.4, that contain
      the commit 879d38e6 "USB: fix race
      between root-hub suspend and remote wakeup".
      Signed-off-by: NAndiry Xu <andiry.xu@amd.com>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: stable@vger.kernel.org
      f370b996
  8. 11 4月, 2012 2 次提交
    • D
      xHCI: use gfp flags from caller instead of GFP_ATOMIC · 3fc8206d
      Dan Carpenter 提交于
      The caller is allowed to specify the GFP flags for these functions.
      We should prefer their flags unless we have good reason.  For
      example, if we take a spin_lock ourselves we'd need to use
      GFP_ATOMIC.  But in this case it's safe to use the callers GFP
      flags.
      
      The callers all pass GFP_ATOMIC here, so this change doesn't affect
      how the kernel behaves but we may add other callers later and this
      is a cleanup.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      3fc8206d
    • F
      xhci: don't re-enable IE constantly · 4e833c0b
      Felipe Balbi 提交于
      While we're at that, define IMAN bitfield to aid readability.
      
      The interrupt enable bit should be set once on driver init, and we
      shouldn't need to continually re-enable it.  Commit c21599a3 introduced
      a read of the irq_pending register, and that allows us to preserve the
      state of the IE bit.  Before that commit, we were blindly writing 0x3 to
      the register.
      
      This patch should be backported to kernels as old as 2.6.36, or ones
      that contain the commit c21599a3 "USB:
      xhci: Reduce reads and writes of interrupter registers".
      Signed-off-by: NFelipe Balbi <balbi@ti.com>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: stable@vger.kernel.org
      4e833c0b
  9. 14 3月, 2012 3 次提交
  10. 13 3月, 2012 1 次提交
  11. 02 3月, 2012 1 次提交
  12. 15 2月, 2012 3 次提交
    • S
      USB/xHCI: Support device-initiated USB 3.0 resume. · 4ee823b8
      Sarah Sharp 提交于
      USB 3.0 hubs don't have a port suspend change bit (that bit is now
      reserved).  Instead, when a host-initiated resume finishes, the hub sets
      the port link state change bit.
      
      When a USB 3.0 device initiates remote wakeup, the parent hubs with
      their upstream links in U3 will pass the LFPS up the chain.  The first
      hub that has an upstream link in U0 (which may be the roothub) will
      reflect that LFPS back down the path to the device.
      
      However, the parent hubs in the resumed path will not set their link
      state change bit.  Instead, the device that initiated the resume has to
      send an asynchronous "Function Wake" Device Notification up to the host
      controller.  Therefore, we need a way to notify the USB core of a device
      resume without going through the normal hub URB completion method.
      
      First, make the xHCI roothub act like an external USB 3.0 hub and not
      pass up the port link state change bit when a device-initiated resume
      finishes.  Introduce a new xHCI bit field, port_remote_wakeup, so that
      we can tell the difference between a port coming out of the U3Exit state
      (host-initiated resume) and the RExit state (ending state of
      device-initiated resume).
      
      Since the USB core can't tell whether a port on a hub has resumed by
      looking at the Hub Status buffer, we need to introduce a bitfield,
      wakeup_bits, that indicates which ports have resumed.  When the xHCI
      driver notices a port finishing a device-initiated resume, we call into
      a new USB core function, usb_wakeup_notification(), that will set
      the right bit in wakeup_bits, and kick khubd for that hub.
      
      We also call usb_wakeup_notification() when the Function Wake Device
      Notification is received by the xHCI driver.  This covers the case where
      the link between the roothub and the first-tier hub is in U0, and the
      hub reflects the resume signaling back to the device without giving any
      indication it has done so until the device sends the Function Wake
      notification.
      
      Change the code in khubd that handles the remote wakeup to look at the
      state the USB core thinks the device is in, and handle the remote wakeup
      if the port's wakeup bit is set.
      
      This patch only takes care of the case where the device is attached
      directly to the roothub, or the USB 3.0 hub that is attached to the root
      hub is the device sending the Function Wake Device Notification (e.g.
      because a new USB device was attached).  The other cases will be covered
      in a second patch.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      4ee823b8
    • S
      USB/xhci: Enable remote wakeup for USB3 devices. · 623bef9e
      Sarah Sharp 提交于
      When the USB 3.0 hub support went in, I disabled selective suspend for
      all external USB 3.0 hubs because they used a different mechanism to
      enable remote wakeup.  In fact, other USB 3.0 devices that could signal
      remote wakeup would have been prevented from going into suspend because
      they would have stalled the SetFeature Device Remote Wakeup request.
      
      This patch adds support for the USB 3.0 way of enabling remote wake up
      (with a SetFeature Function Suspend request), and enables selective
      suspend for all hubs during hub_probe.  It assumes that all USB 3.0 have
      only one "function" as defined by the interface association descriptor,
      which is true of all the USB 3.0 devices I've seen so far.  FIXME if
      that turns out to change later.
      
      After a device signals a remote wakeup, it is supposed to send a Device
      Notification packet to the host controller, signaling which function
      sent the remote wakeup.  The host can then put any other functions back
      into function suspend.  Since we don't have support for function suspend
      (and no devices currently support it), we'll just assume the hub
      function will resume the device properly when it received the port
      status change notification, and simply ignore any device notification
      events from the xHCI host controller.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      623bef9e
    • S
      xHCI: Kick khubd when USB3 resume really completes. · d93814cf
      Sarah Sharp 提交于
      xHCI roothubs go through slightly different port state machines when
      either a device initiates a remote wakeup and signals resume, or when
      the host initiates a resume.
      
      According to section 4.19.1.2.13 of the xHCI 1.0 spec, on host-initiated
      resume, the xHC port state machine automatically goes through the U3Exit
      state into the U0 state, setting the port link state change (PLC) bit in
      the process.
      
      When a device initiates resume, the xHCI port state machine goes into
      the "Resume" state and sets the PLC bit.  Then the xHCI driver writes U0
      into the port link state register to transition the port to U0 from the
      Resume state.
      
      We can't be sure the device is actually in the U0 state until we receive
      the next port status change event with the PLC bit set.  We really don't
      want khubd to be polling the roothub port status bits until the device
      is really in U0.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Acked-by: NAndiry Xu <andiry.xu@amd.com>
      d93814cf
  13. 26 1月, 2012 1 次提交
  14. 11 1月, 2012 1 次提交
    • S
      xhci: Fix USB 3.0 device restart on resume. · d0cd5d48
      Sarah Sharp 提交于
      The xHCI hub port code gets passed a zero-based port number by the USB
      core.  It then adds one to in order to find a device slot by port number
      and device speed by calling xhci_find_slot_id_by_port.  That function
      clearly states it requires a one-based port number.  The xHCI port
      status change event handler was using a zero-based port number that it
      got from find_faked_portnum_from_hw_portnum, not a one-based port
      number.  This lead to the doorbells never being rung for a device after
      a resume, or worse, a different device with the same speed having its
      doorbell rung (which could lead to bad power management in the xHCI host
      controller).
      
      This patch should be backported to kernels as old as 2.6.39.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Acked-by: NAndiry Xu <andiry.xu@amd.com>
      Cc: stable@vger.kernel.org
      d0cd5d48
  15. 05 1月, 2012 1 次提交
    • S
      xhci: Clean up 32-bit build warnings. · e910b440
      Sarah Sharp 提交于
      Randy Dunlap points out that commit 9258c0b2 "xhci: Better debugging for
      critical host errors." introduces some new build warnings on 32-bit
      builds:
      
      drivers/usb/host/xhci-ring.c:1936:3: warning: format '%016llx' expects type 'long long unsigned int', but argument 3 has type 'dma_addr_t'
      drivers/usb/host/xhci-ring.c:1958:3: warning: format '%016llx' expects type 'long long unsigned int', but argument 3 has type 'dma_addr_t'
      
      Cast the results of xhci_trb_virt_to_dma() from a dma_addr_t to an
      unsigned long long.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
      e910b440
  16. 03 1月, 2012 1 次提交
    • S
      xhci: Better debugging for critical host errors. · 9258c0b2
      Sarah Sharp 提交于
      When a host controller gives a bad event TRB, we should print out the
      contents of the TRB as a warning so that users don't have to recompile
      their kernel to get information about what went wrong.  Also, print out
      the event ring if they have xHCI debugging turned on, since previous
      events can often explain what happened before the bad TRB occurred.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      9258c0b2
  17. 23 12月, 2011 5 次提交
  18. 10 12月, 2011 1 次提交
    • C
      usb: fix number of mapped SG DMA entries · bc677d5b
      Clemens Ladisch 提交于
      Add a new field num_mapped_sgs to struct urb so that we have a place to
      store the number of mapped entries and can also retain the original
      value of entries in num_sgs.  Previously, usb_hcd_map_urb_for_dma()
      would overwrite this with the number of mapped entries, which would
      break dma_unmap_sg() because it requires the original number of entries.
      
      This fixes warnings like the following when using USB storage devices:
       ------------[ cut here ]------------
       WARNING: at lib/dma-debug.c:902 check_unmap+0x4e4/0x695()
       ehci_hcd 0000:00:12.2: DMA-API: device driver frees DMA sg list with different entry count [map count=4] [unmap count=1]
       Modules linked in: ohci_hcd ehci_hcd
       Pid: 0, comm: kworker/0:1 Not tainted 3.2.0-rc2+ #319
       Call Trace:
        <IRQ>  [<ffffffff81036d3b>] warn_slowpath_common+0x80/0x98
        [<ffffffff81036de7>] warn_slowpath_fmt+0x41/0x43
        [<ffffffff811fa5ae>] check_unmap+0x4e4/0x695
        [<ffffffff8105e92c>] ? trace_hardirqs_off+0xd/0xf
        [<ffffffff8147208b>] ? _raw_spin_unlock_irqrestore+0x33/0x50
        [<ffffffff811fa84a>] debug_dma_unmap_sg+0xeb/0x117
        [<ffffffff8137b02f>] usb_hcd_unmap_urb_for_dma+0x71/0x188
        [<ffffffff8137b166>] unmap_urb_for_dma+0x20/0x22
        [<ffffffff8137b1c5>] usb_hcd_giveback_urb+0x5d/0xc0
        [<ffffffffa0000d02>] ehci_urb_done+0xf7/0x10c [ehci_hcd]
        [<ffffffffa0001140>] qh_completions+0x429/0x4bd [ehci_hcd]
        [<ffffffffa000340a>] ehci_work+0x95/0x9c0 [ehci_hcd]
        ...
       ---[ end trace f29ac88a5a48c580 ]---
       Mapped at:
        [<ffffffff811faac4>] debug_dma_map_sg+0x45/0x139
        [<ffffffff8137bc0b>] usb_hcd_map_urb_for_dma+0x22e/0x478
        [<ffffffff8137c494>] usb_hcd_submit_urb+0x63f/0x6fa
        [<ffffffff8137d01c>] usb_submit_urb+0x2c7/0x2de
        [<ffffffff8137dcd4>] usb_sg_wait+0x55/0x161
      Signed-off-by: NClemens Ladisch <clemens@ladisch.de>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      bc677d5b
  19. 15 11月, 2011 1 次提交
    • A
      USB: Remove the SAW_IRQ hcd flag · 968b822c
      Alan Stern 提交于
      The HCD_FLAG_SAW_IRQ flag was introduced in order to catch IRQ routing
      errors: If an URB was unlinked and the host controller hadn't gotten
      any IRQs, it seemed likely that the IRQs were directed to the wrong
      vector.
      
      This warning hasn't come up in many years, as far as I know; interrupt
      routing now seems to be well under control.  Therefore there's no
      reason to keep the flag around any more.  This patch (as1495) finally
      removes it.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      968b822c
  20. 03 11月, 2011 1 次提交
    • D
      usb, xhci: fix lockdep warning on endpoint timeout · f43d6231
      Don Zickus 提交于
      While debugging a usb3 problem, I stumbled upon this lockdep warning.
      
      Oct 18 21:41:17 dhcp47-74 kernel: =================================
      Oct 18 21:41:17 dhcp47-74 kernel: [ INFO: inconsistent lock state ]
      Oct 18 21:41:17 dhcp47-74 kernel: 3.1.0-rc4nmi+ #456
      Oct 18 21:41:17 dhcp47-74 kernel: ---------------------------------
      Oct 18 21:41:17 dhcp47-74 kernel: inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
      Oct 18 21:41:17 dhcp47-74 kernel: swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
      Oct 18 21:41:17 dhcp47-74 kernel: (&(&xhci->lock)->rlock){?.-...}, at: [<ffffffffa0228990>] xhci_stop_endpoint_command_watchdog+0x30/0x340 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel: {IN-HARDIRQ-W} state was registered at:
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8109a941>] __lock_acquire+0x781/0x1660
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8109bed7>] lock_acquire+0x97/0x170
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff81501b46>] _raw_spin_lock+0x46/0x80
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffffa02299fa>] xhci_irq+0x3a/0x1960 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffffa022b351>] xhci_msi_irq+0x31/0x40 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff810d2305>] handle_irq_event_percpu+0x85/0x320
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff810d25e8>] handle_irq_event+0x48/0x70
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff810d537d>] handle_edge_irq+0x6d/0x130
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff810048c9>] handle_irq+0x49/0xa0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8150d56d>] do_IRQ+0x5d/0xe0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff815029b0>] ret_from_intr+0x0/0x13
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff81388aca>] usb_set_device_state+0x8a/0x180
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8138f038>] usb_add_hcd+0x2b8/0x730
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffffa022ed7e>] xhci_pci_probe+0x9e/0xd4 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8127915f>] local_pci_probe+0x5f/0xd0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8127a569>] pci_device_probe+0x119/0x120
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff81334473>] driver_probe_device+0xa3/0x2c0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8133473b>] __driver_attach+0xab/0xb0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8133373c>] bus_for_each_dev+0x6c/0xa0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff813341fe>] driver_attach+0x1e/0x20
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff81333b88>] bus_add_driver+0x1f8/0x2b0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff81334df6>] driver_register+0x76/0x140
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8127a7c6>] __pci_register_driver+0x66/0xe0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffffa013c04a>] snd_timer_find+0x4a/0x70 [snd_timer]
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffffa013c00e>] snd_timer_find+0xe/0x70 [snd_timer]
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff810001d3>] do_one_initcall+0x43/0x180
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff810a9ed2>] sys_init_module+0x92/0x1f0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8150ab6b>] system_call_fastpath+0x16/0x1b
      Oct 18 21:41:17 dhcp47-74 kernel: irq event stamp: 631984
      Oct 18 21:41:17 dhcp47-74 kernel: hardirqs last  enabled at (631984): [<ffffffff81502720>] _raw_spin_unlock_irq+0x30/0x50
      Oct 18 21:41:17 dhcp47-74 kernel: hardirqs last disabled at (631983): [<ffffffff81501c49>] _raw_spin_lock_irq+0x19/0x90
      Oct 18 21:41:17 dhcp47-74 kernel: softirqs last  enabled at (631980): [<ffffffff8105ff63>] _local_bh_enable+0x13/0x20
      Oct 18 21:41:17 dhcp47-74 kernel: softirqs last disabled at (631981): [<ffffffff8150ce6c>] call_softirq+0x1c/0x30
      Oct 18 21:41:17 dhcp47-74 kernel:
      Oct 18 21:41:17 dhcp47-74 kernel: other info that might help us debug this:
      Oct 18 21:41:17 dhcp47-74 kernel: Possible unsafe locking scenario:
      Oct 18 21:41:17 dhcp47-74 kernel:
      Oct 18 21:41:17 dhcp47-74 kernel:       CPU0
      Oct 18 21:41:17 dhcp47-74 kernel:       ----
      Oct 18 21:41:17 dhcp47-74 kernel:  lock(&(&xhci->lock)->rlock);
      Oct 18 21:41:17 dhcp47-74 kernel:  <Interrupt>
      Oct 18 21:41:17 dhcp47-74 kernel:    lock(&(&xhci->lock)->rlock);
      Oct 18 21:41:17 dhcp47-74 kernel:
      Oct 18 21:41:17 dhcp47-74 kernel: *** DEADLOCK ***
      Oct 18 21:41:17 dhcp47-74 kernel:
      Oct 18 21:41:17 dhcp47-74 kernel: 1 lock held by swapper/0:
      Oct 18 21:41:17 dhcp47-74 kernel: #0:  (&ep->stop_cmd_timer){+.-...}, at: [<ffffffff8106abf2>] run_timer_softirq+0x162/0x570
      Oct 18 21:41:17 dhcp47-74 kernel:
      Oct 18 21:41:17 dhcp47-74 kernel: stack backtrace:
      Oct 18 21:41:17 dhcp47-74 kernel: Pid: 0, comm: swapper Tainted: G        W   3.1.0-rc4nmi+ #456
      Oct 18 21:41:17 dhcp47-74 kernel: Call Trace:
      Oct 18 21:41:17 dhcp47-74 kernel: <IRQ>  [<ffffffff81098ed7>] print_usage_bug+0x227/0x270
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff810999c6>] mark_lock+0x346/0x410
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8109a7de>] __lock_acquire+0x61e/0x1660
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81099893>] ? mark_lock+0x213/0x410
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8109bed7>] lock_acquire+0x97/0x170
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffffa0228990>] ? xhci_stop_endpoint_command_watchdog+0x30/0x340 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81501b46>] _raw_spin_lock+0x46/0x80
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffffa0228990>] ? xhci_stop_endpoint_command_watchdog+0x30/0x340 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffffa0228990>] xhci_stop_endpoint_command_watchdog+0x30/0x340 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8106abf2>] ? run_timer_softirq+0x162/0x570
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8106ac9d>] run_timer_softirq+0x20d/0x570
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8106abf2>] ? run_timer_softirq+0x162/0x570
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffffa0228960>] ? xhci_queue_isoc_tx_prepare+0x8e0/0x8e0 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff810604d2>] __do_softirq+0xf2/0x3f0
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81020edd>] ? lapic_next_event+0x1d/0x30
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81090d4e>] ? clockevents_program_event+0x5e/0x90
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8150ce6c>] call_softirq+0x1c/0x30
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8100484d>] do_softirq+0x8d/0xc0
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8105ff35>] irq_exit+0xe5/0x100
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8150d65e>] smp_apic_timer_interrupt+0x6e/0x99
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8150b6f0>] apic_timer_interrupt+0x70/0x80
      Oct 18 21:41:17 dhcp47-74 kernel: <EOI>  [<ffffffff81095d8d>] ? trace_hardirqs_off+0xd/0x10
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff812ddb76>] ? acpi_idle_enter_bm+0x227/0x25b
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff812ddb71>] ? acpi_idle_enter_bm+0x222/0x25b
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff813eda63>] cpuidle_idle_call+0x103/0x290
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81002155>] cpu_idle+0xe5/0x160
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff814e7f50>] rest_init+0xe0/0xf0
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff814e7e70>] ? csum_partial_copy_generic+0x170/0x170
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81df8e23>] start_kernel+0x3fc/0x407
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81df8321>] x86_64_start_reservations+0x131/0x135
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81df8412>] x86_64_start_kernel+0xed/0xf4
      Oct 18 21:41:17 dhcp47-74 kernel: xhci_hcd 0000:00:14.0: xHCI host not responding to stop endpoint command.
      Oct 18 21:41:17 dhcp47-74 kernel: xhci_hcd 0000:00:14.0: Assuming host is dying, halting host.
      Oct 18 21:41:17 dhcp47-74 kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
      Oct 18 21:41:17 dhcp47-74 kernel: usb 3-4: device descriptor read/8, error -110
      Oct 18 21:41:17 dhcp47-74 kernel: usb 3-4: device descriptor read/8, error -22
      Oct 18 21:41:17 dhcp47-74 kernel: hub 3-0:1.0: cannot disable port 4 (err = -19)
      
      Basically what is happening is in xhci_stop_endpoint_command_watchdog()
      the xhci->lock is grabbed with just spin_lock.  What lockdep deduces is
      that if an interrupt occurred while in this function it would deadlock
      with xhci_irq because that function also grabs the xhci->lock.
      
      Fixing it is trivial by using spin_lock_irqsave instead.
      
      This should be queued to stable kernels as far back as 2.6.33.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: stable@kernel.org
      f43d6231
  21. 27 9月, 2011 4 次提交
  22. 20 9月, 2011 1 次提交
    • A
      USB: xHCI: prevent infinite loop when processing MSE event · c2d7b49f
      Andiry Xu 提交于
      When a xHC host is unable to handle isochronous transfer in the
      interval, it reports a Missed Service Error event and skips some tds.
      
      Currently xhci driver handles MSE event in the following ways:
      
      1. When encounter a MSE event, set ep->skip flag, update event ring
         dequeue pointer and return.
      
      2. When encounter the next event on this ep, the driver will run the
         do-while loop, fetch td from ep's td_list to find the td
         corresponding to this event.  All tds missed are marked as short
         transfer(-EXDEV).
      
      The do-while loop will end in two ways:
      
      1. If the td pointed by the event trb is found;
      
      2. If the ep ring's td_list is empty.
      
      However, if a buggy HW reports some unpredicted event (for example, an
      overrun event following a MSE event while the ep ring is actually not
      empty), the driver will never find the td, and it will loop until the
      td_list is empty.
      
      Unfortunately, the spinlock is dropped when give back a urb in the
      do-while loop.  During the spinlock released period, the class driver
      may still submit urbs and add tds to the td_list.  This may cause
      disaster, since the td_list will never be empty and the loop never ends,
      and the system hangs.
      
      To fix this, count the number of TDs on the ep ring before skipping TDs,
      and quit the loop when skipped that number of tds.  This guarantees the
      do-while loop will end after certain number of cycles, and driver will
      not be trapped in an infinite loop.
      Signed-off-by: NAndiry Xu <andiry.xu@amd.com>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c2d7b49f
  23. 10 9月, 2011 1 次提交
    • S
      xhci: Don't print short isoc packets. · fd984d24
      Sarah Sharp 提交于
      Now that the xHCI driver always return a status value of zero for isochronous
      URBs, when the last TD of an isochronous URB is short, the local variable
      "status" stays set to -EINPROGRESS.  When xHCI driver debugging is turned on,
      this causes the log file to fill with messages like this:
      
      [   38.859282] xhci_hcd 0000:00:14.0: Giveback URB ffff88013ad47800, len = 1408, expected = 580, status = -115
      
      Don't print out the status of an URB for isochronous URBs.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      fd984d24
  24. 24 8月, 2011 1 次提交
    • K
      USB: use usb_endpoint_maxp() instead of le16_to_cpu() · 29cc8897
      Kuninori Morimoto 提交于
      Now ${LINUX}/drivers/usb/* can use usb_endpoint_maxp(desc) to get maximum packet size
      instead of le16_to_cpu(desc->wMaxPacketSize).
      This patch fix it up
      
      Cc: Armin Fuerst <fuerst@in.tum.de>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Johannes Erdfelt <johannes@erdfelt.com>
      Cc: Vojtech Pavlik <vojtech@suse.cz>
      Cc: Oliver Neukum <oliver@neukum.name>
      Cc: David Kubicek <dave@awk.cz>
      Cc: Johan Hovold <jhovold@gmail.com>
      Cc: Brad Hards <bhards@bigpond.net.au>
      Acked-by: NFelipe Balbi <balbi@ti.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Thomas Dahlmann <dahlmann.thomas@arcor.de>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: David Lopo <dlopo@chipidea.mips.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Michal Nazarewicz <m.nazarewicz@samsung.com>
      Cc: Xie Xiaobo <X.Xie@freescale.com>
      Cc: Li Yang <leoli@freescale.com>
      Cc: Jiang Bo <tanya.jiang@freescale.com>
      Cc: Yuan-hsin Chen <yhchen@faraday-tech.com>
      Cc: Darius Augulis <augulis.darius@gmail.com>
      Cc: Xiaochen Shen <xiaochen.shen@intel.com>
      Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Cc: OKI SEMICONDUCTOR, <toshiharu-linux@dsn.okisemi.com>
      Cc: Robert Jarzmik <robert.jarzmik@free.fr>
      Cc: Ben Dooks <ben@simtec.co.uk>
      Cc: Thomas Abraham <thomas.ab@samsung.com>
      Cc: Herbert Pötzl <herbert@13thfloor.at>
      Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
      Cc: Roman Weissgaerber <weissg@vienna.at>
      Acked-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Tony Olech <tony.olech@elandigitalsystems.com>
      Cc: Florian Floe Echtler <echtler@fs.tum.de>
      Cc: Christian Lucht <lucht@codemercs.com>
      Cc: Juergen Stuber <starblue@sourceforge.net>
      Cc: Georges Toth <g.toth@e-biz.lu>
      Cc: Bill Ryder <bryder@sgi.com>
      Cc: Kuba Ober <kuba@mareimbrium.org>
      Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
      Signed-off-by: NKuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      29cc8897
  25. 17 8月, 2011 1 次提交
    • S
      xhci: Handle zero-length isochronous packets. · 48df4a6f
      Sarah Sharp 提交于
      For a long time, the xHCI driver has had this note:
      	/* FIXME: Ignoring zero-length packets, can those happen? */
      
      It turns out that, yes, there are drivers that need to queue zero-length
      transfers for isochronous OUT transfers.  Without this patch, users will
      see kernel hang messages when a driver attempts to enqueue an isochronous
      URB with a zero length transfer (because count_isoc_trbs_needed will return
      zero for that TD, xhci_td->last_trb will never be set, and updating the
      dequeue pointer will cause an infinite loop).
      
      Matěj ran into this issue when using an NI Audio4DJ USB soundcard
      with the snd-usb-caiaq driver.  See
      	https://bugzilla.kernel.org/show_bug.cgi?id=40702
      
      Fix count_isoc_trbs_needed() to return 1 for zero-length transfers (thanks
      Alan on the math help).  Update the various TRB field calculations to deal
      with zero-length transfers.  We're still transferring one packet with a
      zero-length data payload, so the total_packet_count should be 1. The
      Transfer Burst Count (TBC) and Transfer Last Burst Packet Count (TLBPC)
      fields should be set to zero.
      
      This patch should be backported to kernels as old as 2.6.36.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Tested-by: NMatěj Laitl <matej@laitl.cz>
      Cc: Daniel Mack <zonque@gmail.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: stable@kernel.org
      48df4a6f
  26. 10 8月, 2011 2 次提交
    • S
      xhci: Remove TDs from TD lists when URBs are canceled. · 585df1d9
      Sarah Sharp 提交于
      When a driver tries to cancel an URB, and the host controller is dying,
      xhci_urb_dequeue will giveback the URB without removing the xhci_tds
      that comprise that URB from the td_list or the cancelled_td_list.  This
      can cause a race condition between the driver calling URB dequeue and
      the stop endpoint command watchdog timer.
      
      If the timer fires on a dying host, and a driver attempts to resubmit
      while the watchdog timer has dropped the xhci->lock to giveback a
      cancelled URB, URBs may be given back by the xhci_urb_dequeue() function.
      At that point, the URB's priv pointer will be freed and set to NULL, but
      the TDs will remain on the td_list.  This will cause an oops in
      xhci_giveback_urb_in_irq() when the watchdog timer attempts to loop
      through the endpoints' td_lists, giving back killed URBs.
      
      Make sure that xhci_urb_dequeue() removes TDs from the TD lists and
      canceled TD lists before it gives back the URB.
      
      This patch should be backported to kernels as old as 2.6.36.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Andiry Xu <andiry.xu@amd.com>
      Cc: stable@kernel.org
      585df1d9
    • S
      xhci: Fix failed enqueue in the middle of isoch TD. · 522989a2
      Sarah Sharp 提交于
      When an isochronous transfer is enqueued, xhci_queue_isoc_tx_prepare()
      will ensure that there is enough room on the transfer rings for all of the
      isochronous TDs for that URB.  However, when xhci_queue_isoc_tx() is
      enqueueing individual isoc TDs, the prepare_transfer() function can fail
      if the endpoint state has changed to disabled, error, or some other
      unknown state.
      
      With the current code, if Nth TD (not the first TD) fails, the ring is
      left in a sorry state.  The partially enqueued TDs are left on the ring,
      and the first TRB of the TD is not given back to the hardware.  The
      enqueue pointer is left on the TRB after the last successfully enqueued
      TD.  This means the ring is basically useless.  Any new transfers will be
      enqueued after the failed TDs, which the hardware will never read because
      the cycle bit indicates it does not own them.  The ring will fill up with
      untransferred TDs, and the endpoint will be basically unusable.
      
      The untransferred TDs will also remain on the TD list.  Since the td_list
      is a FIFO, this basically means the ring handler will be waiting on TDs
      that will never be completed (or worse, dereference memory that doesn't
      exist any more).
      
      Change the code to clean up the isochronous ring after a failed transfer.
      If the first TD failed, simply return and allow the xhci_urb_enqueue
      function to free the urb_priv.  If the Nth TD failed, first remove the TDs
      from the td_list.  Then convert the TRBs that were enqueued into No-op
      TRBs.  Make sure to flip the cycle bit on all enqueued TRBs (including any
      link TRBs in the middle or between TDs), but leave the cycle bit of the
      first TRB (which will show software-owned) intact.  Then move the ring
      enqueue pointer back to the first TRB and make sure to change the
      xhci_ring's cycle state to what is appropriate for that ring segment.
      
      This ensures that the No-op TRBs will be overwritten by subsequent TDs,
      and the hardware will not start executing random TRBs because the cycle
      bit was left as hardware-owned.
      
      This bug is unlikely to be hit, but it was something I noticed while
      tracking down the watchdog timer issue.  I verified that the fix works by
      injecting some errors on the 250th isochronous URB queued, although I
      could not verify that the ring is in the correct state because uvcvideo
      refused to talk to the device after the first usb_submit_urb() failed.
      Ring debugging shows that the ring looks correct, however.
      
      This patch should be backported to kernels as old as 2.6.36.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Andiry Xu <andiry.xu@amd.com>
      Cc: stable@kernel.org
      522989a2