1. 03 1月, 2012 1 次提交
    • S
      xhci: Better debugging for critical host errors. · 9258c0b2
      Sarah Sharp 提交于
      When a host controller gives a bad event TRB, we should print out the
      contents of the TRB as a warning so that users don't have to recompile
      their kernel to get information about what went wrong.  Also, print out
      the event ring if they have xHCI debugging turned on, since previous
      events can often explain what happened before the bad TRB occurred.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      9258c0b2
  2. 23 12月, 2011 5 次提交
  3. 10 12月, 2011 1 次提交
    • C
      usb: fix number of mapped SG DMA entries · bc677d5b
      Clemens Ladisch 提交于
      Add a new field num_mapped_sgs to struct urb so that we have a place to
      store the number of mapped entries and can also retain the original
      value of entries in num_sgs.  Previously, usb_hcd_map_urb_for_dma()
      would overwrite this with the number of mapped entries, which would
      break dma_unmap_sg() because it requires the original number of entries.
      
      This fixes warnings like the following when using USB storage devices:
       ------------[ cut here ]------------
       WARNING: at lib/dma-debug.c:902 check_unmap+0x4e4/0x695()
       ehci_hcd 0000:00:12.2: DMA-API: device driver frees DMA sg list with different entry count [map count=4] [unmap count=1]
       Modules linked in: ohci_hcd ehci_hcd
       Pid: 0, comm: kworker/0:1 Not tainted 3.2.0-rc2+ #319
       Call Trace:
        <IRQ>  [<ffffffff81036d3b>] warn_slowpath_common+0x80/0x98
        [<ffffffff81036de7>] warn_slowpath_fmt+0x41/0x43
        [<ffffffff811fa5ae>] check_unmap+0x4e4/0x695
        [<ffffffff8105e92c>] ? trace_hardirqs_off+0xd/0xf
        [<ffffffff8147208b>] ? _raw_spin_unlock_irqrestore+0x33/0x50
        [<ffffffff811fa84a>] debug_dma_unmap_sg+0xeb/0x117
        [<ffffffff8137b02f>] usb_hcd_unmap_urb_for_dma+0x71/0x188
        [<ffffffff8137b166>] unmap_urb_for_dma+0x20/0x22
        [<ffffffff8137b1c5>] usb_hcd_giveback_urb+0x5d/0xc0
        [<ffffffffa0000d02>] ehci_urb_done+0xf7/0x10c [ehci_hcd]
        [<ffffffffa0001140>] qh_completions+0x429/0x4bd [ehci_hcd]
        [<ffffffffa000340a>] ehci_work+0x95/0x9c0 [ehci_hcd]
        ...
       ---[ end trace f29ac88a5a48c580 ]---
       Mapped at:
        [<ffffffff811faac4>] debug_dma_map_sg+0x45/0x139
        [<ffffffff8137bc0b>] usb_hcd_map_urb_for_dma+0x22e/0x478
        [<ffffffff8137c494>] usb_hcd_submit_urb+0x63f/0x6fa
        [<ffffffff8137d01c>] usb_submit_urb+0x2c7/0x2de
        [<ffffffff8137dcd4>] usb_sg_wait+0x55/0x161
      Signed-off-by: NClemens Ladisch <clemens@ladisch.de>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      bc677d5b
  4. 15 11月, 2011 1 次提交
    • A
      USB: Remove the SAW_IRQ hcd flag · 968b822c
      Alan Stern 提交于
      The HCD_FLAG_SAW_IRQ flag was introduced in order to catch IRQ routing
      errors: If an URB was unlinked and the host controller hadn't gotten
      any IRQs, it seemed likely that the IRQs were directed to the wrong
      vector.
      
      This warning hasn't come up in many years, as far as I know; interrupt
      routing now seems to be well under control.  Therefore there's no
      reason to keep the flag around any more.  This patch (as1495) finally
      removes it.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      968b822c
  5. 03 11月, 2011 1 次提交
    • D
      usb, xhci: fix lockdep warning on endpoint timeout · f43d6231
      Don Zickus 提交于
      While debugging a usb3 problem, I stumbled upon this lockdep warning.
      
      Oct 18 21:41:17 dhcp47-74 kernel: =================================
      Oct 18 21:41:17 dhcp47-74 kernel: [ INFO: inconsistent lock state ]
      Oct 18 21:41:17 dhcp47-74 kernel: 3.1.0-rc4nmi+ #456
      Oct 18 21:41:17 dhcp47-74 kernel: ---------------------------------
      Oct 18 21:41:17 dhcp47-74 kernel: inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
      Oct 18 21:41:17 dhcp47-74 kernel: swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
      Oct 18 21:41:17 dhcp47-74 kernel: (&(&xhci->lock)->rlock){?.-...}, at: [<ffffffffa0228990>] xhci_stop_endpoint_command_watchdog+0x30/0x340 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel: {IN-HARDIRQ-W} state was registered at:
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8109a941>] __lock_acquire+0x781/0x1660
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8109bed7>] lock_acquire+0x97/0x170
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff81501b46>] _raw_spin_lock+0x46/0x80
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffffa02299fa>] xhci_irq+0x3a/0x1960 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffffa022b351>] xhci_msi_irq+0x31/0x40 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff810d2305>] handle_irq_event_percpu+0x85/0x320
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff810d25e8>] handle_irq_event+0x48/0x70
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff810d537d>] handle_edge_irq+0x6d/0x130
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff810048c9>] handle_irq+0x49/0xa0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8150d56d>] do_IRQ+0x5d/0xe0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff815029b0>] ret_from_intr+0x0/0x13
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff81388aca>] usb_set_device_state+0x8a/0x180
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8138f038>] usb_add_hcd+0x2b8/0x730
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffffa022ed7e>] xhci_pci_probe+0x9e/0xd4 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8127915f>] local_pci_probe+0x5f/0xd0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8127a569>] pci_device_probe+0x119/0x120
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff81334473>] driver_probe_device+0xa3/0x2c0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8133473b>] __driver_attach+0xab/0xb0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8133373c>] bus_for_each_dev+0x6c/0xa0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff813341fe>] driver_attach+0x1e/0x20
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff81333b88>] bus_add_driver+0x1f8/0x2b0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff81334df6>] driver_register+0x76/0x140
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8127a7c6>] __pci_register_driver+0x66/0xe0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffffa013c04a>] snd_timer_find+0x4a/0x70 [snd_timer]
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffffa013c00e>] snd_timer_find+0xe/0x70 [snd_timer]
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff810001d3>] do_one_initcall+0x43/0x180
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff810a9ed2>] sys_init_module+0x92/0x1f0
      Oct 18 21:41:17 dhcp47-74 kernel:  [<ffffffff8150ab6b>] system_call_fastpath+0x16/0x1b
      Oct 18 21:41:17 dhcp47-74 kernel: irq event stamp: 631984
      Oct 18 21:41:17 dhcp47-74 kernel: hardirqs last  enabled at (631984): [<ffffffff81502720>] _raw_spin_unlock_irq+0x30/0x50
      Oct 18 21:41:17 dhcp47-74 kernel: hardirqs last disabled at (631983): [<ffffffff81501c49>] _raw_spin_lock_irq+0x19/0x90
      Oct 18 21:41:17 dhcp47-74 kernel: softirqs last  enabled at (631980): [<ffffffff8105ff63>] _local_bh_enable+0x13/0x20
      Oct 18 21:41:17 dhcp47-74 kernel: softirqs last disabled at (631981): [<ffffffff8150ce6c>] call_softirq+0x1c/0x30
      Oct 18 21:41:17 dhcp47-74 kernel:
      Oct 18 21:41:17 dhcp47-74 kernel: other info that might help us debug this:
      Oct 18 21:41:17 dhcp47-74 kernel: Possible unsafe locking scenario:
      Oct 18 21:41:17 dhcp47-74 kernel:
      Oct 18 21:41:17 dhcp47-74 kernel:       CPU0
      Oct 18 21:41:17 dhcp47-74 kernel:       ----
      Oct 18 21:41:17 dhcp47-74 kernel:  lock(&(&xhci->lock)->rlock);
      Oct 18 21:41:17 dhcp47-74 kernel:  <Interrupt>
      Oct 18 21:41:17 dhcp47-74 kernel:    lock(&(&xhci->lock)->rlock);
      Oct 18 21:41:17 dhcp47-74 kernel:
      Oct 18 21:41:17 dhcp47-74 kernel: *** DEADLOCK ***
      Oct 18 21:41:17 dhcp47-74 kernel:
      Oct 18 21:41:17 dhcp47-74 kernel: 1 lock held by swapper/0:
      Oct 18 21:41:17 dhcp47-74 kernel: #0:  (&ep->stop_cmd_timer){+.-...}, at: [<ffffffff8106abf2>] run_timer_softirq+0x162/0x570
      Oct 18 21:41:17 dhcp47-74 kernel:
      Oct 18 21:41:17 dhcp47-74 kernel: stack backtrace:
      Oct 18 21:41:17 dhcp47-74 kernel: Pid: 0, comm: swapper Tainted: G        W   3.1.0-rc4nmi+ #456
      Oct 18 21:41:17 dhcp47-74 kernel: Call Trace:
      Oct 18 21:41:17 dhcp47-74 kernel: <IRQ>  [<ffffffff81098ed7>] print_usage_bug+0x227/0x270
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff810999c6>] mark_lock+0x346/0x410
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8109a7de>] __lock_acquire+0x61e/0x1660
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81099893>] ? mark_lock+0x213/0x410
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8109bed7>] lock_acquire+0x97/0x170
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffffa0228990>] ? xhci_stop_endpoint_command_watchdog+0x30/0x340 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81501b46>] _raw_spin_lock+0x46/0x80
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffffa0228990>] ? xhci_stop_endpoint_command_watchdog+0x30/0x340 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffffa0228990>] xhci_stop_endpoint_command_watchdog+0x30/0x340 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8106abf2>] ? run_timer_softirq+0x162/0x570
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8106ac9d>] run_timer_softirq+0x20d/0x570
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8106abf2>] ? run_timer_softirq+0x162/0x570
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffffa0228960>] ? xhci_queue_isoc_tx_prepare+0x8e0/0x8e0 [xhci_hcd]
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff810604d2>] __do_softirq+0xf2/0x3f0
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81020edd>] ? lapic_next_event+0x1d/0x30
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81090d4e>] ? clockevents_program_event+0x5e/0x90
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8150ce6c>] call_softirq+0x1c/0x30
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8100484d>] do_softirq+0x8d/0xc0
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8105ff35>] irq_exit+0xe5/0x100
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8150d65e>] smp_apic_timer_interrupt+0x6e/0x99
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff8150b6f0>] apic_timer_interrupt+0x70/0x80
      Oct 18 21:41:17 dhcp47-74 kernel: <EOI>  [<ffffffff81095d8d>] ? trace_hardirqs_off+0xd/0x10
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff812ddb76>] ? acpi_idle_enter_bm+0x227/0x25b
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff812ddb71>] ? acpi_idle_enter_bm+0x222/0x25b
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff813eda63>] cpuidle_idle_call+0x103/0x290
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81002155>] cpu_idle+0xe5/0x160
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff814e7f50>] rest_init+0xe0/0xf0
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff814e7e70>] ? csum_partial_copy_generic+0x170/0x170
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81df8e23>] start_kernel+0x3fc/0x407
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81df8321>] x86_64_start_reservations+0x131/0x135
      Oct 18 21:41:17 dhcp47-74 kernel: [<ffffffff81df8412>] x86_64_start_kernel+0xed/0xf4
      Oct 18 21:41:17 dhcp47-74 kernel: xhci_hcd 0000:00:14.0: xHCI host not responding to stop endpoint command.
      Oct 18 21:41:17 dhcp47-74 kernel: xhci_hcd 0000:00:14.0: Assuming host is dying, halting host.
      Oct 18 21:41:17 dhcp47-74 kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
      Oct 18 21:41:17 dhcp47-74 kernel: usb 3-4: device descriptor read/8, error -110
      Oct 18 21:41:17 dhcp47-74 kernel: usb 3-4: device descriptor read/8, error -22
      Oct 18 21:41:17 dhcp47-74 kernel: hub 3-0:1.0: cannot disable port 4 (err = -19)
      
      Basically what is happening is in xhci_stop_endpoint_command_watchdog()
      the xhci->lock is grabbed with just spin_lock.  What lockdep deduces is
      that if an interrupt occurred while in this function it would deadlock
      with xhci_irq because that function also grabs the xhci->lock.
      
      Fixing it is trivial by using spin_lock_irqsave instead.
      
      This should be queued to stable kernels as far back as 2.6.33.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: stable@kernel.org
      f43d6231
  6. 27 9月, 2011 4 次提交
  7. 20 9月, 2011 1 次提交
    • A
      USB: xHCI: prevent infinite loop when processing MSE event · c2d7b49f
      Andiry Xu 提交于
      When a xHC host is unable to handle isochronous transfer in the
      interval, it reports a Missed Service Error event and skips some tds.
      
      Currently xhci driver handles MSE event in the following ways:
      
      1. When encounter a MSE event, set ep->skip flag, update event ring
         dequeue pointer and return.
      
      2. When encounter the next event on this ep, the driver will run the
         do-while loop, fetch td from ep's td_list to find the td
         corresponding to this event.  All tds missed are marked as short
         transfer(-EXDEV).
      
      The do-while loop will end in two ways:
      
      1. If the td pointed by the event trb is found;
      
      2. If the ep ring's td_list is empty.
      
      However, if a buggy HW reports some unpredicted event (for example, an
      overrun event following a MSE event while the ep ring is actually not
      empty), the driver will never find the td, and it will loop until the
      td_list is empty.
      
      Unfortunately, the spinlock is dropped when give back a urb in the
      do-while loop.  During the spinlock released period, the class driver
      may still submit urbs and add tds to the td_list.  This may cause
      disaster, since the td_list will never be empty and the loop never ends,
      and the system hangs.
      
      To fix this, count the number of TDs on the ep ring before skipping TDs,
      and quit the loop when skipped that number of tds.  This guarantees the
      do-while loop will end after certain number of cycles, and driver will
      not be trapped in an infinite loop.
      Signed-off-by: NAndiry Xu <andiry.xu@amd.com>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c2d7b49f
  8. 10 9月, 2011 1 次提交
    • S
      xhci: Don't print short isoc packets. · fd984d24
      Sarah Sharp 提交于
      Now that the xHCI driver always return a status value of zero for isochronous
      URBs, when the last TD of an isochronous URB is short, the local variable
      "status" stays set to -EINPROGRESS.  When xHCI driver debugging is turned on,
      this causes the log file to fill with messages like this:
      
      [   38.859282] xhci_hcd 0000:00:14.0: Giveback URB ffff88013ad47800, len = 1408, expected = 580, status = -115
      
      Don't print out the status of an URB for isochronous URBs.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      fd984d24
  9. 24 8月, 2011 1 次提交
    • K
      USB: use usb_endpoint_maxp() instead of le16_to_cpu() · 29cc8897
      Kuninori Morimoto 提交于
      Now ${LINUX}/drivers/usb/* can use usb_endpoint_maxp(desc) to get maximum packet size
      instead of le16_to_cpu(desc->wMaxPacketSize).
      This patch fix it up
      
      Cc: Armin Fuerst <fuerst@in.tum.de>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Johannes Erdfelt <johannes@erdfelt.com>
      Cc: Vojtech Pavlik <vojtech@suse.cz>
      Cc: Oliver Neukum <oliver@neukum.name>
      Cc: David Kubicek <dave@awk.cz>
      Cc: Johan Hovold <jhovold@gmail.com>
      Cc: Brad Hards <bhards@bigpond.net.au>
      Acked-by: NFelipe Balbi <balbi@ti.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Thomas Dahlmann <dahlmann.thomas@arcor.de>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: David Lopo <dlopo@chipidea.mips.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Michal Nazarewicz <m.nazarewicz@samsung.com>
      Cc: Xie Xiaobo <X.Xie@freescale.com>
      Cc: Li Yang <leoli@freescale.com>
      Cc: Jiang Bo <tanya.jiang@freescale.com>
      Cc: Yuan-hsin Chen <yhchen@faraday-tech.com>
      Cc: Darius Augulis <augulis.darius@gmail.com>
      Cc: Xiaochen Shen <xiaochen.shen@intel.com>
      Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Cc: OKI SEMICONDUCTOR, <toshiharu-linux@dsn.okisemi.com>
      Cc: Robert Jarzmik <robert.jarzmik@free.fr>
      Cc: Ben Dooks <ben@simtec.co.uk>
      Cc: Thomas Abraham <thomas.ab@samsung.com>
      Cc: Herbert Pötzl <herbert@13thfloor.at>
      Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
      Cc: Roman Weissgaerber <weissg@vienna.at>
      Acked-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Tony Olech <tony.olech@elandigitalsystems.com>
      Cc: Florian Floe Echtler <echtler@fs.tum.de>
      Cc: Christian Lucht <lucht@codemercs.com>
      Cc: Juergen Stuber <starblue@sourceforge.net>
      Cc: Georges Toth <g.toth@e-biz.lu>
      Cc: Bill Ryder <bryder@sgi.com>
      Cc: Kuba Ober <kuba@mareimbrium.org>
      Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
      Signed-off-by: NKuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      29cc8897
  10. 17 8月, 2011 1 次提交
    • S
      xhci: Handle zero-length isochronous packets. · 48df4a6f
      Sarah Sharp 提交于
      For a long time, the xHCI driver has had this note:
      	/* FIXME: Ignoring zero-length packets, can those happen? */
      
      It turns out that, yes, there are drivers that need to queue zero-length
      transfers for isochronous OUT transfers.  Without this patch, users will
      see kernel hang messages when a driver attempts to enqueue an isochronous
      URB with a zero length transfer (because count_isoc_trbs_needed will return
      zero for that TD, xhci_td->last_trb will never be set, and updating the
      dequeue pointer will cause an infinite loop).
      
      Matěj ran into this issue when using an NI Audio4DJ USB soundcard
      with the snd-usb-caiaq driver.  See
      	https://bugzilla.kernel.org/show_bug.cgi?id=40702
      
      Fix count_isoc_trbs_needed() to return 1 for zero-length transfers (thanks
      Alan on the math help).  Update the various TRB field calculations to deal
      with zero-length transfers.  We're still transferring one packet with a
      zero-length data payload, so the total_packet_count should be 1. The
      Transfer Burst Count (TBC) and Transfer Last Burst Packet Count (TLBPC)
      fields should be set to zero.
      
      This patch should be backported to kernels as old as 2.6.36.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Tested-by: NMatěj Laitl <matej@laitl.cz>
      Cc: Daniel Mack <zonque@gmail.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: stable@kernel.org
      48df4a6f
  11. 10 8月, 2011 3 次提交
    • S
      xhci: Remove TDs from TD lists when URBs are canceled. · 585df1d9
      Sarah Sharp 提交于
      When a driver tries to cancel an URB, and the host controller is dying,
      xhci_urb_dequeue will giveback the URB without removing the xhci_tds
      that comprise that URB from the td_list or the cancelled_td_list.  This
      can cause a race condition between the driver calling URB dequeue and
      the stop endpoint command watchdog timer.
      
      If the timer fires on a dying host, and a driver attempts to resubmit
      while the watchdog timer has dropped the xhci->lock to giveback a
      cancelled URB, URBs may be given back by the xhci_urb_dequeue() function.
      At that point, the URB's priv pointer will be freed and set to NULL, but
      the TDs will remain on the td_list.  This will cause an oops in
      xhci_giveback_urb_in_irq() when the watchdog timer attempts to loop
      through the endpoints' td_lists, giving back killed URBs.
      
      Make sure that xhci_urb_dequeue() removes TDs from the TD lists and
      canceled TD lists before it gives back the URB.
      
      This patch should be backported to kernels as old as 2.6.36.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Andiry Xu <andiry.xu@amd.com>
      Cc: stable@kernel.org
      585df1d9
    • S
      xhci: Fix failed enqueue in the middle of isoch TD. · 522989a2
      Sarah Sharp 提交于
      When an isochronous transfer is enqueued, xhci_queue_isoc_tx_prepare()
      will ensure that there is enough room on the transfer rings for all of the
      isochronous TDs for that URB.  However, when xhci_queue_isoc_tx() is
      enqueueing individual isoc TDs, the prepare_transfer() function can fail
      if the endpoint state has changed to disabled, error, or some other
      unknown state.
      
      With the current code, if Nth TD (not the first TD) fails, the ring is
      left in a sorry state.  The partially enqueued TDs are left on the ring,
      and the first TRB of the TD is not given back to the hardware.  The
      enqueue pointer is left on the TRB after the last successfully enqueued
      TD.  This means the ring is basically useless.  Any new transfers will be
      enqueued after the failed TDs, which the hardware will never read because
      the cycle bit indicates it does not own them.  The ring will fill up with
      untransferred TDs, and the endpoint will be basically unusable.
      
      The untransferred TDs will also remain on the TD list.  Since the td_list
      is a FIFO, this basically means the ring handler will be waiting on TDs
      that will never be completed (or worse, dereference memory that doesn't
      exist any more).
      
      Change the code to clean up the isochronous ring after a failed transfer.
      If the first TD failed, simply return and allow the xhci_urb_enqueue
      function to free the urb_priv.  If the Nth TD failed, first remove the TDs
      from the td_list.  Then convert the TRBs that were enqueued into No-op
      TRBs.  Make sure to flip the cycle bit on all enqueued TRBs (including any
      link TRBs in the middle or between TDs), but leave the cycle bit of the
      first TRB (which will show software-owned) intact.  Then move the ring
      enqueue pointer back to the first TRB and make sure to change the
      xhci_ring's cycle state to what is appropriate for that ring segment.
      
      This ensures that the No-op TRBs will be overwritten by subsequent TDs,
      and the hardware will not start executing random TRBs because the cycle
      bit was left as hardware-owned.
      
      This bug is unlikely to be hit, but it was something I noticed while
      tracking down the watchdog timer issue.  I verified that the fix works by
      injecting some errors on the 250th isochronous URB queued, although I
      could not verify that the ring is in the correct state because uvcvideo
      refused to talk to the device after the first usb_submit_urb() failed.
      Ring debugging shows that the ring looks correct, however.
      
      This patch should be backported to kernels as old as 2.6.36.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Andiry Xu <andiry.xu@amd.com>
      Cc: stable@kernel.org
      522989a2
    • S
      xhci: Fix memory leak during failed enqueue. · d13565c1
      Sarah Sharp 提交于
      When the isochronous transfer support was introduced, and the xHCI driver
      switched to using urb->hcpriv to store an "urb_priv" pointer, a couple of
      memory leaks were introduced into the URB enqueue function in its error
      handling paths.
      
      xhci_urb_enqueue allocates urb_priv, but it doesn't free it if changing
      the control endpoint's max packet size fails or the bulk endpoint is in
      the middle of allocating or deallocating streams.
      
      xhci_urb_enqueue also doesn't free urb_priv if any of the four endpoint
      types' enqueue functions fail.  Instead, it expects those functions to
      free urb_priv if an error occurs.  However, the bulk, control, and
      interrupt enqueue functions do not free urb_priv if the endpoint ring is
      NULL.  It will, however, get freed if prepare_transfer() fails in those
      enqueue functions.
      
      Several of the error paths in the isochronous endpoint enqueue function
      also fail to free it.  xhci_queue_isoc_tx_prepare() doesn't free urb_priv
      if prepare_ring() indicates there is not enough room for all the
      isochronous TDs in this URB.  If individual isochronous TDs fail to be
      queued (perhaps due to an endpoint state change), urb_priv is also leaked.
      
      This argues that the freeing of urb_priv should be done in the function
      that allocated it, xhci_urb_enqueue.
      
      This patch looks rather ugly, but refactoring the code will have to wait
      because this patch needs to be backported to stable kernels.
      
      This patch should be backported to kernels as old as 2.6.36.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Andiry Xu <andiry.xu@amd.com>
      Cc: stable@kernel.org
      d13565c1
  12. 18 6月, 2011 2 次提交
    • S
      xhci: Always set urb->status to zero for isoc endpoints. · b3df3f9c
      Sarah Sharp 提交于
      When the xHCI driver encounters a Missed Service Interval event for an
      isochronous endpoint ring, it means the host controller skipped over
      one or more isochronous TDs.  For TD that is skipped, skip_isoc_td() is
      called.  This sets the frame descriptor status to -EXDEV, and also sets
      the value stored in the int pointed to by status to -EXDEV.
      
      If the isochronous TD happens to be the last TD in an URB,
      handle_tx_event() will use the status variable to give back the URB to
      the USB core.  That means drivers will see urb->status as -EXDEV.
      
      It turns out that EHCI, UHCI, and OHCI always set urb->status to zero for
      an isochronous urb, regardless of what the frame status is.  See
      itd_complete() in ehci-sched.c:
      
                      } else {
                              /* URB was too late */
                              desc->status = -EXDEV;
                      }
              }
      
              /* handle completion now? */
              if (likely ((urb_index + 1) != urb->number_of_packets))
                      goto done;
      
              /* ASSERT: it's really the last itd for this urb
              list_for_each_entry (itd, &stream->td_list, itd_list)
                      BUG_ON (itd->urb == urb);
               */
      
              /* give urb back to the driver; completion often (re)submits */
              dev = urb->dev;
              ehci_urb_done(ehci, urb, 0);
      
      ehci_urb_done() completes the URB with the status of the third argument, which
      is always zero in this case.
      
      It turns out that many USB webcam drivers, such as uvcvideo, cannot
      handle urb->status set to a non-zero value.  They will not resubmit
      their isochronous URBs in that case, and userspace will see a frozen
      video.
      
      Change the xHCI driver to be consistent with the EHCI and UHCI driver,
      and always set urb->status to 0 for isochronous URBs.
      
      This patch should be backported to kernels as old as 2.6.36
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: "Xu, Andiry" <Andiry.Xu@amd.com>
      Cc: stable@kernel.org
      b3df3f9c
    • A
      xHCI 1.0: Incompatible Device Error · f6ba6fe2
      Alex He 提交于
      It is one new TRB Completion Code for the xHCI spec v1.0.
      Asserted if the xHC detects a problem with a device that does not allow it to
      be successfully accessed, e.g. due to a device compliance or compatibility
      problem. This error may be returned by any command or transfer, and is fatal
      as far as the Slot is concerned. Return -EPROTO by urb->status or frame->status
      of ISOC for transfer case. And return -ENODEV for configure endpoint command,
      evaluate context command and address device command if there is an incompatible
      Device Error. The error codes will be sent back to the USB core to decide how
      to do. It's unnecessary for other commands because after the three commands run
      successfully means that the device has been accepted.
      Signed-off-by: NAlex He <alex.he@amd.com>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      f6ba6fe2
  13. 16 6月, 2011 1 次提交
    • A
      xHCI 1.0: Force Stopped Event(FSE) · e1cf486d
      Alex He 提交于
      FSE shall occur on the TD natural boundary. The software ep_ring dequeue pointer
      exceed the hardware ep_ring dequeue pointer in these cases of Table-3. As a
      result, the event_trb(pointed by hardware dequeue pointer) of the FSE can't be
      found in the current TD(pointed by software dequeue pointer). What should we do
      is to figured out the FSE case and skip over it.
      Signed-off-by: NAlex He <alex.he@amd.com>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      e1cf486d
  14. 03 6月, 2011 1 次提交
  15. 02 6月, 2011 1 次提交
  16. 28 5月, 2011 2 次提交
    • S
      Intel xhci: Limit number of active endpoints to 64. · 2cf95c18
      Sarah Sharp 提交于
      The Panther Point chipset has an xHCI host controller that has a limit to
      the number of active endpoints it can handle.  Ideally, it would signal
      that it can't handle anymore endpoints by returning a Resource Error for
      the Configure Endpoint command, but they don't.  Instead it needs software
      to keep track of the number of active endpoints, across configure endpoint
      commands, reset device commands, disable slot commands, and address device
      commands.
      
      Add a new endpoint context counter, xhci_hcd->num_active_eps, and use it
      to track the number of endpoints the xHC has active.  This gets a little
      tricky, because commands to change the number of active endpoints can
      fail.  This patch adds a new xHCI quirk for these Intel hosts, and the new
      code should not have any effect on other xHCI host controllers.
      
      Fail a new device allocation if we don't have room for the new default
      control endpoint.  Use the endpoint ring pointers to determine what
      endpoints were active before a Reset Device command or a Disable Slot
      command, and drop those once the command completes.
      
      Fail a configure endpoint command if it would add too many new endpoints.
      We have to be a bit over zealous here, and only count the number of new
      endpoints to be added, without subtracting the number of dropped
      endpoints.  That's because a second configure endpoint command for a
      different device could sneak in before we know if the first command is
      completed.  If the first command dropped resources, the host controller
      fails the command for some reason, and we're nearing the limit of
      endpoints, we could end up oversubscribing the host.
      
      To fix this race condition, when evaluating whether a configure endpoint
      command will fix in our bandwidth budget, only add the new endpoints to
      xhci->num_active_eps, and don't subtract the dropped endpoints.  Ignore
      changed endpoints (ones that are dropped and then re-added), as that
      shouldn't effect the host's endpoint resources.  When the configure
      endpoint command completes, subtract off the dropped endpoints.
      
      This may mean some configuration changes may temporarily fail, but it's
      always better to under-subscribe than over-subscribe resources.
      
      (Originally my plan had been to push the resource allocation down into the
      ring allocation functions.  However, that would cause us to allocate
      unnecessary resources when endpoints were changed, because the xHCI driver
      allocates a new ring for the changed endpoint, and only deletes the old
      ring once the Configure Endpoint command succeeds.  A further complication
      would have been dealing with the per-device endpoint ring cache.)
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      2cf95c18
    • S
      Intel xhci: Ignore spurious successful event. · ad808333
      Sarah Sharp 提交于
      The xHCI host controller in the Panther Point chipset sometimes produces
      spurious events on the event ring.  If it receives a short packet, it
      first puts a Transfer Event with a short transfer completion code on the
      event ring.  Then it puts a Transfer Event with a successful completion
      code on the ring for the same TD.  The xHCI driver correctly processes the
      short transfer completion code, gives the URB back to the driver, and then
      prints a warning in dmesg about the spurious event.  These warning
      messages really fill up dmesg when an HD webcam is plugged into xHCI.
      
      This spurious successful event behavior isn't technically disallowed by
      the xHCI specification, so make the xHCI driver just ignore the spurious
      completion event.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      ad808333
  17. 26 5月, 2011 4 次提交
    • S
      xhci: STFU: Be quieter during URB submission and completion. · f444ff27
      Sarah Sharp 提交于
      Unsurprisingly, URBs get submitted and completed a lot in the xHCI
      driver.  If we have to print 10 lines of debug for every URB submitted
      or completed, then that can cause the whole system to stay in the
      interrupt handler too long, and can cause Missed Service completion
      codes for isochronous transfers.
      
      Cut down the debugging in the URB submission and completion paths:
       - Don't squawk about successful transfers, only unsuccessful ones.
       - Only print the number of bytes transferred if this was a short
         transfer.
       - Don't print the endpoint index for successful transfers (will add
         more debug to failed transfers to show endpoint index there later).
       - Stop printing MMIO writes.  This debugging shows up when the endpoint
         doorbell is rung a to start a transfer (basically for every URB).
       - Don't print out the ring enqueue and dequeue pointers
       - Stop printing when we're pointing to a link TRB.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      f444ff27
    • S
      xhci: STFU: Don't print event ring dequeue pointer. · 5153b7b3
      Sarah Sharp 提交于
      Stop printing out the event ring dequeue pointer and status register in
      the operational register set.  The host will report an OK status 99% of
      the time the interrupt handler is called, and usually when it's really
      hosed, a host controller won't even call the interrupt handler.  So the
      line is really useless.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      5153b7b3
    • S
      xhci: STFU: Remove function tracing. · 380032c3
      Sarah Sharp 提交于
      Remove unnecessary debugging from the xHCI driver.  We don't need to
      know what function we're calling or returning from.  Now I know how to
      use markup-oops.pl to de-mystify stack dumps of crashes.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      380032c3
    • S
      xhci: Clear stopped_td when Stop Endpoint command completes. · 0714a57c
      Sarah Sharp 提交于
      When an URB is cancelled, the xHCI driver issues a Stop Endpoint command
      so that it can manipulate the ring and remove the transfer.  The xHC
      hardware then places a transfer event with the completion code "Stopped"
      or "Stopped Invalid" to let the driver know what TD it was in the middle
      of processing.  This TD and TRB is stored in ep->stopped_td and
      ep->stopped_trb.  These pointers are also used in handling stalled
      endpoints.
      
      By design, the Stop Endpoint command can race with URB completion.  By
      the time the Stop Endpoint command is handled, the URBs to be cancelled
      may have been given back to the driver.  Unfortunately, the stopped_td
      and stopped_trb pointers were not getting cleared in this case.
      
      The USB core unconditionally tries to reset the toggle bits on any
      endpoints when a new alternate interface setting is installed.  When the
      xHCI driver saw that ep->stopped_td was still set from the Stop Endpoint
      command, xhci_reset_endpoint assumed the endpoint was actually stalled,
      and attempted to clean up the endpoint rings.  This would manifest
      itself in a failed Reset Endpoint command and failed Set TR dequeue
      Pointer command after a successful Configure Endpoint command.  It may
      have also been causing driver oops when the stopped_td was accessed.
      
      This patch should be backported to stable kernels since 2.6.31.  Before
      2.6.33, stopped_td was found in the xhci_endpoint_ring, not the
      xhci_virt_ep.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      0714a57c
  18. 12 5月, 2011 1 次提交
    • S
      xhci: Fix bug in control transfer cancellation. · 3abeca99
      Sarah Sharp 提交于
      When the xHCI driver attempts to cancel a transfer, it issues a Stop
      Endpoint command and waits for the host controller to indicate which TRB
      it was in the middle of processing.  The host will put an event TRB with
      completion code COMP_STOP on the event ring if it stops on a control
      transfer TRB (or other types of transfer TRBs).  The ring handling code
      is supposed to set ep->stopped_trb to the TRB that the host stopped on
      when this happens.
      
      Unfortunately, there is a long-standing bug in the control transfer
      completion code.  It doesn't actually check to see if COMP_STOP is set
      before attempting to process the transfer based on which part of the
      control TD completed.  So when we get an event on the data phase of the
      control TRB with COMP_STOP set, it thinks it's a normal completion of
      the transfer and doesn't set ep->stopped_td or ep->stopped_trb.
      
      When the ring handling code goes on to process the completion of the Stop
      Endpoint command, it sees that ep->stopped_trb is not a part of the TD
      it's trying to cancel.  It thinks the hardware has its enqueue pointer
      somewhere further up in the ring, and thinks it's safe to turn the control
      TRBs into no-op TRBs.  Since the hardware was in the middle of the control
      TRBs to be cancelled, the proper software behavior is to issue a Set TR
      dequeue pointer command.
      
      It turns out that the NEC host controllers can handle active TRBs being
      set to no-op TRBs after a stop endpoint command, but other host
      controllers have issues with this out-of-spec software behavior.  Fix this
      behavior.
      
      This patch should be backported to kernels as far back as 2.6.31, but it
      may be a bit challenging, since process_ctrl_td() was introduced in some
      refactoring done in 2.6.36, and some endian-safe patches added in 2.6.40
      that touch the same lines.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: stable@kernel.org
      3abeca99
  19. 10 5月, 2011 2 次提交
  20. 03 5月, 2011 6 次提交
    • S
      xhci 1.0: Set transfer burst last packet count field. · b61d378f
      Sarah Sharp 提交于
      The xHCI 1.0 specification defines a new isochronous TRB field, called
      transfer burst last packet count (TBLPC).  This field defines the number
      of packets in the last "burst" of packets in a TD.  Only SuperSpeed
      endpoints can handle more than one burst, so this is set to the number for
      packets in a TD for all non-SuperSpeed devices (minus one, since the field
      is zero based).
      
      This patch should have no effect on host controllers that don't advertise
      the xHCI 1.0 (0x100) version number in their hci_version field.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      b61d378f
    • S
      xhci 1.0: Set transfer burst count field. · 5cd43e33
      Sarah Sharp 提交于
      The xHCI 1.0 specification adds a new field to the fourth dword in an
      isochronous TRB: the transfer burst count (TBC).  This field is only
      non-zero for SuperSpeed devices.  Each SS endpoint sets the bMaxBurst
      field in the SuperSpeed endpoint companion descriptor, which indicates how
      many max-packet-sized "bursts" it can handle in one service interval.  The
      device driver may choose to burst less max packet sized chunks each
      service interval (which is defined by one TD).  The xHCI driver indicates
      to the host controller how many bursts it needs to schedule through the
      transfer burst count field.
      
      This patch will only effect xHCI hosts that advertise 1.0 support (0x100)
      in the HCI version field of their capabilities register.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      5cd43e33
    • S
      xhci 1.0: Update TD size field format. · 4da6e6f2
      Sarah Sharp 提交于
      The xHCI 1.0 specification changes the format of the TD size field in
      Normal and Isochronous TRBs.  The field in control TRBs is still set to
      reserved zero.  Instead of representing the number of bytes left to
      transfer in the TD (including the current TRB's buffer), it now represents
      the number of packets left to transfer (*not* including this TRB).
      
      See section 4.11.2.4 of the xHCI 1.0 specification for details.  The math
      is basically copied straight from there.
      
      Create a new function, xhci_v1_0_td_remainder(), that should be called for
      all xHCI 1.0 host controllers.  The field location and maximum value is
      still the same, so reuse the old function, xhci_td_remainder(), to handle
      the bit shifting.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      4da6e6f2
    • S
      xhci 1.0: Only interrupt on short packet for IN EPs. · af8b9e63
      Sarah Sharp 提交于
      It doesn't make sense to set the interrupt on short packet (TRB_ISP) flag
      for TRBs queued to endpoints that only receive packets from the host
      controller (i.e. OUT endpoints).  Packets can only be short when they are
      sent from a USB device.  Plus, the xHCI 1.0 specification forbids setting
      the flag for anything but IN endpoints.
      
      While we're at it, remove some of my snide remarks about the inefficiency
      of event data TRBs.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      af8b9e63
    • M
      xhci: Remove recursive call to xhci_handle_event · 9dee9a21
      Matt Evans 提交于
      Make the caller loop while there are events to handle, instead.
      Signed-off-by: NMatt Evans <matt@ozlabs.org>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      9dee9a21
    • M
      xhci: Add rmb() between reading event validity & event data access. · 92a3da41
      Matt Evans 提交于
      On weakly-ordered systems, the reading of an event's content must occur
      after reading the event's validity.
      Signed-off-by: NMatt Evans <matt@ozlabs.org>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      92a3da41