1. 02 6月, 2016 1 次提交
    • G
      xhci: Cleanup only when releasing primary hcd · 27a41a83
      Gabriel Krisman Bertazi 提交于
      Under stress occasions some TI devices might not return early when
      reading the status register during the quirk invocation of xhci_irq made
      by usb_hcd_pci_remove.  This means that instead of returning, we end up
      handling this interruption in the middle of a shutdown.  Since
      xhci->event_ring has already been freed in xhci_mem_cleanup, we end up
      accessing freed memory, causing the Oops below.
      
      commit 8c24d6d7 ("usb: xhci: stop everything on the first call to
      xhci_stop") is the one that changed the instant in which we clean up the
      event queue when stopping a device.  Before, we didn't call
      xhci_mem_cleanup at the first time xhci_stop is executed (for the shared
      HCD), instead, we only did it after the invocation for the primary HCD,
      much later at the removal path.  The code flow for this oops looks like
      this:
      
      xhci_pci_remove()
      	usb_remove_hcd(xhci->shared)
      	        xhci_stop(xhci->shared)
       			xhci_halt()
      			xhci_mem_cleanup(xhci);  // Free the event_queue
      	usb_hcd_pci_remove(primary)
      		xhci_irq()  // Access the event_queue if STS_EINT is set. Crash.
      		xhci_stop()
      			xhci_halt()
      			// return early
      
      The fix modifies xhci_stop to only cleanup the xhci data when releasing
      the primary HCD.  This way, we still have the event_queue configured
      when invoking xhci_irq.  We still halt the device on the first call to
      xhci_stop, though.
      
      I could reproduce this issue several times on the mainline kernel by
      doing a bind-unbind stress test with a specific storage gadget attached.
      I also ran the same test over-night with my patch applied and didn't
      observe the issue anymore.
      
      [  113.334124] Unable to handle kernel paging request for data at address 0x00000028
      [  113.335514] Faulting instruction address: 0xd00000000d4f767c
      [  113.336839] Oops: Kernel access of bad area, sig: 11 [#1]
      [  113.338214] SMP NR_CPUS=1024 NUMA PowerNV
      
      [c000000efe47ba90] c000000000720850 usb_hcd_irq+0x50/0x80
      [c000000efe47bac0] c00000000073d328 usb_hcd_pci_remove+0x68/0x1f0
      [c000000efe47bb00] d00000000daf0128 xhci_pci_remove+0x78/0xb0
      [xhci_pci]
      [c000000efe47bb30] c00000000055cf70 pci_device_remove+0x70/0x110
      [c000000efe47bb70] c00000000061c6bc __device_release_driver+0xbc/0x190
      [c000000efe47bba0] c00000000061c7d0 device_release_driver+0x40/0x70
      [c000000efe47bbd0] c000000000619510 unbind_store+0x120/0x150
      [c000000efe47bc20] c0000000006183c4 drv_attr_store+0x64/0xa0
      [c000000efe47bc60] c00000000039f1d0 sysfs_kf_write+0x80/0xb0
      [c000000efe47bca0] c00000000039e14c kernfs_fop_write+0x18c/0x1f0
      [c000000efe47bcf0] c0000000002e962c __vfs_write+0x6c/0x190
      [c000000efe47bd90] c0000000002eab40 vfs_write+0xc0/0x200
      [c000000efe47bde0] c0000000002ec85c SyS_write+0x6c/0x110
      [c000000efe47be30] c000000000009260 system_call+0x38/0x108
      Signed-off-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
      Cc: Roger Quadros <rogerq@ti.com>
      Cc: joel@jms.id.au
      Cc: stable@vger.kernel.org
      Reviewed-by: NRoger Quadros <rogerq@ti.com>
      Cc: <stable@vger.kernel.org> #v4.3+
      Tested-by: NJoel Stanley <joel@jms.id.au>
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      27a41a83
  2. 27 4月, 2016 3 次提交
  3. 19 4月, 2016 1 次提交
  4. 14 4月, 2016 1 次提交
    • M
      xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers · 98d74f9c
      Mathias Nyman 提交于
      PCI hotpluggable xhci controllers such as some Alpine Ridge solutions will
      remove the xhci controller from the PCI bus when the last USB device is
      disconnected.
      
      Add a flag to indicate that the host is being removed to avoid queueing
      configure_endpoint commands for the dropped endpoints.
      For PCI hotplugged controllers this will prevent 5 second command timeouts
      For static xhci controllers the configure_endpoint command is not needed
      in the removal case as everything will be returned, freed, and the
      controller is reset.
      
      For now the flag is only set for PCI connected host controllers.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98d74f9c
  5. 15 2月, 2016 2 次提交
  6. 04 2月, 2016 2 次提交
  7. 12 12月, 2015 1 次提交
    • M
      xhci: fix usb2 resume timing and races. · f69115fd
      Mathias Nyman 提交于
      According to USB 2 specs ports need to signal resume for at least 20ms,
      in practice even longer, before moving to U0 state.
      Both host and devices can initiate resume.
      
      On device initiated resume, a port status interrupt with the port in resume
      state in issued. The interrupt handler tags a resume_done[port]
      timestamp with current time + USB_RESUME_TIMEOUT, and kick roothub timer.
      Root hub timer requests for port status, finds the port in resume state,
      checks if resume_done[port] timestamp passed, and set port to U0 state.
      
      On host initiated resume, current code sets the port to resume state,
      sleep 20ms, and finally sets the port to U0 state. This should also
      be changed to work in a similar way as the device initiated resume, with
      timestamp tagging, but that is not yet tested and will be a separate
      fix later.
      
      There are a few issues with this approach
      
      1. A host initiated resume will also generate a resume event. The event
         handler will find the port in resume state, believe it's a device
         initiated resume, and act accordingly.
      
      2. A port status request might cut the resume signalling short if a
         get_port_status request is handled during the host resume signalling.
         The port will be found in resume state. The timestamp is not set leading
         to time_after_eq(jiffies, timestamp) returning true, as timestamp = 0.
         get_port_status will proceed with moving the port to U0.
      
      3. If an error, or anything else happens to the port during device
         initiated resume signalling it will leave all the device resume
         parameters hanging uncleared, preventing further suspend, returning
         -EBUSY, and cause the pm thread to busyloop trying to enter suspend.
      
      Fix this by using the existing resuming_ports bitfield to indicate that
      resume signalling timing is taken care of.
      Check if the resume_done[port] is set before using it for timestamp
      comparison, and also clear out any resume signalling related variables
      if port is not in U0 or Resume state
      
      This issue was discovered when a PM thread busylooped, trying to runtime
      suspend the xhci USB 2 roothub on a Dell XPS
      
      Cc: stable <stable@vger.kernel.org>
      Reported-by: NDaniel J Blueman <daniel@quora.org>
      Tested-by: NDaniel J Blueman <daniel@quora.org>
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f69115fd
  8. 02 12月, 2015 1 次提交
  9. 19 11月, 2015 1 次提交
  10. 17 10月, 2015 3 次提交
  11. 04 10月, 2015 2 次提交
  12. 22 9月, 2015 2 次提交
  13. 09 8月, 2015 3 次提交
  14. 04 8月, 2015 1 次提交
    • M
      xhci: fix off by one error in TRB DMA address boundary check · 7895086a
      Mathias Nyman 提交于
      We need to check that a TRB is part of the current segment
      before calculating its DMA address.
      
      Previously a ring segment didn't use a full memory page, and every
      new ring segment got a new memory page, so the off by one
      error in checking the upper bound was never seen.
      
      Now that we use a full memory page, 256 TRBs (4096 bytes), the off by one
      didn't catch the case when a TRB was the first element of the next segment.
      
      This is triggered if the virtual memory pages for a ring segment are
      next to each in increasing order where the ring buffer wraps around and
      causes errors like:
      
      [  106.398223] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 0 comp_code 1
      [  106.398230] xhci_hcd 0000:00:14.0: Looking for event-dma fffd3000 trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 seg-end fffd4ff0
      
      The trb-end address is one outside the end-seg address.
      
      Cc: <stable@vger.kernel.org>
      Tested-by: NArkadiusz Miśkiewicz <arekm@maven.pl>
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7895086a
  15. 23 7月, 2015 1 次提交
  16. 31 5月, 2015 2 次提交
    • M
      xhci: Return correct number of tranferred bytes for stalled control endpoints · 22ae47e6
      Mathias Nyman 提交于
      Fix the xhci driver from bluntly setting the transferred length to 0 if
      we get a STALL on anything else than the data stage of a control transfer.
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      22ae47e6
    • R
      usb: xhci: fix xhci locking up during hcd remove · ad6b1d91
      Roger Quadros 提交于
      The problem seems to be that if a new device is detected
      while we have already removed the shared HCD, then many of the
      xhci operations (e.g.  xhci_alloc_dev(), xhci_setup_device())
      hang as command never completes.
      
      I don't think XHCI can operate without the shared HCD as we've
      already called xhci_halt() in xhci_only_stop_hcd() when shared HCD
      goes away. We need to prevent new commands from being queued
      not only when HCD is dying but also when HCD is halted.
      
      The following lockup was detected while testing the otg state
      machine.
      
      [  178.199951] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller
      [  178.205799] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 1
      [  178.214458] xhci-hcd xhci-hcd.0.auto: hcc params 0x0220f04c hci version 0x100 quirks 0x00010010
      [  178.223619] xhci-hcd xhci-hcd.0.auto: irq 400, io mem 0x48890000
      [  178.230677] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
      [  178.237796] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
      [  178.245358] usb usb1: Product: xHCI Host Controller
      [  178.250483] usb usb1: Manufacturer: Linux 4.0.0-rc1-00024-g6111320 xhci-hcd
      [  178.257783] usb usb1: SerialNumber: xhci-hcd.0.auto
      [  178.267014] hub 1-0:1.0: USB hub found
      [  178.272108] hub 1-0:1.0: 1 port detected
      [  178.278371] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller
      [  178.284171] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 2
      [  178.294038] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003
      [  178.301183] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
      [  178.308776] usb usb2: Product: xHCI Host Controller
      [  178.313902] usb usb2: Manufacturer: Linux 4.0.0-rc1-00024-g6111320 xhci-hcd
      [  178.321222] usb usb2: SerialNumber: xhci-hcd.0.auto
      [  178.329061] hub 2-0:1.0: USB hub found
      [  178.333126] hub 2-0:1.0: 1 port detected
      [  178.567585] dwc3 48890000.usb: usb_otg_start_host 0
      [  178.572707] xhci-hcd xhci-hcd.0.auto: remove, state 4
      [  178.578064] usb usb2: USB disconnect, device number 1
      [  178.586565] xhci-hcd xhci-hcd.0.auto: USB bus 2 deregistered
      [  178.592585] xhci-hcd xhci-hcd.0.auto: remove, state 1
      [  178.597924] usb usb1: USB disconnect, device number 1
      [  178.603248] usb 1-1: new high-speed USB device number 2 using xhci-hcd
      [  190.597337] INFO: task kworker/u4:0:6 blocked for more than 10 seconds.
      [  190.604273]       Not tainted 4.0.0-rc1-00024-g6111320 #1058
      [  190.610228] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  190.618443] kworker/u4:0    D c05c0ac0     0     6      2 0x00000000
      [  190.625120] Workqueue: usb_otg usb_otg_work
      [  190.629533] [<c05c0ac0>] (__schedule) from [<c05c10ac>] (schedule+0x34/0x98)
      [  190.636915] [<c05c10ac>] (schedule) from [<c05c1318>] (schedule_preempt_disabled+0xc/0x10)
      [  190.645591] [<c05c1318>] (schedule_preempt_disabled) from [<c05c23d0>] (mutex_lock_nested+0x1ac/0x3fc)
      [  190.655353] [<c05c23d0>] (mutex_lock_nested) from [<c046cf8c>] (usb_disconnect+0x3c/0x208)
      [  190.664043] [<c046cf8c>] (usb_disconnect) from [<c0470cf0>] (_usb_remove_hcd+0x98/0x1d8)
      [  190.672535] [<c0470cf0>] (_usb_remove_hcd) from [<c0485da8>] (usb_otg_start_host+0x50/0xf4)
      [  190.681299] [<c0485da8>] (usb_otg_start_host) from [<c04849a4>] (otg_set_protocol+0x5c/0xd0)
      [  190.690153] [<c04849a4>] (otg_set_protocol) from [<c0484b88>] (otg_set_state+0x170/0xbfc)
      [  190.698735] [<c0484b88>] (otg_set_state) from [<c0485740>] (otg_statemachine+0x12c/0x470)
      [  190.707326] [<c0485740>] (otg_statemachine) from [<c0053c84>] (process_one_work+0x1b4/0x4a0)
      [  190.716162] [<c0053c84>] (process_one_work) from [<c00540f8>] (worker_thread+0x154/0x44c)
      [  190.724742] [<c00540f8>] (worker_thread) from [<c0058f88>] (kthread+0xd4/0xf0)
      [  190.732328] [<c0058f88>] (kthread) from [<c000e810>] (ret_from_fork+0x14/0x24)
      [  190.739898] 5 locks held by kworker/u4:0/6:
      [  190.744274]  #0:  ("%s""usb_otg"){.+.+.+}, at: [<c0053bf4>] process_one_work+0x124/0x4a0
      [  190.752799]  #1:  ((&otgd->work)){+.+.+.}, at: [<c0053bf4>] process_one_work+0x124/0x4a0
      [  190.761326]  #2:  (&otgd->fsm.lock){+.+.+.}, at: [<c048562c>] otg_statemachine+0x18/0x470
      [  190.769934]  #3:  (usb_bus_list_lock){+.+.+.}, at: [<c0470ce8>] _usb_remove_hcd+0x90/0x1d8
      [  190.778635]  #4:  (&dev->mutex){......}, at: [<c046cf8c>] usb_disconnect+0x3c/0x208
      [  190.786700] INFO: task kworker/1:0:14 blocked for more than 10 seconds.
      [  190.793633]       Not tainted 4.0.0-rc1-00024-g6111320 #1058
      [  190.799567] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  190.807783] kworker/1:0     D c05c0ac0     0    14      2 0x00000000
      [  190.814457] Workqueue: usb_hub_wq hub_event
      [  190.818866] [<c05c0ac0>] (__schedule) from [<c05c10ac>] (schedule+0x34/0x98)
      [  190.826252] [<c05c10ac>] (schedule) from [<c05c4e40>] (schedule_timeout+0x13c/0x1ec)
      [  190.834377] [<c05c4e40>] (schedule_timeout) from [<c05c19f0>] (wait_for_common+0xbc/0x150)
      [  190.843062] [<c05c19f0>] (wait_for_common) from [<bf068a3c>] (xhci_setup_device+0x164/0x5cc [xhci_hcd])
      [  190.852986] [<bf068a3c>] (xhci_setup_device [xhci_hcd]) from [<c046b7f4>] (hub_port_init+0x3f4/0xb10)
      [  190.862667] [<c046b7f4>] (hub_port_init) from [<c046eb64>] (hub_event+0x704/0x1018)
      [  190.870704] [<c046eb64>] (hub_event) from [<c0053c84>] (process_one_work+0x1b4/0x4a0)
      [  190.878919] [<c0053c84>] (process_one_work) from [<c00540f8>] (worker_thread+0x154/0x44c)
      [  190.887503] [<c00540f8>] (worker_thread) from [<c0058f88>] (kthread+0xd4/0xf0)
      [  190.895076] [<c0058f88>] (kthread) from [<c000e810>] (ret_from_fork+0x14/0x24)
      [  190.902650] 5 locks held by kworker/1:0/14:
      [  190.907023]  #0:  ("usb_hub_wq"){.+.+.+}, at: [<c0053bf4>] process_one_work+0x124/0x4a0
      [  190.915454]  #1:  ((&hub->events)){+.+.+.}, at: [<c0053bf4>] process_one_work+0x124/0x4a0
      [  190.924070]  #2:  (&dev->mutex){......}, at: [<c046e490>] hub_event+0x30/0x1018
      [  190.931768]  #3:  (&port_dev->status_lock){+.+.+.}, at: [<c046eb50>] hub_event+0x6f0/0x1018
      [  190.940558]  #4:  (&bus->usb_address0_mutex){+.+.+.}, at: [<c046b458>] hub_port_init+0x58/0xb10
      Signed-off-by: NRoger Quadros <rogerq@ti.com>
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad6b1d91
  17. 10 5月, 2015 2 次提交
  18. 08 4月, 2015 1 次提交
  19. 18 3月, 2015 1 次提交
  20. 11 3月, 2015 1 次提交
  21. 07 3月, 2015 1 次提交
    • A
      xhci: fix reporting of 0-sized URBs in control endpoint · 45ba2154
      Aleksander Morgado 提交于
      When a control transfer has a short data stage, the xHCI controller generates
      two transfer events: a COMP_SHORT_TX event that specifies the untransferred
      amount, and a COMP_SUCCESS event. But when the data stage is not short, only the
      COMP_SUCCESS event occurs. Therefore, xhci-hcd must set urb->actual_length to
      urb->transfer_buffer_length while processing the COMP_SUCCESS event, unless
      urb->actual_length was set already by a previous COMP_SHORT_TX event.
      
      The driver checks this by seeing whether urb->actual_length == 0, but this alone
      is the wrong test, as it is entirely possible for a short transfer to have an
      urb->actual_length = 0.
      
      This patch changes the xhci driver to rely on a new td->urb_length_set flag,
      which is set to true when a COMP_SHORT_TX event is received and the URB length
      updated at that stage.
      
      This fixes a bug which affected the HSO plugin, which relies on URBs with
      urb->actual_length == 0 to halt re-submitting the RX URB in the control
      endpoint.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAleksander Morgado <aleksander@aleksander.es>
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      45ba2154
  22. 25 2月, 2015 1 次提交
  23. 10 1月, 2015 3 次提交
  24. 03 12月, 2014 2 次提交
  25. 22 11月, 2014 1 次提交
    • M
      USB: xhci: Reset a halted endpoint immediately when we encounter a stall. · 8e71a322
      Mathias Nyman 提交于
      If a device is halted and reuturns a STALL, then the halted endpoint
      needs to be cleared both on the host and device side. The host
      side halt is cleared by issueing a xhci reset endpoint command. The device side
      is cleared with a ClearFeature(ENDPOINT_HALT) request, which should
      be issued by the device driver if a URB reruen -EPIPE.
      
      Previously we cleared the host side halt after the device side was cleared.
      To make sure the host side halt is cleared in time we want to issue the
      reset endpoint command immedialtely when a STALL status is encountered.
      
      Otherwise we end up not following the specs and not returning -EPIPE
      several times in a row when trying to transfer data to a halted endpoint.
      
      Fixes: bcef3fd5 (USB: xhci: Handle errors that cause endpoint halts.)
      Cc: <stable@vger.kernel.org> # v2.6.33+
      Tested-by: NFelipe Balbi <balbi@ti.com>
      Signed-off-by: NMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8e71a322