1. 10 8月, 2011 5 次提交
    • S
      xhci: Remove TDs from TD lists when URBs are canceled. · 585df1d9
      Sarah Sharp 提交于
      When a driver tries to cancel an URB, and the host controller is dying,
      xhci_urb_dequeue will giveback the URB without removing the xhci_tds
      that comprise that URB from the td_list or the cancelled_td_list.  This
      can cause a race condition between the driver calling URB dequeue and
      the stop endpoint command watchdog timer.
      
      If the timer fires on a dying host, and a driver attempts to resubmit
      while the watchdog timer has dropped the xhci->lock to giveback a
      cancelled URB, URBs may be given back by the xhci_urb_dequeue() function.
      At that point, the URB's priv pointer will be freed and set to NULL, but
      the TDs will remain on the td_list.  This will cause an oops in
      xhci_giveback_urb_in_irq() when the watchdog timer attempts to loop
      through the endpoints' td_lists, giving back killed URBs.
      
      Make sure that xhci_urb_dequeue() removes TDs from the TD lists and
      canceled TD lists before it gives back the URB.
      
      This patch should be backported to kernels as old as 2.6.36.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Andiry Xu <andiry.xu@amd.com>
      Cc: stable@kernel.org
      585df1d9
    • S
      xhci: Fix failed enqueue in the middle of isoch TD. · 522989a2
      Sarah Sharp 提交于
      When an isochronous transfer is enqueued, xhci_queue_isoc_tx_prepare()
      will ensure that there is enough room on the transfer rings for all of the
      isochronous TDs for that URB.  However, when xhci_queue_isoc_tx() is
      enqueueing individual isoc TDs, the prepare_transfer() function can fail
      if the endpoint state has changed to disabled, error, or some other
      unknown state.
      
      With the current code, if Nth TD (not the first TD) fails, the ring is
      left in a sorry state.  The partially enqueued TDs are left on the ring,
      and the first TRB of the TD is not given back to the hardware.  The
      enqueue pointer is left on the TRB after the last successfully enqueued
      TD.  This means the ring is basically useless.  Any new transfers will be
      enqueued after the failed TDs, which the hardware will never read because
      the cycle bit indicates it does not own them.  The ring will fill up with
      untransferred TDs, and the endpoint will be basically unusable.
      
      The untransferred TDs will also remain on the TD list.  Since the td_list
      is a FIFO, this basically means the ring handler will be waiting on TDs
      that will never be completed (or worse, dereference memory that doesn't
      exist any more).
      
      Change the code to clean up the isochronous ring after a failed transfer.
      If the first TD failed, simply return and allow the xhci_urb_enqueue
      function to free the urb_priv.  If the Nth TD failed, first remove the TDs
      from the td_list.  Then convert the TRBs that were enqueued into No-op
      TRBs.  Make sure to flip the cycle bit on all enqueued TRBs (including any
      link TRBs in the middle or between TDs), but leave the cycle bit of the
      first TRB (which will show software-owned) intact.  Then move the ring
      enqueue pointer back to the first TRB and make sure to change the
      xhci_ring's cycle state to what is appropriate for that ring segment.
      
      This ensures that the No-op TRBs will be overwritten by subsequent TDs,
      and the hardware will not start executing random TRBs because the cycle
      bit was left as hardware-owned.
      
      This bug is unlikely to be hit, but it was something I noticed while
      tracking down the watchdog timer issue.  I verified that the fix works by
      injecting some errors on the 250th isochronous URB queued, although I
      could not verify that the ring is in the correct state because uvcvideo
      refused to talk to the device after the first usb_submit_urb() failed.
      Ring debugging shows that the ring looks correct, however.
      
      This patch should be backported to kernels as old as 2.6.36.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Andiry Xu <andiry.xu@amd.com>
      Cc: stable@kernel.org
      522989a2
    • S
      xhci: Fix memory leak during failed enqueue. · d13565c1
      Sarah Sharp 提交于
      When the isochronous transfer support was introduced, and the xHCI driver
      switched to using urb->hcpriv to store an "urb_priv" pointer, a couple of
      memory leaks were introduced into the URB enqueue function in its error
      handling paths.
      
      xhci_urb_enqueue allocates urb_priv, but it doesn't free it if changing
      the control endpoint's max packet size fails or the bulk endpoint is in
      the middle of allocating or deallocating streams.
      
      xhci_urb_enqueue also doesn't free urb_priv if any of the four endpoint
      types' enqueue functions fail.  Instead, it expects those functions to
      free urb_priv if an error occurs.  However, the bulk, control, and
      interrupt enqueue functions do not free urb_priv if the endpoint ring is
      NULL.  It will, however, get freed if prepare_transfer() fails in those
      enqueue functions.
      
      Several of the error paths in the isochronous endpoint enqueue function
      also fail to free it.  xhci_queue_isoc_tx_prepare() doesn't free urb_priv
      if prepare_ring() indicates there is not enough room for all the
      isochronous TDs in this URB.  If individual isochronous TDs fail to be
      queued (perhaps due to an endpoint state change), urb_priv is also leaked.
      
      This argues that the freeing of urb_priv should be done in the function
      that allocated it, xhci_urb_enqueue.
      
      This patch looks rather ugly, but refactoring the code will have to wait
      because this patch needs to be backported to stable kernels.
      
      This patch should be backported to kernels as old as 2.6.36.
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Andiry Xu <andiry.xu@amd.com>
      Cc: stable@kernel.org
      d13565c1
    • A
      xHCI: report USB2 port in resuming as suspend · 8a8ff2f9
      Andiry Xu 提交于
      When a USB2 port initiate a remote wakeup, software shall ensure that
      resume is signaled for at least 20ms, and then write '0' to the PLS field.
      According to this, xhci driver do the following things:
      
      1. When receive a remote wakeup event in irq_handler, set the resume_done
         value as jiffies + 20ms, and modify rh_timer to poll root hub status at
         that time;
      2. When receive a GetPortStatus request, if the jiffies is after the
         resume_done value, clear the resume signal and resume_done.
      
      However, if usb_port_resume() is called before the rh_timer triggered, it
      will indicate the port as Suspend Cleared and skip the clear resume signal
      part. The device will fail the usb_get_status request in finish_port_resume(),
      and usbcore will try a reset-resume instead. Device will work OK after
      reset-resume, but resume_done value is not cleared in this case, and
      xhci_bus_suspend() will fail because when it finds a non-zero resume_done
      value, it will regard the port as resuming and return -EBUSY.
      
      This causes issue on some platforms that the system fail to suspend
      after remote wakeup from suspend by USB2 devices connected to xHCI port.
      
      To fix this issue, report the port status as suspend if the resume is
      signaling less that 20ms, and usb_port_resume() will wait 25ms and check
      port status again, so xHCI driver can clear the resume signaling and
      resume_done value.
      
      This should be backported to kernels as old as 2.6.37.
      Signed-off-by: NAndiry Xu <andiry.xu@amd.com>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: stable@kernel.org
      8a8ff2f9
    • A
      xHCI: fix port U3 status check condition · 5ac04bf1
      Andiry Xu 提交于
      Fix the port U3 status check when Clear PORT_SUSPEND Feature.
      The port status should be masked with PORT_PLS_MASK to check if it's in
      U3 state.
      
      This should be backported to kernels as old as 2.6.37.
      Signed-off-by: NAndiry Xu <andiry.xu@amd.com>
      Signed-off-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: stable@kernel.org
      5ac04bf1
  2. 09 8月, 2011 17 次提交
  3. 08 8月, 2011 8 次提交
  4. 07 8月, 2011 10 次提交
    • A
      Fix POSIX ACL permission check · 206b1d09
      Ari Savolainen 提交于
      After commit 3567866b: "RCUify freeing acls, let check_acl() go ahead in
      RCU mode if acl is cached" posix_acl_permission is being called with an
      unsupported flag and the permission check fails. This patch fixes the issue.
      Signed-off-by: NAri Savolainen <ari.m.savolainen@gmail.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      206b1d09
    • L
      Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd · c2f340a6
      Linus Torvalds 提交于
      * 'for-linus' of git://git.open-osd.org/linux-open-osd:
        ore: Make ore its own module
        exofs: Rename raid engine from exofs/ios.c => ore
        exofs: ios: Move to a per inode components & device-table
        exofs: Move exofs specific osd operations out of ios.c
        exofs: Add offset/length to exofs_get_io_state
        exofs: Fix truncate for the raid-groups case
        exofs: Small cleanup of exofs_fill_super
        exofs: BUG: Avoid sbi realloc
        exofs: Remove pnfs-osd private definitions
        nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4
      c2f340a6
    • L
      vfs: optimize inode cache access patterns · 3ddcd056
      Linus Torvalds 提交于
      The inode structure layout is largely random, and some of the vfs paths
      really do care.  The path lookup in particular is already quite D$
      intensive, and profiles show that accessing the 'inode->i_op->xyz'
      fields is quite costly.
      
      We already optimized the dcache to not unnecessarily load the d_op
      structure for members that are often NULL using the DCACHE_OP_xyz bits
      in dentry->d_flags, and this does something very similar for the inode
      ops that are used during pathname lookup.
      
      It also re-orders the fields so that the fields accessed by 'stat' are
      together at the beginning of the inode structure, and roughly in the
      order accessed.
      
      The effect of this seems to be in the 1-2% range for an empty kernel
      "make -j" run (which is fairly kernel-intensive, mostly in filename
      lookup), so it's visible.  The numbers are fairly noisy, though, and
      likely depend a lot on exact microarchitecture.  So there's more tuning
      to be done.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3ddcd056
    • L
      vfs: renumber DCACHE_xyz flags, remove some stale ones · 830c0f0e
      Linus Torvalds 提交于
      Gcc tends to generate better code with small integers, including the
      DCACHE_xyz flag tests - so move the common ones to be first in the list.
      Also just remove the unused DCACHE_INOTIFY_PARENT_WATCHED and
      DCACHE_AUTOFS_PENDING values, their users no longer exists in the source
      tree.
      
      And add a "unlikely()" to the DCACHE_OP_COMPARE test, since we want the
      common case to be a nice straight-line fall-through.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      830c0f0e
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7cd4767e
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        net: Compute protocol sequence numbers and fragment IDs using MD5.
        crypto: Move md5_transform to lib/md5.c
      7cd4767e
    • B
      ore: Make ore its own module · cf283ade
      Boaz Harrosh 提交于
      Export everything from ore need exporting. Change Kbuild and Kconfig
      to build ore.ko as an independent module. Import ore from exofs
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      cf283ade
    • B
      exofs: Rename raid engine from exofs/ios.c => ore · 8ff660ab
      Boaz Harrosh 提交于
      ORE stands for "Objects Raid Engine"
      
      This patch is a mechanical rename of everything that was in ios.c
      and its API declaration to an ore.c and an osd_ore.h header. The ore
      engine will later be used by the pnfs objects layout driver.
      
      * File ios.c => ore.c
      
      * Declaration of types and API are moved from exofs.h to a new
        osd_ore.h
      
      * All used types are prefixed by ore_ from their exofs_ name.
      
      * Shift includes from exofs.h to osd_ore.h so osd_ore.h is
        independent, include it from exofs.h.
      
      Other than a pure rename there are no other changes. Next patch
      will move the ore into it's own module and will export the API
      to be used by exofs and later the layout driver
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      8ff660ab
    • B
      exofs: ios: Move to a per inode components & device-table · 9e9db456
      Boaz Harrosh 提交于
      Exofs raid engine was saving on memory space by having a single layout-info,
      single pid, and a single device-table, global to the filesystem. Then passing
      a credential and object_id info at the io_state level, private for each
      inode. It would also devise this contraption of rotating the device table
      view for each inode->ino to spread out the device usage.
      
      This is not compatible with the pnfs-objects standard, demanding that
      each inode can have it's own layout-info, device-table, and each object
      component it's own pid, oid and creds.
      
      So: Bring exofs raid engine to be usable for generic pnfs-objects use by:
      
      * Define an exofs_comp structure that holds obj_id and credential info.
      
      * Break up exofs_layout struct to an exofs_components structure that holds a
        possible array of exofs_comp and the array of devices + the size of the
        arrays.
      
      * Add a "comps" parameter to get_io_state() that specifies the ids creds
        and device array to use for each IO.
      
        This enables to keep the layout global, but the device-table view, creds
        and IDs at the inode level. It only adds two 64bit to each inode, since
        some of these members already existed in another form.
      
      * ios raid engine now access layout-info and comps-info through the passed
        pointers. Everything is pre-prepared by caller for generic access of
        these structures and arrays.
      
      At the exofs Level:
      
      * Super block holds an exofs_components struct that holds the device
        array, previously in layout. The devices there are in device-table
        order. The device-array is twice bigger and repeats the device-table
        twice so now each inode's device array can point to a random device
        and have a round-robin view of the table, making it compatible to
        previous exofs versions.
      
      * Each inode has an exofs_components struct that is initialized at
        load time, with it's own view of the device table IDs and creds.
        When doing IO this gets passed to the io_state together with the
        layout.
      
      While preforming this change. Bugs where found where credentials with the
      wrong IDs where used to access the different SB objects (super.c). As well
      as some dead code. It was never noticed because the target we use does not
      check the credentials.
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      9e9db456
    • B
      exofs: Move exofs specific osd operations out of ios.c · 85e44df4
      Boaz Harrosh 提交于
      ios.c will be moving to an external library, for use by the
      objects-layout-driver. Remove from it some exofs specific functions.
      
      Also g_attr_logical_length is used both by inode.c and ios.c
      move definition to the later, to keep it independent
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      85e44df4
    • B
      exofs: Add offset/length to exofs_get_io_state · e1042ba0
      Boaz Harrosh 提交于
      In future raid code we will need to know the IO offset/length
      and if it's a read or write to determine some of the array
      sizes we'll need.
      
      So add a new exofs_get_rw_state() API for use when
      writeing/reading. All other simple cases are left using the
      old way.
      
      The major change to this is that now we need to call
      exofs_get_io_state later at inode.c::read_exec and
      inode.c::write_exec when we actually know these things. So this
      patch is kept separate so I can test things apart from other
      changes.
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      e1042ba0