1. 03 9月, 2013 5 次提交
    • L
      lockref: implement lockless reference count updates using cmpxchg() · bc08b449
      Linus Torvalds 提交于
      Instead of taking the spinlock, the lockless versions atomically check
      that the lock is not taken, and do the reference count update using a
      cmpxchg() loop.  This is semantically identical to doing the reference
      count update protected by the lock, but avoids the "wait for lock"
      contention that you get when accesses to the reference count are
      contended.
      
      Note that a "lockref" is absolutely _not_ equivalent to an atomic_t.
      Even when the lockref reference counts are updated atomically with
      cmpxchg, the fact that they also verify the state of the spinlock means
      that the lockless updates can never happen while somebody else holds the
      spinlock.
      
      So while "lockref_put_or_lock()" looks a lot like just another name for
      "atomic_dec_and_lock()", and both optimize to lockless updates, they are
      fundamentally different: the decrement done by atomic_dec_and_lock() is
      truly independent of any lock (as long as it doesn't decrement to zero),
      so a locked region can still see the count change.
      
      The lockref structure, in contrast, really is a *locked* reference
      count.  If you hold the spinlock, the reference count will be stable and
      you can modify the reference count without using atomics, because even
      the lockless updates will see and respect the state of the lock.
      
      In order to enable the cmpxchg lockless code, the architecture needs to
      do three things:
      
       (1) Make sure that the "arch_spinlock_t" and an "unsigned int" can fit
           in an aligned u64, and have a "cmpxchg()" implementation that works
           on such a u64 data type.
      
       (2) define a helper function to test for a spinlock being unlocked
           ("arch_spin_value_unlocked()")
      
       (3) select the "ARCH_USE_CMPXCHG_LOCKREF" config variable in its
           Kconfig file.
      
      This enables it for x86-64 (but not 32-bit, we'd need to make sure
      cmpxchg() turns into the proper cmpxchg8b in order to enable it for
      32-bit mode).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bc08b449
    • L
      lockref: uninline lockref helper functions · 2f4f12e5
      Linus Torvalds 提交于
      They aren't very good to inline, since they already call external
      functions (the spinlock code), and we're going to create rather more
      complicated versions of them that can do the reference count updates
      locklessly.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2f4f12e5
    • L
      vfs: reimplement d_rcu_to_refcount() using lockref_get_or_lock() · 15570086
      Linus Torvalds 提交于
      This moves __d_rcu_to_refcount() from <linux/dcache.h> into fs/namei.c
      and re-implements it using the lockref infrastructure instead.  It also
      adds a lot of comments about what is actually going on, because turning
      a dentry that was looked up using RCU into a long-lived reference
      counted entry is one of the more subtle parts of the rcu walk.
      
      We also used to be _particularly_ subtle in unlazy_walk() where we
      re-validate both the dentry and its parent using the same sequence
      count.  We used to do it by nesting the locks and then verifying the
      sequence count just once.
      
      That was silly, because nested locking is expensive, but the sequence
      count check is not.  So this just re-validates the dentry and the parent
      separately, avoiding the nested locking, and making the lockref lookup
      possible.
      Acked-by: NWaiman Long <waiman.long@hp.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      15570086
    • W
      vfs: use lockref_get_not_zero() for optimistic lockless dget_parent() · df3d0bbc
      Waiman Long 提交于
      A valid parent pointer is always going to have a non-zero reference
      count, but if we look up the parent optimistically without locking, we
      have to protect against the (very unlikely) race against renaming
      changing the parent from under us.
      
      We do that by using lockref_get_not_zero(), and then re-checking the
      parent pointer after getting a valid reference.
      
      [ This is a re-implementation of a chunk from the original patch by
        Waiman Long: "dcache: Enable lockless update of dentry's refcount".
        I've completely rewritten the patch-series and split it up, but I'm
        attributing this part to Waiman as it's close enough to his earlier
        patch  - Linus ]
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      df3d0bbc
    • L
      lockref: add 'lockref_get_or_lock() helper · b3abd802
      Linus Torvalds 提交于
      This behaves like "lockref_get_not_zero()", but instead of doing nothing
      if the count was zero, it returns with the lock held.
      
      This allows callers to revalidate the lockref-protected data structure
      if required even if the count was zero to begin with, and possibly
      increment the count if it passes muster.
      
      In particular, the dentry code wants this when it wants to turn an
      RCU-protected dentry into a stable refcounted one: if the dentry count
      it zero, but the sequence number still validates the dentry, we can take
      a reference to it.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3abd802
  2. 29 8月, 2013 3 次提交
    • W
      vfs: make the dentry cache use the lockref infrastructure · 98474236
      Waiman Long 提交于
      This just replaces the dentry count/lock combination with the lockref
      structure that contains both a count and a spinlock, and does the
      mechanical conversion to use the lockref infrastructure.
      
      There are no semantic changes here, it's purely syntactic.  The
      reference lockref implementation uses the spinlock exactly the same way
      that the old dcache code did, and the bulk of this patch is just
      expanding the internal "d_count" use in the dcache code to use
      "d_lockref.count" instead.
      
      This is purely preparation for the real change to make the reference
      count updates be lockless during the 3.12 merge window.
      
      [ As with the previous commit, this is a rewritten version of a concept
        originally from Waiman, so credit goes to him, blame for any errors
        goes to me.
      
        Waiman's patch had some semantic differences for taking advantage of
        the lockless update in dget_parent(), while this patch is
        intentionally a pure search-and-replace change with no semantic
        changes.     - Linus ]
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      98474236
    • W
      Add new lockref infrastructure reference implementation · 0f8f2aaa
      Waiman Long 提交于
      This introduces a new "lockref" structure that supports the concept of
      lockless updates of reference counts that still honor an attached
      spinlock.
      
      NOTE! This reference implementation is not the optimized lockless
      version, rather it is the fallback implementation using standard
      spinlocks.  The actual optimized versions will be merged into 3.12, but
      I wanted to get the infrastructure in place and document the new
      interfaces.
      
      [ Also note that this particular commit is drastically cut-down minimal
        version of the original patch by Waiman.  In order to properly credit
        the original author I'm marking Waiman as the author here, but in the
        end this patch bears little resemblance to the patch by Waiman.  So
        blame any errors on me editing things down to the point where I can
        introduce the infrastructure before the merge window for 3.12 actually
        opens.     - Linus ]
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0f8f2aaa
    • L
      Revert "fs: Allow unprivileged linkat(..., AT_EMPTY_PATH) aka flink" · f0cc6ffb
      Linus Torvalds 提交于
      This reverts commit bb2314b4.
      
      It wasn't necessarily wrong per se, but we're still busily discussing
      the exact details of this all, so I'm going to revert it for now.
      
      It's true that you can already do flink() through /proc and that flink()
      isn't new.  But as Brad Spengler points out, some secure environments do
      not mount proc, and flink adds a new interface that can avoid path
      lookup of the source for those kinds of environments.
      
      We may re-do this (and even mark it for stable backporting back in 3.11
      and possibly earlier) once the whole discussion about the interface is done.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Brad Spengler <spender@grsecurity.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f0cc6ffb
  3. 28 8月, 2013 3 次提交
    • L
      Merge tag 'regmap-v3.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · fa8218de
      Linus Torvalds 提交于
      Pull regmap fixes from Mark Brown:
       "Two changes here:
      
         - Fix a bug in the rbtree code which could cause it to create two
           different cache entries for the same register by adding a single
           register at a time to the cache.  This isn't awesome for
           performance but it's non-invasive which we need for this late in
           the release cycle and the I/O costs we're trying to avoid are high.
      
         - Add another header used in the !CONFIG_REGMAP stubs where we had
           been relying on implicit inclusion"
      
      * tag 'regmap-v3.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
        regmap: rbtree: Fix overlapping rbnodes.
        regmap: Add another missing header for !CONFIG_REGMAP stubs
      fa8218de
    • L
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · 0c6b5c5b
      Linus Torvalds 提交于
      Pull powerpc fixes from Ben Herrenschmidt:
       "Here are 3 bug fixes that should probably go into 3.11 since I'm also
        tagging them for stable.
      
        Once fixes our old /proc/powerpc/lparcfg file which provides partition
        informations when running under our hypervisor and also acts as a
        user-triggerable Oops when hot :-(
      
        The other two respectively are a one liner to fix a HVSI protocol
        handshake problem causing the console to fail to show up on a bunch of
        machines until we reach userspace, which I deem annoying enough to
        warrant going to stable, and a nasty gcc miscompile causing us to pass
        virtual instead of physical addresses to the firmware under some
        circumstances"
      
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
        powerpc/hvsi: Increase handshake timeout from 200ms to 400ms.
        powerpc: Work around gcc miscompilation of __pa() on 64-bit
        powerpc: Don't Oops when accessing /proc/powerpc/lparcfg without hypervisor
      0c6b5c5b
    • C
      mm: move_ptes -- Set soft dirty bit depending on pte type · 6dec97dc
      Cyrill Gorcunov 提交于
      Dave reported corrupted swap entries
      
       | [ 4588.541886] swap_free: Unused swap offset entry 00002d15
       | [ 4588.541952] BUG: Bad page map in process trinity-kid12  pte:005a2a80 pmd:22c01f067
      
      and Hugh pointed that in move_ptes _PAGE_SOFT_DIRTY bit set regardless
      the type of entry pte consists of.  The trick here is that when we carry
      soft dirty status in swap entries we are to use _PAGE_SWP_SOFT_DIRTY
      instead, because this is the only place in pte which can be used for own
      needs without intersecting with bits owned by swap entry type/offset.
      Reported-and-tested-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Analyzed-by: NHugh Dickins <hughd@google.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6dec97dc
  4. 27 8月, 2013 6 次提交
  5. 26 8月, 2013 5 次提交
  6. 25 8月, 2013 8 次提交
  7. 24 8月, 2013 10 次提交
    • L
      Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 89b53e50
      Linus Torvalds 提交于
      Pull libata fixes from Tejun Heo:
       "This contains three commits all of which are updates for specific
        devices which aren't too widespread.  Pretty limited scope and nothing
        too interesting or dangerous"
      
      * 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
        sata_fsl: save irqs while coalescing
        libata: apply behavioral quirks to sil3826 PMP
        sata, highbank: fix ordering of SGPIO signals
      89b53e50
    • L
      Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · e2982a04
      Linus Torvalds 提交于
      Pull cgroup fix from Tejun Heo:
       "A late fix for cgroup.
      
        This fixes a behavior regression visible to userland which was created
        by a commit merged during -rc1.  While the behavior change isn't too
        likely to be noticeable, the fix is relatively low risk and we'll need
        to backport it through -stable anyway if the bug gets released"
      
      * 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cpuset: fix a regression in validating config change
      e2982a04
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · f07823e1
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "Ben was on holidays for a week so a few nouveau regression fixes
        backed up, but they all seem necessary.
      
        Otherwise one i915 and one gma500 fix"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        gma500: Fix SDVO turning off randomly
        drm/nv04/disp: fix framebuffer pin refcounting
        drm/nouveau/mc: fix race condition between constructor and request_irq()
        drm/nouveau: fix reclocking on nv40
        drm/nouveau/ltcg: fix allocating memory as free
        drm/nouveau/ltcg: fix ltcg memory initialization after suspend
        drm/nouveau/fb: fix null derefs in nv49 and nv4e init
        drm/i915: Invalidate TLBs for the rings after a reset
      f07823e1
    • A
      usb: phy: fix build breakage · 52d5b9ab
      Anatolij Gustschin 提交于
      Commit 94ae9843 (usb: phy: rename all phy drivers to phy-$name-usb.c)
      renamed drivers/usb/phy/otg_fsm.h to drivers/usb/phy/phy-fsm-usb.h
      but changed drivers/usb/phy/phy-fsm-usb.c to include not existing
      "phy-otg-fsm.h" instead of new "phy-fsm-usb.h". This breaks building:
        ...
        drivers/usb/phy/phy-fsm-usb.c:32:25: fatal error: phy-otg-fsm.h: No such file or directory
        compilation terminated.
        make[3]: *** [drivers/usb/phy/phy-fsm-usb.o] Error 1
      
      This commit also missed to modify drivers/usb/phy/phy-fsl-usb.h
      to include new "phy-fsm-usb.h" instead of "otg_fsm.h" resulting
      in another build breakage:
        ...
        In file included from drivers/usb/phy/phy-fsl-usb.c:46:0:
        drivers/usb/phy/phy-fsl-usb.h:18:21: fatal error: otg_fsm.h: No such file or directory
        compilation terminated.
        make[3]: *** [drivers/usb/phy/phy-fsl-usb.o] Error 1
      
      Fix both issues.
      Signed-off-by: NAnatolij Gustschin <agust@denx.de>
      Cc: stable <stable@vger.kernel.org> # 3.10+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      52d5b9ab
    • A
      USB: OHCI: add missing PCI PM callbacks to ohci-pci.c · 9a11899c
      Alan Stern 提交于
      Commit c1117afb (USB: OHCI: make ohci-pci a separate driver)
      neglected to preserve the entries for the pci_suspend and pci_resume
      driver callbacks.  As a result, OHCI controllers don't work properly
      during suspend and after hibernation.
      
      This patch adds the missing callbacks to the driver.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Reported-and-tested-by: NSteve Cotton <steve@s.cotton.clara.co.uk>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9a11899c
    • I
      staging: comedi: bug-fix NULL pointer dereference on failed attach · 3955dfa8
      Ian Abbott 提交于
      Commit dcd7b8bd ("staging: comedi: put
      module _after_ detach" by myself) reversed a couple of calls in
      `comedi_device_attach()` when recovering from an error returned by the
      low-level driver's 'attach' handler.  Unfortunately, that introduced a
      NULL pointer dereference bug as `dev->driver` is NULL after the call to
      `comedi_device_detach()`.   We still have a pointer to the low-level
      comedi driver structure in the `driv` variable, so use that instead.
      Signed-off-by: NIan Abbott <abbotti@mev.co.uk>
      Cc: <stable@vger.kernel.org> # 3.10+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3955dfa8
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 41a00f79
      Linus Torvalds 提交于
      Merge networking fixes from David Miller:
      
       1) Revert Johannes Berg's genetlink locking fix, because it causes
          regressions.
      
          Johannes and Pravin Shelar are working on fixing things properly.
      
       2) Do not drop ipv6 ICMP messages without a redirected header option,
          they are legal.  From Duan Jiong.
      
       3) Missing error return propagation in probing of via-ircc driver.
          From Alexey Khoroshilov.
      
       4) Do not clear out broadcast/multicast/unicast/WOL bits in r8169 when
          initializing, from Peter Wu.
      
       5) realtek phy driver programs wrong interrupt status bit, from
          Giuseppe CAVALLARO.
      
       6) Fix statistics regression in AF_PACKET code, from Willem de Bruijn.
      
       7) Bridge code uses wrong bitmap length, from Toshiaki Makita.
      
       8) SFC driver uses wrong indexes to look up MAC filters, from Ben
          Hutchings.
      
       9) Don't pass stack buffers into usb control operations in hso driver,
          from Daniel Gimpelevich.
      
      10) Multiple ipv6 fragmentation headers in one packet is illegal and
          such packets should be dropped, from Hannes Frederic Sowa.
      
      11) When TCP sockets are "repaired" as part of checkpoint/restart, the
          timestamp field of SKBs need to be refreshed otherwise RTOs can be
          wildly off.  From Andrey Vagin.
      
      12) Fix memcpy args (uses 'address of pointer' instead of 'pointer') in
          hostp driver.  From Dan Carpenter.
      
      13) nl80211hdr_put() doesn't return an ERR_PTR, but some code believes
          it does.  From Dan Carpenter.
      
      14) Fix regression in wireless SME disconnects, from Johannes Berg.
      
      15) Don't use a stack buffer for DMA in zd1201 USB wireless driver, from
          Jussi Kivilinna.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
        ipv4: expose IPV4_DEVCONF
        ipv6: handle Redirect ICMP Message with no Redirected Header option
        be2net: fix disabling TX in be_close()
        Revert "genetlink: fix family dump race"
        hso: Fix stack corruption on some architectures
        hso: Earlier catch of error condition
        sfc: Fix lookup of default RX MAC filters when steered using ethtool
        bridge: Use the correct bit length for bitmap functions in the VLAN code
        packet: restore packet statistics tp_packets to include drops
        net: phy: rtl8211: fix interrupt on status link change
        r8169: remember WOL preferences on driver load
        via-ircc: don't return zero if via_ircc_open() failed
        macvtap: Ignore tap features when VNET_HDR is off
        macvtap: Correctly set tap features when IFF_VNET_HDR is disabled.
        macvtap: simplify usage of tap_features
        tcp: set timestamps for restored skb-s
        bnx2x: set VF DMAE when first function has 0 supported VFs
        bnx2x: Protect against VFs' ndos when SR-IOV is disabled
        bnx2x: prevent VF benign attentions
        bnx2x: Consider DCBX remote error
        ...
      41a00f79
    • L
      Merge branch 'akpm' (patches from Andrew Morton) · 3db0d4de
      Linus Torvalds 提交于
      Merge fixes from Andrew Morton:
       "A few fixes.  One is a licensing change and I don't do licensing, so
        please eyeball that one"
      
      Licensing eye-balled.
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        lib/lz4: correct the LZ4 license
        memcg: get rid of swapaccount leftovers
        nilfs2: fix issue with counting number of bio requests for BIO_EOPNOTSUPP error detection
        nilfs2: remove double bio_put() in nilfs_end_bio_write() for BIO_EOPNOTSUPP error
        drivers/platform/olpc/olpc-ec.c: initialise earlier
      3db0d4de
    • R
      lib/lz4: correct the LZ4 license · ee8a99bd
      Richard Laager 提交于
      The LZ4 code is listed as using the "BSD 2-Clause License".
      Signed-off-by: NRichard Laager <rlaager@wiktel.com>
      Acked-by: NKyungsik Lee <kyungsik.lee@lge.com>
      Cc: Chanho Min <chanho.min@lge.com>
      Cc: Richard Yao <ryao@gentoo.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      [ The 2-clause BSD can be just converted into GPL, but that's rude and
        pointless, so don't do it   - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee8a99bd
    • M
      memcg: get rid of swapaccount leftovers · 07555ac1
      Michal Hocko 提交于
      The swapaccount kernel parameter without any values has been removed by
      commit a2c8990a ("memsw: remove noswapaccount kernel parameter") but
      it seems that we didn't get rid of all the left overs.
      
      Make sure that menuconfig help text and kernel-parameters.txt are clear
      about value for the paramter and remove the stalled comment which is not
      very much useful on its own.
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Reported-by: NGergely Risko <gergely@risko.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      07555ac1