1. 09 5月, 2016 1 次提交
    • A
      nfs: per-name sillyunlink exclusion · 884be175
      Al Viro 提交于
      use d_alloc_parallel() for sillyunlink/lookup exclusion and
      explicit rwsem (nfs_rmdir() being a writer and nfs_call_unlink() -
      a reader) for rmdir/sillyunlink one.
      
      That ought to make lookup/readdir/!O_CREAT atomic_open really
      parallel on NFS.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      884be175
  2. 03 5月, 2016 7 次提交
    • A
      introduce a parallel variant of ->iterate() · 61922694
      Al Viro 提交于
      New method: ->iterate_shared().  Same arguments as in ->iterate(),
      called with the directory locked only shared.  Once all filesystems
      switch, the old one will be gone.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      61922694
    • A
      give readdir(2)/getdents(2)/etc. uniform exclusion with lseek() · 63b6df14
      Al Viro 提交于
      same as read() on regular files has, and for the same reason.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      63b6df14
    • A
      parallel lookups: actual switch to rwsem · 9902af79
      Al Viro 提交于
      ta-da!
      
      The main issue is the lack of down_write_killable(), so the places
      like readdir.c switched to plain inode_lock(); once killable
      variants of rwsem primitives appear, that'll be dealt with.
      
      lockdep side also might need more work
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9902af79
    • A
      parallel lookups machinery, part 4 (and last) · d9171b93
      Al Viro 提交于
      If we *do* run into an in-lookup match, we need to wait for it to
      cease being in-lookup.  Fortunately, we do have unused space in
      in-lookup dentries - d_lru is never looked at until it stops being
      in-lookup.
      
      So we can stash a pointer to wait_queue_head from stack frame of
      the caller of ->lookup().  Some precautions are needed while
      waiting, but it's not that hard - we do hold a reference to dentry
      we are waiting for, so it can't go away.  If it's found to be
      in-lookup the wait_queue_head is still alive and will remain so
      at least while ->d_lock is held.  Moreover, the condition we
      are waiting for becomes true at the same point where everything
      on that wq gets woken up, so we can just add ourselves to the
      queue once.
      
      d_alloc_parallel() gets a pointer to wait_queue_head_t from its
      caller; lookup_slow() adjusted, d_add_ci() taught to use
      d_alloc_parallel() if the dentry passed to it happens to be
      in-lookup one (i.e. if it's been called from the parallel lookup).
      
      That's pretty much it - all that remains is to switch ->i_mutex
      to rwsem and have lookup_slow() take it shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d9171b93
    • A
      parallel lookups machinery, part 3 · 94bdd655
      Al Viro 提交于
      We will need to be able to check if there is an in-lookup
      dentry with matching parent/name.  Right now it's impossible,
      but as soon as start locking directories shared such beasts
      will appear.
      
      Add a secondary hash for locating those.  Hash chains go through
      the same space where d_alias will be once it's not in-lookup anymore.
      Search is done under the same bitlock we use for modifications -
      with the primary hash we can rely on d_rehash() into the wrong
      chain being the worst that could happen, but here the pointers are
      buggered once it's removed from the chain.  On the other hand,
      the chains are not going to be long and normally we'll end up
      adding to the chain anyway.  That allows us to avoid bothering with
      ->d_lock when doing the comparisons - everything is stable until
      removed from chain.
      
      New helper: d_alloc_parallel().  Right now it allocates, verifies
      that no hashed and in-lookup matches exist and adds to in-lookup
      hash.
      
      Returns ERR_PTR() for error, hashed match (in the unlikely case it's
      been found) or new dentry.  In-lookup matches trigger BUG() for
      now; that will change in the next commit when we introduce waiting
      for ongoing lookup to finish.  Note that in-lookup matches won't be
      possible until we actually go for shared locking.
      
      lookup_slow() switched to use of d_alloc_parallel().
      
      Again, these commits are separated only for making it easier to
      review.  All this machinery will start doing something useful only
      when we go for shared locking; it's just that the combination is
      too large for my taste.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      94bdd655
    • A
      parallel lookups machinery, part 2 · 84e710da
      Al Viro 提交于
      We'll need to verify that there's neither a hashed nor in-lookup
      dentry with desired parent/name before adding to in-lookup set.
      
      One possible solution would be to hold the parent's ->d_lock through
      both checks, but while the in-lookup set is relatively small at any
      time, dcache is not.  And holding the parent's ->d_lock through
      something like __d_lookup_rcu() would suck too badly.
      
      So we leave the parent's ->d_lock alone, which means that we watch
      out for the following scenario:
      	* we verify that there's no hashed match
      	* existing in-lookup match gets hashed by another process
      	* we verify that there's no in-lookup matches and decide
      that everything's fine.
      
      Solution: per-directory kinda-sorta seqlock, bumped around the times
      we hash something that used to be in-lookup or move (and hash)
      something in place of in-lookup.  Then the above would turn into
      	* read the counter
      	* do dcache lookup
      	* if no matches found, check for in-lookup matches
      	* if there had been none of those either, check if the
      counter has changed; repeat if it has.
      
      The "kinda-sorta" part is due to the fact that we don't have much spare
      space in inode.  There is a spare word (shared with i_bdev/i_cdev/i_pipe),
      so the counter part is not a problem, but spinlock is a different story.
      
      We could use the parent's ->d_lock, and it would be less painful in
      terms of contention, for __d_add() it would be rather inconvenient to
      grab; we could do that (using lock_parent()), but...
      
      Fortunately, we can get serialization on the counter itself, and it
      might be a good idea in general; we can use cmpxchg() in a loop to
      get from even to odd and smp_store_release() from odd to even.
      
      This commit adds the counter and updating logics; the readers will be
      added in the next commit.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      84e710da
    • A
      beginning of transition to parallel lookups - marking in-lookup dentries · 85c7f810
      Al Viro 提交于
      marked as such when (would be) parallel lookup is about to pass them
      to actual ->lookup(); unmarked when
      	* __d_add() is about to make it hashed, positive or not.
      	* __d_move() (from d_splice_alias(), directly or via
      __d_unalias()) puts a preexisting dentry in its place
      	* in caller of ->lookup() if it has escaped all of the
      above.  Bug (WARN_ON, actually) if it reaches the final dput()
      or d_instantiate() while still marked such.
      
      As the result, we are guaranteed that for as long as the flag is
      set, dentry will
      	* remain negative unhashed with positive refcount
      	* never have its ->d_alias looked at
      	* never have its ->d_lru looked at
      	* never have its ->d_parent and ->d_name changed
      
      Right now we have at most one such for any given parent directory.
      With parallel lookups that restriction will weaken to
      	* only exist when parent is locked shared
      	* at most one with given (parent,name) pair (comparison of
      names is according to ->d_compare())
      	* only exist when there's no hashed dentry with the same
      (parent,name)
      
      Transition will take the next several commits; unfortunately, we'll
      only be able to switch to rwsem at the end of this series.  The
      reason for not making it a single patch is to simplify review.
      
      New primitives: d_in_lookup() (a predicate checking if dentry is in
      the in-lookup state) and d_lookup_done() (tells the system that
      we are done with lookup and if it's still marked as in-lookup, it
      should cease to be such).
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      85c7f810
  3. 11 4月, 2016 3 次提交
  4. 09 4月, 2016 1 次提交
  5. 08 4月, 2016 1 次提交
    • A
      GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU · a0ca153f
      Alexander Duyck 提交于
      This patch fixes an issue I found in which we were dropping frames if we
      had enabled checksums on GRE headers that were encapsulated by either FOU
      or GUE.  Without this patch I was barely able to get 1 Gb/s of throughput.
      With this patch applied I am now at least getting around 6 Gb/s.
      
      The issue is due to the fact that with FOU or GUE applied we do not provide
      a transport offset pointing to the GRE header, nor do we offload it in
      software as the GRE header is completely skipped by GSO and treated like a
      VXLAN or GENEVE type header.  As such we need to prevent the stack from
      generating it and also prevent GRE from generating it via any interface we
      create.
      
      Fixes: c3483384 ("gro: Allow tunnel stacking in the case of FOU/GUE")
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0ca153f
  6. 07 4月, 2016 2 次提交
  7. 06 4月, 2016 1 次提交
  8. 05 4月, 2016 8 次提交
  9. 04 4月, 2016 1 次提交
  10. 02 4月, 2016 5 次提交
    • L
      mm/page_isolation: fix tracepoint to mirror check function behavior · bbe3de25
      Lucas Stach 提交于
      Page isolation has not failed if the fin pfn extends beyond the end pfn
      and test_pages_isolated checks this correctly.  Fix the tracepoint to
      report the same result as the actual check function.
      Signed-off-by: NLucas Stach <l.stach@pengutronix.de>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bbe3de25
    • C
      include/linux/huge_mm.h: return NULL instead of false for pmd_trans_huge_lock() · 969e8d7e
      Chen Gang 提交于
      The return value of pmd_trans_huge_lock() is a pointer, not a boolean
      value, so use NULL instead of false as the return value.
      Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      969e8d7e
    • G
      stmmac: fix MDIO settings · a7657f12
      Giuseppe CAVALLARO 提交于
      Initially the phy_bus_name was added to manipulate the
      driver name but it was recently just used to manage the
      fixed-link and then to take some decision at run-time.
      So the patch uses the is_pseudo_fixed_link and removes
      the phy_bus_name variable not necessary anymore.
      
      The driver can manage the mdio registration by using phy-handle,
      dwmac-mdio and own parameter e.g. snps,phy-addr.
      This patch takes care about all these possible configurations
      and fixes the mdio registration in case of there is a real
      transceiver or a switch (that needs to be managed by using
      fixed-link).
      Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com>
      Reviewed-by: NAndreas Färber <afaerber@suse.de>
      Tested-by: NFrank Schäfer <fschaefer.oss@googlemail.com>
      Cc: Gabriel Fernandez <gabriel.fernandez@linaro.org>
      Cc: Dinh Nguyen <dinh.linux@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Phil Reid <preid@electromag.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7657f12
    • G
      Revert "stmmac: Fix 'eth0: No PHY found' regression" · d7e944c8
      Giuseppe CAVALLARO 提交于
      This reverts commit 88f8b1bb.
      due to problems on GeekBox and Banana Pi M1 board when
      connected to a real transceiver instead of a switch via
      fixed-link.
      Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com>
      Cc: Gabriel Fernandez <gabriel.fernandez@linaro.org>
      Cc: Andreas Färber <afaerber@suse.de>
      Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
      Cc: Dinh Nguyen <dinh.linux@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7e944c8
    • D
      tun, bpf: fix suspicious RCU usage in tun_{attach, detach}_filter · 5a5abb1f
      Daniel Borkmann 提交于
      Sasha Levin reported a suspicious rcu_dereference_protected() warning
      found while fuzzing with trinity that is similar to this one:
      
        [   52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage!
        [   52.765688] other info that might help us debug this:
        [   52.765695] rcu_scheduler_active = 1, debug_locks = 1
        [   52.765701] 1 lock held by a.out/1525:
        [   52.765704]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff816a64b7>] rtnl_lock+0x17/0x20
        [   52.765721] stack backtrace:
        [   52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264
        [...]
        [   52.765768] Call Trace:
        [   52.765775]  [<ffffffff813e488d>] dump_stack+0x85/0xc8
        [   52.765784]  [<ffffffff810f2fa5>] lockdep_rcu_suspicious+0xd5/0x110
        [   52.765792]  [<ffffffff816afdc2>] sk_detach_filter+0x82/0x90
        [   52.765801]  [<ffffffffa0883425>] tun_detach_filter+0x35/0x90 [tun]
        [   52.765810]  [<ffffffffa0884ed4>] __tun_chr_ioctl+0x354/0x1130 [tun]
        [   52.765818]  [<ffffffff8136fed0>] ? selinux_file_ioctl+0x130/0x210
        [   52.765827]  [<ffffffffa0885ce3>] tun_chr_ioctl+0x13/0x20 [tun]
        [   52.765834]  [<ffffffff81260ea6>] do_vfs_ioctl+0x96/0x690
        [   52.765843]  [<ffffffff81364af3>] ? security_file_ioctl+0x43/0x60
        [   52.765850]  [<ffffffff81261519>] SyS_ioctl+0x79/0x90
        [   52.765858]  [<ffffffff81003ba2>] do_syscall_64+0x62/0x140
        [   52.765866]  [<ffffffff817d563f>] entry_SYSCALL64_slow_path+0x25/0x25
      
      Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled
      from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH,
      DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices.
      
      Since the fix in f91ff5b9 ("net: sk_{detach|attach}_filter() rcu
      fixes") sk_attach_filter()/sk_detach_filter() now dereferences the
      filter with rcu_dereference_protected(), checking whether socket lock
      is held in control path.
      
      Since its introduction in 99405162 ("tun: socket filter support"),
      tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the
      sock_owned_by_user(sk) doesn't apply in this specific case and therefore
      triggers the false positive.
      
      Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair
      that is used by tap filters and pass in lockdep_rtnl_is_held() for the
      rcu_dereference_protected() checks instead.
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a5abb1f
  11. 31 3月, 2016 8 次提交
  12. 30 3月, 2016 1 次提交
  13. 29 3月, 2016 1 次提交