1. 24 10月, 2015 1 次提交
    • N
      sunrpc/cache: make cache flushing more reliable. · 77862036
      Neil Brown 提交于
      The caches used to store sunrpc authentication information can be
      flushed by writing a timestamp to a file in /proc.
      
      This timestamp has a one-second resolution and any entry in cache that
      was last_refreshed *before* that time is treated as expired.
      
      This is problematic as it is not possible to reliably flush the cache
      without interrupting NFS service.
      If the current time is written to the "flush" file, any entry that was
      added since the current second started will still be treated as valid.
      If one second beyond than the current time is written to the file
      then no entries can be valid until the second ticks over.  This will
      mean that no NFS request will be handled for up to 1 second.
      
      To resolve this issue we make two changes:
      
      1/ treat an entry as expired if the timestamp when it was last_refreshed
        is before *or the same as* the expiry time.  This means that current
        code which writes out the current time will now flush the cache
        reliably.
      
      2/ when a new entry in added to the cache -  set the last_refresh timestamp
        to 1 second *beyond* the current flush time, when that not in the
        past.
        This ensures that newly added entries will always be valid.
      
      Now that we have a very reliable way to flush the cache, and also
      since we are using "since-boot" timestamps which are monotonic,
      change cache_purge() to set the smallest future flush_time which
      will work, and leave it there: don't revert to '1'.
      
      Also disable the setting of the 'flush_time' far into the future.
      That has never been useful and is now awkward as it would cause
      last_refresh times to be strange.
      Finally: if a request is made to set the 'flush_time' to the current
      second, assume the intent is to flush the cache and advance it, if
      necessary, to 1 second beyond the current 'flush_time' so that all
      active entries will be deemed to be expired.
      
      As part of this we need to add a 'cache_detail' arg to cache_init()
      and cache_fresh_locked() so they can find the current ->flush_time.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Reported-by: NOlaf Kirch <okir@suse.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      77862036
  2. 13 8月, 2015 3 次提交
  3. 16 4月, 2015 1 次提交
    • R
      lib/string_helpers.c: change semantics of string_escape_mem · 41416f23
      Rasmus Villemoes 提交于
      The current semantics of string_escape_mem are inadequate for one of its
      current users, vsnprintf().  If that is to honour its contract, it must
      know how much space would be needed for the entire escaped buffer, and
      string_escape_mem provides no way of obtaining that (short of allocating a
      large enough buffer (~4 times input string) to let it play with, and
      that's definitely a big no-no inside vsnprintf).
      
      So change the semantics for string_escape_mem to be more snprintf-like:
      Return the size of the output that would be generated if the destination
      buffer was big enough, but of course still only write to the part of dst
      it is allowed to, and (contrary to snprintf) don't do '\0'-termination.
      It is then up to the caller to detect whether output was truncated and to
      append a '\0' if desired.  Also, we must output partial escape sequences,
      otherwise a call such as snprintf(buf, 3, "%1pE", "\123") would cause
      printf to write a \0 to buf[2] but leaving buf[0] and buf[1] with whatever
      they previously contained.
      
      This also fixes a bug in the escaped_string() helper function, which used
      to unconditionally pass a length of "end-buf" to string_escape_mem();
      since the latter doesn't check osz for being insanely large, it would
      happily write to dst.  For example, kasprintf(GFP_KERNEL, "something and
      then %pE", ...); is an easy way to trigger an oops.
      
      In test-string_helpers.c, the -ENOMEM test is replaced with testing for
      getting the expected return value even if the buffer is too small.  We
      also ensure that nothing is written (by relying on a NULL pointer deref)
      if the output size is 0 by passing NULL - this has to work for
      kasprintf("%pE") to work.
      
      In net/sunrpc/cache.c, I think qword_add still has the same semantics.
      Someone should definitely double-check this.
      
      In fs/proc/array.c, I made the minimum possible change, but longer-term it
      should stop poking around in seq_file internals.
      
      [andriy.shevchenko@linux.intel.com: simplify qword_add]
      [andriy.shevchenko@linux.intel.com: add missed curly braces]
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Acked-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41416f23
  4. 09 3月, 2015 1 次提交
  5. 10 12月, 2014 1 次提交
  6. 09 5月, 2014 1 次提交
  7. 15 1月, 2014 1 次提交
  8. 13 12月, 2013 1 次提交
  9. 14 7月, 2013 1 次提交
  10. 02 7月, 2013 5 次提交
    • N
      sunrpc: Don't schedule an upcall on a replaced cache entry. · 0bebc633
      NeilBrown 提交于
      When a cache entry is replaced, the "expiry_time" get set to
      zero by a call to "cache_fresh_locked(..., 0)" at the end of
      "sunrpc_cache_update".
      
      This low expiry time makes cache_check() think that the 'refresh_age'
      is negative, so the 'age' is comparatively large and a refresh is
      triggered.
      However refreshing a replaced entry it pointless, it cannot achieve
      anything useful.
      
      So teach cache_check to ignore a low refresh_age when expiry_time
      is zero.
      Reported-by: NBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      0bebc633
    • N
      net/sunrpc: xpt_auth_cache should be ignored when expired. · 7715cde8
      NeilBrown 提交于
      commit d202cce8
          sunrpc: never return expired entries in sunrpc_cache_lookup
      
      moved the 'entry is expired' test from cache_check to
      sunrpc_cache_lookup, so that it happened early and some races could
      safely be ignored.
      
      However the ip_map (in svcauth_unix.c) has a separate single-item
      cache which allows quick lookup without locking.  An entry in this
      case would not be subject to the expiry test and so could be used
      well after it has expired.
      
      This is not normally a big problem because the first time it is used
      after it is expired an up-call will be scheduled to refresh the entry
      (if it hasn't been scheduled already) and the old entry will then
      be invalidated.  So on the second attempt to use it after it has
      expired, ip_map_cached_get will discard it.
      
      However that is subtle and not ideal, so replace the "!cache_valid"
      test with "cache_is_expired".
      In doing this we drop the test on the "CACHE_VALID" bit.  This is
      unnecessary as the bit is never cleared, and an entry will only
      be cached if the bit is set.
      Reported-by: NBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7715cde8
    • N
      sunrpc/cache: ensure items removed from cache do not have pending upcalls. · 013920eb
      NeilBrown 提交于
      It is possible for a race to set CACHE_PENDING after cache_clean()
      has removed a cache entry from the cache.
      If CACHE_PENDING is still set when the entry is finally 'put',
      the cache_dequeue() will never happen and we can leak memory.
      
      So set a new flag 'CACHE_CLEANED' when we remove something from
      the cache, and don't queue any upcall if it is set.
      
      If CACHE_PENDING is set before CACHE_CLEANED, the call that
      cache_clean() makes to cache_fresh_unlocked() will free memory
      as needed.  If CACHE_PENDING is set after CACHE_CLEANED, the
      test in sunrpc_cache_pipe_upcall will ensure that the memory
      is not allocated.
      
      Reported-by: <bstroesser@ts.fujitsu.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      013920eb
    • N
      sunrpc/cache: use cache_fresh_unlocked consistently and correctly. · 2a1c7f53
      NeilBrown 提交于
      cache_fresh_unlocked() is called when a cache entry
      has been updated and ensures that if there were any
      pending upcalls, they are cleared.
      
      So every time we update a cache entry, we should call this,
      and this should be the only way that we try to clear
      pending calls (that sort of uniformity makes code sooo much
      easier to read).
      
      try_to_negate_entry() will (possibly) mark an entry as
      negative.  If it doesn't, it is because the entry already
      is VALID.
      So the entry will be valid on exit, so it is appropriate to
      call cache_fresh_unlocked().
      So tidy up try_to_negate_entry() to do that, and remove
      partial open-coded cache_fresh_unlocked() from the one
      call-site of try_to_negate_entry().
      
      In the other branch of the 'switch(cache_make_upcall())',
      we again have a partial open-coded version of cache_fresh_unlocked().
      Replace that with a real call.
      
      And again in cache_clean(), use a real call to cache_fresh_unlocked().
      
      These call sites might previously have called
      cache_revisit_request() if CACHE_PENDING wasn't set.
      This is never necessary because cache_revisit_request() can
      only do anything if the item is in the cache_defer_hash,
      However any time that an item is added to the cache_defer_hash
      (setup_deferral), the code immediately tests CACHE_PENDING,
      and removes the entry again if it is clear.  So all other
      places we only need to 'cache_revisit_request' if we've
      just cleared CACHE_PENDING.
      Reported-by: NBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: NNeilBrown  <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      2a1c7f53
    • N
      sunrpc/cache: remove races with queuing an upcall. · f9e1aedc
      NeilBrown 提交于
      We currently queue an upcall after setting CACHE_PENDING,
      and dequeue after clearing CACHE_PENDING.
      So a request should only be present when CACHE_PENDING is set.
      
      However we don't combine the test and the enqueue/dequeue in
      a protected region, so it is possible (if unlikely) for a race
      to result in a request being queued without CACHE_PENDING set,
      or a request to be absent despite CACHE_PENDING.
      
      So: include a test for CACHE_PENDING inside the regions of
      enqueue and dequeue where queue_lock is held, and abort
      the operation if the value is not as expected.
      
      Also remove the early 'return' from cache_dequeue() to ensure that it
      always removes all entries: As there is no locking between setting
      CACHE_PENDING and calling sunrpc_cache_pipe_upcall it is not
      inconceivable for some other thread to clear CACHE_PENDING and then
      someone else to set it and call sunrpc_cache_pipe_upcall, both before
      the original threads completed the call.
      
      With this, it perfectly safe and correct to:
       - call cache_dequeue() if and only if we have just
         cleared CACHE_PENDING
       - call sunrpc_cache_pipe_upcall() (via cache_make_upcall)
         if and only if we have just set CACHE_PENDING.
      Reported-by: NBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f9e1aedc
  11. 21 5月, 2013 1 次提交
  12. 30 4月, 2013 1 次提交
  13. 10 4月, 2013 1 次提交
    • A
      procfs: new helper - PDE_DATA(inode) · d9dda78b
      Al Viro 提交于
      The only part of proc_dir_entry the code outside of fs/proc
      really cares about is PDE(inode)->data.  Provide a helper
      for that; static inline for now, eventually will be moved
      to fs/proc, along with the knowledge of struct proc_dir_entry
      layout.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d9dda78b
  14. 04 4月, 2013 1 次提交
  15. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  16. 23 2月, 2013 1 次提交
  17. 15 2月, 2013 3 次提交
  18. 24 1月, 2013 1 次提交
  19. 05 11月, 2012 1 次提交
  20. 18 10月, 2012 1 次提交
    • S
      SUNRPC: Prevent kernel stack corruption on long values of flush · 212ba906
      Sasha Levin 提交于
      The buffer size in read_flush() is too small for the longest possible values
      for it. This can lead to a kernel stack corruption:
      
      [   43.047329] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff833e64b4
      [   43.047329]
      [   43.049030] Pid: 6015, comm: trinity-child18 Tainted: G        W    3.5.0-rc7-next-20120716-sasha #221
      [   43.050038] Call Trace:
      [   43.050435]  [<ffffffff836c60c2>] panic+0xcd/0x1f4
      [   43.050931]  [<ffffffff833e64b4>] ? read_flush.isra.7+0xe4/0x100
      [   43.051602]  [<ffffffff810e94e6>] __stack_chk_fail+0x16/0x20
      [   43.052206]  [<ffffffff833e64b4>] read_flush.isra.7+0xe4/0x100
      [   43.052951]  [<ffffffff833e6500>] ? read_flush_pipefs+0x30/0x30
      [   43.053594]  [<ffffffff833e652c>] read_flush_procfs+0x2c/0x30
      [   43.053596]  [<ffffffff812b9a8c>] proc_reg_read+0x9c/0xd0
      [   43.053596]  [<ffffffff812b99f0>] ? proc_reg_write+0xd0/0xd0
      [   43.053596]  [<ffffffff81250d5b>] do_loop_readv_writev+0x4b/0x90
      [   43.053596]  [<ffffffff81250fd6>] do_readv_writev+0xf6/0x1d0
      [   43.053596]  [<ffffffff812510ee>] vfs_readv+0x3e/0x60
      [   43.053596]  [<ffffffff812511b8>] sys_readv+0x48/0xb0
      [   43.053596]  [<ffffffff8378167d>] system_call_fastpath+0x1a/0x1f
      Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      212ba906
  21. 22 8月, 2012 1 次提交
  22. 12 7月, 2012 1 次提交
    • N
      SUNRPC/cache: fix reporting of expired cache entries in 'content' file. · 200724a7
      NeilBrown 提交于
      Entries that are in a sunrpc cache but are not valid should be reported
      with a leading '#' so they look like a comment.
      Commit  d202cce8 (sunrpc: never return expired entries in sunrpc_cache_lookup)
      broke this for expired entries.
      
      This particularly applies to entries that have been replaced by newer entries.
      sunrpc_cache_update sets the expiry of the replaced entry to '0', but it
      remains in the cache until the next 'cache_clean'.
      The result is that if you
      
        echo 0 2000000000 1 0 > /proc/net/rpc/auth.unix.gid/channel
      
      several times, then
      
        cat /proc/net/rpc/auth.unix.gid/content
      
      It will display multiple entries for the one uid, which is at least confusing:
      
        #uid cnt: gids...
        0 1: 0
        0 1: 0
        0 1: 0
      
      With this patch, expired entries are marked as comments so you get
      
        #uid cnt: gids...
        0 1: 0
        # 0 1: 0
        # 0 1: 0
      
      These expired entries will never be seen by cache_check() as they are always
      *after* a non-expired entry with the same key - so the extra check is only
      needed in c_show()
      Signed-off-by: NNeilBrown <neilb@suse.de>
      
      --
      It's not a big problem, but it had me confused for a while, so it could
      well confuse others.
      Thanks,
      NeilBrown
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      200724a7
  23. 16 4月, 2012 1 次提交
  24. 04 2月, 2012 1 次提交
  25. 01 2月, 2012 3 次提交
  26. 04 1月, 2012 1 次提交
  27. 08 12月, 2011 1 次提交
  28. 05 1月, 2011 3 次提交
    • J
      svcrpc: ensure cache_check caller sees updated entry · fdef7aa5
      J. Bruce Fields 提交于
      Supposes cache_check runs simultaneously with an update on a different
      CPU:
      
      	cache_check			task doing update
      	^^^^^^^^^^^			^^^^^^^^^^^^^^^^^
      
      	1. test for CACHE_VALID		1'. set entry->data
      	   & !CACHE_NEGATIVE
      
      	2. use entry->data		2'. set CACHE_VALID
      
      If the two memory writes performed in step 1' and 2' appear misordered
      with respect to the reads in step 1 and 2, then the caller could get
      stale data at step 2 even though it saw CACHE_VALID set on the cache
      entry.
      
      Add memory barriers to prevent this.
      Reviewed-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      fdef7aa5
    • J
      svcrpc: take lock on turning entry NEGATIVE in cache_check · 6bab93f8
      J. Bruce Fields 提交于
      We attempt to turn a cache entry negative in place.  But that entry may
      already have been filled in by some other task since we last checked
      whether it was valid, so we could be modifying an already-valid entry.
      If nothing else there's a likely leak in such a case when the entry is
      eventually put() and contents are not freed because it has
      CACHE_NEGATIVE set.
      
      So, take the cache_lock just as sunrpc_cache_update() does.
      Reviewed-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      6bab93f8
    • J
      svcrpc: avoid double reply caused by deferral race · d76d1815
      J. Bruce Fields 提交于
      Commit d29068c4 "sunrpc: Simplify cache_defer_req and related
      functions." asserted that cache_check() could determine success or
      failure of cache_defer_req() by checking the CACHE_PENDING bit.
      
      This isn't quite right.
      
      We need to know whether cache_defer_req() created a deferred request,
      in which case sending an rpc reply has become the responsibility of the
      deferred request, and it is important that we not send our own reply,
      resulting in two different replies to the same request.
      
      And the CACHE_PENDING bit doesn't tell us that; we could have
      succesfully created a deferred request at the same time as another
      thread cleared the CACHE_PENDING bit.
      
      So, partially revert that commit, to ensure that cache_check() returns
      -EAGAIN if and only if a deferred request has been created.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Acked-by: NNeilBrown <neilb@suse.de>
      d76d1815