1. 05 1月, 2011 3 次提交
    • J
      svcrpc: ensure cache_check caller sees updated entry · fdef7aa5
      J. Bruce Fields 提交于
      Supposes cache_check runs simultaneously with an update on a different
      CPU:
      
      	cache_check			task doing update
      	^^^^^^^^^^^			^^^^^^^^^^^^^^^^^
      
      	1. test for CACHE_VALID		1'. set entry->data
      	   & !CACHE_NEGATIVE
      
      	2. use entry->data		2'. set CACHE_VALID
      
      If the two memory writes performed in step 1' and 2' appear misordered
      with respect to the reads in step 1 and 2, then the caller could get
      stale data at step 2 even though it saw CACHE_VALID set on the cache
      entry.
      
      Add memory barriers to prevent this.
      Reviewed-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      fdef7aa5
    • J
      svcrpc: take lock on turning entry NEGATIVE in cache_check · 6bab93f8
      J. Bruce Fields 提交于
      We attempt to turn a cache entry negative in place.  But that entry may
      already have been filled in by some other task since we last checked
      whether it was valid, so we could be modifying an already-valid entry.
      If nothing else there's a likely leak in such a case when the entry is
      eventually put() and contents are not freed because it has
      CACHE_NEGATIVE set.
      
      So, take the cache_lock just as sunrpc_cache_update() does.
      Reviewed-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      6bab93f8
    • J
      svcrpc: avoid double reply caused by deferral race · d76d1815
      J. Bruce Fields 提交于
      Commit d29068c4 "sunrpc: Simplify cache_defer_req and related
      functions." asserted that cache_check() could determine success or
      failure of cache_defer_req() by checking the CACHE_PENDING bit.
      
      This isn't quite right.
      
      We need to know whether cache_defer_req() created a deferred request,
      in which case sending an rpc reply has become the responsibility of the
      deferred request, and it is important that we not send our own reply,
      resulting in two different replies to the same request.
      
      And the CACHE_PENDING bit doesn't tell us that; we could have
      succesfully created a deferred request at the same time as another
      thread cleared the CACHE_PENDING bit.
      
      So, partially revert that commit, to ensure that cache_check() returns
      -EAGAIN if and only if a deferred request has been created.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Acked-by: NNeilBrown <neilb@suse.de>
      d76d1815
  2. 19 10月, 2010 1 次提交
  3. 15 10月, 2010 1 次提交
    • A
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann 提交于
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
  4. 12 10月, 2010 2 次提交
    • N
      sunrpc/cache: centralise handling of size limit on deferred list. · e33534d5
      NeilBrown 提交于
      We limit the number of 'defer' requests to DFR_MAX.
      
      The imposition of this limit is spread about a bit - sometime we don't
      add new things to the list, sometimes we remove old things.
      
      Also it is currently applied to requests which we are 'waiting' for
      rather than 'deferring'.  This doesn't seem ideal as 'waiting'
      requests are naturally limited by the number of threads.
      
      So gather the DFR_MAX handling code to one place and only apply it to
      requests that are actually being deferred.
      
      This means that not all 'cache_deferred_req' structures go on the
      'cache_defer_list, so we need to be careful when adding and removing
      things.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      e33534d5
    • N
      sunrpc: Simplify cache_defer_req and related functions. · d29068c4
      NeilBrown 提交于
      The return value from cache_defer_req is somewhat confusing.
      Various different error codes are returned, but the single caller is
      only interested in success or failure.
      
      In fact it can measure this success or failure itself by checking
      CACHE_PENDING, which makes the point of the code more explicit.
      
      So change cache_defer_req to return 'void' and test CACHE_PENDING
      after it completes, to see if the request was actually deferred or
      not.
      
      Similarly setup_deferral and cache_wait_req don't need a return value,
      so make them void and remove some code.
      
      The call to cache_revisit_request (to guard against a race) is only
      needed for the second call to setup_deferral, so move it out of
      setup_deferral to after that second call.  With the first call the
      race is handled differently (by explicitly calling
      'wait_for_completion').
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      d29068c4
  5. 02 10月, 2010 1 次提交
    • N
      sunrpc: fix race in new cache_wait code. · 277f68db
      NeilBrown 提交于
      If we set up to wait for a cache item to be filled in, and then find
      that it is no longer pending, it could be that some other thread is
      in 'cache_revisit_request' and has moved our request to its 'pending' list.
      So when our setup_deferral calls cache_revisit_request it will find nothing to
      put on the pending list, and do nothing.
      
      We then return from cache_wait_req, thus leaving the 'sleeper'
      on-stack structure open to being corrupted by subsequent stack usage.
      
      However that 'sleeper' could still be on the 'pending' list that the
      other thread is looking at and so any corruption could cause it to behave badly.
      
      To avoid this race we simply take the same path as if the
      'wait_for_completion_interruptible_timeout' was interrupted and if the
      sleeper is no longer on the list (which it won't be) we wait on the
      completion - which will ensure that any other cache_revisit_request
      will have let go of the sleeper.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      277f68db
  6. 27 9月, 2010 2 次提交
  7. 23 9月, 2010 1 次提交
  8. 22 9月, 2010 2 次提交
  9. 20 9月, 2010 1 次提交
    • J
      nfsd4: fix hang on fast-booting nfs servers · 06497524
      J. Bruce Fields 提交于
      The last_close field of a cache_detail is initialized to zero, so the
      condition
      
      	detail->last_close < seconds_since_boot() - 30
      
      may be false even for a cache that was never opened.
      
      However, we want to immediately fail upcalls to caches that were never
      opened: in the case of the auth_unix_gid cache, especially, which may
      never be opened by mountd (if the --manage-gids option is not set), we
      want to fail the upcall immediately.  Otherwise client requests will be
      dropped unnecessarily on reboot.
      
      Also document these conditions.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      06497524
  10. 08 9月, 2010 4 次提交
    • J
      svcrpc: cache deferral cleanup · 3211af11
      J. Bruce Fields 提交于
      Attempt to make obvious the first-try-sleeping-then-try-deferral logic
      by putting that logic into a top-level function that calls helpers.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      3211af11
    • J
      svcrpc: minor cache cleanup · 6610f720
      J. Bruce Fields 提交于
      Pull out some code into helper functions, fix a typo.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      6610f720
    • N
      sunrpc/cache: allow threads to block while waiting for cache update. · f16b6e8d
      NeilBrown 提交于
      The current practice of waiting for cache updates by queueing the
      whole request to be retried has (at least) two problems.
      
      1/ With NFSv4, requests can be quite complex and re-trying a whole
        request when a later part fails should only be a last-resort, not a
        normal practice.
      
      2/ Large requests, and in particular any 'write' request, will not be
        queued by the current code and doing so would be undesirable.
      
      In many cases only a very sort wait is needed before the cache gets
      valid data.
      
      So, providing the underlying transport permits it by setting
       ->thread_wait,
      arrange to wait briefly for an upcall to be completed (as reflected in
      the clearing of CACHE_PENDING).
      If the short wait was not long enough and CACHE_PENDING is still set,
      fall back on the old approach.
      
      The 'thread_wait' value is set to 5 seconds when there are spare
      threads, and 1 second when there are no spare threads.
      
      These values are probably much higher than needed, but will ensure
      some forward progress.
      
      Note that as we only request an update for a non-valid item, and as
      non-valid items are updated in place it is extremely unlikely that
      cache_check will return -ETIMEDOUT.  Normally cache_defer_req will
      sleep for a short while and then find that the item is_valid.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      f16b6e8d
    • N
      sunrpc: use seconds since boot in expiry cache · c5b29f88
      NeilBrown 提交于
      This protects us from confusion when the wallclock time changes.
      
      We convert to and from wallclock when  setting or reading expiry
      times.
      
      Also use seconds since boot for last_clost time.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      c5b29f88
  11. 07 8月, 2010 1 次提交
  12. 07 7月, 2010 1 次提交
    • A
      sunrpc: make the cache cleaner workqueue deferrable · 8eab945c
      Artem Bityutskiy 提交于
      This patch makes the cache_cleaner workqueue deferrable, to prevent
      unnecessary system wake-ups, which is very important for embedded
      battery-powered devices.
      
      do_cache_clean() is called every 30 seconds at the moment, and often
      makes the system wake up from its power-save sleep state. With this
      change, when the workqueue uses a deferrable timer, the
      do_cache_clean() invocation will be delayed and combined with the
      closest "real" wake-up. This improves the power consumption situation.
      
      Note, I tried to create a DECLARE_DELAYED_WORK_DEFERRABLE() helper
      macro, similar to DECLARE_DELAYED_WORK(), but failed because of the
      way the timer wheel core stores the deferrable flag (it is the
      LSBit in the time->base pointer). My attempt to define a static
      variable with this bit set ended up with the "initializer element is
      not constant" error.
      
      Thus, I have to use run-time initialization, so I created a new
      cache_initialize() function which is called once when sunrpc is
      being initialized.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      8eab945c
  13. 06 7月, 2010 1 次提交
  14. 22 5月, 2010 1 次提交
  15. 17 5月, 2010 2 次提交
    • F
      sunrpc: Include missing smp_lock.h · 99df95a2
      Frederic Weisbecker 提交于
      Now that cache_ioctl_procfs() calls the bkl explicitly, we need to
      include the relevant header as well.
      
      This fixes the following build error:
      
      	net/sunrpc/cache.c: In function 'cache_ioctl_procfs':
      	net/sunrpc/cache.c:1355: error: implicit declaration of function 'lock_kernel'
      	net/sunrpc/cache.c:1359: error: implicit declaration of function 'unlock_kernel'
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      99df95a2
    • F
      procfs: Push down the bkl from ioctl · d79b6f4d
      Frederic Weisbecker 提交于
      Push down the bkl from procfs's ioctl main handler to its users.
      Only three procfs users implement an ioctl (non unlocked) handler.
      Turn them into unlocked_ioctl and push down the Devil inside.
      
      v2: PDE(inode)->data doesn't need to be under bkl
      v3: And don't forget to git-add the result
      v4: Use wrappers to pushdown instead of an invasive and error prone
          handlers surgery.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      d79b6f4d
  16. 24 3月, 2010 1 次提交
  17. 15 3月, 2010 3 次提交
    • N
      sunrpc: never return expired entries in sunrpc_cache_lookup · d202cce8
      NeilBrown 提交于
      If sunrpc_cache_lookup finds an expired entry, remove it from
      the cache and return a freshly created non-VALID entry instead.
      This ensures that we only ever get a usable entry, or an
      entry that will become usable once an update arrives.
      i.e. we will never need to repeat the lookup.
      
      This allows us to remove the 'is_expired' test from cache_check
      (i.e. from cache_is_valid).  cache_check should never get an expired
      entry as 'lookup' will never return one.  If it does happen - due to
      inconvenient timing - then just accept it as still valid, it won't be
      very much past it's use-by date.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      d202cce8
    • N
      sunrpc/cache: factor out cache_is_expired · 2f50d8b6
      NeilBrown 提交于
      This removes a tiny bit of code duplication, but more important
      prepares for following patch which will perform the expiry check in
      cache_lookup and the rest of the validity check in cache_check.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      2f50d8b6
    • N
      sunrpc: don't keep expired entries in the auth caches. · 3af4974e
      NeilBrown 提交于
      currently expired entries remain in the auth caches as long
      as there is a reference.
      This was needed long ago when the auth_domain cache used the same
      cache infrastructure.  But since that (being a very different sort
      of cache) was separated, this test is no longer needed.
      
      So remove the test on refcnt and tidy up the surrounding code.
      
      This allows the cache_dequeue call (which needed to be there to
      drop a potentially awkward reference) can be moved outside of the
      spinlock which is a better place for it.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      3af4974e
  18. 30 11月, 2009 1 次提交
  19. 19 9月, 2009 1 次提交
  20. 18 9月, 2009 1 次提交
  21. 12 9月, 2009 2 次提交
  22. 20 8月, 2009 1 次提交
  23. 10 8月, 2009 6 次提交