1. 02 4月, 2015 10 次提交
    • D
      FS-Cache: Retain the netfs context in the retrieval op earlier · 4a47132f
      David Howells 提交于
      Now that the retrieval operation may be disposed of by fscache_put_operation()
      before we actually set the context, the retrieval-specific cleanup operation
      can produce a NULL-pointer dereference when it tries to unconditionally clean
      up the netfs context.
      
      Given that it is expected that we'll get at least as far as the place where we
      currently set the context pointer and it is unlikely we'll go through the
      error handling paths prior to that point, retain the context right from the
      point that the retrieval op is allocated.
      
      Concomitant to this, we need to retain the cookie pointer in the retrieval op
      also so that we can call the netfs to release its context in the release
      method.
      
      In addition, we might now get into fscache_release_retrieval_op() with the op
      only initialised.  To this end, set the operation to DEAD only after the
      release method has been called and skip the n_pages test upon cleanup if the
      op is still in the INITIALISED state.
      
      Without these changes, the following oops might be seen:
      
      	BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
      	...
      	RIP: 0010:[<ffffffffa0089c98>] fscache_release_retrieval_op+0xae/0x100
      	...
      	Call Trace:
      	 [<ffffffffa0088560>] fscache_put_operation+0x117/0x2e0
      	 [<ffffffffa008b8f5>] __fscache_read_or_alloc_pages+0x351/0x3ac
      	 [<ffffffffa00b761f>] __nfs_readpages_from_fscache+0x59/0xbf [nfs]
      	 [<ffffffffa00b06c5>] nfs_readpages+0x10c/0x185 [nfs]
      	 [<ffffffff81124925>] ? alloc_pages_current+0x119/0x13e
      	 [<ffffffff810ee5fd>] ? __page_cache_alloc+0xfb/0x10a
      	 [<ffffffff810f87f8>] __do_page_cache_readahead+0x188/0x22c
      	 [<ffffffff810f8b3a>] ondemand_readahead+0x29e/0x2af
      	 [<ffffffff810f8c92>] page_cache_sync_readahead+0x38/0x3a
      	 [<ffffffff810ef337>] generic_file_read_iter+0x1a2/0x55a
      	 [<ffffffffa00a9dff>] ? nfs_revalidate_mapping+0xd6/0x288 [nfs]
      	 [<ffffffffa00a6a23>] nfs_file_read+0x49/0x70 [nfs]
      	 [<ffffffff811363be>] new_sync_read+0x78/0x9c
      	 [<ffffffff81137164>] __vfs_read+0x13/0x38
      	 [<ffffffff8113721e>] vfs_read+0x95/0x121
      	 [<ffffffff811372f6>] SyS_read+0x4c/0x8a
      	 [<ffffffff81557a52>] system_call_fastpath+0x12/0x17
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NJeff Layton <jeff.layton@primarydata.com>
      4a47132f
    • D
      FS-Cache: The operation cancellation method needs calling in more places · d3b97ca4
      David Howells 提交于
      Any time an incomplete operation is cancelled, the operation cancellation
      function needs to be called to clean up.  This is currently being passed
      directly to some of the functions that might want to call it, but not all.
      
      Instead, pass the cancellation method pointer to the fscache_operation_init()
      and have that cache it in the operation struct.  Further, plug in a dummy
      cancellation handler if the caller declines to set one as this allows us to
      call the function unconditionally (the extra overhead isn't worth bothering
      about as we don't expect to be calling this typically).
      
      The cancellation method must thence be called everywhere the CANCELLED state
      is set.  Note that we call it *before* setting the CANCELLED state such that
      the method can use the old state value to guide its operation.
      
      fscache_do_cancel_retrieval() needs moving higher up in the sources so that
      the init function can use it now.
      
      Without this, the following oops may be seen:
      
      	FS-Cache: Assertion failed
      	FS-Cache: 3 == 0 is false
      	------------[ cut here ]------------
      	kernel BUG at ../fs/fscache/page.c:261!
      	...
      	RIP: 0010:[<ffffffffa0089c1b>]  fscache_release_retrieval_op+0x77/0x100
      	 [<ffffffffa008853d>] fscache_put_operation+0x114/0x2da
      	 [<ffffffffa008b8c2>] __fscache_read_or_alloc_pages+0x358/0x3b3
      	 [<ffffffffa00b761f>] __nfs_readpages_from_fscache+0x59/0xbf [nfs]
      	 [<ffffffffa00b06c5>] nfs_readpages+0x10c/0x185 [nfs]
      	 [<ffffffff81124925>] ? alloc_pages_current+0x119/0x13e
      	 [<ffffffff810ee5fd>] ? __page_cache_alloc+0xfb/0x10a
      	 [<ffffffff810f87f8>] __do_page_cache_readahead+0x188/0x22c
      	 [<ffffffff810f8b3a>] ondemand_readahead+0x29e/0x2af
      	 [<ffffffff810f8c92>] page_cache_sync_readahead+0x38/0x3a
      	 [<ffffffff810ef337>] generic_file_read_iter+0x1a2/0x55a
      	 [<ffffffffa00a9dff>] ? nfs_revalidate_mapping+0xd6/0x288 [nfs]
      	 [<ffffffffa00a6a23>] nfs_file_read+0x49/0x70 [nfs]
      	 [<ffffffff811363be>] new_sync_read+0x78/0x9c
      	 [<ffffffff81137164>] __vfs_read+0x13/0x38
      	 [<ffffffff8113721e>] vfs_read+0x95/0x121
      	 [<ffffffff811372f6>] SyS_read+0x4c/0x8a
      	 [<ffffffff81557a52>] system_call_fastpath+0x12/0x17
      
      The assertion is showing that the remaining number of pages (n_pages) is not 0
      when the operation is being released.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NJeff Layton <jeff.layton@primarydata.com>
      d3b97ca4
    • D
      FS-Cache: Put an aborted initialised op so that it is accounted correctly · a39caadf
      David Howells 提交于
      Call fscache_put_operation() or a wrapper on any op that has gone through
      fscache_operation_init() so that the accounting shown in /proc is done
      correctly, specifically fscache_n_op_release.
      
      fscache_put_operation() therefore now allows an op in the INITIALISED state as
      well as in the CANCELLED and COMPLETE states.
      
      Note that this means that an operation can get put that doesn't have its
      ->object pointer filled in, so anything that depends on the object needs to be
      conditional in fscache_put_operation().
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NJeff Layton <jeff.layton@primarydata.com>
      a39caadf
    • D
      FS-Cache: Fix cancellation of in-progress operation · 73c04a47
      David Howells 提交于
      Cancellation of an in-progress operation needs to update the relevant counters
      and start any operations that are pending waiting on this one.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NJeff Layton <jeff.layton@primarydata.com>
      73c04a47
    • D
      FS-Cache: Count the number of initialised operations · 03cdd0e4
      David Howells 提交于
      Count and display through /proc/fs/fscache/stats the number of initialised
      operations.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NJeff Layton <jeff.layton@primarydata.com>
      03cdd0e4
    • D
      FS-Cache: Out of line fscache_operation_init() · 1339ec98
      David Howells 提交于
      Out of line fscache_operation_init() so that it can access internal FS-Cache
      features, such as stats, in a later commit.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NJeff Layton <jeff.layton@primarydata.com>
      1339ec98
    • D
      FS-Cache: Permit fscache_cancel_op() to cancel in-progress operations too · 418b7eb9
      David Howells 提交于
      Currently, fscache_cancel_op() only cancels pending operations - attempts to
      cancel in-progress operations are ignored.  This leads to a problem in
      fscache_wait_for_operation_activation() whereby the wait is terminated, but
      the object has been killed.
      
      The check at the end of the function now triggers because it's no longer
      contingent on the cache having produced an I/O error since the commit that
      fixed the logic error in fscache_object_is_dead().
      
      The result of the check is that it tries to cancel the operation - but since
      the object may not be pending by this point, the cancellation request may be
      ignored - with the result that the the object is just put by the caller and
      fscache_put_operation has an assertion failure because the operation isn't in
      either the COMPLETE or the CANCELLED states.
      
      To fix this, we permit in-progress ops to be cancelled under some
      circumstances.
      
      The bug results in an oops that looks something like this:
      
      	FS-Cache: fscache_wait_for_operation_activation() = -ENOBUFS [obj dead 3]
      	FS-Cache:
      	FS-Cache: Assertion failed
      	FS-Cache: 3 == 5 is false
      	------------[ cut here ]------------
      	kernel BUG at ../fs/fscache/operation.c:432!
      	...
      	RIP: 0010:[<ffffffffa0088574>] fscache_put_operation+0xf2/0x2cd
      	Call Trace:
      	 [<ffffffffa008b92a>] __fscache_read_or_alloc_pages+0x2ec/0x3b3
      	 [<ffffffffa00b761f>] __nfs_readpages_from_fscache+0x59/0xbf [nfs]
      	 [<ffffffffa00b06c5>] nfs_readpages+0x10c/0x185 [nfs]
      	 [<ffffffff81124925>] ? alloc_pages_current+0x119/0x13e
      	 [<ffffffff810ee5fd>] ? __page_cache_alloc+0xfb/0x10a
      	 [<ffffffff810f87f8>] __do_page_cache_readahead+0x188/0x22c
      	 [<ffffffff810f8b3a>] ondemand_readahead+0x29e/0x2af
      	 [<ffffffff810f8c92>] page_cache_sync_readahead+0x38/0x3a
      	 [<ffffffff810ef337>] generic_file_read_iter+0x1a2/0x55a
      	 [<ffffffffa00a9dff>] ? nfs_revalidate_mapping+0xd6/0x288 [nfs]
      	 [<ffffffffa00a6a23>] nfs_file_read+0x49/0x70 [nfs]
      	 [<ffffffff811363be>] new_sync_read+0x78/0x9c
      	 [<ffffffff81137164>] __vfs_read+0x13/0x38
      	 [<ffffffff8113721e>] vfs_read+0x95/0x121
      	 [<ffffffff811372f6>] SyS_read+0x4c/0x8a
      	 [<ffffffff81557a52>] system_call_fastpath+0x12/0x17
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NJeff Layton <jeff.layton@primarydata.com>
      418b7eb9
    • D
      FS-Cache: Handle a new operation submitted against a killed object · 6515d1db
      David Howells 提交于
      Reject new operations that are being submitted against an object if that
      object has failed its lookup or creation states or has been killed by the
      cache backend for some other reason, such as having been culled.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NJeff Layton <jeff.layton@primarydata.com>
      6515d1db
    • D
      FS-Cache: When submitting an op, cancel it if the target object is dying · 30ceec62
      David Howells 提交于
      When submitting an operation, prefer to cancel the operation immediately
      rather than queuing it for later processing if the object is marked as dying
      (ie. the object state machine has reached the KILL_OBJECT state).
      
      Whilst we're at it, change the series of related test_bit() calls into a
      READ_ONCE() and bitwise-AND operators to reduce the number of load
      instructions (test_bit() has a volatile address).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NJeff Layton <jeff.layton@primarydata.com>
      30ceec62
    • D
      FS-Cache: Move fscache_report_unexpected_submission() to make it more available · 3c305984
      David Howells 提交于
      Move fscache_report_unexpected_submission() up within operation.c so that it
      can be called from fscache_submit_exclusive_op() too.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NJeff Layton <jeff.layton@primarydata.com>
      3c305984
  2. 05 6月, 2014 1 次提交
  3. 19 6月, 2013 4 次提交
    • D
      FS-Cache: Don't use spin_is_locked() in assertions · dcfae32f
      David Howells 提交于
      Under certain circumstances, spin_is_locked() is hardwired to 0 - even when the
      code would normally be in a locked section where it should return 1.  This
      means it cannot be used for an assertion that checks that a spinlock is locked.
      
      Remove such usages from FS-Cache.
      
      The following oops might otherwise be observed:
      
      FS-Cache: Assertion failed
      BUG: failure at fs/fscache/operation.c:270/fscache_start_operations()!
      Kernel panic - not syncing: BUG!
      CPU: 0 PID: 10 Comm: kworker/u2:1 Not tainted 3.10.0-rc1-00133-ge7ebb75 #2
      Workqueue: fscache_operation fscache_op_work_func [fscache]
      7f091c48 603c8947 7f090000 7f9b1361 7f25f080 00000001 7f26d440 7f091c90
      60299eb8 7f091d90 602951c5 7f26d440 3000000008 7f091da0 7f091cc0 7f091cd0
      00000007 00000007 00000006 7f091ae0 00000010 0000010e 7f9af330 7f091ae0
      Call Trace:
      7f091c88: [<60299eb8>] dump_stack+0x17/0x19
      7f091c98: [<602951c5>] panic+0xf4/0x1e9
      7f091d38: [<6002b10e>] set_signals+0x1e/0x40
      7f091d58: [<6005b89e>] __wake_up+0x4e/0x70
      7f091d98: [<7f9aa003>] fscache_start_operations+0x43/0x50 [fscache]
      7f091da8: [<7f9aa1e3>] fscache_op_complete+0x1d3/0x220 [fscache]
      7f091db8: [<60082985>] unlock_page+0x55/0x60
      7f091de8: [<7fb25bb0>] cachefiles_read_copier+0x250/0x330 [cachefiles]
      7f091e58: [<7f9ab03c>] fscache_op_work_func+0xac/0x120 [fscache]
      7f091e88: [<6004d5b0>] process_one_work+0x250/0x3a0
      7f091ef8: [<6004edc7>] worker_thread+0x177/0x2a0
      7f091f38: [<6004ec50>] worker_thread+0x0/0x2a0
      7f091f58: [<60054418>] kthread+0xd8/0xe0
      7f091f68: [<6005bb27>] finish_task_switch.isra.64+0x37/0xa0
      7f091fd8: [<600185cf>] new_thread_handler+0x8f/0xb0
      Reported-by: NMilosz Tanski <milosz@adfin.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-and-tested-By: NMilosz Tanski <milosz@adfin.com>
      dcfae32f
    • D
      FS-Cache: Simplify cookie retention for fscache_objects, fixing oops · 1362729b
      David Howells 提交于
      Simplify the way fscache cache objects retain their cookie.  The way I
      implemented the cookie storage handling made synchronisation a pain (ie. the
      object state machine can't rely on the cookie actually still being there).
      
      Instead of the the object being detached from the cookie and the cookie being
      freed in __fscache_relinquish_cookie(), we defer both operations:
      
       (*) The detachment of the object from the list in the cookie now takes place
           in fscache_drop_object() and is thus governed by the object state machine
           (fscache_detach_from_cookie() has been removed).
      
       (*) The release of the cookie is now in fscache_object_destroy() - which is
           called by the cache backend just before it frees the object.
      
      This means that the fscache_cookie struct is now available to the cache all the
      way through from ->alloc_object() to ->drop_object() and ->put_object() -
      meaning that it's no longer necessary to take object->lock to guarantee access.
      
      However, __fscache_relinquish_cookie() doesn't wait for the object to go all
      the way through to destruction before letting the netfs proceed.  That would
      massively slow down the netfs.  Since __fscache_relinquish_cookie() leaves the
      cookie around, in must therefore break all attachments to the netfs - which
      includes ->def, ->netfs_data and any outstanding page read/writes.
      
      To handle this, struct fscache_cookie now has an n_active counter:
      
       (1) This starts off initialised to 1.
      
       (2) Any time the cache needs to get at the netfs data, it calls
           fscache_use_cookie() to increment it - if it is not zero.  If it was zero,
           then access is not permitted.
      
       (3) When the cache has finished with the data, it calls fscache_unuse_cookie()
           to decrement it.  This does a wake-up on it if it reaches 0.
      
       (4) __fscache_relinquish_cookie() decrements n_active and then waits for it to
           reach 0.  The initialisation to 1 in step (1) ensures that we only get
           wake ups when we're trying to get rid of the cookie.
      
      This leaves __fscache_relinquish_cookie() a lot simpler.
      
      
      ***
      This fixes a problem in the current code whereby if fscache_invalidate() is
      followed sufficiently quickly by fscache_relinquish_cookie() then it is
      possible for __fscache_relinquish_cookie() to have detached the cookie from the
      object and cleared the pointer before a thread is dispatched to process the
      invalidation state in the object state machine.
      
      Since the pending write clearance was deferred to the invalidation state to
      make it asynchronous, we need to either wait in relinquishment for the stores
      tree to be cleared in the invalidation state or we need to handle the clearance
      in relinquishment.
      
      Further, if the relinquishment code does clear the tree, then the invalidation
      state need to make the clearance contingent on still having the cookie to hand
      (since that's where the tree is rooted) and we have to prevent the cookie from
      disappearing for the duration.
      
      This can lead to an oops like the following:
      
      BUG: unable to handle kernel NULL pointer dereference at 000000000000000c
      ...
      RIP: 0010:[<ffffffff8151023e>] _spin_lock+0xe/0x30
      ...
      CR2: 000000000000000c ...
      ...
      Process kslowd002 (...)
      ....
      Call Trace:
       [<ffffffffa01c3278>] fscache_invalidate_writes+0x38/0xd0 [fscache]
       [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
       [<ffffffff8105e759>] ? find_busiest_queue+0x69/0x150
       [<ffffffff8110ddd4>] ? slow_work_enqueue+0x104/0x180
       [<ffffffffa01c1303>] fscache_object_slow_work_execute+0x5e3/0x9d0 [fscache]
       [<ffffffff81096b67>] ? bit_waitqueue+0x17/0xd0
       [<ffffffff8110e233>] slow_work_execute+0x233/0x310
       [<ffffffff8110e515>] slow_work_thread+0x205/0x360
       [<ffffffff81096ca0>] ? autoremove_wake_function+0x0/0x40
       [<ffffffff8110e310>] ? slow_work_thread+0x0/0x360
       [<ffffffff81096936>] kthread+0x96/0xa0
       [<ffffffff8100c0ca>] child_rip+0xa/0x20
       [<ffffffff810968a0>] ? kthread+0x0/0xa0
       [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      
      The parameter to fscache_invalidate_writes() was object->cookie which is NULL.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-By: NMilosz Tanski <milosz@adfin.com>
      Acked-by: NJeff Layton <jlayton@redhat.com>
      1362729b
    • D
      FS-Cache: Fix object state machine to have separate work and wait states · caaef690
      David Howells 提交于
      Fix object state machine to have separate work and wait states as that makes
      it easier to envision.
      
      There are now three kinds of state:
      
       (1) Work state.  This is an execution state.  No event processing is performed
           by a work state.  The function attached to a work state returns a pointer
           indicating the next state to which the OSM should transition.  Returning
           NO_TRANSIT repeats the current state, but goes back to the scheduler
           first.
      
       (2) Wait state.  This is an event processing state.  No execution is
           performed by a wait state.  Wait states are just tables of "if event X
           occurs, clear it and transition to state Y".  The dispatcher returns to
           the scheduler if none of the events in which the wait state has an
           interest are currently pending.
      
       (3) Out-of-band state.  This is a special work state.  Transitions to normal
           states can be overridden when an unexpected event occurs (eg. I/O error).
           Instead the dispatcher disables and clears the OOB event and transits to
           the specified work state.  This then acts as an ordinary work state,
           though object->state points to the overridden destination.  Returning
           NO_TRANSIT resumes the overridden transition.
      
      In addition, the states have names in their definitions, so there's no need for
      tables of state names.  Further, the EV_REQUEUE event is no longer necessary as
      that is automatic for work states.
      
      Since the states are now separate structs rather than values in an enum, it's
      not possible to use comparisons other than (non-)equality between them, so use
      some object->flags to indicate what phase an object is in.
      
      The EV_RELEASE, EV_RETIRE and EV_WITHDRAW events have been squished into one
      (EV_KILL).  An object flag now carries the information about retirement.
      
      Similarly, the RELEASING, RECYCLING and WITHDRAWING states have been merged
      into an KILL_OBJECT state and additional states have been added for handling
      waiting dependent objects (JUMPSTART_DEPS and KILL_DEPENDENTS).
      
      A state has also been added for synchronising with parent object initialisation
      (WAIT_FOR_PARENT) and another for initiating look up (PARENT_READY).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-By: NMilosz Tanski <milosz@adfin.com>
      Acked-by: NJeff Layton <jlayton@redhat.com>
      caaef690
    • D
      FS-Cache: Wrap checks on object state · 493f7bc1
      David Howells 提交于
      Wrap checks on object state (mostly outside of fs/fscache/object.c) with
      inline functions so that the mechanism can be replaced.
      
      Some of the state checks within object.c are left as-is as they will be
      replaced.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-By: NMilosz Tanski <milosz@adfin.com>
      Acked-by: NJeff Layton <jlayton@redhat.com>
      493f7bc1
  4. 21 12月, 2012 6 次提交
    • D
      FS-Cache: Clear remaining page count on retrieval cancellation · 91c7fbbf
      David Howells 提交于
      Provide fscache_cancel_op() with a pointer to a function it should invoke under
      lock if it cancels an operation.
      
      Use this to clear the remaining page count upon cancellation of a pending
      retrieval operation so that fscache_release_retrieval_op() doesn't get an
      assertion failure (see below).  This can happen when a signal occurs, say from
      CTRL-C being pressed during data retrieval.
      
      FS-Cache: Assertion failed
      3 == 0 is false
      ------------[ cut here ]------------
      kernel BUG at fs/fscache/page.c:237!
      invalid opcode: 0000 [#641] SMP
      Modules linked in: cachefiles(F) nfsv4(F) nfsv3(F) nfsv2(F) nfs(F) fscache(F) auth_rpcgss(F) nfs_acl(F) lockd(F) sunrpc(F)
      CPU 0
      Pid: 6075, comm: slurp-q Tainted: GF     D      3.7.0-rc8-fsdevel+ #411                  /DG965RY
      RIP: 0010:[<ffffffffa007f328>]  [<ffffffffa007f328>] fscache_release_retrieval_op+0x75/0xff [fscache]
      RSP: 0000:ffff88001c6d7988  EFLAGS: 00010296
      RAX: 000000000000000f RBX: ffff880014cdfe00 RCX: ffffffff6c102000
      RDX: ffffffff8102d1ad RSI: ffffffff6c102000 RDI: ffffffff8102d1d6
      RBP: ffff88001c6d7998 R08: 0000000000000002 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 00000000fffffe00
      R13: ffff88001c6d7ab4 R14: ffff88001a8638a0 R15: ffff88001552b190
      FS:  00007f877aaf0700(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007fff11378fd2 CR3: 000000001c6c6000 CR4: 00000000000007f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process slurp-q (pid: 6075, threadinfo ffff88001c6d6000, task ffff88001c6c4080)
      Stack:
       ffffffffa007ec07 ffff880014cdfe00 ffff88001c6d79c8 ffffffffa007db4d
       ffffffffa007ec07 ffff880014cdfe00 00000000fffffe00 ffff88001c6d7ab4
       ffff88001c6d7a38 ffffffffa008116d 0000000000000000 ffff88001c6c4080
      Call Trace:
       [<ffffffffa007ec07>] ? fscache_cancel_op+0x194/0x1cf [fscache]
       [<ffffffffa007db4d>] fscache_put_operation+0x135/0x2ed [fscache]
       [<ffffffffa007ec07>] ? fscache_cancel_op+0x194/0x1cf [fscache]
       [<ffffffffa008116d>] __fscache_read_or_alloc_pages+0x413/0x4bc [fscache]
       [<ffffffff810ac8ae>] ? __alloc_pages_nodemask+0x195/0x75c
       [<ffffffffa00aab0f>] __nfs_readpages_from_fscache+0x86/0x13d [nfs]
       [<ffffffffa00a5fe0>] nfs_readpages+0x186/0x1bd [nfs]
       [<ffffffff810d23c8>] ? alloc_pages_current+0xc7/0xe4
       [<ffffffff810a68b5>] ? __page_cache_alloc+0x84/0x91
       [<ffffffff810af912>] ? __do_page_cache_readahead+0xa6/0x2e0
       [<ffffffff810afaa3>] __do_page_cache_readahead+0x237/0x2e0
       [<ffffffff810af912>] ? __do_page_cache_readahead+0xa6/0x2e0
       [<ffffffff810afe3e>] ra_submit+0x1c/0x20
       [<ffffffff810b019b>] ondemand_readahead+0x359/0x382
       [<ffffffff810b0279>] page_cache_sync_readahead+0x38/0x3a
       [<ffffffff810a77b5>] generic_file_aio_read+0x26b/0x637
       [<ffffffffa00f1852>] ? nfs_mark_delegation_referenced+0xb/0xb [nfsv4]
       [<ffffffffa009cc85>] nfs_file_read+0xaa/0xcf [nfs]
       [<ffffffff810db5b3>] do_sync_read+0x91/0xd1
       [<ffffffff810dbb8b>] vfs_read+0x9b/0x144
       [<ffffffff810dbc78>] sys_read+0x44/0x75
       [<ffffffff81422892>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      91c7fbbf
    • D
      FS-Cache: Mark cancellation of in-progress operation · 1f372dff
      David Howells 提交于
      Mark as cancelled an operation that is in progress rather than pending at the
      time it is cancelled, and call fscache_complete_op() to cancel an operation so
      that blocked ops can be started.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      1f372dff
    • D
      FS-Cache: Exclusive op submission can BUG if there's been an I/O error · 8d76349d
      David Howells 提交于
      The function to submit an exclusive op (fscache_submit_exclusive_op()) can BUG
      if there's been an I/O error because it may see the parent cache object in an
      unexpected state.  It should only BUG if there hasn't been an I/O error.
      
      In this case the problem was produced by remounting the cache partition to be
      R/O.  The EROFS state was detected and the cache was aborted, but not
      everything handled the aborting correctly.
      
      SysRq : Emergency Remount R/O
      EXT4-fs (sda6): re-mounted. Opts: (null)
      Emergency Remount complete
      CacheFiles: I/O Error: Failed to update xattr with error -30
      FS-Cache: Cache cachefiles stopped due to I/O error
      ------------[ cut here ]------------
      kernel BUG at fs/fscache/operation.c:128!
      invalid opcode: 0000 [#1] SMP 
      CPU 0 
      Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc
      
      Pid: 6612, comm: kworker/u:2 Not tainted 3.1.0-rc8-fsdevel+ #1093                  /DG965RY
      RIP: 0010:[<ffffffffa00739c0>]  [<ffffffffa00739c0>] fscache_submit_exclusive_op+0x2ad/0x2c2 [fscache]
      RSP: 0018:ffff880000853d40  EFLAGS: 00010206
      RAX: ffff880038ac72a8 RBX: ffff8800181f2260 RCX: ffffffff81f2b2b0
      RDX: 0000000000000001 RSI: ffffffff8179a478 RDI: ffff8800181f2280
      RBP: ffff880000853d60 R08: 0000000000000002 R09: 0000000000000000
      R10: 0000000000000001 R11: 0000000000000001 R12: ffff880038ac7268
      R13: ffff8800181f2280 R14: ffff88003a359190 R15: 000000010122b162
      FS:  0000000000000000(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00000034cc4a77f0 CR3: 0000000010e96000 CR4: 00000000000006f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process kworker/u:2 (pid: 6612, threadinfo ffff880000852000, task ffff880014c3c040)
      Stack:
       ffff8800181f2260 ffff8800181f2310 ffff880038ac7268 ffff8800181f2260
       ffff880000853dc0 ffffffffa0072375 ffff880037ecfe00 ffff88003a359198
       ffff880000853dc0 0000000000000246 0000000000000000 ffff88000a91d308
      Call Trace:
       [<ffffffffa0072375>] fscache_object_work_func+0x792/0xe65 [fscache]
       [<ffffffff81047e44>] process_one_work+0x1eb/0x37f
       [<ffffffff81047de6>] ? process_one_work+0x18d/0x37f
       [<ffffffffa0071be3>] ? fscache_enqueue_dependents+0xd8/0xd8 [fscache]
       [<ffffffff810482e4>] worker_thread+0x15a/0x21a
       [<ffffffff8104818a>] ? rescuer_thread+0x188/0x188
       [<ffffffff8104bf96>] kthread+0x7f/0x87
       [<ffffffff813ad6f4>] kernel_thread_helper+0x4/0x10
       [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
       [<ffffffff813abd1d>] ? retint_restore_args+0xe/0xe
       [<ffffffff8104bf17>] ? __init_kthread_worker+0x53/0x53
       [<ffffffff813ad6f0>] ? gs_change+0xb/0xb
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      8d76349d
    • D
      FS-Cache: Provide proper invalidation · ef778e7a
      David Howells 提交于
      Provide a proper invalidation method rather than relying on the netfs retiring
      the cookie it has and getting a new one.  The problem with this is that isn't
      easy for the netfs to make sure that it has completed/cancelled all its
      outstanding storage and retrieval operations on the cookie it is retiring.
      
      Instead, have the cache provide an invalidation method that will cancel or wait
      for all currently outstanding operations before invalidating the cache, and
      will cause new operations to queue up behind that.  Whilst invalidation is in
      progress, some requests will be rejected until the cache can stack a barrier on
      the operation queue to cause new operations to be deferred behind it.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      ef778e7a
    • D
      FS-Cache: Fix operation state management and accounting · 9f10523f
      David Howells 提交于
      Fix the state management of internal fscache operations and the accounting of
      what operations are in what states.
      
      This is done by:
      
       (1) Give struct fscache_operation a enum variable that directly represents the
           state it's currently in, rather than spreading this knowledge over a bunch
           of flags, who's processing the operation at the moment and whether it is
           queued or not.
      
           This makes it easier to write assertions to check the state at various
           points and to prevent invalid state transitions.
      
       (2) Add an 'operation complete' state and supply a function to indicate the
           completion of an operation (fscache_op_complete()) and make things call
           it.  The final call to fscache_put_operation() can then check that an op
           in the appropriate state (complete or cancelled).
      
       (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better
           govern the state of an object:
      
      	(a) The ->n_ops is now the number of extant operations on the object
      	    and is now decremented by fscache_put_operation() only.
      
      	(b) The ->n_in_progress is simply the number of objects that have been
      	    taken off of the object's pending queue for the purposes of being
      	    run.  This is decremented by fscache_op_complete() only.
      
      	(c) The ->n_exclusive is the number of exclusive ops that have been
      	    submitted and queued or are in progress.  It is decremented by
      	    fscache_op_complete() and by fscache_cancel_op().
      
           fscache_put_operation() and fscache_operation_gc() now no longer try to
           clean up ->n_exclusive and ->n_in_progress.  That was leading to double
           decrements against fscache_cancel_op().
      
           fscache_cancel_op() now no longer decrements ->n_ops.  That was leading to
           double decrements against fscache_put_operation().
      
           fscache_submit_exclusive_op() now decides whether it has to queue an op
           based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter
           will persist in being true even after all preceding operations have been
           cancelled or completed.  Furthermore, if an object is active and there are
           runnable ops against it, there must be at least one op running.
      
       (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and
           provide a function to record completion of the pages as they complete.
      
           When n_pages reaches 0, the operation is deemed to be complete and
           fscache_op_complete() is called.
      
           Add calls to fscache_retrieval_complete() anywhere we've finished with a
           page we've been given to read or allocate for.  This includes places where
           we just return pages to the netfs for reading from the server and where
           accessing the cache fails and we discard the proposed netfs page.
      
      The bugs in the unfixed state management manifest themselves as oopses like the
      following where the operation completion gets out of sync with return of the
      cookie by the netfs.  This is possible because the cache unlocks and returns
      all the netfs pages before recording its completion - which means that there's
      nothing to stop the netfs discarding them and returning the cookie.
      
      
      FS-Cache: Cookie 'NFS.fh' still has outstanding reads
      ------------[ cut here ]------------
      kernel BUG at fs/fscache/cookie.c:519!
      invalid opcode: 0000 [#1] SMP
      CPU 1
      Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc
      
      Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090                  /DG965RY
      RIP: 0010:[<ffffffffa007050a>]  [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache]
      RSP: 0018:ffff8800368cfb00  EFLAGS: 00010282
      RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000
      RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c
      RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000
      R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98
      R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370
      FS:  0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040)
      Stack:
       ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0
       ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0
       ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91
      Call Trace:
       [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs]
       [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs]
       [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs]
       [<ffffffff810d8d47>] evict+0xa1/0x15c
       [<ffffffff810d8e2e>] dispose_list+0x2c/0x38
       [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b
       [<ffffffff810c56b7>] prune_super+0xd5/0x140
       [<ffffffff8109b615>] shrink_slab+0x102/0x1ab
       [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595
       [<ffffffff8103e009>] ? process_timeout+0xb/0xb
       [<ffffffff8109dba3>] kswapd+0x270/0x289
       [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46
       [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595
       [<ffffffff8104bf7a>] kthread+0x7f/0x87
       [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10
       [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
       [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe
       [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53
       [<ffffffff813ad6b0>] ? gs_change+0xb/0xb
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      9f10523f
    • D
      FS-Cache: Make cookie relinquishment wait for outstanding reads · ef46ed88
      David Howells 提交于
      Make fscache_relinquish_cookie() log a warning and wait if there are any
      outstanding reads left on the cookie it was given.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      ef46ed88
  5. 25 5月, 2011 1 次提交
  6. 15 1月, 2011 1 次提交
  7. 23 7月, 2010 1 次提交
    • T
      fscache: convert operation to use workqueue instead of slow-work · 8af7c124
      Tejun Heo 提交于
      Make fscache operation to use only workqueue instead of combination of
      workqueue and slow-work.  FSCACHE_OP_SLOW is dropped and
      FSCACHE_OP_FAST is renamed to FSCACHE_OP_ASYNC and uses newly added
      fscache_op_wq workqueue to execute op->processor().
      fscache_operation_init_slow() is dropped and fscache_operation_init()
      now takes @processor argument directly.
      
      * Unbound workqueue is used.
      
      * fscache_retrieval_work() is no longer necessary as OP_ASYNC now does
        the equivalent thing.
      
      * sysctl fscache.operation_max_active added to control concurrency.
        The default value is nr_cpus clamped between 2 and
        WQ_UNBOUND_MAX_ACTIVE.
      
      * debugfs support is dropped for now.  Tracing API based debug
        facility is planned to be added.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      8af7c124
  8. 30 3月, 2010 2 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
    • D
      SLOW_WORK: CONFIG_SLOW_WORK_PROC should be CONFIG_SLOW_WORK_DEBUG · a53f4f9e
      David Howells 提交于
      CONFIG_SLOW_WORK_PROC was changed to CONFIG_SLOW_WORK_DEBUG, but not in all
      instances.  Change the remaining instances.  This makes the debugfs file
      display the time mark and the owner's description again.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a53f4f9e
  9. 20 11月, 2009 5 次提交
    • D
      FS-Cache: Handle read request vs lookup, creation or other cache failure · e3d4d28b
      David Howells 提交于
      FS-Cache doesn't correctly handle the netfs requesting a read from the cache
      on an object that failed or was withdrawn by the cache.  A trace similar to
      the following might be seen:
      
      	CacheFiles: Lookup failed error -105
      	[exe   ] unexpected submission OP165afe [OBJ6cac OBJECT_LC_DYING]
      	[exe   ] objstate=OBJECT_LC_DYING [OBJECT_LC_DYING]
      	[exe   ] objflags=0
      	[exe   ] objevent=9 [fffffffffffffffb]
      	[exe   ] ops=0 inp=0 exc=0
      	Pid: 6970, comm: exe Not tainted 2.6.32-rc6-cachefs #50
      	Call Trace:
      	 [<ffffffffa0076477>] fscache_submit_op+0x3ff/0x45a [fscache]
      	 [<ffffffffa0077997>] __fscache_read_or_alloc_pages+0x187/0x3c4 [fscache]
      	 [<ffffffffa00b6480>] ? nfs_readpage_from_fscache_complete+0x0/0x66 [nfs]
      	 [<ffffffffa00b6388>] __nfs_readpages_from_fscache+0x7e/0x176 [nfs]
      	 [<ffffffff8108e483>] ? __alloc_pages_nodemask+0x11c/0x5cf
      	 [<ffffffffa009d796>] nfs_readpages+0x114/0x1d7 [nfs]
      	 [<ffffffff81090314>] __do_page_cache_readahead+0x15f/0x1ec
      	 [<ffffffff81090228>] ? __do_page_cache_readahead+0x73/0x1ec
      	 [<ffffffff810903bd>] ra_submit+0x1c/0x20
      	 [<ffffffff810906bb>] ondemand_readahead+0x227/0x23a
      	 [<ffffffff81090762>] page_cache_sync_readahead+0x17/0x19
      	 [<ffffffff8108a99e>] generic_file_aio_read+0x236/0x5a0
      	 [<ffffffffa00937bd>] nfs_file_read+0xe4/0xf3 [nfs]
      	 [<ffffffff810b2fa2>] do_sync_read+0xe3/0x120
      	 [<ffffffff81354cc3>] ? _spin_unlock_irq+0x2b/0x31
      	 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34
      	 [<ffffffff811848e5>] ? selinux_file_permission+0x5d/0x10f
      	 [<ffffffff81352bdb>] ? thread_return+0x3e/0x101
      	 [<ffffffff8117d7b0>] ? security_file_permission+0x11/0x13
      	 [<ffffffff810b3b06>] vfs_read+0xaa/0x16f
      	 [<ffffffff81058df0>] ? trace_hardirqs_on_caller+0x10c/0x130
      	 [<ffffffff810b3c84>] sys_read+0x45/0x6c
      	 [<ffffffff8100ae2b>] system_call_fastpath+0x16/0x1b
      
      The object state might also be OBJECT_DYING or OBJECT_WITHDRAWING.
      
      This should be handled by simply rejecting the new operation with ENOBUFS.
      There's no need to log an error for it.  Events of this type now appear in the
      stats file under Ops:rej.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e3d4d28b
    • D
      FS-Cache: Permit cache retrieval ops to be interrupted in the initial wait phase · 5753c441
      David Howells 提交于
      Permit the operations to retrieve data from the cache or to allocate space in
      the cache for future writes to be interrupted whilst they're waiting for
      permission for the operation to proceed.  Typically this wait occurs whilst the
      cache object is being looked up on disk in the background.
      
      If an interruption occurs, and the operation has not yet been given the
      go-ahead to run, the operation is dequeued and cancelled, and control returns
      to the read operation of the netfs routine with none of the requested pages
      having been read or in any way marked as known by the cache.
      
      This means that the initial wait is done interruptibly rather than
      uninterruptibly.
      
      In addition, extra stats values are made available to show the number of ops
      cancelled and the number of cache space allocations interrupted.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      5753c441
    • D
      FS-Cache: Allow the current state of all objects to be dumped · 4fbf4291
      David Howells 提交于
      Allow the current state of all fscache objects to be dumped by doing:
      
      	cat /proc/fs/fscache/objects
      
      By default, all objects and all fields will be shown.  This can be restricted
      by adding a suitable key to one of the caller's keyrings (such as the session
      keyring):
      
      	keyctl add user fscache:objlist "<restrictions>" @s
      
      The <restrictions> are:
      
      	K	Show hexdump of object key (don't show if not given)
      	A	Show hexdump of object aux data (don't show if not given)
      
      And paired restrictions:
      
      	C	Show objects that have a cookie
      	c	Show objects that don't have a cookie
      	B	Show objects that are busy
      	b	Show objects that aren't busy
      	W	Show objects that have pending writes
      	w	Show objects that don't have pending writes
      	R	Show objects that have outstanding reads
      	r	Show objects that don't have outstanding reads
      	S	Show objects that have slow work queued
      	s	Show objects that don't have slow work queued
      
      If neither side of a restriction pair is given, then both are implied.  For
      example:
      
      	keyctl add user fscache:objlist KB @s
      
      shows objects that are busy, and lists their object keys, but does not dump
      their auxiliary data.  It also implies "CcWwRrSs", but as 'B' is given, 'b' is
      not implied.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4fbf4291
    • D
      FS-Cache: Annotate slow-work runqueue proc lines for FS-Cache work items · 440f0aff
      David Howells 提交于
      Annotate slow-work runqueue proc lines for FS-Cache work items.  Objects
      include the object ID and the state.  Operations include the object ID, the
      operation ID and the operation type and state.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      440f0aff
    • D
      SLOW_WORK: Wait for outstanding work items belonging to a module to clear · 3d7a641e
      David Howells 提交于
      Wait for outstanding slow work items belonging to a module to clear when
      unregistering that module as a user of the facility.  This prevents the put_ref
      code of a work item from being taken away before it returns.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3d7a641e
  10. 03 4月, 2009 1 次提交
    • D
      FS-Cache: Add and document asynchronous operation handling · 952efe7b
      David Howells 提交于
      Add and document asynchronous operation handling for use by FS-Cache's data
      storage and retrieval routines.
      
      The following documentation is added to:
      
      	Documentation/filesystems/caching/operations.txt
      
      		       ================================
      		       ASYNCHRONOUS OPERATIONS HANDLING
      		       ================================
      
      ========
      OVERVIEW
      ========
      
      FS-Cache has an asynchronous operations handling facility that it uses for its
      data storage and retrieval routines.  Its operations are represented by
      fscache_operation structs, though these are usually embedded into some other
      structure.
      
      This facility is available to and expected to be be used by the cache backends,
      and FS-Cache will create operations and pass them off to the appropriate cache
      backend for completion.
      
      To make use of this facility, <linux/fscache-cache.h> should be #included.
      
      ===============================
      OPERATION RECORD INITIALISATION
      ===============================
      
      An operation is recorded in an fscache_operation struct:
      
      	struct fscache_operation {
      		union {
      			struct work_struct fast_work;
      			struct slow_work slow_work;
      		};
      		unsigned long		flags;
      		fscache_operation_processor_t processor;
      		...
      	};
      
      Someone wanting to issue an operation should allocate something with this
      struct embedded in it.  They should initialise it by calling:
      
      	void fscache_operation_init(struct fscache_operation *op,
      				    fscache_operation_release_t release);
      
      with the operation to be initialised and the release function to use.
      
      The op->flags parameter should be set to indicate the CPU time provision and
      the exclusivity (see the Parameters section).
      
      The op->fast_work, op->slow_work and op->processor flags should be set as
      appropriate for the CPU time provision (see the Parameters section).
      
      FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the
      operation and waited for afterwards.
      
      ==========
      PARAMETERS
      ==========
      
      There are a number of parameters that can be set in the operation record's flag
      parameter.  There are three options for the provision of CPU time in these
      operations:
      
       (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD).  A thread
           may decide it wants to handle an operation itself without deferring it to
           another thread.
      
           This is, for example, used in read operations for calling readpages() on
           the backing filesystem in CacheFiles.  Although readpages() does an
           asynchronous data fetch, the determination of whether pages exist is done
           synchronously - and the netfs does not proceed until this has been
           determined.
      
           If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags
           before submitting the operation, and the operating thread must wait for it
           to be cleared before proceeding:
      
      		wait_on_bit(&op->flags, FSCACHE_OP_WAITING,
      			    fscache_wait_bit, TASK_UNINTERRUPTIBLE);
      
       (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it
           will be given to keventd to process.  Such an operation is not permitted
           to sleep on I/O.
      
           This is, for example, used by CacheFiles to copy data from a backing fs
           page to a netfs page after the backing fs has read the page in.
      
           If this option is used, op->fast_work and op->processor must be
           initialised before submitting the operation:
      
      		INIT_WORK(&op->fast_work, do_some_work);
      
       (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it
           will be given to the slow work facility to process.  Such an operation is
           permitted to sleep on I/O.
      
           This is, for example, used by FS-Cache to handle background writes of
           pages that have just been fetched from a remote server.
      
           If this option is used, op->slow_work and op->processor must be
           initialised before submitting the operation:
      
      		fscache_operation_init_slow(op, processor)
      
      Furthermore, operations may be one of two types:
      
       (1) Exclusive (FSCACHE_OP_EXCLUSIVE).  Operations of this type may not run in
           conjunction with any other operation on the object being operated upon.
      
           An example of this is the attribute change operation, in which the file
           being written to may need truncation.
      
       (2) Shareable.  Operations of this type may be running simultaneously.  It's
           up to the operation implementation to prevent interference between other
           operations running at the same time.
      
      =========
      PROCEDURE
      =========
      
      Operations are used through the following procedure:
      
       (1) The submitting thread must allocate the operation and initialise it
           itself.  Normally this would be part of a more specific structure with the
           generic op embedded within.
      
       (2) The submitting thread must then submit the operation for processing using
           one of the following two functions:
      
      	int fscache_submit_op(struct fscache_object *object,
      			      struct fscache_operation *op);
      
      	int fscache_submit_exclusive_op(struct fscache_object *object,
      					struct fscache_operation *op);
      
           The first function should be used to submit non-exclusive ops and the
           second to submit exclusive ones.  The caller must still set the
           FSCACHE_OP_EXCLUSIVE flag.
      
           If successful, both functions will assign the operation to the specified
           object and return 0.  -ENOBUFS will be returned if the object specified is
           permanently unavailable.
      
           The operation manager will defer operations on an object that is still
           undergoing lookup or creation.  The operation will also be deferred if an
           operation of conflicting exclusivity is in progress on the object.
      
           If the operation is asynchronous, the manager will retain a reference to
           it, so the caller should put their reference to it by passing it to:
      
      	void fscache_put_operation(struct fscache_operation *op);
      
       (3) If the submitting thread wants to do the work itself, and has marked the
           operation with FSCACHE_OP_MYTHREAD, then it should monitor
           FSCACHE_OP_WAITING as described above and check the state of the object if
           necessary (the object might have died whilst the thread was waiting).
      
           When it has finished doing its processing, it should call
           fscache_put_operation() on it.
      
       (4) The operation holds an effective lock upon the object, preventing other
           exclusive ops conflicting until it is released.  The operation can be
           enqueued for further immediate asynchronous processing by adjusting the
           CPU time provisioning option if necessary, eg:
      
      	op->flags &= ~FSCACHE_OP_TYPE;
      	op->flags |= ~FSCACHE_OP_FAST;
      
           and calling:
      
      	void fscache_enqueue_operation(struct fscache_operation *op)
      
           This can be used to allow other things to have use of the worker thread
           pools.
      
      =====================
      ASYNCHRONOUS CALLBACK
      =====================
      
      When used in asynchronous mode, the worker thread pool will invoke the
      processor method with a pointer to the operation.  This should then get at the
      container struct by using container_of():
      
      	static void fscache_write_op(struct fscache_operation *_op)
      	{
      		struct fscache_storage *op =
      			container_of(_op, struct fscache_storage, op);
      	...
      	}
      
      The caller holds a reference on the operation, and will invoke
      fscache_put_operation() when the processor function returns.  The processor
      function is at liberty to call fscache_enqueue_operation() or to take extra
      references.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NSteve Dickson <steved@redhat.com>
      Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Tested-by: NDaire Byrne <Daire.Byrne@framestore.com>
      952efe7b