1. 06 12月, 2009 1 次提交
  2. 05 12月, 2009 1 次提交
  3. 04 12月, 2009 2 次提交
    • C
      SUNRPC: Allow RPCs to fail quickly if the server is unreachable · 09a21c41
      Chuck Lever 提交于
      The kernel sometimes makes RPC calls to services that aren't running.
      Because the kernel's RPC client always assumes the hard retry semantic
      when reconnecting a connection-oriented RPC transport, the underlying
      reconnect logic takes a long while to time out, even though the remote
      may have responded immediately with ECONNREFUSED.
      
      In certain cases, like upcalls to our local rpcbind daemon, or for NFS
      mount requests, we'd like the kernel to fail immediately if the remote
      service isn't reachable.  This allows another transport to be tried
      immediately, or the pending request can be abandoned quickly.
      
      Introduce a per-request flag which controls how call_transmit_status()
      behaves when request transmission fails because the server cannot be
      reached.
      
      We don't want soft connection semantics to apply to other errors.  The
      default case of the switch statement in call_transmit_status() no
      longer falls through; the fall through code is copied to the default
      case, and a "break;" is added.
      
      The transport's connection re-establishment timeout is also ignored for
      such requests.  We want the request to fail immediately, so the
      reconnect delay is skipped.  Additionally, we don't want a connect
      failure here to further increase the reconnect timeout value, since
      this request will not be retried.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      09a21c41
    • R
      NFS: reorder nfs4_sequence_regs to remove 8 bytes of padding on 64 bits · a01878aa
      Richard Kennedy 提交于
      reorder nfs4_sequence_args to remove 8 bytes of padding on 64 bit
      builds.
      
      The size of this structure drops to 24 bytes from 32 and reduces the
      text size of nfs.ko.
      On my x86_64 size reports
      
      		text       data     bss
      2.6.32-rc5 	200996	   8512	    432	 209940	  33414	nfs.ko
      +patch 		200884	   8512	    432	 209828	  333a4	nfs.ko
      Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      a01878aa
  4. 02 12月, 2009 1 次提交
  5. 01 12月, 2009 2 次提交
    • M
      mfd: Correct WM831X_MAX_ISEL_VALUE · 7716977b
      Mark Brown 提交于
      There was confusion between the array size and the highest ISEL
      value possible.
      Reported-by: NDan Carpenter <error27@gmail.com>
      Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
      Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>
      7716977b
    • J
      mac80211: fix spurious delBA handling · 827d42c9
      Johannes Berg 提交于
      Lennert Buytenhek noticed that delBA handling in mac80211
      was broken and has remotely triggerable problems, some of
      which are due to some code shuffling I did that ended up
      changing the order in which things were done -- this was
      
        commit d75636ef
        Author: Johannes Berg <johannes@sipsolutions.net>
        Date:   Tue Feb 10 21:25:53 2009 +0100
      
          mac80211: RX aggregation: clean up stop session
      
      and other parts were already present in the original
      
        commit d92684e6
        Author: Ron Rindjunsky <ron.rindjunsky@intel.com>
        Date:   Mon Jan 28 14:07:22 2008 +0200
      
            mac80211: A-MPDU Tx add delBA from recipient support
      
      The first problem is that I moved a BUG_ON before various
      checks -- thereby making it possible to hit. As the comment
      indicates, the BUG_ON can be removed since the ampdu_action
      callback must already exist when the state is != IDLE.
      
      The second problem isn't easily exploitable but there's a
      race condition due to unconditionally setting the state to
      OPERATIONAL when a delBA frame is received, even when no
      aggregation session was ever initiated. All the drivers
      accept stopping the session even then, but that opens a
      race window where crashes could happen before the driver
      accepts it. Right now, a WARN_ON may happen with non-HT
      drivers, while the race opens only for HT drivers.
      
      For this case, there are two things necessary to fix it:
       1) don't process spurious delBA frames, and be more careful
          about the session state; don't drop the lock
      
       2) HT drivers need to be prepared to handle a session stop
          even before the session was really started -- this is
          true for all drivers (that support aggregation) but
          iwlwifi which can be fixed easily. The other HT drivers
          (ath9k and ar9170) are behaving properly already.
      Reported-by: NLennert Buytenhek <buytenh@marvell.com>
      Cc: stable@kernel.org
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      827d42c9
  6. 29 11月, 2009 1 次提交
    • A
      sctp: on T3_RTX retransmit all the in-flight chunks · 5fdd4bae
      Andrei Pelinescu-Onciul 提交于
      When retransmitting due to T3 timeout, retransmit all the
      in-flight chunks for the corresponding  transport/path, including
      chunks sent less then 1 rto ago.
      This is the correct behaviour according to rfc4960 section 6.3.3
      E3 and
      "Note: Any DATA chunks that were sent to the address for which the
       T3-rtx timer expired but did not fit in one MTU (rule E3 above)
       should be marked for retransmission and sent as soon as cwnd
       allows (normally, when a SACK arrives). ".
      
      This fixes problems when more then one path is present and the T3
      retransmission of the first chunk that timeouts stops the T3 timer
      for the initial active path, leaving all the other in-flight
      chunks waiting forever or until a new chunk is transmitted on the
      same path and timeouts (and this will happen only if the cwnd
      allows sending new chunks, but since cwnd was dropped to MTU by
      the timeout => it will wait until the first heartbeat).
      
      Example: 10 packets in flight, sent at 0.1 s intervals on the
      primary path. The primary path is down and the first packet
      timeouts. The first packet is retransmitted on another path, the
      T3 timer for the primary path is stopped and cwnd is set to MTU.
      All the other 9 in-flight packets will not be retransmitted
      (unless more new packets are sent on the primary path which depend
      on cwnd allowing it, and even in this case the 9 packets will be
      retransmitted only after a new packet timeouts which even in the
      best case would be more then RTO).
      
      This commit reverts d0ce9291 and
      also removes the now unused transport->last_rto, introduced in
       b6157d8e.
      
      p.s  The problem is not only when multiple paths are there.  It
      can happen in a single homed environment.  If the application
      stops sending data, it possible to have a hung association.
      Signed-off-by: NAndrei Pelinescu-Onciul <andrei@iptel.org>
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5fdd4bae
  7. 26 11月, 2009 1 次提交
    • J
      [SCSI] fix async scan add/remove race resulting in an oops · 860dc736
      James Bottomley 提交于
      Async scanning introduced a very wide window where the SCSI device is
      up and running but has not yet been added to sysfs.  We delay the
      adding until all scans have completed to retain the same ordering as
      sync scanning.
      
      This delay in visibility causes an oops if a device is removed before
      we make it visible because the SCSI removal routines have an inbuilt
      assumption that if a device is in SDEV_RUNNING state, it must be
      visible (which is not necessarily true in the async scanning case).
      
      Fix this by introducing an additional is_visible flag which we can use
      to condition the tear down so we do the right thing for running but
      not yet made visible.
      Reported-by: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      860dc736
  8. 20 11月, 2009 15 次提交
    • K
      i2c: i2c-pnx: Made buf type unsigned to prevent sign extension · 4ced24c8
      Kevin Wells 提交于
      Made buf type unsigned to prevent sign extension
      Signed-off-by: NKevin Wells <kevin.wells@nxp.com>
      Signed-off-by: NBen Dooks <ben-linux@fluff.org>
      4ced24c8
    • A
      vt: Fix use of "new" in a struct field · 308efab5
      Alan Cox 提交于
      As this struct is exposed to user space and the API was added for this
      release it's a bit of a pain for the C++ world and we still have time to
      fix it. Rename the fields before we end up with that pain in an actual
      release.
      Signed-off-by: NAlan Cox <alan@linux.intel.com>
      Reported-by: Olivier Goffart
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      308efab5
    • D
      CacheFiles: Catch an overly long wait for an old active object · fee096de
      David Howells 提交于
      Catch an overly long wait for an old, dying active object when we want to
      replace it with a new one.  The probability is that all the slow-work threads
      are hogged, and the delete can't get a look in.
      
      What we do instead is:
      
       (1) if there's nothing in the slow work queue, we sleep until either the dying
           object has finished dying or there is something in the slow work queue
           behind which we can queue our object.
      
       (2) if there is something in the slow work queue, we return ETIMEDOUT to
           fscache_lookup_object(), which then puts us back on the slow work queue,
           presumably behind the deletion that we're blocked by.  We are then
           deferred for a while until we work our way back through the queue -
           without blocking a slow-work thread unnecessarily.
      
      A backtrace similar to the following may appear in the log without this patch:
      
      	INFO: task kslowd004:5711 blocked for more than 120 seconds.
      	"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      	kslowd004     D 0000000000000000     0  5711      2 0x00000080
      	 ffff88000340bb80 0000000000000046 ffff88002550d000 0000000000000000
      	 ffff88002550d000 0000000000000007 ffff88000340bfd8 ffff88002550d2a8
      	 000000000000ddf0 00000000000118c0 00000000000118c0 ffff88002550d2a8
      	Call Trace:
      	 [<ffffffff81058e21>] ? trace_hardirqs_on+0xd/0xf
      	 [<ffffffffa011c4d8>] ? cachefiles_wait_bit+0x0/0xd [cachefiles]
      	 [<ffffffffa011c4e1>] cachefiles_wait_bit+0x9/0xd [cachefiles]
      	 [<ffffffff81353153>] __wait_on_bit+0x43/0x76
      	 [<ffffffff8111ae39>] ? ext3_xattr_get+0x1ec/0x270
      	 [<ffffffff813531ef>] out_of_line_wait_on_bit+0x69/0x74
      	 [<ffffffffa011c4d8>] ? cachefiles_wait_bit+0x0/0xd [cachefiles]
      	 [<ffffffff8104c125>] ? wake_bit_function+0x0/0x2e
      	 [<ffffffffa011bc79>] cachefiles_mark_object_active+0x203/0x23b [cachefiles]
      	 [<ffffffffa011c209>] cachefiles_walk_to_object+0x558/0x827 [cachefiles]
      	 [<ffffffffa011a429>] cachefiles_lookup_object+0xac/0x12a [cachefiles]
      	 [<ffffffffa00aa1e9>] fscache_lookup_object+0x1c7/0x214 [fscache]
      	 [<ffffffffa00aafc5>] fscache_object_state_machine+0xa5/0x52d [fscache]
      	 [<ffffffffa00ab4ac>] fscache_object_slow_work_execute+0x5f/0xa0 [fscache]
      	 [<ffffffff81082093>] slow_work_execute+0x18f/0x2d1
      	 [<ffffffff8108239a>] slow_work_thread+0x1c5/0x308
      	 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34
      	 [<ffffffff810821d5>] ? slow_work_thread+0x0/0x308
      	 [<ffffffff8104be91>] kthread+0x7a/0x82
      	 [<ffffffff8100beda>] child_rip+0xa/0x20
      	 [<ffffffff8100b87c>] ? restore_args+0x0/0x30
      	 [<ffffffff8104be17>] ? kthread+0x0/0x82
      	 [<ffffffff8100bed0>] ? child_rip+0x0/0x20
      	1 lock held by kslowd004/5711:
      	 #0:  (&sb->s_type->i_mutex_key#7/1){+.+.+.}, at: [<ffffffffa011be64>] cachefiles_walk_to_object+0x1b3/0x827 [cachefiles]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      fee096de
    • D
      CacheFiles: Don't write a full page if there's only a partial page to cache · a17754fb
      David Howells 提交于
      cachefiles_write_page() writes a full page to the backing file for the last
      page of the netfs file, even if the netfs file's last page is only a partial
      page.
      
      This causes the EOF on the backing file to be extended beyond the EOF of the
      netfs, and thus the backing file will be truncated by cachefiles_attr_changed()
      called from cachefiles_lookup_object().
      
      So we need to limit the write we make to the backing file on that last page
      such that it doesn't push the EOF too far.
      
      Also, if a backing file that has a partial page at the end is expanded, we
      discard the partial page and refetch it on the basis that we then have a hole
      in the file with invalid data, and should the power go out...  A better way to
      deal with this could be to record a note that the partial page contains invalid
      data until the correct data is written into it.
      
      This isn't a problem for netfs's that discard the whole backing file if the
      file size changes (such as NFS).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      a17754fb
    • D
      FS-Cache: Start processing an object's operations on that object's death · 60d543ca
      David Howells 提交于
      Start processing an object's operations when that object moves into the DYING
      state as the object cannot be destroyed until all its outstanding operations
      have completed.
      
      Furthermore, make sure that read and allocation operations handle being woken
      up on a dead object.  Such events are recorded in the Allocs.abt and
      Retrvls.abt statistics as viewable through /proc/fs/fscache/stats.
      
      The code for waiting for object activation for the read and allocation
      operations is also extracted into its own function as it is much the same in
      all cases, differing only in the stats incremented.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      60d543ca
    • D
      FS-Cache: Handle pages pending storage that get evicted under OOM conditions · 201a1542
      David Howells 提交于
      Handle netfs pages that the vmscan algorithm wants to evict from the pagecache
      under OOM conditions, but that are waiting for write to the cache.  Under these
      conditions, vmscan calls the releasepage() function of the netfs, asking if a
      page can be discarded.
      
      The problem is typified by the following trace of a stuck process:
      
      	kslowd005     D 0000000000000000     0  4253      2 0x00000080
      	 ffff88001b14f370 0000000000000046 ffff880020d0d000 0000000000000007
      	 0000000000000006 0000000000000001 ffff88001b14ffd8 ffff880020d0d2a8
      	 000000000000ddf0 00000000000118c0 00000000000118c0 ffff880020d0d2a8
      	Call Trace:
      	 [<ffffffffa00782d8>] __fscache_wait_on_page_write+0x8b/0xa7 [fscache]
      	 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34
      	 [<ffffffffa0078240>] ? __fscache_check_page_write+0x63/0x70 [fscache]
      	 [<ffffffffa00b671d>] nfs_fscache_release_page+0x4e/0xc4 [nfs]
      	 [<ffffffffa00927f0>] nfs_release_page+0x3c/0x41 [nfs]
      	 [<ffffffff810885d3>] try_to_release_page+0x32/0x3b
      	 [<ffffffff81093203>] shrink_page_list+0x316/0x4ac
      	 [<ffffffff8109372b>] shrink_inactive_list+0x392/0x67c
      	 [<ffffffff813532fa>] ? __mutex_unlock_slowpath+0x100/0x10b
      	 [<ffffffff81058df0>] ? trace_hardirqs_on_caller+0x10c/0x130
      	 [<ffffffff8135330e>] ? mutex_unlock+0x9/0xb
      	 [<ffffffff81093aa2>] shrink_list+0x8d/0x8f
      	 [<ffffffff81093d1c>] shrink_zone+0x278/0x33c
      	 [<ffffffff81052d6c>] ? ktime_get_ts+0xad/0xba
      	 [<ffffffff81094b13>] try_to_free_pages+0x22e/0x392
      	 [<ffffffff81091e24>] ? isolate_pages_global+0x0/0x212
      	 [<ffffffff8108e743>] __alloc_pages_nodemask+0x3dc/0x5cf
      	 [<ffffffff81089529>] grab_cache_page_write_begin+0x65/0xaa
      	 [<ffffffff8110f8c0>] ext3_write_begin+0x78/0x1eb
      	 [<ffffffff81089ec5>] generic_file_buffered_write+0x109/0x28c
      	 [<ffffffff8103cb69>] ? current_fs_time+0x22/0x29
      	 [<ffffffff8108a509>] __generic_file_aio_write+0x350/0x385
      	 [<ffffffff8108a588>] ? generic_file_aio_write+0x4a/0xae
      	 [<ffffffff8108a59e>] generic_file_aio_write+0x60/0xae
      	 [<ffffffff810b2e82>] do_sync_write+0xe3/0x120
      	 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34
      	 [<ffffffff810b18e1>] ? __dentry_open+0x1a5/0x2b8
      	 [<ffffffff810b1a76>] ? dentry_open+0x82/0x89
      	 [<ffffffffa00e693c>] cachefiles_write_page+0x298/0x335 [cachefiles]
      	 [<ffffffffa0077147>] fscache_write_op+0x178/0x2c2 [fscache]
      	 [<ffffffffa0075656>] fscache_op_execute+0x7a/0xd1 [fscache]
      	 [<ffffffff81082093>] slow_work_execute+0x18f/0x2d1
      	 [<ffffffff8108239a>] slow_work_thread+0x1c5/0x308
      	 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34
      	 [<ffffffff810821d5>] ? slow_work_thread+0x0/0x308
      	 [<ffffffff8104be91>] kthread+0x7a/0x82
      	 [<ffffffff8100beda>] child_rip+0xa/0x20
      	 [<ffffffff8100b87c>] ? restore_args+0x0/0x30
      	 [<ffffffff8102ef83>] ? tg_shares_up+0x171/0x227
      	 [<ffffffff8104be17>] ? kthread+0x0/0x82
      	 [<ffffffff8100bed0>] ? child_rip+0x0/0x20
      
      In the above backtrace, the following is happening:
      
       (1) A page storage operation is being executed by a slow-work thread
           (fscache_write_op()).
      
       (2) FS-Cache farms the operation out to the cache to perform
           (cachefiles_write_page()).
      
       (3) CacheFiles is then calling Ext3 to perform the actual write, using Ext3's
           standard write (do_sync_write()) under KERNEL_DS directly from the netfs
           page.
      
       (4) However, for Ext3 to perform the write, it must allocate some memory, in
           particular, it must allocate at least one page cache page into which it
           can copy the data from the netfs page.
      
       (5) Under OOM conditions, the memory allocator can't immediately come up with
           a page, so it uses vmscan to find something to discard
           (try_to_free_pages()).
      
       (6) vmscan finds a clean netfs page it might be able to discard (possibly the
           one it's trying to write out).
      
       (7) The netfs is called to throw the page away (nfs_release_page()) - but it's
           called with __GFP_WAIT, so the netfs decides to wait for the store to
           complete (__fscache_wait_on_page_write()).
      
       (8) This blocks a slow-work processing thread - possibly against itself.
      
      The system ends up stuck because it can't write out any netfs pages to the
      cache without allocating more memory.
      
      To avoid this, we make FS-Cache cancel some writes that aren't in the middle of
      actually being performed.  This means that some data won't make it into the
      cache this time.  To support this, a new FS-Cache function is added
      fscache_maybe_release_page() that replaces what the netfs releasepage()
      functions used to do with respect to the cache.
      
      The decisions fscache_maybe_release_page() makes are counted and displayed
      through /proc/fs/fscache/stats on a line labelled "VmScan".  There are four
      counters provided: "nos=N" - pages that weren't pending storage; "gon=N" -
      pages that were pending storage when we first looked, but weren't by the time
      we got the object lock; "bsy=N" - pages that we ignored as they were actively
      being written when we looked; and "can=N" - pages that we cancelled the storage
      of.
      
      What I'd really like to do is alter the behaviour of the cancellation
      heuristics, depending on how necessary it is to expel pages.  If there are
      plenty of other pages that aren't waiting to be written to the cache that
      could be ejected first, then it would be nice to hold up on immediate
      cancellation of cache writes - but I don't see a way of doing that.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      201a1542
    • D
      FS-Cache: Fix lock misorder in fscache_write_op() · 1bccf513
      David Howells 提交于
      FS-Cache has two structs internally for keeping track of the internal state of
      a cached file: the fscache_cookie struct, which represents the netfs's state,
      and fscache_object struct, which represents the cache's state.  Each has a
      pointer that points to the other (when both are in existence), and each has a
      spinlock for pointer maintenance.
      
      Since netfs operations approach these structures from the cookie side, they get
      the cookie lock first, then the object lock.  Cache operations, on the other
      hand, approach from the object side, and get the object lock first.  It is not
      then permitted for a cache operation to get the cookie lock whilst it is
      holding the object lock lest deadlock occur; instead, it must do one of two
      things:
      
       (1) increment the cookie usage counter, drop the object lock and then get both
           locks in order, or
      
       (2) simply hold the object lock as certain parts of the cookie may not be
           altered whilst the object lock is held.
      
      It is also not permitted to follow either pointer without holding the lock at
      the end you start with.  To break the pointers between the cookie and the
      object, both locks must be held.
      
      fscache_write_op(), however, violates the locking rules: It attempts to get the
      cookie lock without (a) checking that the cookie pointer is a valid pointer,
      and (b) holding the object lock to protect the cookie pointer whilst it follows
      it.  This is so that it can access the pending page store tree without
      interference from __fscache_write_page().
      
      This is fixed by splitting the cookie lock, such that the page store tracking
      tree is protected by its own lock, and checking that the cookie pointer is
      non-NULL before we attempt to follow it whilst holding the object lock.
      
      The new lock is subordinate to both the cookie lock and the object lock, and so
      should be taken after those.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      1bccf513
    • D
      FS-Cache: Allow the current state of all objects to be dumped · 4fbf4291
      David Howells 提交于
      Allow the current state of all fscache objects to be dumped by doing:
      
      	cat /proc/fs/fscache/objects
      
      By default, all objects and all fields will be shown.  This can be restricted
      by adding a suitable key to one of the caller's keyrings (such as the session
      keyring):
      
      	keyctl add user fscache:objlist "<restrictions>" @s
      
      The <restrictions> are:
      
      	K	Show hexdump of object key (don't show if not given)
      	A	Show hexdump of object aux data (don't show if not given)
      
      And paired restrictions:
      
      	C	Show objects that have a cookie
      	c	Show objects that don't have a cookie
      	B	Show objects that are busy
      	b	Show objects that aren't busy
      	W	Show objects that have pending writes
      	w	Show objects that don't have pending writes
      	R	Show objects that have outstanding reads
      	r	Show objects that don't have outstanding reads
      	S	Show objects that have slow work queued
      	s	Show objects that don't have slow work queued
      
      If neither side of a restriction pair is given, then both are implied.  For
      example:
      
      	keyctl add user fscache:objlist KB @s
      
      shows objects that are busy, and lists their object keys, but does not dump
      their auxiliary data.  It also implies "CcWwRrSs", but as 'B' is given, 'b' is
      not implied.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4fbf4291
    • D
      FS-Cache: Annotate slow-work runqueue proc lines for FS-Cache work items · 440f0aff
      David Howells 提交于
      Annotate slow-work runqueue proc lines for FS-Cache work items.  Objects
      include the object ID and the state.  Operations include the object ID, the
      operation ID and the operation type and state.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      440f0aff
    • D
      SLOW_WORK: Allow a requeueable work item to sleep till the thread is needed · 3bde31a4
      David Howells 提交于
      Add a function to allow a requeueable work item to sleep till the thread
      processing it is needed by the slow-work facility to perform other work.
      
      Sometimes a work item can't progress immediately, but must wait for the
      completion of another work item that's currently being processed by another
      slow-work thread.
      
      In some circumstances, the waiting item could instead - theoretically - put
      itself back on the queue and yield its thread back to the slow-work facility,
      thus waiting till it gets processing time again before attempting to progress.
      This would allow other work items processing time on that thread.
      
      However, this only works if there is something on the queue for it to queue
      behind - otherwise it will just get a thread again immediately, and will end
      up cycling between the queue and the thread, eating up valuable CPU time.
      
      So, slow_work_sleep_till_thread_needed() is provided such that an item can put
      itself on a wait queue that will wake it up when the event it is actually
      interested in occurs, then call this function in lieu of calling schedule().
      
      This function will then sleep until either the item's event occurs or another
      work item appears on the queue.  If another work item is queued, but the
      item's event hasn't occurred, then the work item should requeue itself and
      yield the thread back to the slow-work facility by returning.
      
      This can be used by CacheFiles for an object that is being created on one
      thread to wait for an object being deleted on another thread where there is
      nothing on the queue for the creation to go and wait behind.  As soon as an
      item appears on the queue that could be given thread time instead, CacheFiles
      can stick the creating object back on the queue and return to the slow-work
      facility - assuming the object deletion didn't also complete.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3bde31a4
    • D
      SLOW_WORK: Allow the owner of a work item to determine if it is queued or not · 31ba99d3
      David Howells 提交于
      Add a function (slow_work_is_queued()) to permit the owner of a work item to
      determine if the item is queued or not.
      
      The work item is counted as being queued if it is actually on the queue, not
      just if it is pending.  If it is executing and pending, then it is not on the
      queue, but will rather be put back on the queue when execution finishes.
      
      This permits a caller to quickly work out if it may be able to put another,
      dependent work item on the queue behind it, or whether it will have to wait
      till that is finished.
      
      This can be used by CacheFiles to work out whether the creation a new object
      can be immediately deferred when it has to wait for an old object to be
      deleted, or whether a wait must take place.  If a wait is necessary, then the
      slow-work thread can otherwise get blocked, preventing the deletion from
      taking place.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      31ba99d3
    • D
      SLOW_WORK: Allow the work items to be viewed through a /proc file · 8fba10a4
      David Howells 提交于
      Allow the executing and queued work items to be viewed through a /proc file
      for debugging purposes.  The contents look something like the following:
      
          THR PID   ITEM ADDR        FL MARK  DESC
          === ===== ================ == ===== ==========
            0  3005 ffff880023f52348  a 952ms FSC: OBJ17d3: LOOK
            1  3006 ffff880024e33668  2 160ms FSC: OBJ17e5 OP60d3b: Write1/Store fl=2
            2  3165 ffff8800296dd180  a 424ms FSC: OBJ17e4: LOOK
            3  4089 ffff8800262c8d78  a 212ms FSC: OBJ17ea: CRTN
            4  4090 ffff88002792bed8  2 388ms FSC: OBJ17e8 OP60d36: Write1/Store fl=2
            5  4092 ffff88002a0ef308  2 388ms FSC: OBJ17e7 OP60d2e: Write1/Store fl=2
            6  4094 ffff88002abaf4b8  2 132ms FSC: OBJ17e2 OP60d4e: Write1/Store fl=2
            7  4095 ffff88002bb188e0  a 388ms FSC: OBJ17e9: CRTN
          vsq     - ffff880023d99668  1 308ms FSC: OBJ17e0 OP60f91: Write1/EnQ fl=2
          vsq     - ffff8800295d1740  1 212ms FSC: OBJ16be OP4d4b6: Write1/EnQ fl=2
          vsq     - ffff880025ba3308  1 160ms FSC: OBJ179a OP58dec: Write1/EnQ fl=2
          vsq     - ffff880024ec83e0  1 160ms FSC: OBJ17ae OP599f2: Write1/EnQ fl=2
          vsq     - ffff880026618e00  1 160ms FSC: OBJ17e6 OP60d33: Write1/EnQ fl=2
          vsq     - ffff880025a2a4b8  1 132ms FSC: OBJ16a2 OP4d583: Write1/EnQ fl=2
          vsq     - ffff880023cbe6d8  9 212ms FSC: OBJ17eb: LOOK
          vsq     - ffff880024d37590  9 212ms FSC: OBJ17ec: LOOK
          vsq     - ffff880027746cb0  9 212ms FSC: OBJ17ed: LOOK
          vsq     - ffff880024d37ae8  9 212ms FSC: OBJ17ee: LOOK
          vsq     - ffff880024d37cb0  9 212ms FSC: OBJ17ef: LOOK
          vsq     - ffff880025036550  9 212ms FSC: OBJ17f0: LOOK
          vsq     - ffff8800250368e0  9 212ms FSC: OBJ17f1: LOOK
          vsq     - ffff880025036aa8  9 212ms FSC: OBJ17f2: LOOK
      
      In the 'THR' column, executing items show the thread they're occupying and
      queued threads indicate which queue they're on.  'PID' shows the process ID of
      a slow-work thread that's executing something.  'FL' shows the work item flags.
      'MARK' indicates how long since an item was queued or began executing.  Lastly,
      the 'DESC' column permits the owner of an item to give some information.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      8fba10a4
    • J
      SLOW_WORK: Add delayed_slow_work support · 6b8268b1
      Jens Axboe 提交于
      This adds support for starting slow work with a delay, similar
      to the functionality we have for workqueues.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      6b8268b1
    • J
      SLOW_WORK: Add support for cancellation of slow work · 01609502
      Jens Axboe 提交于
      Add support for cancellation of queued slow work and delayed slow work items.
      The cancellation functions will wait for items that are pending or undergoing
      execution to be discarded by the slow work facility.
      
      Attempting to enqueue work that is in the process of being cancelled will
      result in ECANCELED.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      01609502
    • D
      SLOW_WORK: Wait for outstanding work items belonging to a module to clear · 3d7a641e
      David Howells 提交于
      Wait for outstanding slow work items belonging to a module to clear when
      unregistering that module as a user of the facility.  This prevents the put_ref
      code of a work item from being taken away before it returns.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3d7a641e
  9. 18 11月, 2009 2 次提交
  10. 16 11月, 2009 1 次提交
  11. 15 11月, 2009 1 次提交
  12. 14 11月, 2009 1 次提交
    • V
      sctp: Set source addresses on the association before adding transports · 409b95af
      Vlad Yasevich 提交于
      Recent commit 8da645e1
      	sctp: Get rid of an extra routing lookup when adding a transport
      introduced a regression in the connection setup.  The behavior was
      
      different between IPv4 and IPv6.  IPv4 case ended up working because the
      route lookup routing returned a NULL route, which triggered another
      route lookup later in the output patch that succeeded.  In the IPv6 case,
      a valid route was returned for first call, but we could not find a valid
      source address at the time since the source addresses were not set on the
      association yet.  Thus resulted in a hung connection.
      
      The solution is to set the source addresses on the association prior to
      adding peers.
      Signed-off-by: NVlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      409b95af
  13. 12 11月, 2009 3 次提交
  14. 11 11月, 2009 2 次提交
  15. 07 11月, 2009 3 次提交
  16. 06 11月, 2009 1 次提交
    • J
      netfilter: nf_nat: fix NAT issue in 2.6.30.4+ · f9dd09c7
      Jozsef Kadlecsik 提交于
      Vitezslav Samel discovered that since 2.6.30.4+ active FTP can not work
      over NAT. The "cause" of the problem was a fix of unacknowledged data
      detection with NAT (commit a3a9f79e).
      However, actually, that fix uncovered a long standing bug in TCP conntrack:
      when NAT was enabled, we simply updated the max of the right edge of
      the segments we have seen (td_end), by the offset NAT produced with
      changing IP/port in the data. However, we did not update the other parameter
      (td_maxend) which is affected by the NAT offset. Thus that could drift
      away from the correct value and thus resulted breaking active FTP.
      
      The patch below fixes the issue by *not* updating the conntrack parameters
      from NAT, but instead taking into account the NAT offsets in conntrack in a
      consistent way. (Updating from NAT would be more harder and expensive because
      it'd need to re-calculate parameters we already calculated in conntrack.)
      Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9dd09c7
  17. 03 11月, 2009 1 次提交
  18. 02 11月, 2009 1 次提交
    • E
      9p: fix readdir corner cases · 3e2796a9
      Eric Van Hensbergen 提交于
      The patch below also addresses a couple of other corner cases in readdir
      seen with a large (e.g. 64k) msize.  I'm not sure what people think of
      my co-opting of fid->aux here.  I'd be happy to rework if there's a better
      way.
      
      When the size of the user supplied buffer passed to readdir is smaller
      than the data returned in one go by the 9P read request, v9fs_dir_readdir()
      currently discards extra data so that, on the next call, a 9P read
      request will be issued with offset < previous offset + bytes returned,
      which voilates the constraint described in paragraph 3 of read(5) description.
      This patch preseves the leftover data in fid->aux for use in the next call.
      Signed-off-by: NJim Garlick <garlick@llnl.gov>
      Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>
      3e2796a9