1. 21 7月, 2011 9 次提交
    • J
      fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers · 02c24a82
      Josef Bacik 提交于
      Btrfs needs to be able to control how filemap_write_and_wait_range() is called
      in fsync to make it less of a painful operation, so push down taking i_mutex and
      the calling of filemap_write_and_wait() down into the ->fsync() handlers.  Some
      file systems can drop taking the i_mutex altogether it seems, like ext3 and
      ocfs2.  For correctness sake I just pushed everything down in all cases to make
      sure that we keep the current behavior the same for everybody, and then each
      individual fs maintainer can make up their mind about what to do from there.
      Thanks,
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      02c24a82
    • J
      fs: add SEEK_HOLE and SEEK_DATA flags · 982d8165
      Josef Bacik 提交于
      This just gets us ready to support the SEEK_HOLE and SEEK_DATA flags.  Turns out
      using fiemap in things like cp cause more problems than it solves, so lets try
      and give userspace an interface that doesn't suck.  We need to match solaris
      here, and the definitions are
      
      *o* If /whence/ is SEEK_HOLE, the offset of the start of the
      next hole greater than or equal to the supplied offset
      is returned. The definition of a hole is provided near
      the end of the DESCRIPTION.
      
      *o* If /whence/ is SEEK_DATA, the file pointer is set to the
      start of the next non-hole file region greater than or
      equal to the supplied offset.
      
      So in the generic case the entire file is data and there is a virtual hole at
      the end.  That means we will just return i_size for SEEK_HOLE and will return
      the same offset for SEEK_DATA.  This is how Solaris does it so we have to do it
      the same way.
      
      Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      982d8165
    • K
      fs: seq_file - add event counter to simplify poll() support · f1514638
      Kay Sievers 提交于
      Moving the event counter into the dynamically allocated 'struc seq_file'
      allows poll() support without the need to allocate its own tracking
      structure.
      
      All current users are switched over to use the new counter.
      
      Requested-by: Andrew Morton akpm@linux-foundation.org
      Acked-by: NNeilBrown <neilb@suse.de>
      Tested-by: Lucas De Marchi lucas.demarchi@profusion.mobi
      Signed-off-by: NKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      f1514638
    • C
      fs: simplify the blockdev_direct_IO prototype · aacfc19c
      Christoph Hellwig 提交于
      Simple filesystems always pass inode->i_sb_bdev as the block device
      argument, and never need a end_io handler.  Let's simply things for
      them and for my grepping activity by dropping these arguments.  The
      only thing not falling into that scheme is ext4, which passes and
      end_io handler without needing special flags (yet), but given how
      messy the direct I/O code there is use of __blockdev_direct_IO
      in one instead of two out of three cases isn't going to make a large
      difference anyway.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      aacfc19c
    • C
      rw_semaphore: remove up/down_read_non_owner · 11b80f45
      Christoph Hellwig 提交于
      Now that the last users is gone these can be removed.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      11b80f45
    • C
      fs: kill i_alloc_sem · bd5fe6c5
      Christoph Hellwig 提交于
      i_alloc_sem is a rather special rw_semaphore.  It's the last one that may
      be released by a non-owner, and it's write side is always mirrored by
      real exclusion.  It's intended use it to wait for all pending direct I/O
      requests to finish before starting a truncate.
      
      Replace it with a hand-grown construct:
      
       - exclusion for truncates is already guaranteed by i_mutex, so it can
         simply fall way
       - the reader side is replaced by an i_dio_count member in struct inode
         that counts the number of pending direct I/O requests.  Truncate can't
         proceed as long as it's non-zero
       - when i_dio_count reaches non-zero we wake up a pending truncate using
         wake_up_bit on a new bit in i_flags
       - new references to i_dio_count can't appear while we are waiting for
         it to read zero because the direct I/O count always needs i_mutex
         (or an equivalent like XFS's i_iolock) for starting a new operation.
      
      This scheme is much simpler, and saves the space of a spinlock_t and a
      struct list_head in struct inode (typically 160 bits on a non-debug 64-bit
      system).
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bd5fe6c5
    • T
      anonfd: fix missing declaration · e46ebd27
      Tomasz Stanislawski 提交于
      The forward declaration of struct file_operations is
      added to avoid compilation warnings.
      Signed-off-by: NTomasz Stanislawski <t.stanislaws@samsung.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e46ebd27
    • D
      superblock: add filesystem shrinker operations · 0e1fdafd
      Dave Chinner 提交于
      Now we have a per-superblock shrinker implementation, we can add a
      filesystem specific callout to it to allow filesystem internal
      caches to be shrunk by the superblock shrinker.
      
      Rather than perpetuate the multipurpose shrinker callback API (i.e.
      nr_to_scan == 0 meaning "tell me how many objects freeable in the
      cache), two operations will be added. The first will return the
      number of objects that are freeable, the second is the actual
      shrinker call.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0e1fdafd
    • D
      superblock: introduce per-sb cache shrinker infrastructure · b0d40c92
      Dave Chinner 提交于
      With context based shrinkers, we can implement a per-superblock
      shrinker that shrinks the caches attached to the superblock. We
      currently have global shrinkers for the inode and dentry caches that
      split up into per-superblock operations via a coarse proportioning
      method that does not batch very well.  The global shrinkers also
      have a dependency - dentries pin inodes - so we have to be very
      careful about how we register the global shrinkers so that the
      implicit call order is always correct.
      
      With a per-sb shrinker callout, we can encode this dependency
      directly into the per-sb shrinker, hence avoiding the need for
      strictly ordering shrinker registrations. We also have no need for
      any proportioning code for the shrinker subsystem already provides
      this functionality across all shrinkers. Allowing the shrinker to
      operate on a single superblock at a time means that we do less
      superblock list traversals and locking and reclaim should batch more
      effectively. This should result in less CPU overhead for reclaim and
      potentially faster reclaim of items from each filesystem.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b0d40c92
  2. 20 7月, 2011 23 次提交
  3. 19 7月, 2011 1 次提交
  4. 15 7月, 2011 1 次提交
    • M
      net: remove NETIF_F_ALL_TX_OFFLOADS · 62f2a3a4
      Michał Mirosław 提交于
      There is no software fallback implemented for SCTP or FCoE checksumming,
      and so it should not be passed on by software devices like bridge or bonding.
      
      For VLAN devices, this is different. First, the driver for underlying device
      should be prepared to get offloaded packets even when the feature is disabled
      (especially if it advertises it in vlan_features). Second, devices under
      VLANs do not get replaced without tearing down the VLAN first.
      
      This fixes a mess I accidentally introduced while converting bonding to
      ndo_fix_features.
      
      NETIF_F_SOFT_FEATURES are removed from BOND_VLAN_FEATURES because they
      are unused as of commit 712ae51a.
      Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62f2a3a4
  5. 14 7月, 2011 1 次提交
  6. 12 7月, 2011 1 次提交
  7. 09 7月, 2011 1 次提交
    • J
      w1: ds1wm: add a reset recovery parameter · f607e7fc
      Jean-François Dagenais 提交于
      This fixes a regression in 3.0 reported by Paul Parsons regarding the
      removal of the msleep(1) in the ds1wm_reset() function:
      
      : The linux-3.0-rc4 DS1WM 1-wire driver is logging "bus error, retrying"
      : error messages on an HP iPAQ hx4700 PDA (XScale-PXA270):
      :
      : <snip>
      : Driver for 1-wire Dallas network protocol.
      : DS1WM w1 busmaster driver - (c) 2004 Szabolcs Gyurko
      : 1-Wire driver for the DS2760 battery monitor  chip  - (c) 2004-2005, Szabolcs Gyurko
      : ds1wm ds1wm: pass: 1 bus error, retrying
      : ds1wm ds1wm: pass: 2 bus error, retrying
      : ds1wm ds1wm: pass: 3 bus error, retrying
      : ds1wm ds1wm: pass: 4 bus error, retrying
      : ds1wm ds1wm: pass: 5 bus error, retrying
      : ...
      :
      : The visible result is that the battery charging LED is erratic; sometimes
      : it works, mostly it doesn't.
      :
      : The linux-2.6.39 DS1WM 1-wire driver worked OK.  I haven't tried 3.0-rc1,
      : 3.0-rc2, or 3.0-rc3.
      
      This sleep should not be required on normal circuitry provided the
      pull-ups on the bus are correctly adapted to the slaves.  Unfortunately,
      this is not always the case.  The sleep is restored but as a parameter to
      the probe function in the pdata.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Reported-by: NPaul Parsons <lost.distance@yahoo.com>
      Tested-by: NPaul Parsons <lost.distance@yahoo.com>
      Signed-off-by: NJean-François Dagenais <dagenaisj@sonatest.com>
      Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f607e7fc
  8. 08 7月, 2011 2 次提交
    • D
      FS-Cache: Add a helper to bulk uncache pages on an inode · c902ce1b
      David Howells 提交于
      Add an FS-Cache helper to bulk uncache pages on an inode.  This will
      only work for the circumstance where the pages in the cache correspond
      1:1 with the pages attached to an inode's page cache.
      
      This is required for CIFS and NFS: When disabling inode cookie, we were
      returning the cookie and setting cifsi->fscache to NULL but failed to
      invalidate any previously mapped pages.  This resulted in "Bad page
      state" errors and manifested in other kind of errors when running
      fsstress.  Fix it by uncaching mapped pages when we disable the inode
      cookie.
      
      This patch should fix the following oops and "Bad page state" errors
      seen during fsstress testing.
      
        ------------[ cut here ]------------
        kernel BUG at fs/cachefiles/namei.c:201!
        invalid opcode: 0000 [#1] SMP
        Pid: 5, comm: kworker/u:0 Not tainted 2.6.38.7-30.fc15.x86_64 #1 Bochs Bochs
        RIP: 0010: cachefiles_walk_to_object+0x436/0x745 [cachefiles]
        RSP: 0018:ffff88002ce6dd00  EFLAGS: 00010282
        RAX: ffff88002ef165f0 RBX: ffff88001811f500 RCX: 0000000000000000
        RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000282
        RBP: ffff88002ce6dda0 R08: 0000000000000100 R09: ffffffff81b3a300
        R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff88002ae54840
        R13: ffff88002ae54840 R14: ffff880029c29c00 R15: ffff88001811f4b0
        FS:  00007f394dd32720(0000) GS:ffff88002ef00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 00007fffcb62ddf8 CR3: 000000001825f000 CR4: 00000000000006e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
        Process kworker/u:0 (pid: 5, threadinfo ffff88002ce6c000, task ffff88002ce55cc0)
        Stack:
         0000000000000246 ffff88002ce55cc0 ffff88002ce6dd58 ffff88001815dc00
         ffff8800185246c0 ffff88001811f618 ffff880029c29d18 ffff88001811f380
         ffff88002ce6dd50 ffffffff814757e4 ffff88002ce6dda0 ffffffff8106ac56
        Call Trace:
         cachefiles_lookup_object+0x78/0xd4 [cachefiles]
         fscache_lookup_object+0x131/0x16d [fscache]
         fscache_object_work_func+0x1bc/0x669 [fscache]
         process_one_work+0x186/0x298
         worker_thread+0xda/0x15d
         kthread+0x84/0x8c
         kernel_thread_helper+0x4/0x10
        RIP  cachefiles_walk_to_object+0x436/0x745 [cachefiles]
        ---[ end trace 1d481c9af1804caa ]---
      
      I tested the uncaching by the following means:
      
       (1) Create a big file on my NFS server (104857600 bytes).
      
       (2) Read the file into the cache with md5sum on the NFS client.  Look in
           /proc/fs/fscache/stats:
      
      	Pages  : mrk=25601 unc=0
      
       (3) Open the file for read/write ("bash 5<>/warthog/bigfile").  Look in proc
           again:
      
      	Pages  : mrk=25601 unc=25601
      Reported-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-and-Tested-by: NSuresh Jayaraman <sjayaraman@suse.de>
      cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c902ce1b
    • S
      genirq: replace irq_gc_ack() with {set,clr}_bit variants (fwd) · 659fb32d
      Simon Guinot 提交于
      This fixes a regression introduced by e59347a1 "arm: orion:
      Use generic irq chip".
      
      Depending on the device, interrupts acknowledgement is done by setting
      or by clearing a dedicated register. Replace irq_gc_ack() with some
      {set,clr}_bit variants allows to handle both cases.
      
      Note that this patch affects the following SoCs: Davinci, Samsung and
      Orion. Except for this last, the change is minor: irq_gc_ack() is just
      renamed into irq_gc_ack_set_bit().
      
      For the Orion SoCs, the edge GPIO interrupts support is currently
      broken. irq_gc_ack() try to acknowledge a such interrupt by setting
      the corresponding cause register bit. The Orion GPIO device expect the
      opposite. To fix this issue, the irq_gc_ack_clr_bit() variant is used.
      
      Tested on Network Space v2.
      Reported-by: NJoey Oravec <joravec@drewtech.com>
      Signed-off-by: NSimon Guinot <sguinot@lacie.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      659fb32d
  9. 05 7月, 2011 1 次提交