1. 23 6月, 2010 4 次提交
  2. 01 6月, 2010 4 次提交
  3. 30 5月, 2010 10 次提交
  4. 28 5月, 2010 22 次提交
    • A
      remove detritus left by "mm: make read_cache_page synchronous" · 49837a80
      Al Viro 提交于
      gets minix get_dir_page() in sync with its analogs; back in 2007
      Nick has switched read_cache_page() and friends to sync behaviour
      (i.e.  they wait for the page to get unlocked, check if it's uptodate
      and if it isn't return ERR_PTR(-EIO) instead) and removed the
      duplicate logics from the callers.  In case of fs/minix/dir.c he'd
      removed only half of that...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      49837a80
    • A
      fix fs/sysv s_dirt handling · 4c9002de
      Al Viro 提交于
      got broken on ->sync_fs() conversion a year ago, nobody noticed...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4c9002de
    • N
      fat: convert to use the new truncate convention. · 459f6ed3
      npiggin@suse.de 提交于
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      459f6ed3
    • N
      ext2: convert to use the new truncate convention. · 737f2e93
      npiggin@suse.de 提交于
      I also have commented a possible bug in existing ext2 code, marked with XXX.
      
      Cc: linux-ext4@vger.kernel.org
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      737f2e93
    • N
      fs: convert simple fs to new truncate · 3322e79a
      Nick Piggin 提交于
      Convert simple filesystems: ramfs, configfs, sysfs, block_dev to new truncate
      sequence.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      3322e79a
    • N
      kill spurious reference to vmtruncate · 15c6fd97
      npiggin@suse.de 提交于
      Lots of filesystems calls vmtruncate despite not implementing the old
      ->truncate method.  Switch them to use simple_setsize and add some
      comments about the truncate code where it seems fitting.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      15c6fd97
    • N
      fs: introduce new truncate sequence · 7bb46a67
      npiggin@suse.de 提交于
      Introduce a new truncate calling sequence into fs/mm subsystems. Rather than
      setattr > vmtruncate > truncate, have filesystems call their truncate sequence
      from ->setattr if filesystem specific operations are required. vmtruncate is
      deprecated, and truncate_pagecache and inode_newsize_ok helpers introduced
      previously should be used.
      
      simple_setattr is introduced for simple in-ram filesystems to implement
      the new truncate sequence. Eventually all filesystems should be converted
      to implement a setattr, and the default code in notify_change should go
      away.
      
      simple_setsize is also introduced to perform just the ATTR_SIZE portion
      of simple_setattr (ie. changing i_size and trimming pagecache).
      
      To implement the new truncate sequence:
      - filesystem specific manipulations (eg freeing blocks) must be done in
        the setattr method rather than ->truncate.
      - vmtruncate can not be used by core code to trim blocks past i_size in
        the event of write failure after allocation, so this must be performed
        in the fs code.
      - convert usage of helpers block_write_begin, nobh_write_begin,
        cont_write_begin, and *blockdev_direct_IO* to use _newtrunc postfixed
        variants. These avoid calling vmtruncate to trim blocks (see previous).
      - inode_setattr should not be used. generic_setattr is a new function
        to be used to copy simple attributes into the generic inode.
      - make use of the better opportunity to handle errors with the new sequence.
      
      Big problem with the previous calling sequence: the filesystem is not called
      until i_size has already changed.  This means it is not allowed to fail the
      call, and also it does not know what the previous i_size was. Also, generic
      code calling vmtruncate to truncate allocated blocks in case of error had
      no good way to return a meaningful error (or, for example, atomically handle
      block deallocation).
      
      Cc: Christoph Hellwig <hch@lst.de>
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7bb46a67
    • R
      fs/super: fix kernel-doc warning · 7000d3c4
      Randy Dunlap 提交于
      Fix fs/super.c kernel-doc warning and function notation:
      Warning(fs/super.c:957): No description found for parameter 'sb'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7000d3c4
    • E
      fs/minix: bugfix, number of indirect block ptrs per block depends on block size · 0ab7620a
      Erik van der Kouwe 提交于
      The MINIX filesystem driver used a constant number of indirect block
      pointers in an indirect block. This worked only for filesystems with 1kb
      block, while the MINIX default block size is now 4kb. As a consequence,
      large files were read incorrectly on such filesystems and writing a
      large file would cause the filesystem to become corrupted. This patch
      computes the number of indirect block pointers based on the block size,
      making the driver work for each block size.
      
      I would like to thank Feiran Zheng ('Fam') for pointing out the cause
      of the corruption.
      Signed-off-by: NErik van der Kouwe <vdkouwe@cs.vu.nl>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0ab7620a
    • C
      rename the generic fsync implementations · 1b061d92
      Christoph Hellwig 提交于
      We don't name our generic fsync implementations very well currently.
      The no-op implementation for in-memory filesystems currently is called
      simple_sync_file which doesn't make too much sense to start with,
      the the generic one for simple filesystems is called simple_fsync
      which can lead to some confusion.
      
      This patch renames the generic file fsync method to generic_file_fsync
      to match the other generic_file_* routines it is supposed to be used
      with, and the no-op implementation to noop_fsync to make it obvious
      what to expect.  In addition add some documentation for both methods.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1b061d92
    • C
      drop unused dentry argument to ->fsync · 7ea80859
      Christoph Hellwig 提交于
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7ea80859
    • J
      fs: Add missing mutex_unlock · cc967be5
      Julia Lawall 提交于
      Add a mutex_unlock missing on the error path.  At other exists from the
      function that return an error flag, the mutex is unlocked, so do the same
      here.
      
      The semantic match that finds this problem is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      expression E1;
      @@
      
      * mutex_lock(E1,...);
        <+... when != E1
        if (...) {
          ... when != E1
      *   return ...;
        }
        ...+>
      * mutex_unlock(E1,...);
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      cc967be5
    • A
      get rid of the magic around f_count in aio · d7065da0
      Al Viro 提交于
      __aio_put_req() plays sick games with file refcount.  What
      it wants is fput() from atomic context; it's almost always
      done with f_count > 1, so they only have to deal with delayed
      work in rare cases when their reference happens to be the
      last one.  Current code decrements f_count and if it hasn't
      hit 0, everything is fine.  Otherwise it keeps a pointer
      to struct file (with zero f_count!) around and has delayed
      work do __fput() on it.
      
      Better way to do it: use atomic_long_add_unless( , -1, 1)
      instead of !atomic_long_dec_and_test().  IOW, decrement it
      only if it's not the last reference, leave refcount alone
      if it was.  And use normal fput() in delayed work.
      
      I've made that atomic_long_add_unless call a new helper -
      fput_atomic().  Drops a reference to file if it's safe to
      do in atomic (i.e. if that's not the last one), tells if
      it had been able to do that.  aio.c converted to it, __fput()
      use is gone.  req->ki_file *always* contributes to refcount
      now.  And __fput() became static.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d7065da0
    • N
      VFS: fix recent breakage of FS_REVAL_DOT · 176306f5
      Neil Brown 提交于
      Commit 1f36f774 broke FS_REVAL_DOT semantics.
      
      In particular, before this patch, the command
         ls -l
      in an NFS mounted directory would always check if the directory on the server
      had changed and if so would flush and refill the pagecache for the dir.
      After this patch, the same "ls -l" will repeatedly return stale date until
      the cached attributes for the directory time out.
      
      The following patch fixes this by ensuring the d_revalidate is called by
      do_last when "." is being looked-up.
      link_path_walk has already called d_revalidate, but in that case LOOKUP_OPEN
      is not set so nfs_lookup_verify_inode chooses not to do any validation.
      
      The following patch restores the original behaviour.
      
      Cc: stable@kernel.org
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      176306f5
    • A
      Revert "anon_inode: set S_IFREG on the anon_inode" · 1eb2cbb6
      Al Viro 提交于
      This reverts commit a7cf4145.
      1eb2cbb6
    • D
      quota: Convert quota statistics to generic percpu_counter · f32764bd
      Dmitry Monakhov 提交于
      Generic per-cpu counter has some memory overhead but it is negligible for
      modern systems and embedded systems compile without quota support.  And code
      reuse is a good thing. This patch should fix complain from preemptive kernels
      which was introduced by dde95888.
      
      [Jan Kara: Fixed patch to work on 32-bit archs as well]
      Reported-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      f32764bd
    • J
      fs/: do not fallback to default_llseek() when readdir() uses BKL · ca572727
      jan Blunck 提交于
      Do not use the fallback default_llseek() if the readdir operation of the
      filesystem still uses the big kernel lock.
      
      Since llseek() modifies
      file->f_pos of the directory directly it may need locking to not confuse
      readdir which usually uses file->f_pos directly as well
      
      Since the special characteristics of the BKL (unlocked on schedule) are
      not necessary in this case, the inode mutex can be used for locking as
      provided by generic_file_llseek().  This is only possible since all
      filesystems, except reiserfs, either use a directory as a flat file or
      with disk address offsets.  Reiserfs on the other hand uses a 32bit hash
      off the filename as the offset so generic_file_llseek() can get used as
      well since the hash is always smaller than sb->s_maxbytes (= (512 << 32) -
      blocksize).
      Signed-off-by: NJan Blunck <jblunck@suse.de>
      Acked-by: NJan Kara <jack@suse.cz>
      Acked-by: NAnders Larsen <al@alarsen.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ca572727
    • J
      vfs: introduce noop_llseek() · ae6afc3f
      jan Blunck 提交于
      This is an implementation of ->llseek useable for the rare special case
      when userspace expects the seek to succeed but the (device) file is
      actually not able to perform the seek.  In this case you use noop_llseek()
      instead of falling back to the default implementation of ->llseek.
      Signed-off-by: NJan Blunck <jblunck@suse.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ae6afc3f
    • J
      aio: fix the compat vectored operations · 9d85cba7
      Jeff Moyer 提交于
      The aio compat code was not converting the struct iovecs from 32bit to
      64bit pointers, causing either EINVAL to be returned from io_getevents, or
      EFAULT as the result of the I/O.  This patch passes a compat flag to
      io_submit to signal that pointer conversion is necessary for a given iocb
      array.
      
      A variant of this was tested by Michael Tokarev.  I have also updated the
      libaio test harness to exercise this code path with good success.
      Further, I grabbed a copy of ltp and ran the
      testcases/kernel/syscall/readv and writev tests there (compiled with -m32
      on my 64bit system).  All seems happy, but extra eyes on this would be
      welcome.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix CONFIG_COMPAT=n build]
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Reported-by: NMichael Tokarev <mjt@tls.msk.ru>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: <stable@kernel.org>		[2.6.35.1]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d85cba7
    • J
      compat: factor out compat_rw_copy_check_uvector from compat_do_readv_writev · b8373363
      Jeff Moyer 提交于
      It was reported in http://lkml.org/lkml/2010/3/8/309 that 32 bit readv and
      writev AIO operations were not functioning properly.  It turns out that
      the code to convert the 32bit io vectors to 64 bits was never written.
      The results of that can be pretty bad, but in my testing, it mostly ended
      up in generating EFAULT as we walked off the list of I/O vectors provided.
      
      This patch set fixes the problem in my environment.  are greatly
      appreciated.
      
      This patch:
      
      Factor out code that will be used by both compat_do_readv_writev and the
      compat aio submission code paths.
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Reported-by: NMichael Tokarev <mjt@tls.msk.ru>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: <stable@kernel.org>		[2.6.35.1]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b8373363
    • J
      fs/affs: use ERR_CAST · cccad8f9
      Julia Lawall 提交于
      Use ERR_CAST(x) rather than ERR_PTR(PTR_ERR(x)).  The former makes more
      clear what is the purpose of the operation, which otherwise looks like a
      no-op.
      
      The semantic patch that makes this change is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      type T;
      T x;
      identifier f;
      @@
      
      T f (...) { <+...
      - ERR_PTR(PTR_ERR(x))
      + x
       ...+> }
      
      @@
      expression x;
      @@
      
      - ERR_PTR(PTR_ERR(x))
      + ERR_CAST(x)
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cccad8f9
    • W
      kcore: add _text to KCORE_TEXT · 36e15263
      Wu Fengguang 提交于
      Extend KCORE_TEXT to cover the pages between _text and _stext, to allow
      examining some important page table pages.
      
      `readelf -a` output on x86_64 before and after patch:
      	  Type           Offset             VirtAddr           PhysAddr
      before    LOAD           0x00007fff8100c000 0xffffffff81009000 0x0000000000000000
      after     LOAD           0x00007fff81003000 0xffffffff81000000 0x0000000000000000
      
      The newly covered pages are:
      
      	0xffffffff81000000 <startup_64> etc.
      	0xffffffff81001000 <init_level4_pgt>
      	0xffffffff81002000 <level3_ident_pgt>
      	0xffffffff81003000 <level3_kernel_pgt>
      	0xffffffff81004000 <level2_fixmap_pgt>
      	0xffffffff81005000 <level1_fixmap_pgt>
      	0xffffffff81006000 <level2_ident_pgt>
      	0xffffffff81007000 <level2_kernel_pgt>
      	0xffffffff81008000 <level2_spare_pgt>
      
      Before patch, /proc/kcore shows outdated contents for the above page
      table pages, for example:
      
      	(gdb) p level3_ident_pgt
      	$1 = {<text variable, no debug info>} 0xffffffff81002000 <level3_ident_pgt>
      	(gdb) p/x *((pud_t *)&level3_ident_pgt)@512
      	$2 = {{pud = 0x1006063}, {pud = 0x0} <repeats 511 times>}
      
      while the real content is:
      
      	root@hp /home/wfg# hexdump -s 0x1002000 -n 4096 /dev/mem
      	1002000 6063 0100 0000 0000 8067 0000 0000 0000
      	1002010 0000 0000 0000 0000 0000 0000 0000 0000
      	*
      	1003000
      
      That is, on a x86_64 box with 2GB memory, we can see first-1GB / full-2GB
      identity mapping before/after patch:
      
      	(gdb) p/x *((pud_t *)&level3_ident_pgt)@512
      before  $1 = {{pud = 0x1006063}, {pud = 0x0} <repeats 511 times>}
      after   $1 = {{pud = 0x1006063}, {pud = 0x8067}, {pud = 0x0} <repeats 510 times>}
      
      Obviously the content before patch is wrong.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      36e15263