1. 06 1月, 2009 1 次提交
  2. 05 11月, 2008 1 次提交
  3. 04 1月, 2009 1 次提交
  4. 17 12月, 2008 1 次提交
  5. 26 11月, 2008 1 次提交
    • J
      jbd2: improve jbd2 fsync batching · e07f7183
      Josef Bacik 提交于
      This patch removes the static sleep time in favor of a more self
      optimizing approach where we measure the average amount of time it
      takes to commit a transaction to disk and the ammount of time a
      transaction has been running.  If somebody does a sync write or an
      fsync() traditionally we would sleep for 1 jiffies, which depending on
      the value of HZ could be a significant amount of time compared to how
      long it takes to commit a transaction to the underlying storage.  With
      this patch instead of sleeping for a jiffie, we check to see if the
      amount of time this transaction has been running is less than the
      average commit time, and if it is we sleep for the delta using
      schedule_hrtimeout to give us a higher precision sleep time.  This
      greatly benefits high end storage where you could end up sleeping for
      longer than it takes to commit the transaction and therefore sitting
      idle instead of allowing the transaction to be committed by keeping
      the sleep time to a minimum so you are sure to always be doing
      something.
      Signed-off-by: NJosef Bacik <jbacik@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      e07f7183
  6. 06 1月, 2009 3 次提交
    • A
      ext4: Don't overwrite allocation_context ac_status · 032115fc
      Aneesh Kumar K.V 提交于
      We can call ext4_mb_check_limits even after successfully allocating
      the requested blocks.  In that case, make sure we don't overwrite
      ac_status if it already has the status AC_STATUS_FOUND.  This fixes
      the lockdep warning:
      
      =============================================
      [ INFO: possible recursive locking detected ]
      2.6.28-rc6-autokern1 #1
      ---------------------------------------------
      fsstress/11948 is trying to acquire lock:
       (&meta_group_info[i]->alloc_sem){----}, at: [<c04d9a49>] ext4_mb_load_buddy+0x9f/0x278
      .....
      
      stack backtrace:
      .....
       [<c04db974>] ext4_mb_regular_allocator+0xbb5/0xd44
      .....
      
      but task is already holding lock:
       (&meta_group_info[i]->alloc_sem){----}, at: [<c04d9a49>] ext4_mb_load_buddy+0x9f/0x278
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      032115fc
    • T
      ext4: remove extraneous newlines from calls to ext4_error() and ext4_warning() · fde4d95a
      Theodore Ts'o 提交于
      This removes annoying blank syslog entries emitted by ext4_error() or
      ext4_warning(), since these functions add their own newline.
      Signed-off-by: NNick Warne <nick@ukfsn.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      fde4d95a
    • T
      jbd2: Add barrier not supported test to journal_wait_on_commit_record · fd98496f
      Theodore Ts'o 提交于
      Xen doesn't report that barriers are not supported until buffer I/O is
      reported as completed, instead of when the buffer I/O is submitted.
      Add a check and a fallback codepath to journal_wait_on_commit_record()
      to detect this case, so that attempts to mount ext4 filesystems on
      LVM/devicemapper devices on Xen guests don't blow up with an "Aborting
      journal on device XXX"; "Remounting filesystem read-only" error.
      
      Thanks to Andreas Sundstrom for reporting this issue.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      fd98496f
  7. 07 1月, 2009 1 次提交
    • F
      ext4: Allow ext4 to run without a journal · 0390131b
      Frank Mayhar 提交于
      A few weeks ago I posted a patch for discussion that allowed ext4 to run
      without a journal.  Since that time I've integrated the excellent
      comments from Andreas and fixed several serious bugs.  We're currently
      running with this patch and generating some performance numbers against
      both ext2 (with backported reservations code) and ext4 with and without
      a journal.  It just so happens that running without a journal is
      slightly faster for most everything.
      
      We did
      	iozone -T -t 4 s 2g -r 256k -T -I -i0 -i1 -i2
      
      which creates 4 threads, each of which create and do reads and writes on
      a 2G file, with a buffer size of 256K, using O_DIRECT for all file opens
      to bypass the page cache.  Results:
      
                           ext2        ext4, default   ext4, no journal
        initial writes   13.0 MB/s        15.4 MB/s          15.7 MB/s
        rewrites         13.1 MB/s        15.6 MB/s          15.9 MB/s
        reads            15.2 MB/s        16.9 MB/s          17.2 MB/s
        re-reads         15.3 MB/s        16.9 MB/s          17.2 MB/s
        random readers    5.6 MB/s         5.6 MB/s           5.7 MB/s
        random writers    5.1 MB/s         5.3 MB/s           5.4 MB/s 
      
      So it seems that, so far, this was a useful exercise.
      Signed-off-by: NFrank Mayhar <fmayhar@google.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      0390131b
  8. 17 12月, 2008 1 次提交
  9. 27 11月, 2008 1 次提交
  10. 26 11月, 2008 2 次提交
  11. 06 1月, 2009 2 次提交
  12. 05 11月, 2008 1 次提交
    • T
      ext4: tone down ext4_da_writepages warnings · 2a21e37e
      Theodore Ts'o 提交于
      If the filesystem has errors, ext4_da_writepages() will return a *lot*
      of errors, including lots and lots of stack dumps.  While it's true
      that we are dropping user data on the floor, which is unfortunate, the
      stack dumps aren't helpful, and they tend to obscure the true original
      root cause of the problem.  So in the case where the filesystem has
      aborted, return an EROFS right away.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      2a21e37e
  13. 13 12月, 2008 1 次提交
    • T
      ext4: remove do_blk_alloc() · 97df5d15
      Theodore Ts'o 提交于
      The convenience function do_blk_alloc() is a static function with only
      one caller, so fold it into ext4_new_meta_blocks() to simplify the
      code and to make it easier to understand.
      
      To save more stack space, if count is a null pointer in
      ext4_new_meta_blocks() assume that caller wanted a single block (and
      if there is an error, no blocks were allocated).
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      97df5d15
  14. 08 12月, 2008 1 次提交
    • T
      ext4: remove ext4_new_meta_block() · cfe82c85
      Theodore Ts'o 提交于
      There were only two one callers of the function ext4_new_meta_block(),
      which just a very simpler wrapper function around
      ext4_new_meta_blocks().  Change those two functions to call
      ext4_new_meta_blocks() directly, to save code and stack space usage.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      cfe82c85
  15. 02 1月, 2009 1 次提交
  16. 07 12月, 2008 1 次提交
  17. 29 10月, 2008 2 次提交
  18. 30 10月, 2008 1 次提交
  19. 05 1月, 2009 3 次提交
    • N
      fs: symlink write_begin allocation context fix · 54566b2c
      Nick Piggin 提交于
      With the write_begin/write_end aops, page_symlink was broken because it
      could no longer pass a GFP_NOFS type mask into the point where the
      allocations happened.  They are done in write_begin, which would always
      assume that the filesystem can be entered from reclaim.  This bug could
      cause filesystem deadlocks.
      
      The funny thing with having a gfp_t mask there is that it doesn't really
      allow the caller to arbitrarily tinker with the context in which it can be
      called.  It couldn't ever be GFP_ATOMIC, for example, because it needs to
      take the page lock.  The only thing any callers care about is __GFP_FS
      anyway, so turn that into a single flag.
      
      Add a new flag for write_begin, AOP_FLAG_NOFS.  Filesystems can now act on
      this flag in their write_begin function.  Change __grab_cache_page to
      accept a nofs argument as well, to honour that flag (while we're there,
      change the name to grab_cache_page_write_begin which is more instructive
      and does away with random leading underscores).
      
      This is really a more flexible way to go in the end anyway -- if a
      filesystem happens to want any extra allocations aside from the pagecache
      ones in ints write_begin function, it may now use GFP_KERNEL (rather than
      GFP_NOFS) for common case allocations (eg.  ocfs2_alloc_write_ctxt, for a
      random example).
      
      [kosaki.motohiro@jp.fujitsu.com: fix ubifs]
      [kosaki.motohiro@jp.fujitsu.com: fix fuse]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: <stable@kernel.org>		[2.6.28.x]
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      [ Cleaned up the calling convention: just pass in the AOP flags
        untouched to the grab_cache_page_write_begin() function.  That
        just simplifies everybody, and may even allow future expansion of the
        logic.   - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      54566b2c
    • P
      fs: introduce bgl_lock_ptr() · c644f0e4
      Pekka Enberg 提交于
      As suggested by Andreas Dilger, introduce a bgl_lock_ptr() helper in
      <linux/blockgroup_lock.h> and add separate sb_bgl_lock() helpers to
      filesystem specific header files to break the hidden dependency to
      struct ext[234]_sb_info.
      
      Also, while at it, convert the macros to static inlines to try make up
      for all the times I broke Andrew Morton's tree.
      Acked-by: NAndreas Dilger <adilger@sun.com>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c644f0e4
    • A
      sanitize audit_fd_pair() · 157cf649
      Al Viro 提交于
      * no allocations
      * return void
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      157cf649
  20. 04 1月, 2009 3 次提交
  21. 03 1月, 2009 10 次提交
    • D
      CRED: Wrap task credential accesses in the devpts filesystem · d0eafc7d
      David Howells 提交于
      Wrap access to task credentials so that they can be separated more easily from
      the task_struct during the introduction of COW creds.
      
      Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().
      
      Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
      sense to use RCU directly rather than a convenient wrapper; these will be
      addressed by later patches.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0eafc7d
    • A
      devpts: fix unused function warning · 8c056e5b
      Andrew Morton 提交于
      fs/devpts/inode.c:324: warning: 'compare_init_pts_sb' defined but not used
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8c056e5b
    • A
      devpts: Coding style clean up · 835aa440
      Alan Cox 提交于
      Just nail the oddments now while this code is being touched
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      835aa440
    • S
      Enable multiple instances of devpts · 2a1b2dc0
      Sukadev Bhattiprolu 提交于
      To support containers, allow multiple instances of devpts filesystem, such
      that indices of ptys allocated in one instance are independent of ptys
      allocated in other instances of devpts.
      
      But to preserve backward compatibility, enable this support for multiple
      instances only if:
      
      	- CONFIG_DEVPTS_MULTIPLE_INSTANCES is set to Y, and
      	- '-o newinstance' mount option is specified while mounting devpts
      
      To use multi-instance mount, a container startup script could:
      
      	$ ns_exec -cm /bin/bash
      	$ umount /dev/pts
      	$ mount -t devpts -o newinstance lxcpts /dev/pts
      	$ mount -o bind /dev/pts/ptmx /dev/ptmx
      	$ /usr/sbin/sshd -p 1234
      
      where 'ns_exec -cm /bin/bash' is calls clone() with CLONE_NEWNS flag and execs
      /bin/bash in the child process. A pty created by the sshd is not visible in
      the original mount of /dev/pts.
      
      USER-SPACE-IMPACT:
      	- See Documentation/fs/devpts.txt (included in next patch) for user-
      	  space impact in multi-instance and mixed-mode operation.
      TODO:
      	- Update mount(8), pts(4) man pages. Highlight impact of not
      	  redirecting /dev/ptmx to /dev/pts/ptmx after a multi-instance mount.
      
      Changelog[v6]:
      	- [Dave Hansen] Use new get_init_pts_sb() interface
      	- [Serge Hallyn] Don't bother displaying 'newinstance' in show_options
      	- [Serge Hallyn] Use macros (PARSE_REMOUNT/PARSE_MOUNT) instead of 0/1.
      	- [Serge Hallyn] Check error return from get_sb_single() (now
      	  get_init_pts_sb())
      	- devpts_pty_kill(): don't dput error dentries
      
      Changelog[v5]:
      	- Move get_sb_ref() definition to earlier patch
      	- Move usage info to Documentation/filesystems/devpts.txt (next patch)
      	- Make ptmx node even in init_pts_ns, now that default mode is 0000
      	  (defined in earlier patch, enabled here).
      	- Cache ptmx dentry and use to update mode during remount
      	  (defined in earlier patch, enabled here).
      	- Bugfix: explicitly ignore newinstance on remount (if newinstance was
      	  specified on remount of initial mount, it would be ignored but
      	  /proc/mounts would imply that the option was set)
      
      Changelog[v4]:
      
      	- Update patch description to address H. Peter Anvin's comments
      	- Consolidate multi-instance mode code under new config token,
      	  CONFIG_DEVPTS_MULTIPLE_INSTANCE.
      	- Move usage-details from patch description to
      	  Documentation/fs/devpts.txt
      
      Changelog[v3]:
      	- Rename new mount option to 'newinstance'
      	- Create ptmx nodes only in 'newinstance' mounts
      	- Bugfix: parse_mount_options() modifies @data but since we need to
      	  parse the @data twice (once in devpts_get_sb() and once during
      	  do_remount_sb()), parse a local copy of @data in devpts_get_sb().
      	  (restructured code in devpts_get_sb() to fix this)
      
      Changelog[v2]:
      	- Support both single-mount and multiple-mount semantics and
      	  provide '-onewmnt' option to select the semantics.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2a1b2dc0
    • S
      Define get_init_pts_sb() · d4076ac5
      Sukadev Bhattiprolu 提交于
      See comments in the function header for details. The new interface will
      be used in a follow-on patch.
      
      Changelog [v2]:
      	[Dave Hansen] Replace get_sb_ref() in fs/super.c with get_init_pts_sb()
      	and make the new interface private to devpts
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d4076ac5
    • S
      Define mknod_ptmx() · 1f8f1e29
      Sukadev Bhattiprolu 提交于
      /dev/ptmx is closely tied to the devpts filesystem. An open of /dev/ptmx,
      allocates the next pty index and the associated device shows up in the
      devpts fs as /dev/pts/n.
      
      Wih multiple instancs of devpts filesystem, during an open of /dev/ptmx
      we would be unable to determine which instance of the devpts is being
      accessed.
      
      So we move the 'ptmx' node into /dev/pts and use the inode of the 'ptmx'
      node to identify the superblock and hence the devpts instance.  This patch
      adds ability for the kernel to internally create the [ptmx, c, 5:2] device
      when mounting devpts filesystem.  Since the ptmx node in devpts is new and
      may surprise some userspace scripts, the default permissions for the new
      node is 0000.  These permissions can be changed either using chmod or by
      remounting with the new '-o ptmxmode=0666' mount option.
      
      Changelog[v5]:
      	- [Serge Hallyn bugfix]: Letting new_inode() assign inode number to
      	  ptmx can collide with hand-assigning inode numbers to ptys. So,
      	  hand-assign specific inode number to ptmx node also.
      	- [Serge Hallyn]: Maybe safer to grab root dentry mutex while creating
      	  ptmx node
      	- [Bugfix with Serge Hallyn] Replace lookup_one_len() in mknod_ptmx()
      	  wih d_alloc_name() (lookup during ->get_sb() locks up system). To
      	  simplify patchset, fold the ptmx_dentry patch into this.
      
      Changelog[v4]:
      	- Change default permissions of pts/ptmx node to 0000.
      	- Move code for ptmxmode under #ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES.
      
      Changelog[v3]:
      	- Rename ptmx_mode to ptmxmode (for consistency with 'newinstance')
      
      Changelog[v2]:
      	- [H. Peter Anvin] Remove mknod() system call support and create the
      	  ptmx node internally.
      
      Changelog[v1]:
      	- Earlier version of this patch enabled creating /dev/pts/tty as
      	  well. As pointed out by Al Viro and H. Peter Anvin, that is not
      	  really necessary.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1f8f1e29
    • S
      Extract option parsing to new function · 53af8ee4
      Sukadev Bhattiprolu 提交于
      Move code to parse mount options into a separate function so it can
      (later) be shared between mount and remount operations.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      53af8ee4
    • S
      Per-mount 'config' object · 31af0abb
      Sukadev Bhattiprolu 提交于
      With support for multiple mounts of devpts, the 'config' structure really
      represents per-mount options rather than config parameters. Rename 'config'
      structure to 'pts_mount_opts' and store it in the super-block.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      31af0abb
    • S
      Per-mount allocated_ptys · e76b7c01
      Sukadev Bhattiprolu 提交于
      To enable multiple mounts of devpts, 'allocated_ptys' must be a per-mount
      variable rather than a global variable.  Move 'allocated_ptys' into the
      super_block's s_fs_info.
      
      Changelog[v2]:
      	Define and use DEVPTS_SB() wrapper.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e76b7c01
    • S
      Remove devpts_root global · 59e55e6c
      Sukadev Bhattiprolu 提交于
      Remove the 'devpts_root' global variable and find the root dentry using
      the super_block. The super-block can be found from the device inode, using
      the new wrapper, pts_sb_from_inode().
      
      Changelog: This patch is based on an earlier patchset from Serge Hallyn
      	   and Matt Helsley.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NAlan Cox <alan@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      59e55e6c
  22. 01 1月, 2009 1 次提交