1. 01 1月, 2010 2 次提交
    • T
      ext4: Calculate metadata requirements more accurately · 9d0be502
      Theodore Ts'o 提交于
      In the past, ext4_calc_metadata_amount(), and its sub-functions
      ext4_ext_calc_metadata_amount() and ext4_indirect_calc_metadata_amount()
      badly over-estimated the number of metadata blocks that might be
      required for delayed allocation blocks.  This didn't matter as much
      when functions which managed the reserved metadata blocks were more
      aggressive about dropping reserved metadata blocks as delayed
      allocation blocks were written, but unfortunately they were too
      aggressive.  This was fixed in commit 0637c6f4, but as a result the
      over-estimation by ext4_calc_metadata_amount() would lead to reserving
      2-3 times the number of pending delayed allocation blocks as
      potentially required metadata blocks.  So if there are 1 megabytes of
      blocks which have been not yet been allocation, up to 3 megabytes of
      space would get reserved out of the user's quota and from the file
      system free space pool until all of the inode's data blocks have been
      allocated.
      
      This commit addresses this problem by much more accurately estimating
      the number of metadata blocks that will be required.  It will still
      somewhat over-estimate the number of blocks needed, since it must make
      a worst case estimate not knowing which physical blocks will be
      needed, but it is much more accurate than before.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      9d0be502
    • T
      ext4: Fix accounting of reserved metadata blocks · ee5f4d9c
      Theodore Ts'o 提交于
      Commit 0637c6f4 had a typo which caused the reserved metadata blocks to
      not be released correctly.   Fix this.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      ee5f4d9c
  2. 31 12月, 2009 1 次提交
    • T
      ext4: Patch up how we claim metadata blocks for quota purposes · 0637c6f4
      Theodore Ts'o 提交于
      As reported in Kernel Bugzilla #14936, commit d21cd8f1 triggered a BUG
      in the function ext4_da_update_reserve_space() found in
      fs/ext4/inode.c.  The root cause of this BUG() was caused by the fact
      that ext4_calc_metadata_amount() can severely over-estimate how many
      metadata blocks will be needed, especially when using direct
      block-mapped files.
      
      In addition, it can also badly *under* estimate how much space is
      needed, since ext4_calc_metadata_amount() assumes that the blocks are
      contiguous, and this is not always true.  If the application is
      writing blocks to a sparse file, the number of metadata blocks
      necessary can be severly underestimated by the functions
      ext4_da_reserve_space(), ext4_da_update_reserve_space() and
      ext4_da_release_space().  This was the cause of the dq_claim_space
      reports found on kerneloops.org.
      
      Unfortunately, doing this right means that we need to massively
      over-estimate the amount of free space needed.  So in some cases we
      may need to force the inode to be written to disk asynchronously in
      to avoid spurious quota failures.
      
      http://bugzilla.kernel.org/show_bug.cgi?id=14936Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      0637c6f4
  3. 30 12月, 2009 1 次提交
  4. 26 12月, 2009 1 次提交
  5. 23 12月, 2009 6 次提交
    • A
      jbd2: don't use __GFP_NOFAIL in journal_init_common() · 3ebfdf88
      Andrew Morton 提交于
      It triggers the warning in get_page_from_freelist(), and it isn't
      appropriate to use __GFP_NOFAIL here anyway.
      
      Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14843Reported-by: NChristian Casteyde <casteyde.christian@free.fr>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      3ebfdf88
    • E
      ext4: flush delalloc blocks when space is low · c8afb446
      Eric Sandeen 提交于
      Creating many small files in rapid succession on a small
      filesystem can lead to spurious ENOSPC; on a 104MB filesystem:
      
      for i in `seq 1 22500`; do
          echo -n > $SCRATCH_MNT/$i
          echo XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > $SCRATCH_MNT/$i
      done
      
      leads to ENOSPC even though after a sync, 40% of the fs is free
      again.
      
      This is because we reserve worst-case metadata for delalloc writes,
      and when data is allocated that worst-case reservation is not
      usually needed.
      
      When freespace is low, kicking off an async writeback will start
      converting that worst-case space usage into something more realistic,
      almost always freeing up space to continue.
      
      This resolves the testcase for me, and survives all 4 generic
      ENOSPC tests in xfstests.
      
      We'll still need a hard synchronous sync to squeeze out the last bit,
      but this fixes things up to a large degree.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      c8afb446
    • E
      fs-writeback: Add helper function to start writeback if idle · 17bd55d0
      Eric Sandeen 提交于
      ext4, at least, would like to start pushing on writeback if it starts
      to get close to ENOSPC when reserving worst-case blocks for delalloc
      writes.  Writing out delalloc data will convert those worst-case
      predictions into usually smaller actual usage, freeing up space
      before we hit ENOSPC based on this speculation.
      
      Thanks to Jens for the suggestion for the helper function,
      & the naming help.
      
      I've made the helper return status on whether writeback was
      started even though I don't plan to use it in the ext4 patch;
      it seems like it would be potentially useful to test this
      in some cases.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Acked-by: NJan Kara <jack@suse.cz>
      17bd55d0
    • J
      ext4: Eliminate potential double free on error path · d3533d72
      Julia Lawall 提交于
      b_entry_name and buffer are initially NULL, are initialized within a loop
      to the result of calling kmalloc, and are freed at the bottom of this loop.
      The loop contains gotos to cleanup, which also frees b_entry_name and
      buffer.  Some of these gotos are before the reinitializations of
      b_entry_name and buffer.  To maintain the invariant that b_entry_name and
      buffer are NULL at the top of the loop, and thus acceptable arguments to
      kfree, these variables are now set to NULL after the kfrees.
      
      This seems to be the simplest solution.  A more complicated solution
      would be to introduce more labels in the error handling code at the end of
      the function.
      
      A simplified version of the semantic match that finds this problem is as
      follows: (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @r@
      identifier E;
      expression E1;
      iterator I;
      statement S;
      @@
      
      *kfree(E);
      ... when != E = E1
          when != I(E,...) S
          when != &E
      *kfree(E);
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      d3533d72
    • A
      ext4: fix unsigned long long printk warning in super.c · a6b43e38
      Andrew Morton 提交于
      sparc64 allmodconfig:
      
      fs/ext4/super.c: In function `lifetime_write_kbytes_show':
      fs/ext4/super.c:2174: warning: long long unsigned int format, long unsigned int arg (arg 4)
      fs/ext4/super.c:2174: warning: long long unsigned int format, long unsigned int arg (arg 4)
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      a6b43e38
    • T
      ext4, jbd2: Add barriers for file systems with exernal journals · cc3e1bea
      Theodore Ts'o 提交于
      This is a bit complicated because we are trying to optimize when we
      send barriers to the fs data disk.  We could just throw in an extra
      barrier to the data disk whenever we send a barrier to the journal
      disk, but that's not always strictly necessary.
      
      We only need to send a barrier during a commit when there are data
      blocks which are must be written out due to an inode written in
      ordered mode, or if fsync() depends on the commit to force data blocks
      to disk.  Finally, before we drop transactions from the beginning of
      the journal during a checkpoint operation, we need to guarantee that
      any blocks that were flushed out to the data disk are firmly on the
      rust platter before we drop the transaction from the journal.
      
      Thanks to Oleg Drokin for pointing out this flaw in ext3/ext4.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      cc3e1bea
  6. 14 12月, 2009 1 次提交
  7. 21 12月, 2009 2 次提交
  8. 14 12月, 2009 1 次提交
  9. 24 12月, 2009 5 次提交
  10. 23 12月, 2009 20 次提交
    • J
      quota: Improve checking of quota file header · 869835df
      Jan Kara 提交于
      When we are asked for vfsv0 quota format and the file is in vfsv1
      format (or vice versa), refuse to use the quota file. Also return
      with error when we don't like the header of quota file.
      Signed-off-by: NJan Kara <jack@suse.cz>
      869835df
    • Y
      jbd: jbd-debug and jbd2-debug should be writable · 765f8361
      Yin Kangkai 提交于
      jbd-debug and jbd2-debug is currently read-only (S_IRUGO), which is not
      correct. Make it writable so that we can start debuging.
      Signed-off-by: NYin Kangkai <kangkai.yin@intel.com>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      765f8361
    • D
      ext4: fix sleep inside spinlock issue with quota and dealloc (#14739) · 39bc680a
      Dmitry Monakhov 提交于
      Unlock i_block_reservation_lock before vfs_dq_reserve_block().
      This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=14739
      
      CC: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      39bc680a
    • D
      ext4: Fix potential quota deadlock · d21cd8f1
      Dmitry Monakhov 提交于
      We have to delay vfs_dq_claim_space() until allocation context destruction.
      Currently we have following call-trace:
      ext4_mb_new_blocks()
        /* task is already holding ac->alloc_semp */
       ->ext4_mb_mark_diskspace_used
          ->vfs_dq_claim_space()  /*  acquire dqptr_sem here. Possible deadlock */
       ->ext4_mb_release_context() /* drop ac->alloc_semp here */
      
      Let's move quota claiming to ext4_da_update_reserve_space()
      
       =======================================================
       [ INFO: possible circular locking dependency detected ]
       2.6.32-rc7 #18
       -------------------------------------------------------
       write-truncate-/3465 is trying to acquire lock:
        (&s->s_dquot.dqptr_sem){++++..}, at: [<c025e73b>] dquot_claim_space+0x3b/0x1b0
      
       but task is already holding lock:
        (&meta_group_info[i]->alloc_sem){++++..}, at: [<c02ce962>] ext4_mb_load_buddy+0xb2/0x370
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #3 (&meta_group_info[i]->alloc_sem){++++..}:
              [<c017d04b>] __lock_acquire+0xd7b/0x1260
              [<c017d5ea>] lock_acquire+0xba/0xd0
              [<c0527191>] down_read+0x51/0x90
              [<c02ce962>] ext4_mb_load_buddy+0xb2/0x370
              [<c02d0c1c>] ext4_mb_free_blocks+0x46c/0x870
              [<c029c9d3>] ext4_free_blocks+0x73/0x130
              [<c02c8cfc>] ext4_ext_truncate+0x76c/0x8d0
              [<c02a8087>] ext4_truncate+0x187/0x5e0
              [<c01e0f7b>] vmtruncate+0x6b/0x70
              [<c022ec02>] inode_setattr+0x62/0x190
              [<c02a2d7a>] ext4_setattr+0x25a/0x370
              [<c022ee81>] notify_change+0x151/0x340
              [<c021349d>] do_truncate+0x6d/0xa0
              [<c0221034>] may_open+0x1d4/0x200
              [<c022412b>] do_filp_open+0x1eb/0x910
              [<c021244d>] do_sys_open+0x6d/0x140
              [<c021258e>] sys_open+0x2e/0x40
              [<c0103100>] sysenter_do_call+0x12/0x32
      
       -> #2 (&ei->i_data_sem){++++..}:
              [<c017d04b>] __lock_acquire+0xd7b/0x1260
              [<c017d5ea>] lock_acquire+0xba/0xd0
              [<c0527191>] down_read+0x51/0x90
              [<c02a5787>] ext4_get_blocks+0x47/0x450
              [<c02a74c1>] ext4_getblk+0x61/0x1d0
              [<c02a7a7f>] ext4_bread+0x1f/0xa0
              [<c02bcddc>] ext4_quota_write+0x12c/0x310
              [<c0262d23>] qtree_write_dquot+0x93/0x120
              [<c0261708>] v2_write_dquot+0x28/0x30
              [<c025d3fb>] dquot_commit+0xab/0xf0
              [<c02be977>] ext4_write_dquot+0x77/0x90
              [<c02be9bf>] ext4_mark_dquot_dirty+0x2f/0x50
              [<c025e321>] dquot_alloc_inode+0x101/0x180
              [<c029fec2>] ext4_new_inode+0x602/0xf00
              [<c02ad789>] ext4_create+0x89/0x150
              [<c0221ff2>] vfs_create+0xa2/0xc0
              [<c02246e7>] do_filp_open+0x7a7/0x910
              [<c021244d>] do_sys_open+0x6d/0x140
              [<c021258e>] sys_open+0x2e/0x40
              [<c0103100>] sysenter_do_call+0x12/0x32
      
       -> #1 (&sb->s_type->i_mutex_key#7/4){+.+...}:
              [<c017d04b>] __lock_acquire+0xd7b/0x1260
              [<c017d5ea>] lock_acquire+0xba/0xd0
              [<c0526505>] mutex_lock_nested+0x65/0x2d0
              [<c0260c9d>] vfs_load_quota_inode+0x4bd/0x5a0
              [<c02610af>] vfs_quota_on_path+0x5f/0x70
              [<c02bc812>] ext4_quota_on+0x112/0x190
              [<c026345a>] sys_quotactl+0x44a/0x8a0
              [<c0103100>] sysenter_do_call+0x12/0x32
      
       -> #0 (&s->s_dquot.dqptr_sem){++++..}:
              [<c017d361>] __lock_acquire+0x1091/0x1260
              [<c017d5ea>] lock_acquire+0xba/0xd0
              [<c0527191>] down_read+0x51/0x90
              [<c025e73b>] dquot_claim_space+0x3b/0x1b0
              [<c02cb95f>] ext4_mb_mark_diskspace_used+0x36f/0x380
              [<c02d210a>] ext4_mb_new_blocks+0x34a/0x530
              [<c02c83fb>] ext4_ext_get_blocks+0x122b/0x13c0
              [<c02a5966>] ext4_get_blocks+0x226/0x450
              [<c02a5ff3>] mpage_da_map_blocks+0xc3/0xaa0
              [<c02a6ed6>] ext4_da_writepages+0x506/0x790
              [<c01de272>] do_writepages+0x22/0x50
              [<c01d766d>] __filemap_fdatawrite_range+0x6d/0x80
              [<c01d7b9b>] filemap_flush+0x2b/0x30
              [<c02a40ac>] ext4_alloc_da_blocks+0x5c/0x60
              [<c029e595>] ext4_release_file+0x75/0xb0
              [<c0216b59>] __fput+0xf9/0x210
              [<c0216c97>] fput+0x27/0x30
              [<c02122dc>] filp_close+0x4c/0x80
              [<c014510e>] put_files_struct+0x6e/0xd0
              [<c01451b7>] exit_files+0x47/0x60
              [<c0146a24>] do_exit+0x144/0x710
              [<c0147028>] do_group_exit+0x38/0xa0
              [<c0159abc>] get_signal_to_deliver+0x2ac/0x410
              [<c0102849>] do_notify_resume+0xb9/0x890
              [<c01032d2>] work_notifysig+0x13/0x21
      
       other info that might help us debug this:
      
       3 locks held by write-truncate-/3465:
        #0:  (jbd2_handle){+.+...}, at: [<c02e1f8f>] start_this_handle+0x38f/0x5c0
        #1:  (&ei->i_data_sem){++++..}, at: [<c02a57f6>] ext4_get_blocks+0xb6/0x450
        #2:  (&meta_group_info[i]->alloc_sem){++++..}, at: [<c02ce962>] ext4_mb_load_buddy+0xb2/0x370
      
       stack backtrace:
       Pid: 3465, comm: write-truncate- Not tainted 2.6.32-rc7 #18
       Call Trace:
        [<c0524cb3>] ? printk+0x1d/0x22
        [<c017ac9a>] print_circular_bug+0xca/0xd0
        [<c017d361>] __lock_acquire+0x1091/0x1260
        [<c016bca2>] ? sched_clock_local+0xd2/0x170
        [<c0178fd0>] ? trace_hardirqs_off_caller+0x20/0xd0
        [<c017d5ea>] lock_acquire+0xba/0xd0
        [<c025e73b>] ? dquot_claim_space+0x3b/0x1b0
        [<c0527191>] down_read+0x51/0x90
        [<c025e73b>] ? dquot_claim_space+0x3b/0x1b0
        [<c025e73b>] dquot_claim_space+0x3b/0x1b0
        [<c02cb95f>] ext4_mb_mark_diskspace_used+0x36f/0x380
        [<c02d210a>] ext4_mb_new_blocks+0x34a/0x530
        [<c02c601d>] ? ext4_ext_find_extent+0x25d/0x280
        [<c02c83fb>] ext4_ext_get_blocks+0x122b/0x13c0
        [<c016bca2>] ? sched_clock_local+0xd2/0x170
        [<c016be60>] ? sched_clock_cpu+0x120/0x160
        [<c016beef>] ? cpu_clock+0x4f/0x60
        [<c0178fd0>] ? trace_hardirqs_off_caller+0x20/0xd0
        [<c052712c>] ? down_write+0x8c/0xa0
        [<c02a5966>] ext4_get_blocks+0x226/0x450
        [<c016be60>] ? sched_clock_cpu+0x120/0x160
        [<c016beef>] ? cpu_clock+0x4f/0x60
        [<c017908b>] ? trace_hardirqs_off+0xb/0x10
        [<c02a5ff3>] mpage_da_map_blocks+0xc3/0xaa0
        [<c01d69cc>] ? find_get_pages_tag+0x16c/0x180
        [<c01d6860>] ? find_get_pages_tag+0x0/0x180
        [<c02a73bd>] ? __mpage_da_writepage+0x16d/0x1a0
        [<c01dfc4e>] ? pagevec_lookup_tag+0x2e/0x40
        [<c01ddf1b>] ? write_cache_pages+0xdb/0x3d0
        [<c02a7250>] ? __mpage_da_writepage+0x0/0x1a0
        [<c02a6ed6>] ext4_da_writepages+0x506/0x790
        [<c016beef>] ? cpu_clock+0x4f/0x60
        [<c016bca2>] ? sched_clock_local+0xd2/0x170
        [<c016be60>] ? sched_clock_cpu+0x120/0x160
        [<c016be60>] ? sched_clock_cpu+0x120/0x160
        [<c02a69d0>] ? ext4_da_writepages+0x0/0x790
        [<c01de272>] do_writepages+0x22/0x50
        [<c01d766d>] __filemap_fdatawrite_range+0x6d/0x80
        [<c01d7b9b>] filemap_flush+0x2b/0x30
        [<c02a40ac>] ext4_alloc_da_blocks+0x5c/0x60
        [<c029e595>] ext4_release_file+0x75/0xb0
        [<c0216b59>] __fput+0xf9/0x210
        [<c0216c97>] fput+0x27/0x30
        [<c02122dc>] filp_close+0x4c/0x80
        [<c014510e>] put_files_struct+0x6e/0xd0
        [<c01451b7>] exit_files+0x47/0x60
        [<c0146a24>] do_exit+0x144/0x710
        [<c017b163>] ? lock_release_holdtime+0x33/0x210
        [<c0528137>] ? _spin_unlock_irq+0x27/0x30
        [<c0147028>] do_group_exit+0x38/0xa0
        [<c017babb>] ? trace_hardirqs_on+0xb/0x10
        [<c0159abc>] get_signal_to_deliver+0x2ac/0x410
        [<c0102849>] do_notify_resume+0xb9/0x890
        [<c0178fd0>] ? trace_hardirqs_off_caller+0x20/0xd0
        [<c017b163>] ? lock_release_holdtime+0x33/0x210
        [<c0165b50>] ? autoremove_wake_function+0x0/0x50
        [<c017ba54>] ? trace_hardirqs_on_caller+0x134/0x190
        [<c017babb>] ? trace_hardirqs_on+0xb/0x10
        [<c0300ba4>] ? security_file_permission+0x14/0x20
        [<c0215761>] ? vfs_write+0x131/0x190
        [<c0214f50>] ? do_sync_write+0x0/0x120
        [<c0103115>] ? sysenter_do_call+0x27/0x32
        [<c01032d2>] work_notifysig+0x13/0x21
      
      CC: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      d21cd8f1
    • J
      quota: Fix 64-bit limits setting on 32-bit archs · 82fdfa92
      Jan Kara 提交于
      Fix warnings:
      fs/quota/quota_v2.c: In function ‘v2_read_file_info’:
      fs/quota/quota_v2.c:123: warning: integer constant is too large for ‘long’ type
      fs/quota/quota_v2.c:124: warning: integer constant is too large for ‘long’ type
      Reported-by: NJerry Leo <jerryleo860202@gmail.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      82fdfa92
    • E
      ext3: Replace lock/unlock_super() with an explicit lock for resizing · 96d2a495
      Eric Sandeen 提交于
      Use a separate lock to protect s_groups_count and the other block
      group descriptors which get changed via an on-line resize operation,
      so we can stop overloading the use of lock_super().
      
      Port of ext4 commit 32ed5058 by
      Theodore Ts'o <tytso@mit.edu>.
      
      CC: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      96d2a495
    • E
      ext3: Replace lock/unlock_super() with an explicit lock for the orphan list · b8a052d0
      Eric Sandeen 提交于
      Use a separate lock to protect the orphan list, so we can stop
      overloading the use of lock_super().
      
      Port of ext4 commit 3b9d4ed2
      by Theodore Ts'o <tytso@mit.edu>.
      
      CC: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      b8a052d0
    • E
      ext3: ext3_mark_recovery_complete() doesn't need to use lock_super · 4854a5f0
      Eric Sandeen 提交于
      The function ext3_mark_recovery_complete() is called from two call
      paths: either (a) while mounting the filesystem, in which case there's
      no danger of any other CPU calling write_super() until the mount is
      completed, and (b) while remounting the filesystem read-write, in
      which case the fs core has already locked the superblock.  This also
      allows us to take out a very vile unlock_super()/lock_super() pair in
      ext3_remount().
      
      Port of ext4 commit a63c9eb2 by
      Theodore Ts'o <tytso@mit.edu>.
      
      CC: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      4854a5f0
    • E
      ext3: Remove outdated comment about lock_super() · ed505ee4
      Eric Sandeen 提交于
      ext3_fill_super() is no longer called by read_super(), and it is no
      longer called with the superblock locked.  The
      unlock_super()/lock_super() is no longer present, so this comment is
      entirely superfluous.
      
      Port of ext4 commit 32ed5058 by
      Theodore Ts'o <tytso@mit.edu>.
      
      CC: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      ed505ee4
    • D
      quota: Move duplicated code to separate functions · dc52dd3a
      Dmitry Monakhov 提交于
      - for(..) { mark_dquot_dirty(); } -> mark_all_dquot_dirty()
      - for(..) { dput(); }             -> dqput_all()
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      dc52dd3a
    • D
      ext4: Convert to generic reserved quota's space management. · a9e7f447
      Dmitry Monakhov 提交于
      This patch also fixes write vs chown race condition.
      Acked-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      a9e7f447
    • D
      quota: decouple fs reserved space from quota reservation · fd8fbfc1
      Dmitry Monakhov 提交于
      Currently inode_reservation is managed by fs itself and this
      reservation is transfered on dquot_transfer(). This means what
      inode_reservation must always be in sync with
      dquot->dq_dqb.dqb_rsvspace. Otherwise dquot_transfer() will result
      in incorrect quota(WARN_ON in dquot_claim_reserved_space() will be
      triggered)
      This is not easy because of complex locking order issues
      for example http://bugzilla.kernel.org/show_bug.cgi?id=14739
      
      The patch introduce quota reservation field for each fs-inode
      (fs specific inode is used in order to prevent bloating generic
      vfs inode). This reservation is managed by quota code internally
      similar to i_blocks/i_bytes and may not be always in sync with
      internal fs reservation.
      
      Also perform some code rearrangement:
      - Unify dquot_reserve_space() and dquot_reserve_space()
      - Unify dquot_release_reserved_space() and dquot_free_space()
      - Also this patch add missing warning update to release_rsv()
        dquot_release_reserved_space() must call flush_warnings() as
        dquot_free_space() does.
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      fd8fbfc1
    • D
      Add unlocked version of inode_add_bytes() function · b462707e
      Dmitry Monakhov 提交于
      Quota code requires unlocked version of this function. Off course
      we can just copy-paste the code, but copy-pasting is always an evil.
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      b462707e
    • D
      ext3: quota macros cleanup [V2] · c459001f
      Dmitry Monakhov 提交于
      Currently all quota block reservation macros contains hardcoded "2"
      aka MAXQUOTAS value. This is no good because in some places it is not
      obvious to understand what does this digit represent. Let's introduce
      new macro with self descriptive name.
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      c459001f
    • A
      jfs: Fix 32bit build warning · 28ba0ec6
      Alan Cox 提交于
      loff_t is a type that isn't entirely dependant upon 32 v 64bit choice
      Signed-off-by: NAlan Cox <alan@linux.intel.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      28ba0ec6
    • A
      Sanitize f_flags helpers · 5300990c
      Al Viro 提交于
      * pull ACC_MODE to fs.h; we have several copies all over the place
      * nightmarish expression calculating f_mode by f_flags deserves a helper
      too (OPEN_FMODE(flags))
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5300990c
    • A
      Fix f_flags/f_mode in case of lookup_instantiate_filp() from open(pathname, 3) · 482928d5
      Al Viro 提交于
      Just set f_flags when shoving struct file into nameidata; don't
      postpone that until __dentry_open().  do_filp_open() has correct
      value; lookup_instantiate_filp() doesn't - we lose the difference
      between O_RDWR and 3 by that point.
      
      We still set .intent.open.flags, so no fs code needs to be changed.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      482928d5
    • R
      anonfd: Allow making anon files read-only · 628ff7c1
      Roland Dreier 提交于
      It seems a couple places such as arch/ia64/kernel/perfmon.c and
      drivers/infiniband/core/uverbs_main.c could use anon_inode_getfile()
      instead of a private pseudo-fs + alloc_file(), if only there were a way
      to get a read-only file.  So provide this by having anon_inode_getfile()
      create a read-only file if we pass O_RDONLY in flags.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      628ff7c1
    • A
      fs/compat_ioctl.c: fix build error when !BLOCK · ed261758
      Arnd Bergmann 提交于
      No driver uses SG_SET_TRANSFORM any more in Linux, since the ide-scsi
      driver was removed in 2.6.29. The compat-ioctl cleanup series moved
      the handling for this around, which broke building without CONFIG_BLOCK.
      
      Just remove the code handling it for compat mode.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      ed261758
    • R
      alloc_file(): simplify handling of mnt_clone_write() errors · 385e3ed4
      Roland Dreier 提交于
      When alloc_file() and init_file() were combined, the error handling of
      mnt_clone_write() was taken into alloc_file() in a somewhat obfuscated
      way.  Since we don't use the error code for anything except warning,
      we might as well warn directly without an extra variable.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      385e3ed4