1. 14 7月, 2009 1 次提交
    • J
      jbd2: Fix a race between checkpointing code and journal_get_write_access() · f91d1d04
      Jan Kara 提交于
      The following race can happen:
      
       CPU1                          CPU2
                                     checkpointing code checks the buffer, adds
                                       it to an array for writeback
       do_get_write_access()
       ...
       lock_buffer()
       unlock_buffer()
                                     flush_batch() submits the buffer for IO
       __jbd2_journal_file_buffer()
      
      So a buffer under writeout is returned from
      do_get_write_access(). Since the filesystem code relies on the fact
      that journaled buffers cannot be written out, it does not take the
      buffer lock and so it can modify buffer while it is under
      writeout. That can lead to a filesystem corruption if we crash at the
      right moment.
      
      We fix the problem by clearing the buffer dirty bit under buffer_lock
      even if the buffer is on BJ_None list. Actually, we clear the dirty
      bit regardless the list the buffer is in and warn about the fact if
      the buffer is already journalled.
      
      Thanks for spotting the problem goes to dingdinghua <dingdinghua85@gmail.com>.
      Reported-by: Ndingdinghua <dingdinghua85@gmail.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f91d1d04
  2. 06 7月, 2009 1 次提交
  3. 13 7月, 2009 1 次提交
    • E
      ext4: naturally align struct ext4_allocation_request · 726447d8
      Eric Sandeen 提交于
      As Ted noted, the ext4_allocation_request isn't well aligned.  Looking
      at it with pahole we're wasting space on 64-bit arches:
      
      struct ext4_allocation_request {
              struct inode *             inode;              /*     0     8 */
              ext4_lblk_t                logical;            /*     8     4 */
      
              /* XXX 4 bytes hole, try to pack */
      
              ext4_fsblk_t               goal;               /*    16     8 */
              ext4_lblk_t                lleft;              /*    24     4 */
      
              /* XXX 4 bytes hole, try to pack */
      
              ext4_fsblk_t               pleft;              /*    32     8 */
              ext4_lblk_t                lright;             /*    40     4 */
      
              /* XXX 4 bytes hole, try to pack */
      
              ext4_fsblk_t               pright;             /*    48     8 */
              unsigned int               len;                /*    56     4 */
              unsigned int               flags;              /*    60     4 */
              /* --- cacheline 1 boundary (64 bytes) --- */
      
              /* size: 64, cachelines: 1, members: 9 */
              /* sum members: 52, holes: 3, sum holes: 12 */
      };
      
      Grouping 32-bit members together closes these holes and shrinks the
      structure by 12 bytes. which is important since ext4 can get on the
      hairy edge of stack overruns.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      726447d8
  4. 06 7月, 2009 2 次提交
    • E
      ext4: mark several more functions in mballoc.c as noinline · 089ceecc
      Eric Sandeen 提交于
      Ted noticed a stack-deep callchain through
      writepages->ext4_mb_regular_allocator->ext4_mb_init_cache->submit_bh ...
      
      With all the static functions in mballoc.c, gcc helpfully
      inlines for us, and we get something like this:
      
      ext4_mb_regular_allocator	(232 bytes stack)
      	ext4_mb_init_cache	(232 bytes stack)
      		submit_bh	(starts 464 deeper)
      
      the 2 ext4 functions here get several others inlined; by telling
      gcc not to inline them, we can save stack space for when we
      head off into submit_bh land and associated block layer callchains.
      The following noinlined functions are only called once, so this
      won't impact any other callchains:
      
      ext4_mb_regular_allocator 			(104) (was 232)
      	ext4_mb_find_by_goal			 (56) (noinlined)
      	ext4_mb_init_group			 (24) (noinlined)
      		ext4_mb_init_cache		(136) (was 232)
      			ext4_mb_generate_buddy	 (88) (noinlined)
      			ext4_mb_generate_from_pa (40) (noinlined)
      			submit_bh
      	ext4_mb_simple_scan_group		 (24) (noinlined)
      	ext4_mb_scan_aligned			 (56) (noinlined)
      	ext4_mb_complex_scan_group		 (40) (noinlined)
      	ext4_mb_try_best_found			 (24) (noinlined)
      
      now when we head off into submit_bh() we're only 264 bytes deeper
      in stack than when we entered ext4_mb_regular_allocator()
      (vs. 464 bytes before).  Every 200 bytes helps.  :)
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      089ceecc
    • T
      ext4: Fix potential reclaim deadlock when truncating partial block · f4a01017
      Theodore Ts'o 提交于
      The ext4_block_truncate_page() function previously called
      grab_cache_page(), which called find_or_create_page() with the
      __GFP_FS flag potentially set.  This could cause a deadlock if the
      system is low on memory and it attempts a memory reclaim, which could
      potentially call back into ext4.  So we need to call
      find_or_create_page() directly, and remove the __GFP_FP flag to avoid
      this potential deadlock.
      
      Thanks to Roland Dreier for reporting a lockdep warning which showed
      this problem.
      
      [20786.363249] =================================
      [20786.363257] [ INFO: inconsistent lock state ]
      [20786.363265] 2.6.31-2-generic #14~rbd4gitd960eea9
      [20786.363270] ---------------------------------
      [20786.363276] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
      [20786.363285] http/8397 [HC0[0]:SC0[0]:HE1:SE1] takes:
      [20786.363291]  (jbd2_handle){+.+.?.}, at: [<ffffffff812008bb>] jbd2_journal_start+0xdb/0x150
      [20786.363314] {IN-RECLAIM_FS-W} state was registered at:
      [20786.363320]   [<ffffffff8108bef6>] mark_irqflags+0xc6/0x1a0
      [20786.363334]   [<ffffffff8108d347>] __lock_acquire+0x287/0x430
      [20786.363345]   [<ffffffff8108d595>] lock_acquire+0xa5/0x150
      [20786.363355]   [<ffffffff812008da>] jbd2_journal_start+0xfa/0x150
      [20786.363365]   [<ffffffff811d98a8>] ext4_journal_start_sb+0x58/0x90
      [20786.363377]   [<ffffffff811cce85>] ext4_delete_inode+0xc5/0x2c0
      [20786.363389]   [<ffffffff81146fa3>] generic_delete_inode+0xd3/0x1a0
      [20786.363401]   [<ffffffff81147095>] generic_drop_inode+0x25/0x30
      [20786.363411]   [<ffffffff81145ce2>] iput+0x62/0x70
      [20786.363420]   [<ffffffff81142878>] dentry_iput+0x98/0x110
      [20786.363429]   [<ffffffff81142a00>] d_kill+0x50/0x80
      [20786.363438]   [<ffffffff811444c5>] dput+0x95/0x180
      [20786.363447]   [<ffffffff8120de4b>] ecryptfs_d_release+0x2b/0x70
      [20786.363459]   [<ffffffff81142978>] d_free+0x28/0x60
      [20786.363468]   [<ffffffff81142a18>] d_kill+0x68/0x80
      [20786.363477]   [<ffffffff81142ad3>] prune_one_dentry+0xa3/0xc0
      [20786.363487]   [<ffffffff81142d61>] __shrink_dcache_sb+0x271/0x290
      [20786.363497]   [<ffffffff81142e89>] prune_dcache+0x109/0x1b0
      [20786.363506]   [<ffffffff81142f6f>] shrink_dcache_memory+0x3f/0x50
      [20786.363516]   [<ffffffff810f6d3d>] shrink_slab+0x12d/0x190
      [20786.363527]   [<ffffffff810f97d7>] balance_pgdat+0x4d7/0x640
      [20786.363537]   [<ffffffff810f9a57>] kswapd+0x117/0x170
      [20786.363546]   [<ffffffff810773ce>] kthread+0x9e/0xb0
      [20786.363558]   [<ffffffff8101430a>] child_rip+0xa/0x20
      [20786.363569]   [<ffffffffffffffff>] 0xffffffffffffffff
      [20786.363598] irq event stamp: 15997
      [20786.363603] hardirqs last  enabled at (15997): [<ffffffff81125f9d>] kmem_cache_alloc+0xfd/0x1a0
      [20786.363617] hardirqs last disabled at (15996): [<ffffffff81125f01>] kmem_cache_alloc+0x61/0x1a0
      [20786.363628] softirqs last  enabled at (15966): [<ffffffff810631ea>] __do_softirq+0x14a/0x220
      [20786.363641] softirqs last disabled at (15861): [<ffffffff8101440c>] call_softirq+0x1c/0x30
      [20786.363651] 
      [20786.363653] other info that might help us debug this:
      [20786.363660] 3 locks held by http/8397:
      [20786.363665]  #0:  (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<ffffffff8112ed24>] do_truncate+0x64/0x90
      [20786.363685]  #1:  (&sb->s_type->i_alloc_sem_key#5){+++++.}, at: [<ffffffff81147f90>] notify_change+0x250/0x350
      [20786.363707]  #2:  (jbd2_handle){+.+.?.}, at: [<ffffffff812008bb>] jbd2_journal_start+0xdb/0x150
      [20786.363724] 
      [20786.363726] stack backtrace:
      [20786.363734] Pid: 8397, comm: http Tainted: G         C 2.6.31-2-generic #14~rbd4gitd960eea9
      [20786.363741] Call Trace:
      [20786.363752]  [<ffffffff8108ad7c>] print_usage_bug+0x18c/0x1a0
      [20786.363763]  [<ffffffff8108b0c0>] ? check_usage_backwards+0x0/0xb0
      [20786.363773]  [<ffffffff8108bad2>] mark_lock_irq+0xf2/0x280
      [20786.363783]  [<ffffffff8108bd97>] mark_lock+0x137/0x1d0
      [20786.363793]  [<ffffffff8108c03c>] mark_held_locks+0x6c/0xa0
      [20786.363803]  [<ffffffff8108c11f>] lockdep_trace_alloc+0xaf/0xe0
      [20786.363813]  [<ffffffff810efbac>] __alloc_pages_nodemask+0x7c/0x180
      [20786.363824]  [<ffffffff810e9411>] ? find_get_page+0x91/0xf0
      [20786.363835]  [<ffffffff8111d3b7>] alloc_pages_current+0x87/0xd0
      [20786.363845]  [<ffffffff810e9827>] __page_cache_alloc+0x67/0x70
      [20786.363856]  [<ffffffff810eb7df>] find_or_create_page+0x4f/0xb0
      [20786.363867]  [<ffffffff811cb3be>] ext4_block_truncate_page+0x3e/0x460
      [20786.363876]  [<ffffffff812008da>] ? jbd2_journal_start+0xfa/0x150
      [20786.363885]  [<ffffffff812008bb>] ? jbd2_journal_start+0xdb/0x150
      [20786.363895]  [<ffffffff811c6415>] ? ext4_meta_trans_blocks+0x75/0xf0
      [20786.363905]  [<ffffffff811e8d8b>] ext4_ext_truncate+0x1bb/0x1e0
      [20786.363916]  [<ffffffff811072c5>] ? unmap_mapping_range+0x75/0x290
      [20786.363926]  [<ffffffff811ccc28>] ext4_truncate+0x498/0x630
      [20786.363938]  [<ffffffff8129b4ce>] ? _raw_spin_unlock+0x5e/0xb0
      [20786.363947]  [<ffffffff81107306>] ? unmap_mapping_range+0xb6/0x290
      [20786.363957]  [<ffffffff8108c3ad>] ? trace_hardirqs_on+0xd/0x10
      [20786.363966]  [<ffffffff811ffe58>] ? jbd2_journal_stop+0x1f8/0x2e0
      [20786.363976]  [<ffffffff81107690>] vmtruncate+0xb0/0x110
      [20786.363986]  [<ffffffff81147c05>] inode_setattr+0x35/0x170
      [20786.363995]  [<ffffffff811c9906>] ext4_setattr+0x186/0x370
      [20786.364005]  [<ffffffff81147eab>] notify_change+0x16b/0x350
      [20786.364014]  [<ffffffff8112ed30>] do_truncate+0x70/0x90
      [20786.364021]  [<ffffffff8112f48b>] T.657+0xeb/0x110
      [20786.364021]  [<ffffffff8112f4be>] sys_ftruncate+0xe/0x10
      [20786.364021]  [<ffffffff81013132>] system_call_fastpath+0x16/0x1b
      Reported-by: NRoland Dreier <roland@digitalvampire.org>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      f4a01017
  5. 21 6月, 2009 1 次提交
  6. 03 7月, 2009 8 次提交
  7. 02 7月, 2009 2 次提交
  8. 01 7月, 2009 10 次提交
  9. 29 6月, 2009 1 次提交
  10. 28 6月, 2009 1 次提交
  11. 26 6月, 2009 3 次提交
    • S
      [CIFS] remove unknown mount option warning message · 71a394fa
      Steve French 提交于
      Jeff's previous patch which removed the unneeded rw/ro
      parsing can cause a minor warning in dmesg (about the
      unknown rw or ro mount option) at mount time. This
      patch makes cifs ignore them in kernel to remove the warning
      (they are already handled in the mount helper and VFS).
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      71a394fa
    • S
      [CIFS] remove bkl usage from umount begin · ad8034f1
      Steve French 提交于
      The lock_kernel call moved into the fs for umount_begin
      is not needed.  This adds a check to make sure we don't
      call umount_begin twice on the same fs.
      
      umount_begin for cifs is probably not needed and
      may eventually be able to be removed, but in
      the meantime this smaller patch is safe and
      gets rid of the bkl from this path which provides
      some benefit.
      
      Acked-by: Jeff Layton <redhat.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      ad8034f1
    • S
      cifs: Fix incorrect return code being printed in cFYI messages · 0f3bc09e
      Suresh Jayaraman 提交于
      FreeXid() along with freeing Xid does add a cifsFYI debug message that
      prints rc (return code) as well. In some code paths where we set/return
      error code after calling FreeXid(), incorrect error code is being
      printed when cifsFYI is enabled.
      
      This could be misleading in few cases. For eg.
      In cifs_open() if cifs_fill_filedata() returns a valid pointer to
      cifsFileInfo, FreeXid() prints rc=-13 whereas 0 is actually being
      returned. Fix this by setting rc before calling FreeXid().
      
      Basically convert
      
      FreeXid(xid);			rc = -ERR;
      return -ERR;		=>	FreeXid(xid);
      				return rc;
      
      [Note that Christoph would like to replace the GetXid/FreeXid
      calls, which are primarily used for debugging.  This seems
      like a good longer term goal, but although there is an
      alternative tracing facility, there are no examples yet
      available that I know of that we can use (yet) to
      convert this cifs function entry/exit logging, and for
      creating an identifier that we can use to correlate
      all dmesg log entries for a particular vfs operation
      (ie identify all log entries for a particular vfs
      request to cifs: e.g. a particular close or read or write
      or byte range lock call ... and just using the thread id
      is harder).  Eventually when a replacement
      for this is available (e.g. when NFS switches over and various
      samples to look at in other file systems) we can remove the
      GetXid/FreeXid macro but in the meantime multiple people
      use this run time configurable logging all the time
      for debugging, and Suresh's patch fixes a problem
      which made it harder to notice some low
      memory problems in the log so it is worthwhile
      to fix this problem until a better logging
      approach is able to be used]
      Acked-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSuresh Jayaraman <sjayaraman@suse.de>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      0f3bc09e
  12. 25 6月, 2009 8 次提交
  13. 24 6月, 2009 1 次提交
新手
引导
客服 返回
顶部