1. 12 1月, 2010 2 次提交
    • M
      smaps: fix wrong rss count · 7f53a09e
      Minchan Kim 提交于
      A long time ago we regarded zero page as file_rss and vm_normal_page
      doesn't return NULL.
      
      But now, we reinstated ZERO_PAGE and vm_normal_page's implementation can
      return NULL in case of zero page.  Also we don't count it with file_rss
      any more.
      
      Then, RSS and PSS can't be matched.  For consistency, Let's ignore zero
      page in smaps_pte_range.
      Signed-off-by: NMinchan Kim <minchan.kim@gmail.com>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NHugh Dickins <hugh.dickins@tiscali.co.uk>
      Acked-by: NMatt Mackall <mpm@selenic.com>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f53a09e
    • K
      proc: partially revert "procfs: provide stack information for threads" · 1306d603
      KOSAKI Motohiro 提交于
      Commit d899bf7b (procfs: provide stack information for threads) introduced
      to show stack information in /proc/{pid}/status.  But it cause large
      performance regression.  Unfortunately /proc/{pid}/status is used ps
      command too and ps is one of most important component.  Because both to
      take mmap_sem and page table walk are heavily operation.
      
      If many process run, the ps performance is,
      
      [before d899bf7b]
      
      % perf stat ps >/dev/null
      
       Performance counter stats for 'ps':
      
           4090.435806  task-clock-msecs         #      0.032 CPUs
                   229  context-switches         #      0.000 M/sec
                     0  CPU-migrations           #      0.000 M/sec
                   234  page-faults              #      0.000 M/sec
            8587565207  cycles                   #   2099.425 M/sec
            9866662403  instructions             #      1.149 IPC
            3789415411  cache-references         #    926.409 M/sec
              30419509  cache-misses             #      7.437 M/sec
      
         128.859521955  seconds time elapsed
      
      [after d899bf7b]
      
      % perf stat  ps  > /dev/null
      
       Performance counter stats for 'ps':
      
           4305.081146  task-clock-msecs         #      0.028 CPUs
                   480  context-switches         #      0.000 M/sec
                     2  CPU-migrations           #      0.000 M/sec
                   237  page-faults              #      0.000 M/sec
            9021211334  cycles                   #   2095.480 M/sec
           10605887536  instructions             #      1.176 IPC
            3612650999  cache-references         #    839.160 M/sec
              23917502  cache-misses             #      5.556 M/sec
      
         152.277819582  seconds time elapsed
      
      Thus, this patch revert it. Fortunately /proc/{pid}/task/{tid}/smaps
      provide almost same information. we can use it.
      
      Commit d899bf7b introduced two features:
      
       1) Add the annotattion of [thread stack: xxxx] mark to
          /proc/{pid}/task/{tid}/maps.
       2) Add StackUsage field to /proc/{pid}/status.
      
      I only revert (2), because I haven't seen (1) cause regression.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Stefani Seibold <stefani@seibold.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1306d603
  2. 11 1月, 2010 1 次提交
    • J
      quota: Fix dquot_transfer for filesystems different from ext4 · 05b5d898
      Jan Kara 提交于
      Commit fd8fbfc1 modified the way we find amount of reserved space
      belonging to an inode. The amount of reserved space is checked
      from dquot_transfer and thus inode_reserved_space gets called
      even for filesystems that don't provide get_reserved_space callback
      which results in a BUG.
      
      Fix the problem by checking get_reserved_space callback and return 0 if
      the filesystem does not provide it.
      
      CC: Dmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      05b5d898
  3. 09 1月, 2010 1 次提交
  4. 07 1月, 2010 6 次提交
  5. 05 1月, 2010 6 次提交
    • B
      exofs: simple_write_end does not mark_inode_dirty · efd124b9
      Boaz Harrosh 提交于
      exofs uses simple_write_end() for it's .write_end handler. But
      it is not enough because simple_write_end() does not call
      mark_inode_dirty() when it extends i_size. So even if we do
      call mark_inode_dirty at beginning of write out, with a very
      long IO and a saturated system we might get the .write_inode()
      called while still extend-writing to file and miss out on the last
      i_size updates.
      
      So override .write_end, call simple_write_end(), and afterwords if
      i_size was changed call mark_inode_dirty().
      
      It stands to logic that since simple_write_end() was the one extending
      i_size it should also call mark_inode_dirty(). But it looks like all
      users of simple_write_end() are memory-bound pseudo filesystems, who
      could careless about mark_inode_dirty(). I might submit a
      warning-comment patch to simple_write_end() in future.
      
      CC: Stable <stable@kernel.org>
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      efd124b9
    • B
      exofs: fix pnfs_osd re-definitions in pre-pnfs trees · 89be5030
      Boaz Harrosh 提交于
      Some on disk exofs constants and types are defined in the pnfs_osd_xdr.h
      file. Since we needed these types before the pnfs-objects code was
      accepted to mainline we duplicated the minimal needed definitions into
      an exofs local header. The definitions where conditionally included
      depending on !CONFIG_PNFS defined. So if PNFS was present in the tree
      definitions are taken from there and if not they are defined locally.
      
      That was all good but, the CONFIG_PNFS is planed to be included upstream
      before the pnfs-objects is also included. (The first pnfs batch might be
      pnfs-files only)
      
      So condition exofs local definitions on the absence of pnfs_osd_xdr.h
      inclusion (__PNFS_OSD_XDR_H__ not defined). User code must make sure
      that in future pnfs_osd_xdr.h will be included before fs/exofs/pnfs.h,
      which happens to be so in current code.
      
      Once pnfs-objects hits mainline, exofs's local header will be removed.
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      89be5030
    • F
      reiserfs: Relax lock on xattr removing · 4f3be1b5
      Frederic Weisbecker 提交于
      When we remove an xattr, we call lookup_and_delete_xattr()
      that takes some private xattr inodes mutexes. But we hold
      the reiserfs lock at this time, which leads to dependency
      inversions.
      
      We can safely call lookup_and_delete_xattr() without the
      reiserfs lock, where xattr inodes lookups only need the
      xattr inodes mutexes.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Christian Kujau <lists@nerdbynature.de>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      4f3be1b5
    • F
      reiserfs: Relax the lock before truncating pages · 108d3943
      Frederic Weisbecker 提交于
      While truncating a file, reiserfs_setattr() calls inode_setattr()
      that will truncate the mapping for the given inode, but for that
      it needs the pages locks.
      
      In order to release these, the owners need the reiserfs lock to
      complete their jobs. But they can't, as we don't release it before
      calling inode_setattr().
      
      We need to do that to fix the following softlockups:
      
      INFO: task flush-8:0:2149 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      flush-8:0     D f51af998     0  2149      2 0x00000000
       f51af9ac 00000092 00000002 f51af998 c2803304 00000000 c1894ad0 010f3000
       f51af9cc c1462604 c189ef80 f51af974 c1710304 f715b450 f715b5ec c2807c40
       00000000 0005bb00 c2803320 c102c55b c1710304 c2807c50 c2803304 00000246
      Call Trace:
       [<c1462604>] ? schedule+0x434/0xb20
       [<c102c55b>] ? resched_task+0x4b/0x70
       [<c106fa22>] ? mark_held_locks+0x62/0x80
       [<c146414d>] ? mutex_lock_nested+0x1fd/0x350
       [<c14640b9>] mutex_lock_nested+0x169/0x350
       [<c1178cde>] ? reiserfs_write_lock+0x2e/0x40
       [<c1178cde>] reiserfs_write_lock+0x2e/0x40
       [<c11719a2>] do_journal_end+0xc2/0xe70
       [<c1172912>] journal_end+0xb2/0x120
       [<c11686b3>] ? pathrelse+0x33/0xb0
       [<c11729e4>] reiserfs_end_persistent_transaction+0x64/0x70
       [<c1153caa>] reiserfs_get_block+0x12ba/0x15f0
       [<c106fa22>] ? mark_held_locks+0x62/0x80
       [<c1154b24>] reiserfs_writepage+0xa74/0xe80
       [<c1465a27>] ? _raw_spin_unlock_irq+0x27/0x50
       [<c11f3d25>] ? radix_tree_gang_lookup_tag_slot+0x95/0xc0
       [<c10b5377>] ? find_get_pages_tag+0x127/0x1a0
       [<c106fa22>] ? mark_held_locks+0x62/0x80
       [<c106fcd4>] ? trace_hardirqs_on_caller+0x124/0x170
       [<c10bc1e0>] __writepage+0x10/0x40
       [<c10bc9ab>] write_cache_pages+0x16b/0x320
       [<c10bc1d0>] ? __writepage+0x0/0x40
       [<c10bcb88>] generic_writepages+0x28/0x40
       [<c10bcbd5>] do_writepages+0x35/0x40
       [<c11059f7>] writeback_single_inode+0xc7/0x330
       [<c11067b2>] writeback_inodes_wb+0x2c2/0x490
       [<c1106a86>] wb_writeback+0x106/0x1b0
       [<c1106cf6>] wb_do_writeback+0x106/0x1e0
       [<c1106c18>] ? wb_do_writeback+0x28/0x1e0
       [<c1106e0a>] bdi_writeback_task+0x3a/0xb0
       [<c10cbb13>] bdi_start_fn+0x63/0xc0
       [<c10cbab0>] ? bdi_start_fn+0x0/0xc0
       [<c105d1f4>] kthread+0x74/0x80
       [<c105d180>] ? kthread+0x0/0x80
       [<c100327a>] kernel_thread_helper+0x6/0x10
      3 locks held by flush-8:0/2149:
       #0:  (&type->s_umount_key#30){+++++.}, at: [<c110676f>] writeback_inodes_wb+0x27f/0x490
       #1:  (&journal->j_mutex){+.+...}, at: [<c117199a>] do_journal_end+0xba/0xe70
       #2:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1178cde>] reiserfs_write_lock+0x2e/0x40
      INFO: task fstest:3813 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      fstest        D 00000002     0  3813   3812 0x00000000
       f5103c94 00000082 f5103c40 00000002 f5ad5450 00000007 f5103c28 011f3000
       00000006 f5ad5450 c10bb005 00000480 c1710304 f5ad5450 f5ad55ec c2907c40
       00000001 f5ad5450 f5103c74 00000046 00000002 f5ad5450 00000007 f5103c6c
      Call Trace:
       [<c10bb005>] ? free_hot_cold_page+0x1d5/0x280
       [<c1462d64>] io_schedule+0x74/0xc0
       [<c10b5a45>] sync_page+0x35/0x60
       [<c146325a>] __wait_on_bit_lock+0x4a/0x90
       [<c10b5a10>] ? sync_page+0x0/0x60
       [<c10b59e5>] __lock_page+0x85/0x90
       [<c105d660>] ? wake_bit_function+0x0/0x60
       [<c10bf654>] truncate_inode_pages_range+0x1e4/0x2d0
       [<c10bf75f>] truncate_inode_pages+0x1f/0x30
       [<c10bf7cf>] truncate_pagecache+0x5f/0xa0
       [<c10bf86a>] vmtruncate+0x5a/0x70
       [<c10fdb7d>] inode_setattr+0x5d/0x190
       [<c1150117>] reiserfs_setattr+0x1f7/0x2f0
       [<c1464569>] ? down_write+0x49/0x70
       [<c10fde01>] notify_change+0x151/0x330
       [<c10e6f3d>] do_truncate+0x6d/0xa0
       [<c10f4ce2>] do_filp_open+0x9a2/0xcf0
       [<c1465aec>] ? _raw_spin_unlock+0x2c/0x50
       [<c10fec50>] ? alloc_fd+0xe0/0x100
       [<c10e602d>] do_sys_open+0x6d/0x130
       [<c1002cfb>] ? sysenter_exit+0xf/0x16
       [<c10e615e>] sys_open+0x2e/0x40
       [<c1002ccc>] sysenter_do_call+0x12/0x32
      3 locks held by fstest/3813:
       #0:  (&sb->s_type->i_mutex_key#4){+.+.+.}, at: [<c10e6f33>] do_truncate+0x63/0xa0
       #1:  (&sb->s_type->i_alloc_sem_key#3){+.+.+.}, at: [<c10fdf07>] notify_change+0x257/0x330
       #2:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1178c8e>] reiserfs_write_lock_once+0x2e/0x50
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Christian Kujau <lists@nerdbynature.de>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      108d3943
    • F
      reiserfs: Fix recursive lock on lchown · 5fe1533f
      Frederic Weisbecker 提交于
      On chown, reiserfs will call reiserfs_setattr() to change the owner
      of the given inode, but it may also recursively call
      reiserfs_setattr() to propagate the owner change to the private xattr
      files for this inode.
      
      Hence, the reiserfs lock may be acquired twice which is not wanted
      as reiserfs_setattr() calls journal_begin() that is going to try to
      relax the lock in order to safely acquire the journal mutex.
      
      Using reiserfs_write_lock_once() from reiserfs_setattr() solves
      the problem.
      
      This fixes the following warning, that precedes a lockdep report.
      
      WARNING: at fs/reiserfs/lock.c:95 reiserfs_lock_check_recursive+0x3f/0x50()
      Hardware name: MS-7418
      Unwanted recursive reiserfs lock!
      Pid: 4189, comm: fsstress Not tainted 2.6.33-rc2-tip-atom+ #195
      Call Trace:
       [<c1178bff>] ? reiserfs_lock_check_recursive+0x3f/0x50
       [<c1178bff>] ? reiserfs_lock_check_recursive+0x3f/0x50
       [<c103f7ac>] warn_slowpath_common+0x6c/0xc0
       [<c1178bff>] ? reiserfs_lock_check_recursive+0x3f/0x50
       [<c103f84b>] warn_slowpath_fmt+0x2b/0x30
       [<c1178bff>] reiserfs_lock_check_recursive+0x3f/0x50
       [<c1172ae3>] do_journal_begin_r+0x83/0x350
       [<c1172f2d>] journal_begin+0x7d/0x140
       [<c106509a>] ? in_group_p+0x2a/0x30
       [<c10fda71>] ? inode_change_ok+0x91/0x140
       [<c115007d>] reiserfs_setattr+0x15d/0x2e0
       [<c10f9bf3>] ? dput+0xe3/0x140
       [<c1465adc>] ? _raw_spin_unlock+0x2c/0x50
       [<c117831d>] chown_one_xattr+0xd/0x10
       [<c11780a3>] reiserfs_for_each_xattr+0x113/0x2c0
       [<c1178310>] ? chown_one_xattr+0x0/0x10
       [<c14641e9>] ? mutex_lock_nested+0x2a9/0x350
       [<c117826f>] reiserfs_chown_xattrs+0x1f/0x60
       [<c106509a>] ? in_group_p+0x2a/0x30
       [<c10fda71>] ? inode_change_ok+0x91/0x140
       [<c1150046>] reiserfs_setattr+0x126/0x2e0
       [<c1177c20>] ? reiserfs_getxattr+0x0/0x90
       [<c11b0d57>] ? cap_inode_need_killpriv+0x37/0x50
       [<c10fde01>] notify_change+0x151/0x330
       [<c10e659f>] chown_common+0x6f/0x90
       [<c10e67bd>] sys_lchown+0x6d/0x80
       [<c1002ccc>] sysenter_do_call+0x12/0x32
      ---[ end trace 7c2b77224c1442fc ]---
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Christian Kujau <lists@nerdbynature.de>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      5fe1533f
    • E
      sysfs: Add lockdep annotations for the sysfs active reference · 846f9974
      Eric W. Biederman 提交于
      Holding locks over device_del -> kobject_del -> sysfs_deactivate can
      cause deadlocks if those same locks are grabbed in sysfs show or store
      methods.
      
      The I model s_active count + completion as a sleeping read/write lock.
      I describe to lockdep sysfs_get_active as a read_trylock,
      sysfs_put_active as a read_unlock, and sysfs_deactivate as a
      write_lock and write_unlock pair.  This seems to capture the essence
      for purposes of finding deadlocks, and in my testing gives finds real
      issues and ignores non-issues.
      
      This brings us back to holding locks over kobject_del is a problem
      that ideally we should find a way of addressing, but at least lockdep
      can tell us about the problems instead of requiring developers to debug
      rare strange system deadlocks, that happen when sysfs files are removed
      while being written to.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      846f9974
  6. 04 1月, 2010 1 次提交
  7. 03 1月, 2010 2 次提交
  8. 02 1月, 2010 9 次提交
    • F
      reiserfs: Safely acquire i_mutex from xattr_rmdir · 835d5247
      Frederic Weisbecker 提交于
      Relax the reiserfs lock before taking the inode mutex from
      xattr_rmdir() to avoid the usual reiserfs lock <-> inode mutex
      bad dependency.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NChristian Kujau <lists@nerdbynature.de>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      835d5247
    • F
      reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr · 8b513f56
      Frederic Weisbecker 提交于
      Relax the reiserfs lock before taking the inode mutex from
      reiserfs_for_each_xattr() to avoid the usual bad dependencies:
      
      =======================================================
      [ INFO: possible circular locking dependency detected ]
      2.6.32-atom #179
      -------------------------------------------------------
      rm/3242 is trying to acquire lock:
       (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290
      
      but task is already holding lock:
       (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143389>] reiserfs_write_lock+0x29/0x40
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
             [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c1401aab>] mutex_lock_nested+0x5b/0x340
             [<c1143339>] reiserfs_write_lock_once+0x29/0x50
             [<c1117022>] reiserfs_lookup+0x62/0x140
             [<c10bd85f>] __lookup_hash+0xef/0x110
             [<c10bf21d>] lookup_one_len+0x8d/0xc0
             [<c1141e3a>] open_xa_dir+0xea/0x1b0
             [<c1142720>] reiserfs_for_each_xattr+0x70/0x290
             [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
             [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
             [<c10c9c32>] generic_delete_inode+0xa2/0x170
             [<c10c9d4f>] generic_drop_inode+0x4f/0x70
             [<c10c8b07>] iput+0x47/0x50
             [<c10c0965>] do_unlinkat+0xd5/0x160
             [<c10c0b13>] sys_unlinkat+0x23/0x40
             [<c1002ec4>] sysenter_do_call+0x12/0x32
      
      -> #0 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
             [<c105f176>] __lock_acquire+0x18f6/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c1401aab>] mutex_lock_nested+0x5b/0x340
             [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290
             [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
             [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
             [<c10c9c32>] generic_delete_inode+0xa2/0x170
             [<c10c9d4f>] generic_drop_inode+0x4f/0x70
             [<c10c8b07>] iput+0x47/0x50
             [<c10c0965>] do_unlinkat+0xd5/0x160
             [<c10c0b13>] sys_unlinkat+0x23/0x40
             [<c1002ec4>] sysenter_do_call+0x12/0x32
      
      other info that might help us debug this:
      
      1 lock held by rm/3242:
       #0:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143389>] reiserfs_write_lock+0x29/0x40
      
      stack backtrace:
      Pid: 3242, comm: rm Not tainted 2.6.32-atom #179
      Call Trace:
       [<c13ffa13>] ? printk+0x18/0x1a
       [<c105d33a>] print_circular_bug+0xca/0xd0
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105c932>] ? mark_held_locks+0x62/0x80
       [<c105cc3b>] ? trace_hardirqs_on+0xb/0x10
       [<c1401098>] ? mutex_unlock+0x8/0x10
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290
       [<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290
       [<c1401aab>] mutex_lock_nested+0x5b/0x340
       [<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290
       [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290
       [<c1143180>] ? delete_one_xattr+0x0/0x100
       [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
       [<c1143339>] ? reiserfs_write_lock_once+0x29/0x50
       [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
       [<c11b0d4f>] ? _atomic_dec_and_lock+0x4f/0x70
       [<c111e990>] ? reiserfs_delete_inode+0x0/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c1401098>] ? mutex_unlock+0x8/0x10
       [<c10c3e0d>] ? vfs_readdir+0x7d/0xb0
       [<c10c3af0>] ? filldir64+0x0/0xf0
       [<c1002ef3>] ? sysenter_exit+0xf/0x16
       [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
       [<c10c0b13>] sys_unlinkat+0x23/0x40
       [<c1002ec4>] sysenter_do_call+0x12/0x32
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NChristian Kujau <lists@nerdbynature.de>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      8b513f56
    • F
      reiserfs: Fix journal mutex <-> inode mutex lock inversion · 4dd85969
      Frederic Weisbecker 提交于
      We need to relax the reiserfs lock before locking the inode mutex
      from xattr_unlink(), otherwise we'll face the usual bad dependencies:
      
      =======================================================
      [ INFO: possible circular locking dependency detected ]
      2.6.32-atom #178
      -------------------------------------------------------
      rm/3202 is trying to acquire lock:
       (&journal->j_mutex){+.+...}, at: [<c113c234>] do_journal_begin_r+0x94/0x360
      
      but task is already holding lock:
       (&sb->s_type->i_mutex_key#4/2){+.+...}, at: [<c1142a67>] xattr_unlink+0x57/0xb0
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #2 (&sb->s_type->i_mutex_key#4/2){+.+...}:
             [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c1401a7b>] mutex_lock_nested+0x5b/0x340
             [<c1142a67>] xattr_unlink+0x57/0xb0
             [<c1143179>] delete_one_xattr+0x29/0x100
             [<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290
             [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
             [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
             [<c10c9c32>] generic_delete_inode+0xa2/0x170
             [<c10c9d4f>] generic_drop_inode+0x4f/0x70
             [<c10c8b07>] iput+0x47/0x50
             [<c10c0965>] do_unlinkat+0xd5/0x160
             [<c10c0b13>] sys_unlinkat+0x23/0x40
             [<c1002ec4>] sysenter_do_call+0x12/0x32
      
      -> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
             [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c1401a7b>] mutex_lock_nested+0x5b/0x340
             [<c1143359>] reiserfs_write_lock+0x29/0x40
             [<c113c23c>] do_journal_begin_r+0x9c/0x360
             [<c113c680>] journal_begin+0x80/0x130
             [<c1127363>] reiserfs_remount+0x223/0x4e0
             [<c10b6dd6>] do_remount_sb+0xa6/0x140
             [<c10ce6a0>] do_mount+0x560/0x750
             [<c10ce914>] sys_mount+0x84/0xb0
             [<c1002ec4>] sysenter_do_call+0x12/0x32
      
      -> #0 (&journal->j_mutex){+.+...}:
             [<c105f176>] __lock_acquire+0x18f6/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c1401a7b>] mutex_lock_nested+0x5b/0x340
             [<c113c234>] do_journal_begin_r+0x94/0x360
             [<c113c680>] journal_begin+0x80/0x130
             [<c1116d63>] reiserfs_unlink+0x83/0x2e0
             [<c1142a74>] xattr_unlink+0x64/0xb0
             [<c1143179>] delete_one_xattr+0x29/0x100
             [<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290
             [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
             [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
             [<c10c9c32>] generic_delete_inode+0xa2/0x170
             [<c10c9d4f>] generic_drop_inode+0x4f/0x70
             [<c10c8b07>] iput+0x47/0x50
             [<c10c0965>] do_unlinkat+0xd5/0x160
             [<c10c0b13>] sys_unlinkat+0x23/0x40
             [<c1002ec4>] sysenter_do_call+0x12/0x32
      
      other info that might help us debug this:
      
      2 locks held by rm/3202:
       #0:  (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c114274b>] reiserfs_for_each_xattr+0x9b/0x290
       #1:  (&sb->s_type->i_mutex_key#4/2){+.+...}, at: [<c1142a67>] xattr_unlink+0x57/0xb0
      
      stack backtrace:
      Pid: 3202, comm: rm Not tainted 2.6.32-atom #178
      Call Trace:
       [<c13ff9e3>] ? printk+0x18/0x1a
       [<c105d33a>] print_circular_bug+0xca/0xd0
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c1142a67>] ? xattr_unlink+0x57/0xb0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c113c234>] ? do_journal_begin_r+0x94/0x360
       [<c113c234>] ? do_journal_begin_r+0x94/0x360
       [<c1401a7b>] mutex_lock_nested+0x5b/0x340
       [<c113c234>] ? do_journal_begin_r+0x94/0x360
       [<c113c234>] do_journal_begin_r+0x94/0x360
       [<c10411b6>] ? run_timer_softirq+0x1a6/0x220
       [<c103cb00>] ? __do_softirq+0x50/0x140
       [<c113c680>] journal_begin+0x80/0x130
       [<c103cba2>] ? __do_softirq+0xf2/0x140
       [<c104f72f>] ? hrtimer_interrupt+0xdf/0x220
       [<c1116d63>] reiserfs_unlink+0x83/0x2e0
       [<c105c932>] ? mark_held_locks+0x62/0x80
       [<c11b8d08>] ? trace_hardirqs_on_thunk+0xc/0x10
       [<c1002fd8>] ? restore_all_notrace+0x0/0x18
       [<c1142a67>] ? xattr_unlink+0x57/0xb0
       [<c1142a74>] xattr_unlink+0x64/0xb0
       [<c1143179>] delete_one_xattr+0x29/0x100
       [<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290
       [<c1143150>] ? delete_one_xattr+0x0/0x100
       [<c1401cb9>] ? mutex_lock_nested+0x299/0x340
       [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
       [<c1143309>] ? reiserfs_write_lock_once+0x29/0x50
       [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
       [<c11b0d1f>] ? _atomic_dec_and_lock+0x4f/0x70
       [<c111e990>] ? reiserfs_delete_inode+0x0/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c1401068>] ? mutex_unlock+0x8/0x10
       [<c10c3e0d>] ? vfs_readdir+0x7d/0xb0
       [<c10c3af0>] ? filldir64+0x0/0xf0
       [<c1002ef3>] ? sysenter_exit+0xf/0x16
       [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
       [<c10c0b13>] sys_unlinkat+0x23/0x40
       [<c1002ec4>] sysenter_do_call+0x12/0x32
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NChristian Kujau <lists@nerdbynature.de>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      4dd85969
    • F
      reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink() · c674905c
      Frederic Weisbecker 提交于
      reiserfs_unlink() may or may not be called under the reiserfs
      lock.
      But it also takes the reiserfs lock and can then acquire it
      recursively which leads to do_journal_begin_r() that fails to
      relax the reiserfs lock before grabbing the journal mutex,
      creating an unexpected lock inversion.
      
      We need to ensure reiserfs_unlink() won't get the reiserfs lock
      recursively using reiserfs_write_lock_once().
      
      This fixes the following warning that precedes a lock inversion
      report (reiserfs lock <-> journal mutex).
      
      ------------[ cut here ]------------
      WARNING: at fs/reiserfs/lock.c:95 reiserfs_lock_check_recursive+0x3a/0x50()
      Hardware name: MS-7418
      Unwanted recursive reiserfs lock!
      Pid: 3208, comm: dbench Not tainted 2.6.32-atom #177
      Call Trace:
       [<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50
       [<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50
       [<c10373a7>] warn_slowpath_common+0x67/0xc0
       [<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50
       [<c1037446>] warn_slowpath_fmt+0x26/0x30
       [<c114327a>] reiserfs_lock_check_recursive+0x3a/0x50
       [<c113c213>] do_journal_begin_r+0x83/0x360
       [<c105eb16>] ? __lock_acquire+0x1296/0x19e0
       [<c1142a57>] ? xattr_unlink+0x57/0xb0
       [<c113c670>] journal_begin+0x80/0x130
       [<c1116d5d>] reiserfs_unlink+0x7d/0x2d0
       [<c1142a57>] ? xattr_unlink+0x57/0xb0
       [<c1142a57>] ? xattr_unlink+0x57/0xb0
       [<c1142a57>] ? xattr_unlink+0x57/0xb0
       [<c1142a64>] xattr_unlink+0x64/0xb0
       [<c1143169>] delete_one_xattr+0x29/0x100
       [<c11427ab>] reiserfs_for_each_xattr+0x10b/0x290
       [<c1143140>] ? delete_one_xattr+0x0/0x100
       [<c1401ca9>] ? mutex_lock_nested+0x299/0x340
       [<c11429aa>] reiserfs_delete_xattrs+0x1a/0x60
       [<c11432f9>] ? reiserfs_write_lock_once+0x29/0x50
       [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
       [<c11b0d0f>] ? _atomic_dec_and_lock+0x4f/0x70
       [<c111e980>] ? reiserfs_delete_inode+0x0/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10505c6>] ? up_read+0x16/0x30
       [<c1022ab7>] ? do_page_fault+0x187/0x330
       [<c1002fd8>] ? restore_all_notrace+0x0/0x18
       [<c1022930>] ? do_page_fault+0x0/0x330
       [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
       [<c10c0a00>] sys_unlink+0x10/0x20
       [<c1002ec4>] sysenter_do_call+0x12/0x32
      ---[ end trace 2e35d71a6cc69d0c ]---
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NChristian Kujau <lists@nerdbynature.de>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      c674905c
    • F
      reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle() · 3f14fea6
      Frederic Weisbecker 提交于
      We call xattr_lookup() from reiserfs_xattr_get(). We then hold
      the reiserfs lock when we grab the i_mutex. But later, we may
      relax the reiserfs lock, creating dependency inversion between
      both locks.
      
      The lookups and creation jobs ar already protected by the
      inode mutex, so we can safely relax the reiserfs lock, dropping
      the unwanted reiserfs lock -> i_mutex dependency, as shown
      in the following lockdep report:
      
      =======================================================
      [ INFO: possible circular locking dependency detected ]
      2.6.32-atom #173
      -------------------------------------------------------
      cp/3204 is trying to acquire lock:
       (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
      
      but task is already holding lock:
       (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
             [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c1401a2b>] mutex_lock_nested+0x5b/0x340
             [<c1141d83>] open_xa_dir+0x43/0x1b0
             [<c1142722>] reiserfs_for_each_xattr+0x62/0x260
             [<c114299a>] reiserfs_delete_xattrs+0x1a/0x60
             [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
             [<c10c9c32>] generic_delete_inode+0xa2/0x170
             [<c10c9d4f>] generic_drop_inode+0x4f/0x70
             [<c10c8b07>] iput+0x47/0x50
             [<c10c0965>] do_unlinkat+0xd5/0x160
             [<c10c0a00>] sys_unlink+0x10/0x20
             [<c1002ec4>] sysenter_do_call+0x12/0x32
      
      -> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
             [<c105f176>] __lock_acquire+0x18f6/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c1401a2b>] mutex_lock_nested+0x5b/0x340
             [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
             [<c1117012>] reiserfs_lookup+0x62/0x140
             [<c10bd85f>] __lookup_hash+0xef/0x110
             [<c10bf21d>] lookup_one_len+0x8d/0xc0
             [<c1141e2a>] open_xa_dir+0xea/0x1b0
             [<c1141fe5>] xattr_lookup+0x15/0x160
             [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
             [<c1144042>] reiserfs_get_acl+0xa2/0x360
             [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
             [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
             [<c10bea96>] vfs_mkdir+0xd6/0x180
             [<c10c0c10>] sys_mkdirat+0xc0/0xd0
             [<c10c0c40>] sys_mkdir+0x20/0x30
             [<c1002ec4>] sysenter_do_call+0x12/0x32
      
      other info that might help us debug this:
      
      2 locks held by cp/3204:
       #0:  (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [<c10bd8d6>] lookup_create+0x26/0xa0
       #1:  (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0
      
      stack backtrace:
      Pid: 3204, comm: cp Not tainted 2.6.32-atom #173
      Call Trace:
       [<c13ff993>] ? printk+0x18/0x1a
       [<c105d33a>] print_circular_bug+0xca/0xd0
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105d3aa>] ? check_usage+0x6a/0x460
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
       [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
       [<c1401a2b>] mutex_lock_nested+0x5b/0x340
       [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
       [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
       [<c1117012>] reiserfs_lookup+0x62/0x140
       [<c105ccca>] ? debug_check_no_locks_freed+0x8a/0x140
       [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
       [<c10bd85f>] __lookup_hash+0xef/0x110
       [<c10bf21d>] lookup_one_len+0x8d/0xc0
       [<c1141e2a>] open_xa_dir+0xea/0x1b0
       [<c1141fe5>] xattr_lookup+0x15/0x160
       [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
       [<c1144042>] reiserfs_get_acl+0xa2/0x360
       [<c10ca2e7>] ? new_inode+0x27/0xa0
       [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
       [<c1402eb7>] ? _spin_unlock+0x27/0x40
       [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
       [<c10c7cb8>] ? __d_lookup+0x108/0x190
       [<c105c932>] ? mark_held_locks+0x62/0x80
       [<c1401c8d>] ? mutex_lock_nested+0x2bd/0x340
       [<c10bd17a>] ? generic_permission+0x1a/0xa0
       [<c11788fe>] ? security_inode_permission+0x1e/0x20
       [<c10bea96>] vfs_mkdir+0xd6/0x180
       [<c10c0c10>] sys_mkdirat+0xc0/0xd0
       [<c10505c6>] ? up_read+0x16/0x30
       [<c1002fd8>] ? restore_all_notrace+0x0/0x18
       [<c10c0c40>] sys_mkdir+0x20/0x30
       [<c1002ec4>] sysenter_do_call+0x12/0x32
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NChristian Kujau <lists@nerdbynature.de>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      3f14fea6
    • F
      reiserfs: Relax reiserfs lock while freeing the journal · 0523676d
      Frederic Weisbecker 提交于
      Keeping the reiserfs lock while freeing the journal on
      umount path triggers a lock inversion between bdev->bd_mutex
      and the reiserfs lock.
      
      We don't need the reiserfs lock at this stage. The filesystem
      is not usable anymore, and there are no more pending commits,
      everything got flushed (even this operation was done in parallel
      and didn't required the reiserfs lock from the current process).
      
      This fixes the following lockdep report:
      
      =======================================================
      [ INFO: possible circular locking dependency detected ]
      2.6.32-atom #172
      -------------------------------------------------------
      umount/3904 is trying to acquire lock:
       (&bdev->bd_mutex){+.+.+.}, at: [<c10de2c2>] __blkdev_put+0x22/0x160
      
      but task is already holding lock:
       (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143279>] reiserfs_write_lock+0x29/0x40
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #3 (&REISERFS_SB(s)->lock){+.+.+.}:
             [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c140199b>] mutex_lock_nested+0x5b/0x340
             [<c1143229>] reiserfs_write_lock_once+0x29/0x50
             [<c111c485>] reiserfs_get_block+0x85/0x1620
             [<c10e1040>] do_mpage_readpage+0x1f0/0x6d0
             [<c10e1640>] mpage_readpages+0xc0/0x100
             [<c1119b89>] reiserfs_readpages+0x19/0x20
             [<c108f1ec>] __do_page_cache_readahead+0x1bc/0x260
             [<c108f2b8>] ra_submit+0x28/0x40
             [<c1087e3e>] filemap_fault+0x40e/0x420
             [<c109b5fd>] __do_fault+0x3d/0x430
             [<c109d47e>] handle_mm_fault+0x12e/0x790
             [<c1022a65>] do_page_fault+0x135/0x330
             [<c1403663>] error_code+0x6b/0x70
             [<c10ef9ca>] load_elf_binary+0x82a/0x1a10
             [<c10ba130>] search_binary_handler+0x90/0x1d0
             [<c10bb70f>] do_execve+0x1df/0x250
             [<c1001746>] sys_execve+0x46/0x70
             [<c1002fa5>] syscall_call+0x7/0xb
      
      -> #2 (&mm->mmap_sem){++++++}:
             [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c109b1ab>] might_fault+0x8b/0xb0
             [<c11b8f52>] copy_to_user+0x32/0x70
             [<c10c3b94>] filldir64+0xa4/0xf0
             [<c1109116>] sysfs_readdir+0x116/0x210
             [<c10c3e1d>] vfs_readdir+0x8d/0xb0
             [<c10c3ea9>] sys_getdents64+0x69/0xb0
             [<c1002ec4>] sysenter_do_call+0x12/0x32
      
      -> #1 (sysfs_mutex){+.+.+.}:
             [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c140199b>] mutex_lock_nested+0x5b/0x340
             [<c110951c>] sysfs_addrm_start+0x2c/0xb0
             [<c1109aa0>] create_dir+0x40/0x90
             [<c1109b1b>] sysfs_create_dir+0x2b/0x50
             [<c11b2352>] kobject_add_internal+0xc2/0x1b0
             [<c11b2531>] kobject_add_varg+0x31/0x50
             [<c11b25ac>] kobject_add+0x2c/0x60
             [<c1258294>] device_add+0x94/0x560
             [<c11036ea>] add_partition+0x18a/0x2a0
             [<c110418a>] rescan_partitions+0x33a/0x450
             [<c10de5bf>] __blkdev_get+0x12f/0x2d0
             [<c10de76a>] blkdev_get+0xa/0x10
             [<c11034b8>] register_disk+0x108/0x130
             [<c11a87a9>] add_disk+0xd9/0x130
             [<c12998e5>] sd_probe_async+0x105/0x1d0
             [<c10528af>] async_thread+0xcf/0x230
             [<c104bfd4>] kthread+0x74/0x80
             [<c1003aab>] kernel_thread_helper+0x7/0x3c
      
      -> #0 (&bdev->bd_mutex){+.+.+.}:
             [<c105f176>] __lock_acquire+0x18f6/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c140199b>] mutex_lock_nested+0x5b/0x340
             [<c10de2c2>] __blkdev_put+0x22/0x160
             [<c10de40a>] blkdev_put+0xa/0x10
             [<c113ce22>] free_journal_ram+0xd2/0x130
             [<c113ea18>] do_journal_release+0x98/0x190
             [<c113eb2a>] journal_release+0xa/0x10
             [<c1128eb6>] reiserfs_put_super+0x36/0x130
             [<c10b776f>] generic_shutdown_super+0x4f/0xe0
             [<c10b7825>] kill_block_super+0x25/0x40
             [<c11255df>] reiserfs_kill_sb+0x7f/0x90
             [<c10b7f4a>] deactivate_super+0x7a/0x90
             [<c10cccd8>] mntput_no_expire+0x98/0xd0
             [<c10ccfcc>] sys_umount+0x4c/0x310
             [<c10cd2a9>] sys_oldumount+0x19/0x20
             [<c1002ec4>] sysenter_do_call+0x12/0x32
      
      other info that might help us debug this:
      
      2 locks held by umount/3904:
       #0:  (&type->s_umount_key#30){+++++.}, at: [<c10b7f45>] deactivate_super+0x75/0x90
       #1:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143279>] reiserfs_write_lock+0x29/0x40
      
      stack backtrace:
      Pid: 3904, comm: umount Not tainted 2.6.32-atom #172
      Call Trace:
       [<c13ff903>] ? printk+0x18/0x1a
       [<c105d33a>] print_circular_bug+0xca/0xd0
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c108b66f>] ? free_pcppages_bulk+0x1f/0x250
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c10de2c2>] ? __blkdev_put+0x22/0x160
       [<c10de2c2>] ? __blkdev_put+0x22/0x160
       [<c140199b>] mutex_lock_nested+0x5b/0x340
       [<c10de2c2>] ? __blkdev_put+0x22/0x160
       [<c105c932>] ? mark_held_locks+0x62/0x80
       [<c10afe12>] ? kfree+0x92/0xd0
       [<c10de2c2>] __blkdev_put+0x22/0x160
       [<c105cc3b>] ? trace_hardirqs_on+0xb/0x10
       [<c10de40a>] blkdev_put+0xa/0x10
       [<c113ce22>] free_journal_ram+0xd2/0x130
       [<c113ea18>] do_journal_release+0x98/0x190
       [<c113eb2a>] journal_release+0xa/0x10
       [<c1128eb6>] reiserfs_put_super+0x36/0x130
       [<c1050596>] ? up_write+0x16/0x30
       [<c10b776f>] generic_shutdown_super+0x4f/0xe0
       [<c10b7825>] kill_block_super+0x25/0x40
       [<c10f41e0>] ? vfs_quota_off+0x0/0x20
       [<c11255df>] reiserfs_kill_sb+0x7f/0x90
       [<c10b7f4a>] deactivate_super+0x7a/0x90
       [<c10cccd8>] mntput_no_expire+0x98/0xd0
       [<c10ccfcc>] sys_umount+0x4c/0x310
       [<c10cd2a9>] sys_oldumount+0x19/0x20
       [<c1002ec4>] sysenter_do_call+0x12/0x32
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      0523676d
    • F
      reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr · 27026a05
      Frederic Weisbecker 提交于
      While deleting the xattrs of an inode, we hold the reiserfs lock
      and grab the inode->i_mutex of the targeted inode and the root
      private xattr directory.
      
      Later on, we may relax the reiserfs lock for various reasons, this
      creates inverted dependencies.
      
      We can remove the reiserfs lock -> i_mutex dependency by relaxing
      the former before calling open_xa_dir(). This is fine because the
      lookup and creation of xattr private directories done in
      open_xa_dir() are covered by the targeted inode mutexes. And deeper
      operations in the tree are still done under the write lock.
      
      This fixes the following lockdep report:
      
      =======================================================
      [ INFO: possible circular locking dependency detected ]
      2.6.32-atom #173
      -------------------------------------------------------
      cp/3204 is trying to acquire lock:
       (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
      
      but task is already holding lock:
       (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
             [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c1401a2b>] mutex_lock_nested+0x5b/0x340
             [<c1141d83>] open_xa_dir+0x43/0x1b0
             [<c1142722>] reiserfs_for_each_xattr+0x62/0x260
             [<c114299a>] reiserfs_delete_xattrs+0x1a/0x60
             [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
             [<c10c9c32>] generic_delete_inode+0xa2/0x170
             [<c10c9d4f>] generic_drop_inode+0x4f/0x70
             [<c10c8b07>] iput+0x47/0x50
             [<c10c0965>] do_unlinkat+0xd5/0x160
             [<c10c0a00>] sys_unlink+0x10/0x20
             [<c1002ec4>] sysenter_do_call+0x12/0x32
      
      -> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
             [<c105f176>] __lock_acquire+0x18f6/0x19e0
             [<c105f2c8>] lock_acquire+0x68/0x90
             [<c1401a2b>] mutex_lock_nested+0x5b/0x340
             [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
             [<c1117012>] reiserfs_lookup+0x62/0x140
             [<c10bd85f>] __lookup_hash+0xef/0x110
             [<c10bf21d>] lookup_one_len+0x8d/0xc0
             [<c1141e2a>] open_xa_dir+0xea/0x1b0
             [<c1141fe5>] xattr_lookup+0x15/0x160
             [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
             [<c1144042>] reiserfs_get_acl+0xa2/0x360
             [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
             [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
             [<c10bea96>] vfs_mkdir+0xd6/0x180
             [<c10c0c10>] sys_mkdirat+0xc0/0xd0
             [<c10c0c40>] sys_mkdir+0x20/0x30
             [<c1002ec4>] sysenter_do_call+0x12/0x32
      
      other info that might help us debug this:
      
      2 locks held by cp/3204:
       #0:  (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [<c10bd8d6>] lookup_create+0x26/0xa0
       #1:  (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0
      
      stack backtrace:
      Pid: 3204, comm: cp Not tainted 2.6.32-atom #173
      Call Trace:
       [<c13ff993>] ? printk+0x18/0x1a
       [<c105d33a>] print_circular_bug+0xca/0xd0
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105d3aa>] ? check_usage+0x6a/0x460
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
       [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
       [<c1401a2b>] mutex_lock_nested+0x5b/0x340
       [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
       [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
       [<c1117012>] reiserfs_lookup+0x62/0x140
       [<c105ccca>] ? debug_check_no_locks_freed+0x8a/0x140
       [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
       [<c10bd85f>] __lookup_hash+0xef/0x110
       [<c10bf21d>] lookup_one_len+0x8d/0xc0
       [<c1141e2a>] open_xa_dir+0xea/0x1b0
       [<c1141fe5>] xattr_lookup+0x15/0x160
       [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
       [<c1144042>] reiserfs_get_acl+0xa2/0x360
       [<c10ca2e7>] ? new_inode+0x27/0xa0
       [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
       [<c1402eb7>] ? _spin_unlock+0x27/0x40
       [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
       [<c10c7cb8>] ? __d_lookup+0x108/0x190
       [<c105c932>] ? mark_held_locks+0x62/0x80
       [<c1401c8d>] ? mutex_lock_nested+0x2bd/0x340
       [<c10bd17a>] ? generic_permission+0x1a/0xa0
       [<c11788fe>] ? security_inode_permission+0x1e/0x20
       [<c10bea96>] vfs_mkdir+0xd6/0x180
       [<c10c0c10>] sys_mkdirat+0xc0/0xd0
       [<c10505c6>] ? up_read+0x16/0x30
       [<c1002fd8>] ? restore_all_notrace+0x0/0x18
       [<c10c0c40>] sys_mkdir+0x20/0x30
       [<c1002ec4>] sysenter_do_call+0x12/0x32
      
      v2: Don't drop reiserfs_mutex_lock_nested_safe() as we'll still
          need it later
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: NChristian Kujau <lists@nerdbynature.de>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      27026a05
    • F
      reiserfs: Warn on lock relax if taken recursively · c4a62ca3
      Frederic Weisbecker 提交于
      When we relax the reiserfs lock to avoid creating unwanted
      dependencies against others locks while grabbing these,
      we want to ensure it has not been taken recursively, otherwise
      the lock won't be really relaxed. Only its depth will be decreased.
      The unwanted dependency would then actually happen.
      
      To prevent from that, add a reiserfs_lock_check_recursive() call
      in the places that need it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      c4a62ca3
    • F
      reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion · 0719d343
      Frederic Weisbecker 提交于
      i_xattr_sem depends on the reiserfs lock. But after we grab
      i_xattr_sem, we may relax/relock the reiserfs lock while waiting
      on a freezed filesystem, creating a dependency inversion between
      the two locks.
      
      In order to avoid the i_xattr_sem -> reiserfs lock dependency, let's
      create a reiserfs_down_read_safe() that acts like
      reiserfs_mutex_lock_safe(): relax the reiserfs lock while grabbing
      another lock to avoid undesired dependencies induced by the
      heivyweight reiserfs lock.
      
      This fixes the following warning:
      
      [  990.005931] =======================================================
      [  990.012373] [ INFO: possible circular locking dependency detected ]
      [  990.013233] 2.6.33-rc1 #1
      [  990.013233] -------------------------------------------------------
      [  990.013233] dbench/1891 is trying to acquire lock:
      [  990.013233]  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
      [  990.013233]
      [  990.013233] but task is already holding lock:
      [  990.013233]  (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
      [  990.013233]
      [  990.013233] which lock already depends on the new lock.
      [  990.013233]
      [  990.013233]
      [  990.013233] the existing dependency chain (in reverse order) is:
      [  990.013233]
      [  990.013233] -> #1 (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}:
      [  990.013233]        [<ffffffff81063afc>] __lock_acquire+0xf9c/0x1560
      [  990.013233]        [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
      [  990.013233]        [<ffffffff814ac194>] down_write+0x44/0x80
      [  990.013233]        [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
      [  990.013233]        [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
      [  990.013233]        [<ffffffff8115a6aa>] user_set+0x8a/0x90
      [  990.013233]        [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
      [  990.013233]        [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
      [  990.013233]        [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
      [  990.013233]        [<ffffffff810e2780>] setxattr+0xc0/0x150
      [  990.013233]        [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
      [  990.013233]        [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
      [  990.013233]
      [  990.013233] -> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
      [  990.013233]        [<ffffffff81063e30>] __lock_acquire+0x12d0/0x1560
      [  990.013233]        [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
      [  990.013233]        [<ffffffff814aba77>] __mutex_lock_common+0x47/0x3b0
      [  990.013233]        [<ffffffff814abebe>] mutex_lock_nested+0x3e/0x50
      [  990.013233]        [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
      [  990.013233]        [<ffffffff811340e5>] reiserfs_prepare_write+0x45/0x180
      [  990.013233]        [<ffffffff81158bb6>] reiserfs_xattr_set_handle+0x2a6/0x470
      [  990.013233]        [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
      [  990.013233]        [<ffffffff8115a6aa>] user_set+0x8a/0x90
      [  990.013233]        [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
      [  990.013233]        [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
      [  990.013233]        [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
      [  990.013233]        [<ffffffff810e2780>] setxattr+0xc0/0x150
      [  990.013233]        [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
      [  990.013233]        [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
      [  990.013233]
      [  990.013233] other info that might help us debug this:
      [  990.013233]
      [  990.013233] 2 locks held by dbench/1891:
      [  990.013233]  #0:  (&sb->s_type->i_mutex_key#12){+.+.+.}, at: [<ffffffff810e2678>] vfs_setxattr+0x78/0xc0
      [  990.013233]  #1:  (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
      [  990.013233]
      [  990.013233] stack backtrace:
      [  990.013233] Pid: 1891, comm: dbench Not tainted 2.6.33-rc1 #1
      [  990.013233] Call Trace:
      [  990.013233]  [<ffffffff81061639>] print_circular_bug+0xe9/0xf0
      [  990.013233]  [<ffffffff81063e30>] __lock_acquire+0x12d0/0x1560
      [  990.013233]  [<ffffffff8115899a>] ? reiserfs_xattr_set_handle+0x8a/0x470
      [  990.013233]  [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
      [  990.013233]  [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
      [  990.013233]  [<ffffffff8115899a>] ? reiserfs_xattr_set_handle+0x8a/0x470
      [  990.013233]  [<ffffffff814aba77>] __mutex_lock_common+0x47/0x3b0
      [  990.013233]  [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
      [  990.013233]  [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
      [  990.013233]  [<ffffffff81062592>] ? mark_held_locks+0x72/0xa0
      [  990.013233]  [<ffffffff814ab81d>] ? __mutex_unlock_slowpath+0xbd/0x140
      [  990.013233]  [<ffffffff810628ad>] ? trace_hardirqs_on_caller+0x14d/0x1a0
      [  990.013233]  [<ffffffff814abebe>] mutex_lock_nested+0x3e/0x50
      [  990.013233]  [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
      [  990.013233]  [<ffffffff811340e5>] reiserfs_prepare_write+0x45/0x180
      [  990.013233]  [<ffffffff81158bb6>] reiserfs_xattr_set_handle+0x2a6/0x470
      [  990.013233]  [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
      [  990.013233]  [<ffffffff814abcb4>] ? __mutex_lock_common+0x284/0x3b0
      [  990.013233]  [<ffffffff8115a6aa>] user_set+0x8a/0x90
      [  990.013233]  [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
      [  990.013233]  [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
      [  990.013233]  [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
      [  990.013233]  [<ffffffff810e2780>] setxattr+0xc0/0x150
      [  990.013233]  [<ffffffff81056018>] ? sched_clock_cpu+0xb8/0x100
      [  990.013233]  [<ffffffff8105eded>] ? trace_hardirqs_off+0xd/0x10
      [  990.013233]  [<ffffffff810560a3>] ? cpu_clock+0x43/0x50
      [  990.013233]  [<ffffffff810c6820>] ? fget+0xb0/0x110
      [  990.013233]  [<ffffffff810c6770>] ? fget+0x0/0x110
      [  990.013233]  [<ffffffff81002ddc>] ? sysret_check+0x27/0x62
      [  990.013233]  [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
      [  990.013233]  [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
      Reported-and-tested-by: NChristian Kujau <lists@nerdbynature.de>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      0719d343
  9. 01 1月, 2010 2 次提交
    • T
      ext4: Calculate metadata requirements more accurately · 9d0be502
      Theodore Ts'o 提交于
      In the past, ext4_calc_metadata_amount(), and its sub-functions
      ext4_ext_calc_metadata_amount() and ext4_indirect_calc_metadata_amount()
      badly over-estimated the number of metadata blocks that might be
      required for delayed allocation blocks.  This didn't matter as much
      when functions which managed the reserved metadata blocks were more
      aggressive about dropping reserved metadata blocks as delayed
      allocation blocks were written, but unfortunately they were too
      aggressive.  This was fixed in commit 0637c6f4, but as a result the
      over-estimation by ext4_calc_metadata_amount() would lead to reserving
      2-3 times the number of pending delayed allocation blocks as
      potentially required metadata blocks.  So if there are 1 megabytes of
      blocks which have been not yet been allocation, up to 3 megabytes of
      space would get reserved out of the user's quota and from the file
      system free space pool until all of the inode's data blocks have been
      allocated.
      
      This commit addresses this problem by much more accurately estimating
      the number of metadata blocks that will be required.  It will still
      somewhat over-estimate the number of blocks needed, since it must make
      a worst case estimate not knowing which physical blocks will be
      needed, but it is much more accurate than before.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      9d0be502
    • T
      ext4: Fix accounting of reserved metadata blocks · ee5f4d9c
      Theodore Ts'o 提交于
      Commit 0637c6f4 had a typo which caused the reserved metadata blocks to
      not be released correctly.   Fix this.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      ee5f4d9c
  10. 31 12月, 2009 3 次提交
  11. 30 12月, 2009 2 次提交
  12. 26 12月, 2009 1 次提交
  13. 25 12月, 2009 2 次提交
  14. 24 12月, 2009 2 次提交