1. 05 10月, 2010 1 次提交
    • J
      BKL: Explicitly add BKL around get_sb/fill_super · db719222
      Jan Blunck 提交于
      This patch is a preparation necessary to remove the BKL from do_new_mount().
      It explicitly adds calls to lock_kernel()/unlock_kernel() around
      get_sb/fill_super operations for filesystems that still uses the BKL.
      
      I've read through all the code formerly covered by the BKL inside
      do_kern_mount() and have satisfied myself that it doesn't need the BKL
      any more.
      
      do_kern_mount() is already called without the BKL when mounting the rootfs
      and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called
      from various places without BKL: simple_pin_fs(), nfs_do_clone_mount()
      through nfs_follow_mountpoint(), afs_mntpt_do_automount() through
      afs_mntpt_follow_link(). Both later functions are actually the filesystems
      follow_link inode operation. vfs_kern_mount() is calling the specified
      get_sb function and lets the filesystem do its job by calling the given
      fill_super function.
      
      Therefore I think it is safe to push down the BKL from the VFS to the
      low-level filesystems get_sb/fill_super operation.
      
      [arnd: do not add the BKL to those file systems that already
             don't use it elsewhere]
      Signed-off-by: NJan Blunck <jblunck@infradead.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Christoph Hellwig <hch@infradead.org>
      db719222
  2. 24 9月, 2010 5 次提交
  3. 18 9月, 2010 1 次提交
  4. 11 9月, 2010 4 次提交
    • T
      Ocfs2: Handle empty list in lockres_seq_start() for dlmdebug.c · 228ac635
      Tristan Ye 提交于
      This patch tries to handle the case in which list 'dlm->tracking_list' is
      empty, to avoid accessing an invalid pointer. It fixes the following oops:
      
      http://oss.oracle.com/bugzilla/show_bug.cgi?id=1287Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      228ac635
    • T
      Ocfs2: Re-access the journal after ocfs2_insert_extent() in dxdir codes. · 0f4da216
      Tristan Ye 提交于
      In ocfs2_dx_dir_rebalance(), we need to rejournal_acess the blocks after
      calling ocfs2_insert_extent() since growing an extent tree may trigger
      ocfs2_extend_trans(), which makes previous journal_access meaningless.
      Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      0f4da216
    • T
      ocfs2: Fix lockdep warning in reflink. · 07eaac94
      Tao Ma 提交于
      This patch change mutex_lock to a new subclass and
      add a new inode lock subclass for the target inode
      which caused this lockdep warning.
      
      =============================================
      [ INFO: possible recursive locking detected ]
      2.6.35+ #5
      ---------------------------------------------
      reflink/11086 is trying to acquire lock:
       (Meta){+++++.}, at: [<ffffffffa06f9d65>] ocfs2_reflink_ioctl+0x898/0x1229 [ocfs2]
      
      but task is already holding lock:
       (Meta){+++++.}, at: [<ffffffffa06f9aa0>] ocfs2_reflink_ioctl+0x5d3/0x1229 [ocfs2]
      
      other info that might help us debug this:
      6 locks held by reflink/11086:
       #0:  (&sb->s_type->i_mutex_key#15/1){+.+.+.}, at: [<ffffffff820e09ec>] lookup_create+0x26/0x97
       #1:  (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffffa06f99a0>] ocfs2_reflink_ioctl+0x4d3/0x1229 [ocfs2]
       #2:  (Meta){+++++.}, at: [<ffffffffa06f9aa0>] ocfs2_reflink_ioctl+0x5d3/0x1229 [ocfs2]
       #3:  (&oi->ip_xattr_sem){+.+.+.}, at: [<ffffffffa06f9b58>] ocfs2_reflink_ioctl+0x68b/0x1229 [ocfs2]
       #4:  (&oi->ip_alloc_sem){+.+.+.}, at: [<ffffffffa06f9b67>] ocfs2_reflink_ioctl+0x69a/0x1229 [ocfs2]
       #5:  (&sb->s_type->i_mutex_key#15/2){+.+...}, at: [<ffffffffa06f9d4f>] ocfs2_reflink_ioctl+0x882/0x1229 [ocfs2]
      
      stack backtrace:
      Pid: 11086, comm: reflink Not tainted 2.6.35+ #5
      Call Trace:
       [<ffffffff82063dd9>] validate_chain+0x56e/0xd68
       [<ffffffff82062275>] ? mark_held_locks+0x49/0x69
       [<ffffffff82064d6d>] __lock_acquire+0x79a/0x7f1
       [<ffffffff82065a81>] lock_acquire+0xc6/0xed
       [<ffffffffa06f9d65>] ? ocfs2_reflink_ioctl+0x898/0x1229 [ocfs2]
       [<ffffffffa06c9ade>] __ocfs2_cluster_lock+0x975/0xa0d [ocfs2]
       [<ffffffffa06f9d65>] ? ocfs2_reflink_ioctl+0x898/0x1229 [ocfs2]
       [<ffffffffa06e107b>] ? ocfs2_wait_for_recovery+0x15/0x8a [ocfs2]
       [<ffffffffa06cb6ea>] ocfs2_inode_lock_full_nested+0x1ac/0xdc5 [ocfs2]
       [<ffffffffa06f9d65>] ? ocfs2_reflink_ioctl+0x898/0x1229 [ocfs2]
       [<ffffffff820623a0>] ? trace_hardirqs_on_caller+0x10b/0x12f
       [<ffffffff82060193>] ? debug_mutex_free_waiter+0x4f/0x53
       [<ffffffffa06f9d65>] ocfs2_reflink_ioctl+0x898/0x1229 [ocfs2]
       [<ffffffffa06ce24a>] ? ocfs2_file_lock_res_init+0x66/0x78 [ocfs2]
       [<ffffffff820bb2d2>] ? might_fault+0x40/0x8d
       [<ffffffffa06df9f6>] ocfs2_ioctl+0x61a/0x656 [ocfs2]
       [<ffffffff820ee5d3>] ? mntput_no_expire+0x1d/0xb0
       [<ffffffff820e07b3>] ? path_put+0x2c/0x31
       [<ffffffff820e53ac>] vfs_ioctl+0x2a/0x9d
       [<ffffffff820e5903>] do_vfs_ioctl+0x45d/0x4ae
       [<ffffffff8233a7f6>] ? _raw_spin_unlock+0x26/0x2a
       [<ffffffff8200299c>] ? sysret_check+0x27/0x62
       [<ffffffff820e59ab>] sys_ioctl+0x57/0x7a
       [<ffffffff8200296b>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      07eaac94
    • T
      ocfs2/lockdep: Move ip_xattr_sem out of ocfs2_xattr_get_nolock. · 5e64b0d9
      Tao Ma 提交于
      As the name shows, we shouldn't have any lock in
      ocfs2_xattr_get_nolock. so lift ip_xattr_sem to the caller.
      This should be safe for us since the only 2 callers are:
      1. ocfs2_xattr_get which will lock the resources.
      2. ocfs2_mknod which don't need this locking.
      
      And this also resolves the following lockdep warning.
      
      =======================================================
      [ INFO: possible circular locking dependency detected ]
      2.6.35+ #5
      -------------------------------------------------------
      reflink/30027 is trying to acquire lock:
       (&oi->ip_alloc_sem){+.+.+.}, at: [<ffffffffa0673b67>] ocfs2_reflink_ioctl+0x69a/0x1226 [ocfs2]
      
      but task is already holding lock:
       (&oi->ip_xattr_sem){++++..}, at: [<ffffffffa0673b58>] ocfs2_reflink_ioctl+0x68b/0x1226 [ocfs2]
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #3 (&oi->ip_xattr_sem){++++..}:
             [<ffffffff82064d6d>] __lock_acquire+0x79a/0x7f1
             [<ffffffff82065a81>] lock_acquire+0xc6/0xed
             [<ffffffff82339650>] down_read+0x34/0x47
             [<ffffffffa0691cb8>] ocfs2_xattr_get_nolock+0xa0/0x4e6 [ocfs2]
             [<ffffffffa069d64f>] ocfs2_get_acl_nolock+0x5c/0x132 [ocfs2]
             [<ffffffffa069d9c7>] ocfs2_init_acl+0x60/0x243 [ocfs2]
             [<ffffffffa066499d>] ocfs2_mknod+0xae8/0xfea [ocfs2]
             [<ffffffffa0665041>] ocfs2_create+0x9d/0x105 [ocfs2]
             [<ffffffff820e1c83>] vfs_create+0x9b/0xf4
             [<ffffffff820e20bb>] do_last+0x2fd/0x5be
             [<ffffffff820e31c0>] do_filp_open+0x1fb/0x572
             [<ffffffff820d6cf6>] do_sys_open+0x5a/0xe7
             [<ffffffff820d6dac>] sys_open+0x1b/0x1d
             [<ffffffff8200296b>] system_call_fastpath+0x16/0x1b
      
      -> #2 (jbd2_handle){+.+...}:
             [<ffffffff82064d6d>] __lock_acquire+0x79a/0x7f1
             [<ffffffff82065a81>] lock_acquire+0xc6/0xed
             [<ffffffffa0604ff8>] start_this_handle+0x4a3/0x4bc [jbd2]
             [<ffffffffa06051d6>] jbd2__journal_start+0xba/0xee [jbd2]
             [<ffffffffa0605218>] jbd2_journal_start+0xe/0x10 [jbd2]
             [<ffffffffa065ca34>] ocfs2_start_trans+0xb7/0x19b [ocfs2]
             [<ffffffffa06645f3>] ocfs2_mknod+0x73e/0xfea [ocfs2]
             [<ffffffffa0665041>] ocfs2_create+0x9d/0x105 [ocfs2]
             [<ffffffff820e1c83>] vfs_create+0x9b/0xf4
             [<ffffffff820e20bb>] do_last+0x2fd/0x5be
             [<ffffffff820e31c0>] do_filp_open+0x1fb/0x572
             [<ffffffff820d6cf6>] do_sys_open+0x5a/0xe7
             [<ffffffff820d6dac>] sys_open+0x1b/0x1d
             [<ffffffff8200296b>] system_call_fastpath+0x16/0x1b
      
      -> #1 (&journal->j_trans_barrier){.+.+..}:
             [<ffffffff82064d6d>] __lock_acquire+0x79a/0x7f1
             [<ffffffff82064fa9>] lock_release_non_nested+0x1e5/0x24b
             [<ffffffff82065999>] lock_release+0x158/0x17a
             [<ffffffff823389f6>] __mutex_unlock_slowpath+0xbf/0x11b
             [<ffffffff82338a5b>] mutex_unlock+0x9/0xb
             [<ffffffffa0679673>] ocfs2_free_ac_resource+0x31/0x67 [ocfs2]
             [<ffffffffa067c6bc>] ocfs2_free_alloc_context+0x11/0x1d [ocfs2]
             [<ffffffffa0633de0>] ocfs2_write_begin_nolock+0x141e/0x159b [ocfs2]
             [<ffffffffa0635523>] ocfs2_write_begin+0x11e/0x1e7 [ocfs2]
             [<ffffffff820a1297>] generic_file_buffered_write+0x10c/0x210
             [<ffffffffa0653624>] ocfs2_file_aio_write+0x4cc/0x6d3 [ocfs2]
             [<ffffffff820d822d>] do_sync_write+0xc2/0x106
             [<ffffffff820d897b>] vfs_write+0xae/0x131
             [<ffffffff820d8e55>] sys_write+0x47/0x6f
             [<ffffffff8200296b>] system_call_fastpath+0x16/0x1b
      
      -> #0 (&oi->ip_alloc_sem){+.+.+.}:
             [<ffffffff82063f92>] validate_chain+0x727/0xd68
             [<ffffffff82064d6d>] __lock_acquire+0x79a/0x7f1
             [<ffffffff82065a81>] lock_acquire+0xc6/0xed
             [<ffffffff82339694>] down_write+0x31/0x52
             [<ffffffffa0673b67>] ocfs2_reflink_ioctl+0x69a/0x1226 [ocfs2]
             [<ffffffffa06599f6>] ocfs2_ioctl+0x61a/0x656 [ocfs2]
             [<ffffffff820e53ac>] vfs_ioctl+0x2a/0x9d
             [<ffffffff820e5903>] do_vfs_ioctl+0x45d/0x4ae
             [<ffffffff820e59ab>] sys_ioctl+0x57/0x7a
             [<ffffffff8200296b>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      5e64b0d9
  5. 08 9月, 2010 13 次提交
  6. 10 8月, 2010 6 次提交
  7. 08 8月, 2010 8 次提交
    • T
      O2net: Disallow o2net accept connection request from itself. · 415cf32c
      Tristan Ye 提交于
      Currently, o2net_accept_one() is allowed to accept a connection from
      listening node itself, such a fake connection will not be successfully
      established due to no handshake detected afterwards, and later end up
      with triggering connecting worker in a loop.
      
      We're going to fix this by treating such connection request as 'invalid',
      since we've got no chance of requesting connection from a node to itself
      in a OCFS2 cluster.
      
      The fix doesn't hurt user's scan for o2net-listener, it always gets a
      successful connection from userpace.
      Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
      Acked-by: NSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      415cf32c
    • W
      ocfs2/dlm: remove potential deadlock -V3 · b11f1f1a
      Wengang Wang 提交于
      When we need to take both dlm_domain_lock and dlm->spinlock, we should take
      them in order of: dlm_domain_lock then dlm->spinlock.
      
      There is pathes disobey this order. That is calling dlm_lockres_put() with
      dlm->spinlock held in dlm_run_purge_list. dlm_lockres_put() calls dlm_put() at
      the ref and dlm_put() locks on dlm_domain_lock.
      
      Fix:
      Don't grab/put the dlm when the initialising/releasing lockres.
      That grab is not required because we don't call dlm_unregister_domain()
      based on refcount.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      b11f1f1a
    • W
      ocfs2/dlm: avoid incorrect bit set in refmap on recovery master · a524812b
      Wengang Wang 提交于
      In the following situation, there remains an incorrect bit in refmap on the
      recovery master. Finally the recovery master will fail at purging the lockres
      due to the incorrect bit in refmap.
      
      1) node A has no interest on lockres A any longer, so it is purging it.
      2) the owner of lockres A is node B, so node A is sending de-ref message
      to node B.
      3) at this time, node B crashed. node C becomes the recovery master. it recovers
      lockres A(because the master is the dead node B).
      4) node A migrated lockres A to node C with a refbit there.
      5) node A failed to send de-ref message to node B because it crashed. The failure
      is ignored. no other action is done for lockres A any more.
      
      For mormal, re-send the deref message to it to recovery master can fix it. Well,
      ignoring the failure of deref to the original master and not recovering the lockres
      to recovery master has the same effect. And the later is simpler.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Acked-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      a524812b
    • J
      Fix the nested PR lock calling issue in ACL · 845b6cf3
      Jiaju Zhang 提交于
      Hi,
      
      Thanks a lot for all the review and comments so far;) I'd like to send
      the improved (V4) version of this patch.
      
      This patch fixes a deadlock in OCFS2 ACL. We found this bug in OCFS2
      and Samba integration using scenario, the symptom is several smbd
      processes will be hung under heavy workload. Finally we found out it
      is the nested PR lock calling that leads to this deadlock:
      
       node1        node2
                    gr PR
                      |
                      V
       PR(EX)---> BAST:OCFS2_LOCK_BLOCKED
                      |
                      V
                    rq PR
                      |
                      V
                    wait=1
      
      After requesting the 2nd PR lock, the process "smbd" went into D
      state. It can only be woken up when the 1st PR lock's RO holder equals
      zero. There should be an ocfs2_inode_unlock in the calling path later
      on, which can decrement the RO holder. But since it has been in
      uninterruptible sleep, the unlock function has no chance to be called.
      
      The related stack trace is:
      smbd          D ffff8800013d0600     0  9522   5608 0x00000000
       ffff88002ca7fb18 0000000000000282 ffff88002f964500 ffff88002ca7fa98
       ffff8800013d0600 ffff88002ca7fae0 ffff88002f964340 ffff88002f964340
       ffff88002ca7ffd8 ffff88002ca7ffd8 ffff88002f964340 ffff88002f964340
      Call Trace:
      [<ffffffff80350425>] schedule_timeout+0x175/0x210
      [<ffffffff8034f580>] wait_for_common+0xf0/0x210
      [<ffffffffa03e12b9>] __ocfs2_cluster_lock+0x3b9/0xa90 [ocfs2]
      [<ffffffffa03e7665>] ocfs2_inode_lock_full_nested+0x255/0xdb0 [ocfs2]
      [<ffffffffa0446019>] ocfs2_get_acl+0x69/0x120 [ocfs2]
      [<ffffffffa0446368>] ocfs2_check_acl+0x28/0x80 [ocfs2]
      [<ffffffff800e3507>] acl_permission_check+0x57/0xb0
      [<ffffffff800e357d>] generic_permission+0x1d/0xc0
      [<ffffffffa03eecea>] ocfs2_permission+0x10a/0x1d0 [ocfs2]
      [<ffffffff800e3f65>] inode_permission+0x45/0x100
      [<ffffffff800d86b3>] sys_chdir+0x53/0x90
      [<ffffffff80007458>] system_call_fastpath+0x16/0x1b
      [<00007f34a4ef6927>] 0x7f34a4ef6927
      
      For details, please see:
      https://bugzilla.novell.com/show_bug.cgi?id=614332 and
      http://oss.oracle.com/bugzilla/show_bug.cgi?id=1278Signed-off-by: NJiaju Zhang <jjzhang@suse.de>
      Acked-by: NMark Fasheh <mfasheh@suse.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      845b6cf3
    • T
      ocfs2: Count more refcount records in file system fragmentation. · 8a2e70c4
      Tao Ma 提交于
      The refcount record calculation in ocfs2_calc_refcount_meta_credits
      is too optimistic that we can always allocate contiguous clusters
      and handle an already existed refcount rec as a whole. Actually
      because of file system fragmentation, we may have the chance to split
      a refcount record into 3 parts during the transaction. So consider
      the worst case in record calculation.
      
      Cc: stable@kernel.org
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      8a2e70c4
    • S
      ocfs2 fix o2dlm dlm run purgelist (rev 3) · 7beaf243
      Srinivas Eeda 提交于
      This patch fixes two problems in dlm_run_purgelist
      
      1. If a lockres is found to be in use, dlm_run_purgelist keeps trying to purge
      the same lockres instead of trying the next lockres.
      
      2. When a lockres is found unused, dlm_run_purgelist releases lockres spinlock
      before setting DLM_LOCK_RES_DROPPING_REF and calls dlm_purge_lockres.
      spinlock is reacquired but in this window lockres can get reused. This leads
      to BUG.
      
      This patch modifies dlm_run_purgelist to skip lockres if it's in use and purge
       next lockres. It also sets DLM_LOCK_RES_DROPPING_REF before releasing the
      lockres spinlock protecting it from getting reused.
      Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Acked-by: NSunil Mushran <sunil.mushran@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      7beaf243
    • W
      ocfs2/dlm: fix a dead lock · 6d98c3cc
      Wengang Wang 提交于
      When we have to take both dlm->master_lock and lockres->spinlock,
      take them in order
      
      lockres->spinlock and then dlm->master_lock.
      
      The patch fixes a violation of the rule.
      We can simply move taking dlm->master_lock to where we have dropped res->spinlock
      since when we access res->state and free mle memory we don't need master_lock's
      protection.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      6d98c3cc
    • T
      ocfs2: do not overwrite error codes in ocfs2_init_acl · 6eda3dd3
      Tiger Yang 提交于
      Setting the acl while creating a new inode depends on
      the error codes of posix_acl_create_masq. This patch fix
      a issue of overwriting the error codes of it.
      Reported-by: NPawel Zawora <pzawora@gmail.com>
      Cc: <stable@kernel.org> [ .33, .34 ]
      Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      6eda3dd3
  8. 04 8月, 2010 1 次提交
  9. 27 7月, 2010 1 次提交
    • C
      direct-io: move aio_complete into ->end_io · 552ef802
      Christoph Hellwig 提交于
      Filesystems with unwritten extent support must not complete an AIO request
      until the transaction to convert the extent has been commited.  That means
      the aio_complete calls needs to be moved into the ->end_io callback so
      that the filesystem can control when to call it exactly.
      
      This makes a bit of a mess out of dio_complete and the ->end_io callback
      prototype even more complicated. 
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: Jan Kara <jack@suse.cz> 
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      552ef802