1. 27 7月, 2016 1 次提交
  2. 31 3月, 2016 1 次提交
    • A
      posix_acl: Inode acl caching fixes · b8a7a3a6
      Andreas Gruenbacher 提交于
      When get_acl() is called for an inode whose ACL is not cached yet, the
      get_acl inode operation is called to fetch the ACL from the filesystem.
      The inode operation is responsible for updating the cached acl with
      set_cached_acl().  This is done without locking at the VFS level, so
      another task can call set_cached_acl() or forget_cached_acl() before the
      get_acl inode operation gets to calling set_cached_acl(), and then
      get_acl's call to set_cached_acl() results in caching an outdate ACL.
      
      Prevent this from happening by setting the cached ACL pointer to a
      task-specific sentinel value before calling the get_acl inode operation.
      Move the responsibility for updating the cached ACL from the get_acl
      inode operations to get_acl().  There, only set the cached ACL if the
      sentinel value hasn't changed.
      
      The sentinel values are chosen to have odd values.  Likewise, the value
      of ACL_NOT_CACHED is odd.  In contrast, ACL object pointers always have
      an even value (ACLs are aligned in memory).  This allows to distinguish
      uncached ACLs values from ACL objects.
      
      In addition, switch from guarding inode->i_acl and inode->i_default_acl
      upates by the inode->i_lock spinlock to using xchg() and cmpxchg().
      
      Filesystems that do not want ACLs returned from their get_acl inode
      operations to be cached must call forget_cached_acl() to prevent the VFS
      from doing so.
      
      (Patch written by Al Viro and Andreas Gruenbacher.)
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b8a7a3a6
  3. 22 1月, 2016 1 次提交
    • T
      ocfs2: NFS hangs in __ocfs2_cluster_lock due to race with ocfs2_unblock_lock · b1b1e15e
      Tariq Saeed 提交于
      NFS on a 2 node ocfs2 cluster each node exporting dir.  The lock causing
      the hang is the global bit map inode lock.  Node 1 is master, has the
      lock granted in PR mode; Node 2 is in the converting list (PR -> EX).
      There are no holders of the lock on the master node so it should
      downconvert to NL and grant EX to node 2 but that does not happen.
      BLOCKED + QUEUED in lock res are set and it is on osb blocked list.
      Threads are waiting in __ocfs2_cluster_lock on BLOCKED.  One thread
      wants EX, rest want PR.  So it is as though the downconvert thread needs
      to be kicked to complete the conv.
      
      The hang is caused by an EX req coming into __ocfs2_cluster_lock on the
      heels of a PR req after it sets BUSY (drops l_lock, releasing EX
      thread), forcing the incoming EX to wait on BUSY without doing anything.
      PR has called ocfs2_dlm_lock, which sets the node 1 lock from NL -> PR,
      queues ast.
      
      At this time, upconvert (PR ->EX) arrives from node 2, finds conflict
      with node 1 lock in PR, so the lock res is put on dlm thread's dirty
      listt.
      
      After ret from ocf2_dlm_lock, PR thread now waits behind EX on BUSY till
      awoken by ast.
      
      Now it is dlm_thread that serially runs dlm_shuffle_lists, ast, bast, in
      that order.  dlm_shuffle_lists ques a bast on behalf of node 2 (which
      will be run by dlm_thread right after the ast).  ast does its part, sets
      UPCONVERT_FINISHING, clears BUSY and wakes its waiters.  Next,
      dlm_thread runs bast.  It sets BLOCKED and kicks dc thread.  dc thread
      runs ocfs2_unblock_lock, but since UPCONVERT_FINISHING set, skips doing
      anything and reques.
      
      Inside of __ocfs2_cluster_lock, since EX has been waiting on BUSY ahead
      of PR, it wakes up first, finds BLOCKED set and skips doing anything but
      clearing UPCONVERT_FINISHING (which was actually "meant" for the PR
      thread), and this time waits on BLOCKED.  Next, the PR thread comes out
      of wait but since UPCONVERT_FINISHING is not set, it skips updating the
      l_ro_holders and goes straight to wait on BLOCKED.  So there, we have a
      hang! Threads in __ocfs2_cluster_lock wait on BLOCKED, lock res in osb
      blocked list.  Only when dc thread is awoken, it will run
      ocfs2_unblock_lock and things will unhang.
      
      One way to fix this is to wake the dc thread on the flag after clearing
      UPCONVERT_FINISHING
      
      Orabug: 20933419
      Signed-off-by: NTariq Saeed <tariq.x.saeed@oracle.com>
      Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Reviewed-by: NWengang Wang <wen.gang.wang@oracle.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Eric Ren <zren@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b1b1e15e
  4. 15 1月, 2016 1 次提交
  5. 06 11月, 2015 1 次提交
  6. 05 9月, 2015 1 次提交
  7. 07 8月, 2015 1 次提交
    • J
      ocfs2: fix BUG in ocfs2_downconvert_thread_do_work() · 209f7512
      Joseph Qi 提交于
      The "BUG_ON(list_empty(&osb->blocked_lock_list))" in
      ocfs2_downconvert_thread_do_work can be triggered in the following case:
      
      ocfs2dc has firstly saved osb->blocked_lock_count to local varibale
      processed, and then processes the dentry lockres.  During the dentry
      put, it calls iput and then deletes rw, inode and open lockres from
      blocked list in ocfs2_mark_lockres_freeing.  And this causes the
      variable `processed' to not reflect the number of blocked lockres to be
      processed, which triggers the BUG.
      Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      209f7512
  8. 22 4月, 2015 1 次提交
    • L
      Revert "ocfs2: incorrect check for debugfs returns" · 8f443e23
      Linus Torvalds 提交于
      This reverts commit e2ac55b6.
      
      Huang Ying reports that this causes a hang at boot with debugfs disabled.
      
      It is true that the debugfs error checks are kind of confusing, and this
      code certainly merits more cleanup and thinking about it, but there's
      something wrong with the trivial "check not just for NULL, but for error
      pointers too" patch.
      
      Yes, with debugfs disabled, we will end up setting the o2hb_debug_dir
      pointer variable to an error pointer (-ENODEV), and then continue as if
      everything was fine.  But since debugfs is disabled, all the _users_ of
      that pointer end up being compiled away, so even though the pointer can
      not be dereferenced, that's still fine.
      
      So it's confusing and somewhat questionable, but the "more correct"
      error checks end up causing more trouble than they fix.
      Reported-by: NHuang Ying <ying.huang@intel.com>
      Acked-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NChengyu Song <csong84@gatech.edu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f443e23
  9. 15 4月, 2015 2 次提交
    • A
      ocfs2: check if the ocfs2 lock resource has been initialized before calling ocfs2_dlm_lock · 2f2eca20
      alex chen 提交于
      If ocfs2 lockres has not been initialized before calling ocfs2_dlm_lock,
      the lock won't be dropped and then will lead umount hung.  The case is
      described below:
      
      ocfs2_mknod
          ocfs2_mknod_locked
              __ocfs2_mknod_locked
                  ocfs2_journal_access_di
                  Failed because of -ENOMEM or other reasons, the inode lockres
                  has not been initialized yet.
      
          iput(inode)
              ocfs2_evict_inode
                  ocfs2_delete_inode
                      ocfs2_inode_lock
                          ocfs2_inode_lock_full_nested
                              __ocfs2_cluster_lock
                              Succeeds and allocates a new dlm lockres.
                  ocfs2_clear_inode
                      ocfs2_open_unlock
                          ocfs2_drop_inode_locks
                              ocfs2_drop_lock
                              Since lockres has not been initialized, the lock
                              can't be dropped and the lockres can't be
                              migrated, thus umount will hang forever.
      Signed-off-by: NAlex Chen <alex.chen@huawei.com>
      Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
      Reviewed-by: Njoyce.xue <xuejiufei@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2f2eca20
    • C
      ocfs2: incorrect check for debugfs returns · e2ac55b6
      Chengyu Song 提交于
      debugfs_create_dir and debugfs_create_file may return -ENODEV when debugfs
      is not configured, so the return value should be checked against
      ERROR_VALUE as well, otherwise the later dereference of the dentry pointer
      would crash the kernel.
      
      This patch tries to solve this problem by fixing certain checks. However,
      I have that found other call sites are protected by #ifdef CONFIG_DEBUG_FS.
      In current implementation, if CONFIG_DEBUG_FS is defined, then the above
      two functions will never return any ERROR_VALUE. So another possibility
      to fix this is to surround all the buggy checks/functions with the same
      #ifdef CONFIG_DEBUG_FS. But I'm not sure if this would break any functionality,
      as only OCFS2_FS_STATS declares dependency on DEBUG_FS.
      Signed-off-by: NChengyu Song <csong84@gatech.edu>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e2ac55b6
  10. 11 2月, 2015 1 次提交
    • A
      ocfs2: prune the dcache before deleting the dentry of directory · 10ab8811
      alex chen 提交于
      In ocfs2_dentry_convert_worker, we should prune the dcache before deleting
      the dentry of directory, otherwise, in the following cases the inode of
      directory will still remain in orphan directory until the device being
      umounted.
      
      Mount point: /mnt/ocfs2
      Node A                              Node B
      mkdir /mnt/ocfs2/testdir
        ocfs2_mkdir
        ->ocfs2_mknod
        ->ocfs2_dentry_attach_lock
        ->ocfs2_dentry_lock(dentry, 0)
        ... ...
      touch /mnt/ocfs2/testdir/testfile
                                          unlink /mnt/test/testdir/testfile
                                          rmdir /mnt/ocfs2/testdir
                                            ocfs2_unlink
                                            ->ocfs2_remote_dentry_delete
                                            ->ocfs2_dentry_lock(dentry, 1)
                                            ... ...
      ... ...
      ocfs2_downconvert_thread
      ->ocfs2_unblock_lock
      ->ocfs2_dentry_convert_worker
      ->ocfs2_find_local_alias
        ->dget_dlock
      ->d_delete
      Here the dentry can not be
      released because the children's
      dentry is negative but still exist.
      Finally, this inode will still remain
      in orphan directory until its children
      are destroyed.
      
      So before deleting dentry of directory, we should prune the dcache to
      remove unused children of the parent dentry by shrink_dcache_parent().
      Signed-off-by: NAlex Chen <alex.chen@huawei.com>
      Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
      Reviewed-by: Njoyce.xue <xuejiufei@huawei.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      10ab8811
  11. 11 12月, 2014 1 次提交
  12. 20 11月, 2014 1 次提交
  13. 10 10月, 2014 1 次提交
  14. 05 6月, 2014 1 次提交
  15. 04 4月, 2014 1 次提交
  16. 22 1月, 2014 2 次提交
  17. 15 11月, 2013 1 次提交
  18. 08 5月, 2013 1 次提交
    • Z
      aio: remove retry-based AIO · 41003a7b
      Zach Brown 提交于
      This removes the retry-based AIO infrastructure now that nothing in tree
      is using it.
      
      We want to remove retry-based AIO because it is fundemantally unsafe.
      It retries IO submission from a kernel thread that has only assumed the
      mm of the submitting task.  All other task_struct references in the IO
      submission path will see the kernel thread, not the submitting task.
      This design flaw means that nothing of any meaningful complexity can use
      retry-based AIO.
      
      This removes all the code and data associated with the retry machinery.
      The most significant benefit of this is the removal of the locking
      around the unused run list in the submission path.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Signed-off-by: NZach Brown <zab@redhat.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Reviewed-by: N"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41003a7b
  19. 22 2月, 2013 1 次提交
  20. 13 2月, 2013 1 次提交
  21. 04 7月, 2012 2 次提交
  22. 02 11月, 2011 1 次提交
  23. 01 6月, 2011 1 次提交
  24. 07 3月, 2011 1 次提交
    • T
      ocfs2: Remove EXIT from masklog. · c1e8d35e
      Tao Ma 提交于
      mlog_exit is used to record the exit status of a function.
      But because it is added in so many functions, if we enable it,
      the system logs get filled up quickly and cause too much I/O.
      So actually no one can open it for a production system or even
      for a test.
      
      This patch just try to remove it or change it. So:
      1. if all the error paths already use mlog_errno, it is just removed.
         Otherwise, it will be replaced by mlog_errno.
      2. if it is used to print some return value, it is replaced with
         mlog(0,...).
      mlog_exit_ptr is changed to mlog(0.
      All those mlog(0,...) will be replaced with trace events later.
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      c1e8d35e
  25. 21 2月, 2011 1 次提交
    • T
      ocfs2: Remove ENTRY from masklog. · ef6b689b
      Tao Ma 提交于
      ENTRY is used to record the entry of a function.
      But because it is added in so many functions, if we enable it,
      the system logs get filled up quickly and cause too much I/O.
      So actually no one can open it for a production system or even
      for a test.
      
      So for mlog_entry_void, we just remove it.
      for mlog_entry(...), we replace it with mlog(0,...), and they
      will be replace by trace event later.
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      ef6b689b
  26. 20 2月, 2011 1 次提交
    • S
      ocfs2: Use hrtimer to track ocfs2 fs lock stats · 5bc970e8
      Sunil Mushran 提交于
      Patch makes use of the hrtimer to track times in ocfs2 lock stats.
      
      The patch is a bit involved to ensure no additional impact on the memory
      footprint. The size of ocfs2_inode_cache remains 1280 bytes on 32-bit systems.
      
      A related change was to modify the unit of the max wait time from nanosec to
      microsec allowing us to track max time larger than 4 secs. This change
      necessitated the bumping of the output version in the debugfs file,
      locking_state, from 2 to 3.
      Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      5bc970e8
  27. 11 9月, 2010 1 次提交
    • G
      Track negative entries v3 · 5e98d492
      Goldwyn Rodrigues 提交于
      Track negative dentries by recording the generation number of the parent
      directory in d_fsdata. The generation number for the parent directory is
      recorded in the inode_info, which increments every time the lock on the
      directory is dropped.
      
      If the generation number of the parent directory and the negative dentry
      matches, there is no need to perform the revalidate, else a revalidate
      is forced. This improves performance in situations where nodes look for
      the same non-existent file multiple times.
      
      Thanks Mark for explaining the DLM sequence.
      Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.de>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      5e98d492
  28. 20 7月, 2010 1 次提交
  29. 22 5月, 2010 1 次提交
  30. 28 2月, 2010 1 次提交
  31. 27 2月, 2010 3 次提交
    • J
      ocfs2: Pass the locking protocol into ocfs2_cluster_connect(). · 553b5eb9
      Joel Becker 提交于
      Inside the stackglue, the locking protocol structure is hanging off of
      the ocfs2_cluster_connection.  This takes it one further; the locking
      protocol is passed into ocfs2_cluster_connect().  Now different cluster
      connections can have different locking protocols with distinct asts.
      Note that all locking protocols have to keep their maximum protocol
      version in lock-step.
      
      With the protocol structure set in ocfs2_cluster_connect(), there is no
      need for the stackglue to have a static pointer to a specific protocol
      structure.  We can change initialization to only pass in the maximum
      protocol version.
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      553b5eb9
    • J
      ocfs2: Attach the connection to the lksb · c0e41338
      Joel Becker 提交于
      We're going to want it in the ast functions, so we convert union
      ocfs2_dlm_lksb to struct ocfs2_dlm_lksb and let it carry the connection.
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      c0e41338
    • J
      ocfs2: Pass lksbs back from stackglue ast/bast functions. · a796d286
      Joel Becker 提交于
      The stackglue ast and bast functions tried to maintain the fiction that
      their arguments were void pointers.  In reality, stack_user.c had to
      know that the argument was an ocfs2_lock_res in order to get the status
      off of the lksb.  That's ugly.
      
      This changes stackglue to always pass the lksb as the argument to ast
      and bast functions.  The caller can always use container_of() to get the
      ocfs2_lock_res or user_dlm_lock_res.  The net effect to the caller is
      zero.  They still get back the lockres in their ast.  stackglue gets
      cleaner, and now can use the lksb itself.
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      a796d286
  32. 09 2月, 2010 1 次提交
  33. 04 2月, 2010 1 次提交
  34. 03 2月, 2010 2 次提交