1. 25 7月, 2011 2 次提交
  2. 26 5月, 2011 2 次提交
    • S
      ocfs2/dlm: Do not migrate resource to a node that is leaving the domain · 66effd3c
      Sunil Mushran 提交于
      During dlm domain shutdown, o2dlm has to free all the lock resources. Ones that
      have no locks and references are freed. Ones that have locks and/or references
      are migrated to another node.
      
      The first task in migration is finding a target. Currently we scan the lock
      resource and find one node that either has a lock or a reference. This is not
      very efficient in a parallel umount case as we might end up migrating the
      lock resource to a node which itself may have to migrate it to a third node.
      
      The patch scans the dlm->exit_domain_map to ensure the target node is not
      leaving the domain. If no valid target node is found, o2dlm does not migrate
      the resource but instead waits for the unlock and deref messages that will
      allow it to free the resource.
      Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      66effd3c
    • S
      ocfs2/dlm: Add new dlm message DLM_BEGIN_EXIT_DOMAIN_MSG · bddefdee
      Sunil Mushran 提交于
      This patch adds a new dlm message DLM_BEGIN_EXIT_DOMAIN_MSG and ups the dlm
      protocol to 1.2.
      
      o2dlm sends this new message in dlm_unregister_domain() to mark the beginning
      of the exit domain. This message is sent to all nodes in the domain.
      
      Currently o2dlm has no way of informing other nodes of its impending exit.
      This information is useful as the other nodes could disregard the exiting
      node in certain operations. For example, in resource migration. If two or
      more nodes were umounting in parallel, it would be more efficient if o2dlm
      were to choose a non-exiting node to be the new master node rather than an
      exiting one.
      Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.com>
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      bddefdee
  3. 24 5月, 2011 1 次提交
  4. 14 5月, 2011 2 次提交
  5. 31 3月, 2011 1 次提交
  6. 17 3月, 2011 1 次提交
  7. 22 2月, 2011 1 次提交
  8. 07 3月, 2011 1 次提交
    • T
      ocfs2: Remove EXIT from masklog. · c1e8d35e
      Tao Ma 提交于
      mlog_exit is used to record the exit status of a function.
      But because it is added in so many functions, if we enable it,
      the system logs get filled up quickly and cause too much I/O.
      So actually no one can open it for a production system or even
      for a test.
      
      This patch just try to remove it or change it. So:
      1. if all the error paths already use mlog_errno, it is just removed.
         Otherwise, it will be replaced by mlog_errno.
      2. if it is used to print some return value, it is replaced with
         mlog(0,...).
      mlog_exit_ptr is changed to mlog(0.
      All those mlog(0,...) will be replaced with trace events later.
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      c1e8d35e
  9. 21 2月, 2011 1 次提交
    • T
      ocfs2: Remove ENTRY from masklog. · ef6b689b
      Tao Ma 提交于
      ENTRY is used to record the entry of a function.
      But because it is added in so many functions, if we enable it,
      the system logs get filled up quickly and cause too much I/O.
      So actually no one can open it for a production system or even
      for a test.
      
      So for mlog_entry_void, we just remove it.
      for mlog_entry(...), we replace it with mlog(0,...), and they
      will be replace by trace event later.
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      ef6b689b
  10. 23 12月, 2010 3 次提交
  11. 16 12月, 2010 3 次提交
  12. 10 12月, 2010 1 次提交
    • S
      ocfs2/dlm: Migrate lockres with no locks if it has a reference · 388c4bcb
      Sunil Mushran 提交于
      o2dlm was not migrating resources with zero locks because it assumed that that
      resource would get purged by dlm_thread. However, some usage patterns involve
      creating and dropping locks at a high rate leading to the migrate thread seeing
      zero locks but the purge thread seeing an active reference. When this happens,
      the dlm_thread cannot purge the resource and the migrate thread sees no reason
      to migrate that resource. The spell is broken when the migrate thread catches
      the resource with a lock.
      
      The fix is to make the migrate thread also consider the reference map.
      
      This usage pattern can be triggered by userspace on userdlm locks and flocks.
      Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      388c4bcb
  13. 19 11月, 2010 1 次提交
    • D
      fs/ocfs2/dlm: Use GFP_ATOMIC under spin_lock · a48a982a
      David Sterba 提交于
      coccinelle check scripts/coccinelle/locks/call_kern.cocci found that
      in fs/ocfs2/dlm/dlmdomain.c an allocation with GFP_KERNEL is done
      with locks held:
      
      dlm_query_region_handler
        spin_lock(dlm_domain_lock)
          dlm_match_regions
            kmalloc(GFP_KERNEL)
      
      Change it to GFP_ATOMIC.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      CC: Joel Becker <joel.becker@oracle.com>
      CC: Mark Fasheh <mfasheh@suse.com>
      CC: ocfs2-devel@oss.oracle.com
      
      --
      Exists in v2.6.37-rc1 and current linux-next.
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      a48a982a
  14. 10 10月, 2010 1 次提交
  15. 08 10月, 2010 1 次提交
    • S
      · 18cfdf1b
      Sunil Mushran 提交于
      ocfs2/dlm: Add message DLM_QUERY_NODEINFO
      
      Adds new dlm message DLM_QUERY_NODEINFO that sends the attributes of all
      registered nodes. This message is sent if the negotiated dlm protocol is
      1.1 or higher. If the information of the joining node does not match
      that of any existing nodes, the join domain request is rejected.
      Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
      18cfdf1b
  16. 10 10月, 2010 1 次提交
    • S
      · ea203441
      Sunil Mushran 提交于
      ocfs2/dlm: Add message DLM_QUERY_REGION
      
      Adds new dlm message DLM_QUERY_REGION that sends the names of all active
      heartbeat regions. This message is only sent in the global heartbeat
      mode. If the regions in the joining node do not fully match the ones in
      the active nodes, the join domain request is rejected.
      Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
      ea203441
  17. 07 10月, 2010 1 次提交
  18. 24 9月, 2010 1 次提交
  19. 16 9月, 2010 1 次提交
  20. 11 9月, 2010 1 次提交
  21. 08 8月, 2010 4 次提交
    • W
      ocfs2/dlm: remove potential deadlock -V3 · b11f1f1a
      Wengang Wang 提交于
      When we need to take both dlm_domain_lock and dlm->spinlock, we should take
      them in order of: dlm_domain_lock then dlm->spinlock.
      
      There is pathes disobey this order. That is calling dlm_lockres_put() with
      dlm->spinlock held in dlm_run_purge_list. dlm_lockres_put() calls dlm_put() at
      the ref and dlm_put() locks on dlm_domain_lock.
      
      Fix:
      Don't grab/put the dlm when the initialising/releasing lockres.
      That grab is not required because we don't call dlm_unregister_domain()
      based on refcount.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      b11f1f1a
    • W
      ocfs2/dlm: avoid incorrect bit set in refmap on recovery master · a524812b
      Wengang Wang 提交于
      In the following situation, there remains an incorrect bit in refmap on the
      recovery master. Finally the recovery master will fail at purging the lockres
      due to the incorrect bit in refmap.
      
      1) node A has no interest on lockres A any longer, so it is purging it.
      2) the owner of lockres A is node B, so node A is sending de-ref message
      to node B.
      3) at this time, node B crashed. node C becomes the recovery master. it recovers
      lockres A(because the master is the dead node B).
      4) node A migrated lockres A to node C with a refbit there.
      5) node A failed to send de-ref message to node B because it crashed. The failure
      is ignored. no other action is done for lockres A any more.
      
      For mormal, re-send the deref message to it to recovery master can fix it. Well,
      ignoring the failure of deref to the original master and not recovering the lockres
      to recovery master has the same effect. And the later is simpler.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Acked-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      a524812b
    • S
      ocfs2 fix o2dlm dlm run purgelist (rev 3) · 7beaf243
      Srinivas Eeda 提交于
      This patch fixes two problems in dlm_run_purgelist
      
      1. If a lockres is found to be in use, dlm_run_purgelist keeps trying to purge
      the same lockres instead of trying the next lockres.
      
      2. When a lockres is found unused, dlm_run_purgelist releases lockres spinlock
      before setting DLM_LOCK_RES_DROPPING_REF and calls dlm_purge_lockres.
      spinlock is reacquired but in this window lockres can get reused. This leads
      to BUG.
      
      This patch modifies dlm_run_purgelist to skip lockres if it's in use and purge
       next lockres. It also sets DLM_LOCK_RES_DROPPING_REF before releasing the
      lockres spinlock protecting it from getting reused.
      Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Acked-by: NSunil Mushran <sunil.mushran@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      7beaf243
    • W
      ocfs2/dlm: fix a dead lock · 6d98c3cc
      Wengang Wang 提交于
      When we have to take both dlm->master_lock and lockres->spinlock,
      take them in order
      
      lockres->spinlock and then dlm->master_lock.
      
      The patch fixes a violation of the rule.
      We can simply move taking dlm->master_lock to where we have dropped res->spinlock
      since when we access res->state and free mle memory we don't need master_lock's
      protection.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      6d98c3cc
  22. 20 7月, 2010 1 次提交
  23. 16 7月, 2010 1 次提交
  24. 13 7月, 2010 2 次提交
    • D
      ocfs2: tighten up strlen() checking · e372357b
      Dan Carpenter 提交于
      This function is only called from one place and it's like this:
      	dlm_register_domain(conn->cc_name, dlm_key, &fs_version);
      
      The "conn->cc_name" is 64 characters long.  If strlen(conn->cc_name)
      were equal to O2NM_MAX_NAME_LEN (64) that would be a bug because
      strlen() doesn't count the NULL character.
      
      In fact, if you look how O2NM_MAX_NAME_LEN is used, it mostly describes
      64 character buffers.  The only exception is nd_name from struct
      o2nm_node.
      
      Anyway I looked into it and in this case the domain string comes from
      osb->uuid_str in ocfs2_setup_osb_uuid().  That's 32 characters and NULL
      which easily fits into O2NM_MAX_NAME_LEN.  This patch doesn't change how
      the code works, but I think it makes the code a little cleaner.
      Signed-off-by: NDan Carpenter <error27@gmail.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      e372357b
    • W
      ocfs2/dlm: don't access beyond bitmap size · f471c9df
      Wengang Wang 提交于
      dlm->recovery_map is defined as
      	unsigned long recovery_map[BITS_TO_LONGS(O2NM_MAX_NODES)];
      
      We should treat O2NM_MAX_NODES as the bit map size in bits.
      This patches fixes a bit operation that takes O2NM_MAX_NODES + 1 as bitmap size.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      f471c9df
  25. 16 6月, 2010 1 次提交
  26. 19 5月, 2010 3 次提交
  27. 06 5月, 2010 1 次提交