1. 10 9月, 2010 3 次提交
    • T
      ocfs2: Remove obscure error handling in direct_write. · 95fa859a
      Tao Ma 提交于
      In ocfs2, actually we don't allow any direct write pass i_size,
      see the function ocfs2_prepare_inode_for_write. So we don't
      need the bogus simple_setsize.
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      95fa859a
    • T
      ocfs2: Add some trace log for orphan scan. · 3c3f20c9
      Tao Ma 提交于
      Now orphan scan worker has no trace log, so it is
      very hard to tell whether it is finished or blocked.
      So add 2 mlog trace log so that we can tell whether
      the current orphan scan worker is blocked or not.
      It does help when I analyzed a orphan scan bug.
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      3c3f20c9
    • T
      Ocfs2: Add new OCFS2_IOC_INFO ioctl for ocfs2 v8. · ddee5cdb
      Tristan Ye 提交于
      The reason why we need this ioctl is to offer the none-privileged
      end-user a possibility to get filesys info gathering.
      
      We use OCFS2_IOC_INFO to manipulate the new ioctl, userspace passes a
      structure to kernel containing an array of request pointers and request
      count, such as,
      
      * From userspace:
      
      struct ocfs2_info_blocksize oib = {
              .ib_req = {
                      .ir_magic = OCFS2_INFO_MAGIC,
                      .ir_code = OCFS2_INFO_BLOCKSIZE,
                      ...
              }
              ...
      }
      
      struct ocfs2_info_clustersize oic = {
              ...
      }
      
      uint64_t reqs[2] = {(unsigned long)&oib,
                          (unsigned long)&oic};
      
      struct ocfs2_info info = {
              .oi_requests = reqs,
              .oi_count = 2,
      }
      
      ret = ioctl(fd, OCFS2_IOC_INFO, &info);
      
      * In kernel:
      
      Get the request pointers from *info*, then handle each request one bye one.
      
      Idea here is to make the spearated request small enough to guarantee
      a better backward&forward compatibility since a small piece of request
      would be less likely to be broken if filesys on raw disk get changed.
      
      Currently, the following 7 requests are supported per the requirement from
      userspace tool o2info, and I believe it will grow over time:-)
      
              OCFS2_INFO_CLUSTERSIZE
              OCFS2_INFO_BLOCKSIZE
              OCFS2_INFO_MAXSLOTS
              OCFS2_INFO_LABEL
              OCFS2_INFO_UUID
              OCFS2_INFO_FS_FEATURES
              OCFS2_INFO_JOURNAL_SIZE
      
      This ioctl is only specific to OCFS2.
      Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      ddee5cdb
  2. 08 9月, 2010 13 次提交
  3. 10 8月, 2010 6 次提交
  4. 08 8月, 2010 8 次提交
    • T
      O2net: Disallow o2net accept connection request from itself. · 415cf32c
      Tristan Ye 提交于
      Currently, o2net_accept_one() is allowed to accept a connection from
      listening node itself, such a fake connection will not be successfully
      established due to no handshake detected afterwards, and later end up
      with triggering connecting worker in a loop.
      
      We're going to fix this by treating such connection request as 'invalid',
      since we've got no chance of requesting connection from a node to itself
      in a OCFS2 cluster.
      
      The fix doesn't hurt user's scan for o2net-listener, it always gets a
      successful connection from userpace.
      Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
      Acked-by: NSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      415cf32c
    • W
      ocfs2/dlm: remove potential deadlock -V3 · b11f1f1a
      Wengang Wang 提交于
      When we need to take both dlm_domain_lock and dlm->spinlock, we should take
      them in order of: dlm_domain_lock then dlm->spinlock.
      
      There is pathes disobey this order. That is calling dlm_lockres_put() with
      dlm->spinlock held in dlm_run_purge_list. dlm_lockres_put() calls dlm_put() at
      the ref and dlm_put() locks on dlm_domain_lock.
      
      Fix:
      Don't grab/put the dlm when the initialising/releasing lockres.
      That grab is not required because we don't call dlm_unregister_domain()
      based on refcount.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      b11f1f1a
    • W
      ocfs2/dlm: avoid incorrect bit set in refmap on recovery master · a524812b
      Wengang Wang 提交于
      In the following situation, there remains an incorrect bit in refmap on the
      recovery master. Finally the recovery master will fail at purging the lockres
      due to the incorrect bit in refmap.
      
      1) node A has no interest on lockres A any longer, so it is purging it.
      2) the owner of lockres A is node B, so node A is sending de-ref message
      to node B.
      3) at this time, node B crashed. node C becomes the recovery master. it recovers
      lockres A(because the master is the dead node B).
      4) node A migrated lockres A to node C with a refbit there.
      5) node A failed to send de-ref message to node B because it crashed. The failure
      is ignored. no other action is done for lockres A any more.
      
      For mormal, re-send the deref message to it to recovery master can fix it. Well,
      ignoring the failure of deref to the original master and not recovering the lockres
      to recovery master has the same effect. And the later is simpler.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Acked-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      a524812b
    • J
      Fix the nested PR lock calling issue in ACL · 845b6cf3
      Jiaju Zhang 提交于
      Hi,
      
      Thanks a lot for all the review and comments so far;) I'd like to send
      the improved (V4) version of this patch.
      
      This patch fixes a deadlock in OCFS2 ACL. We found this bug in OCFS2
      and Samba integration using scenario, the symptom is several smbd
      processes will be hung under heavy workload. Finally we found out it
      is the nested PR lock calling that leads to this deadlock:
      
       node1        node2
                    gr PR
                      |
                      V
       PR(EX)---> BAST:OCFS2_LOCK_BLOCKED
                      |
                      V
                    rq PR
                      |
                      V
                    wait=1
      
      After requesting the 2nd PR lock, the process "smbd" went into D
      state. It can only be woken up when the 1st PR lock's RO holder equals
      zero. There should be an ocfs2_inode_unlock in the calling path later
      on, which can decrement the RO holder. But since it has been in
      uninterruptible sleep, the unlock function has no chance to be called.
      
      The related stack trace is:
      smbd          D ffff8800013d0600     0  9522   5608 0x00000000
       ffff88002ca7fb18 0000000000000282 ffff88002f964500 ffff88002ca7fa98
       ffff8800013d0600 ffff88002ca7fae0 ffff88002f964340 ffff88002f964340
       ffff88002ca7ffd8 ffff88002ca7ffd8 ffff88002f964340 ffff88002f964340
      Call Trace:
      [<ffffffff80350425>] schedule_timeout+0x175/0x210
      [<ffffffff8034f580>] wait_for_common+0xf0/0x210
      [<ffffffffa03e12b9>] __ocfs2_cluster_lock+0x3b9/0xa90 [ocfs2]
      [<ffffffffa03e7665>] ocfs2_inode_lock_full_nested+0x255/0xdb0 [ocfs2]
      [<ffffffffa0446019>] ocfs2_get_acl+0x69/0x120 [ocfs2]
      [<ffffffffa0446368>] ocfs2_check_acl+0x28/0x80 [ocfs2]
      [<ffffffff800e3507>] acl_permission_check+0x57/0xb0
      [<ffffffff800e357d>] generic_permission+0x1d/0xc0
      [<ffffffffa03eecea>] ocfs2_permission+0x10a/0x1d0 [ocfs2]
      [<ffffffff800e3f65>] inode_permission+0x45/0x100
      [<ffffffff800d86b3>] sys_chdir+0x53/0x90
      [<ffffffff80007458>] system_call_fastpath+0x16/0x1b
      [<00007f34a4ef6927>] 0x7f34a4ef6927
      
      For details, please see:
      https://bugzilla.novell.com/show_bug.cgi?id=614332 and
      http://oss.oracle.com/bugzilla/show_bug.cgi?id=1278Signed-off-by: NJiaju Zhang <jjzhang@suse.de>
      Acked-by: NMark Fasheh <mfasheh@suse.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      845b6cf3
    • T
      ocfs2: Count more refcount records in file system fragmentation. · 8a2e70c4
      Tao Ma 提交于
      The refcount record calculation in ocfs2_calc_refcount_meta_credits
      is too optimistic that we can always allocate contiguous clusters
      and handle an already existed refcount rec as a whole. Actually
      because of file system fragmentation, we may have the chance to split
      a refcount record into 3 parts during the transaction. So consider
      the worst case in record calculation.
      
      Cc: stable@kernel.org
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      8a2e70c4
    • S
      ocfs2 fix o2dlm dlm run purgelist (rev 3) · 7beaf243
      Srinivas Eeda 提交于
      This patch fixes two problems in dlm_run_purgelist
      
      1. If a lockres is found to be in use, dlm_run_purgelist keeps trying to purge
      the same lockres instead of trying the next lockres.
      
      2. When a lockres is found unused, dlm_run_purgelist releases lockres spinlock
      before setting DLM_LOCK_RES_DROPPING_REF and calls dlm_purge_lockres.
      spinlock is reacquired but in this window lockres can get reused. This leads
      to BUG.
      
      This patch modifies dlm_run_purgelist to skip lockres if it's in use and purge
       next lockres. It also sets DLM_LOCK_RES_DROPPING_REF before releasing the
      lockres spinlock protecting it from getting reused.
      Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Acked-by: NSunil Mushran <sunil.mushran@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      7beaf243
    • W
      ocfs2/dlm: fix a dead lock · 6d98c3cc
      Wengang Wang 提交于
      When we have to take both dlm->master_lock and lockres->spinlock,
      take them in order
      
      lockres->spinlock and then dlm->master_lock.
      
      The patch fixes a violation of the rule.
      We can simply move taking dlm->master_lock to where we have dropped res->spinlock
      since when we access res->state and free mle memory we don't need master_lock's
      protection.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      6d98c3cc
    • T
      ocfs2: do not overwrite error codes in ocfs2_init_acl · 6eda3dd3
      Tiger Yang 提交于
      Setting the acl while creating a new inode depends on
      the error codes of posix_acl_create_masq. This patch fix
      a issue of overwriting the error codes of it.
      Reported-by: NPawel Zawora <pzawora@gmail.com>
      Cc: <stable@kernel.org> [ .33, .34 ]
      Signed-off-by: NTiger Yang <tiger.yang@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      6eda3dd3
  5. 04 8月, 2010 1 次提交
  6. 27 7月, 2010 2 次提交
  7. 20 7月, 2010 1 次提交
  8. 17 7月, 2010 1 次提交
  9. 16 7月, 2010 3 次提交
    • J
      jbd2/ocfs2: Fix block checksumming when a buffer is used in several transactions · 13ceef09
      Jan Kara 提交于
      OCFS2 uses t_commit trigger to compute and store checksum of the just
      committed blocks. When a buffer has b_frozen_data, checksum is computed
      for it instead of b_data but this can result in an old checksum being
      written to the filesystem in the following scenario:
      
      1) transaction1 is opened
      2) handle1 is opened
      3) journal_access(handle1, bh)
          - This sets jh->b_transaction to transaction1
      4) modify(bh)
      5) journal_dirty(handle1, bh)
      6) handle1 is closed
      7) start committing transaction1, opening transaction2
      8) handle2 is opened
      9) journal_access(handle2, bh)
          - This copies off b_frozen_data to make it safe for transaction1 to commit.
            jh->b_next_transaction is set to transaction2.
      10) jbd2_journal_write_metadata() checksums b_frozen_data
      11) the journal correctly writes b_frozen_data to the disk journal
      12) handle2 is closed
          - There was no dirty call for the bh on handle2, so it is never queued for
            any more journal operation
      13) Checkpointing finally happens, and it just spools the bh via normal buffer
      writeback.  This will write b_data, which was never triggered on and thus
      contains a wrong (old) checksum.
      
      This patch fixes the problem by calling the trigger at the moment data is
      frozen for journal commit - i.e., either when b_frozen_data is created by
      do_get_write_access or just before we write a buffer to the log if
      b_frozen_data does not exist. We also rename the trigger to t_frozen as
      that better describes when it is called.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NMark Fasheh <mfasheh@suse.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      13ceef09
    • W
      ocfs2/dlm: Remove BUG_ON from migration in the rare case of a down node · a39953dd
      Wengang Wang 提交于
      For migration, we are waiting for DLM_LOCK_RES_MIGRATING flag to be set
      before sending DLM_MIG_LOCKRES_MSG message to the target. We are using
      dlm_migration_can_proceed() for that purpose.  However, if the node is
      down, dlm_migration_can_proceed() will also return "go ahead".  In this
      rare case, the DLM_LOCK_RES_MIGRATING flag might not be set yet. Remove
      the BUG_ON() that trips over this condition.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      a39953dd
    • T
      ocfs2: Don't duplicate pages past i_size during CoW. · f5e27b6d
      Tao Ma 提交于
      During CoW, the pages after i_size don't contain valid data, so there's
      no need to read and duplicate them.
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      f5e27b6d
  10. 13 7月, 2010 2 次提交
    • D
      ocfs2: tighten up strlen() checking · e372357b
      Dan Carpenter 提交于
      This function is only called from one place and it's like this:
      	dlm_register_domain(conn->cc_name, dlm_key, &fs_version);
      
      The "conn->cc_name" is 64 characters long.  If strlen(conn->cc_name)
      were equal to O2NM_MAX_NAME_LEN (64) that would be a bug because
      strlen() doesn't count the NULL character.
      
      In fact, if you look how O2NM_MAX_NAME_LEN is used, it mostly describes
      64 character buffers.  The only exception is nd_name from struct
      o2nm_node.
      
      Anyway I looked into it and in this case the domain string comes from
      osb->uuid_str in ocfs2_setup_osb_uuid().  That's 32 characters and NULL
      which easily fits into O2NM_MAX_NAME_LEN.  This patch doesn't change how
      the code works, but I think it makes the code a little cleaner.
      Signed-off-by: NDan Carpenter <error27@gmail.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      e372357b
    • T
      ocfs2: Make xattr reflink work with new local alloc reservation. · 121a39bb
      Tao Ma 提交于
      The new reservation code in local alloc has add the limitation
      that the caller should handle the case that the local alloc
      doesn't give use enough contiguous clusters. It make the old
      xattr reflink code broken.
      
      So this patch udpate the xattr reflink code so that it can
      handle the case that local alloc give us one cluster at a time.
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NJoel Becker <joel.becker@oracle.com>
      121a39bb