1. 02 10月, 2012 2 次提交
    • Y
      ceph: Fix oops when handling mdsmap that decreases max_mds · 3e8f43a0
      Yan, Zheng 提交于
      When i >= newmap->m_max_mds, ceph_mdsmap_get_addr(newmap, i) return
      NULL. Passing NULL to memcmp() triggers oops.
      Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
      Signed-off-by: NSage Weil <sage@inktank.com>
      3e8f43a0
    • A
      ceph: let path portion of mount "device" be optional · c98f533c
      Alex Elder 提交于
      A recent change to /sbin/mountall causes any trailing '/' character
      in the "device" (or fs_spec) field in /etc/fstab to be stripped.  As
      a result, an entry for a ceph mount that intends to mount the root
      of the name space ends up with now path portion, and the ceph mount
      option processing code rejects this.
      
      That is, an entry in /etc/fstab like:
          cephserver:port:/ /mnt ceph defaults 0 0
      provides to the ceph code just "cephserver:port:" as the "device,"
      and that gets rejected.
      
      Although this is a bug in /sbin/mountall, we can have the ceph mount
      code support an empty/nonexistent path, interpreting it to mean the
      root of the name space.
      
      RFC 5952 offers recommendations for how to express IPv6 addresses,
      and recommends the usage found in RFC 3986 (which specifies the
      format for URI's) for representing both IPv4 and IPv6 addresses that
      include port numbers.  (See in particular the definition of
      "authority" found in the Appendix of RFC 3986.)
      
      According to those standards, no host specification will ever
      contain a '/' character.  As a result, it is sufficient to scan a
      provided "device" from an /etc/fstab entry for the first '/'
      character, and if it's found, treat that as the beginning of the
      path.  If no '/' character is present, we can treat the entire
      string as the monitor host specification(s), and assume the path
      to be the root of the name space.  We'll still require a ':' to
      separate the host portion from the (possibly empty) path portion.
      
      This means that we can more formally define how ceph will interpret
      the "device" it's provided when processing a mount request:
      
          "device" will look like:
              <server_spec>[,<server_spec>...]:[<path>]
          where
              <server_spec> is <ip>[:<port>]
              <path> is optional, but if present must begin with '/'
      
      This addresses http://tracker.newdream.net/issues/2919Signed-off-by: NAlex Elder <elder@inktank.com>
      Reviewed-by: NDan Mick <dan.mick@inktank.com>
      c98f533c
  2. 30 9月, 2012 1 次提交
    • M
      vfs: dcache: fix deadlock in tree traversal · 8110e16d
      Miklos Szeredi 提交于
      IBM reported a deadlock in select_parent().  This was found to be caused
      by taking rename_lock when already locked when restarting the tree
      traversal.
      
      There are two cases when the traversal needs to be restarted:
      
       1) concurrent d_move(); this can only happen when not already locked,
          since taking rename_lock protects against concurrent d_move().
      
       2) racing with final d_put() on child just at the moment of ascending
          to parent; rename_lock doesn't protect against this rare race, so it
          can happen when already locked.
      
      Because of case 2, we need to be able to handle restarting the traversal
      when rename_lock is already held.  This patch fixes all three callers of
      try_to_ascend().
      
      IBM reported that the deadlock is gone with this patch.
      
      [ I rewrote the patch to be smaller and just do the "goto again" if the
        lock was already held, but credit goes to Miklos for the real work.
         - Linus ]
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8110e16d
  3. 28 9月, 2012 1 次提交
  4. 23 9月, 2012 2 次提交
    • A
      close the race in nlmsvc_free_block() · c5aa1e55
      Al Viro 提交于
      we need to grab mutex before the reference counter reaches 0
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c5aa1e55
    • A
      do_add_mount()/umount -l races · 156cacb1
      Al Viro 提交于
      normally we deal with lock_mount()/umount races by checking that
      mountpoint to be is still in our namespace after lock_mount() has
      been done.  However, do_add_mount() skips that check when called
      with MNT_SHRINKABLE in flags (i.e. from finish_automount()).  The
      reason is that ->mnt_ns may be a temporary namespace created exactly
      to contain automounts a-la NFS4 referral handling.  It's not the
      namespace of the caller, though, so check_mnt() would fail here.
      We still need to check that ->mnt_ns is non-NULL in that case,
      though.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      156cacb1
  5. 22 9月, 2012 2 次提交
    • L
      debugfs: fix u32_array race in format_array_alloc · e05e279e
      Linus Torvalds 提交于
      The format_array_alloc() function is fundamentally racy, in that it
      prints the array twice: once to figure out how much space to allocate
      for the buffer, and the second time to actually print out the data.
      
      If any of the array contents changes in between, the allocation size may
      be wrong, and the end result may be truncated in odd ways.
      
      Just don't do it.  Allocate a maximum-sized array up-front, and just
      format the array contents once.  The only user of the u32_array
      interfaces is the Xen spinlock statistics code, and it has 31 entries in
      the arrays, so the maximum size really isn't that big, and the end
      result is much simpler code without the bug.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e05e279e
    • D
      debugfs: fix race in u32_array_read and allocate array at open · 36048853
      David Rientjes 提交于
      u32_array_open() is racy when multiple threads read from a file with a
      seek position of zero, i.e. when two or more simultaneous reads are
      occurring after the non-seekable files are created.  It is possible that
      file->private_data is double-freed because the threads races between
      
      	kfree(file->private-data);
      
      and
      
      	file->private_data = NULL;
      
      The fix is to only do format_array_alloc() when the file is opened and
      free it when it is closed.
      
      Note that because the file has always been non-seekable, you can't open
      it and read it multiple times anyway, so the data has always been
      generated just once.  The difference is that now it is generated at open
      time rather than at the time of the first read, and that avoids the
      race.
      Reported-by: NDave Jones <davej@redhat.com>
      Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Tested-by: NRaghavendra <raghavendra.kt@linux.vnet.ibm.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      36048853
  6. 19 9月, 2012 3 次提交
    • B
      xfs: stop the sync worker before xfs_unmountfs · 0ba6e536
      Ben Myers 提交于
      Cancel work of the xfs_sync_worker before teardown of the log in
      xfs_unmountfs.  This prevents occasional crashes on unmount like so:
      
      PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
       #0 [c5377d28] crash_kexec at c0292c94
       #1 [c5377d80] oops_end at c07090c2
       #2 [c5377d98] no_context at c06f614e
       #3 [c5377dbc] __bad_area_nosemaphore at c06f6281
       #4 [c5377df4] bad_area_nosemaphore at c06f629b
       #5 [c5377e00] do_page_fault at c070b0cb
       #6 [c5377e7c] error_code (via page_fault) at c070892c
          EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
          DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
          CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
       #7 [c5377eb0] atomic64_read_cx8 at c0481ad0
       #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
       #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
      #10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
      #11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
      #12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
      #13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
      #14 [c5377f58] process_one_work at c024ee4c
      #15 [c5377f98] worker_thread at c024f43d
      #16 [c5377fbc] kthread at c025326b
      #17 [c5377fe8] kernel_thread_helper at c070e834
      
      PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
       #0 [cde0fda0] __schedule at c0706595
       #1 [cde0fe28] schedule at c0706b89
       #2 [cde0fe30] schedule_timeout at c0705600
       #3 [cde0fe94] __down_common at c0706098
       #4 [cde0fec8] __down at c0706122
       #5 [cde0fed0] down at c025936f
       #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
       #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
       #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
       #9 [cde0ff1c] generic_shutdown_super at c0333d7a
      #10 [cde0ff38] kill_block_super at c0333e0f
      #11 [cde0ff48] deactivate_locked_super at c0334218
      #12 [cde0ff58] deactivate_super at c033495d
      #13 [cde0ff68] mntput_no_expire at c034bc13
      #14 [cde0ff7c] sys_umount at c034cc69
      #15 [cde0ffa0] sys_oldumount at c034ccd4
      #16 [cde0ffb0] system_call at c0707e66
      
      commit 11159a05 added this to xfs_log_unmount and needs to be cleaned up
      at a later date.
      Signed-off-by: NBen Myers <bpm@sgi.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      0ba6e536
    • J
      cifs: fix return value in cifsConvertToUTF16 · c73f6939
      Jeff Layton 提交于
      This function returns the wrong value, which causes the callers to get
      the length of the resulting pathname wrong when it contains non-ASCII
      characters.
      
      This seems to fix https://bugzilla.samba.org/show_bug.cgi?id=6767
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NBaldvin Kovacs <baldvin.kovacs@gmail.com>
      Reported-and-Tested-by: NNicolas Lefebvre <nico.lefebvre@gmail.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      c73f6939
    • M
      vfs: dcache: use DCACHE_DENTRY_KILLED instead of DCACHE_DISCONNECTED in d_kill() · b161dfa6
      Miklos Szeredi 提交于
      IBM reported a soft lockup after applying the fix for the rename_lock
      deadlock.  Commit c83ce989 ("VFS: Fix the nfs sillyrename regression
      in kernel 2.6.38") was found to be the culprit.
      
      The nfs sillyrename fix used DCACHE_DISCONNECTED to indicate that the
      dentry was killed.  This flag can be set on non-killed dentries too,
      which results in infinite retries when trying to traverse the dentry
      tree.
      
      This patch introduces a separate flag: DCACHE_DENTRY_KILLED, which is
      only set in d_kill() and makes try_to_ascend() test only this flag.
      
      IBM reported successful test results with this patch.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b161dfa6
  7. 18 9月, 2012 1 次提交
    • F
      fs/proc: fix potential unregister_sysctl_table hang · 6bf61045
      Francesco Ruggeri 提交于
      The unregister_sysctl_table() function hangs if all references to its
      ctl_table_header structure are not dropped.
      
      This can happen sometimes because of a leak in proc_sys_lookup():
      proc_sys_lookup() gets a reference to the table via lookup_entry(), but
      it does not release it when a subsequent call to sysctl_follow_link()
      fails.
      
      This patch fixes this leak by making sure the reference is always
      dropped on return.
      
      See also commit 076c3eed ("sysctl: Rewrite proc_sys_lookup
      introducing find_entry and lookup_entry") which reorganized this code in
      3.4.
      
      Tested in Linux 3.4.4.
      Signed-off-by: NFrancesco Ruggeri <fruggeri@aristanetworks.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6bf61045
  8. 15 9月, 2012 5 次提交
  9. 13 9月, 2012 3 次提交
  10. 12 9月, 2012 1 次提交
  11. 07 9月, 2012 3 次提交
  12. 06 9月, 2012 1 次提交
  13. 05 9月, 2012 5 次提交
  14. 04 9月, 2012 1 次提交
  15. 03 9月, 2012 1 次提交
    • D
      fuse: mark variables uninitialized · 381bf7ca
      Daniel Mack 提交于
      gcc 4.6.3 complains about uninitialized variables in fs/fuse/control.c:
      
        CC      fs/fuse/control.o
      fs/fuse/control.c: In function 'fuse_conn_congestion_threshold_write':
      fs/fuse/control.c:165:29: warning: 'val' may be used uninitialized in this function [-Wuninitialized]
      fs/fuse/control.c: In function 'fuse_conn_max_background_write':
      fs/fuse/control.c:128:23: warning: 'val' may be used uninitialized in this function [-Wuninitialized]
      
      fuse_conn_limit_write() will always return non-zero unless the &val
      is modified, so the warning is misleading. Let the compiler know
      about it by marking 'val' with 'uninitialized_var'.
      Signed-off-by: NDaniel Mack <zonque@gmail.com>
      Cc: Brian Foster <bfoster@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      381bf7ca
  16. 31 8月, 2012 2 次提交
    • M
      cuse: kill connection on initialization error · 8d39d801
      Miklos Szeredi 提交于
      Luca Risolia reported that a CUSE daemon will continue to run even if
      initialization of the emulated device failes for some reason (e.g. the device
      number is already registered by another driver).
      
      This patch disconnects the fuse device on error, which will make the userspace
      CUSE daemon exit, albeit without indication about what the problem was.
      Reported-by: NLuca Risolia <luca.risolia@studio.unibo.it>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      8d39d801
    • M
      cuse: fix fuse_conn_kill() · bbd99797
      Miklos Szeredi 提交于
      fuse_conn_kill() removed fc->entry, called fuse_ctl_remove_conn() and
      fuse_bdi_destroy().  None of which is appropriate for cuse cleanup.
      
      The fuse_ctl_remove_conn() decrements the nlink on the control filesystem, which
      is totally bogus.  The others are harmless but unnecessary.
      
      So move these out from fuse_conn_kill() to fuse_put_super() where they belong.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      bbd99797
  17. 30 8月, 2012 1 次提交
    • C
      xfs: fix race while discarding buffers [V4] · 6fb8a90a
      Carlos Maiolino 提交于
      While xfs_buftarg_shrink() is freeing buffers from the dispose list (filled with
      buffers from lru list), there is a possibility to have xfs_buf_stale() racing
      with it, and removing buffers from dispose list before xfs_buftarg_shrink() does
      it.
      
      This happens because xfs_buftarg_shrink() handle the dispose list without
      locking and the test condition in xfs_buf_stale() checks for the buffer being in
      *any* list:
      
      if (!list_empty(&bp->b_lru))
      
      If the buffer happens to be on dispose list, this causes the buffer counter of
      lru list (btp->bt_lru_nr) to be decremented twice (once in xfs_buftarg_shrink()
      and another in xfs_buf_stale()) causing a wrong account usage of the lru list.
      
      This may cause xfs_buftarg_shrink() to return a wrong value to the memory
      shrinker shrink_slab(), and such account error may also cause an underflowed
      value to be returned; since the counter is lower than the current number of
      items in the lru list, a decrement may happen when the counter is 0, causing
      an underflow on the counter.
      
      The fix uses a new flag field (and a new buffer flag) to serialize buffer
      handling during the shrink process. The new flag field has been designed to use
      btp->bt_lru_lock/unlock instead of xfs_buf_lock/unlock mechanism.
      
      dchinner, sandeen, aquini and aris also deserve credits for this.
      Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      6fb8a90a
  18. 29 8月, 2012 5 次提交