1. 14 7月, 2012 16 次提交
  2. 13 7月, 2012 1 次提交
    • J
      block: fix infinite loop in __getblk_slow · 91f68c89
      Jeff Moyer 提交于
      Commit 080399aa ("block: don't mark buffers beyond end of disk as
      mapped") exposed a bug in __getblk_slow that causes mount to hang as it
      loops infinitely waiting for a buffer that lies beyond the end of the
      disk to become uptodate.
      
      The problem was initially reported by Torsten Hilbrich here:
      
          https://lkml.org/lkml/2012/6/18/54
      
      and also reported independently here:
      
          http://www.sysresccd.org/forums/viewtopic.php?f=13&t=4511
      
      and then Richard W.M.  Jones and Marcos Mello noted a few separate
      bugzillas also associated with the same issue.  This patch has been
      confirmed to fix:
      
          https://bugzilla.redhat.com/show_bug.cgi?id=835019
      
      The main problem is here, in __getblk_slow:
      
              for (;;) {
                      struct buffer_head * bh;
                      int ret;
      
                      bh = __find_get_block(bdev, block, size);
                      if (bh)
                              return bh;
      
                      ret = grow_buffers(bdev, block, size);
                      if (ret < 0)
                              return NULL;
                      if (ret == 0)
                              free_more_memory();
              }
      
      __find_get_block does not find the block, since it will not be marked as
      mapped, and so grow_buffers is called to fill in the buffers for the
      associated page.  I believe the for (;;) loop is there primarily to
      retry in the case of memory pressure keeping grow_buffers from
      succeeding.  However, we also continue to loop for other cases, like the
      block lying beond the end of the disk.  So, the fix I came up with is to
      only loop when grow_buffers fails due to memory allocation issues
      (return value of 0).
      
      The attached patch was tested by myself, Torsten, and Rich, and was
      found to resolve the problem in call cases.
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Reported-and-Tested-by: NTorsten Hilbrich <torsten.hilbrich@secunet.com>
      Tested-by: NRichard W.M. Jones <rjones@redhat.com>
      Reviewed-by: NJosh Boyer <jwboyer@redhat.com>
      Cc: Stable <stable@vger.kernel.org>  # 3.0+
      [ Jens is on vacation, taking this directly  - Linus ]
      --
      Stable Notes: this patch requires backport to 3.0, 3.2 and 3.3.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      91f68c89
  3. 12 7月, 2012 3 次提交
  4. 11 7月, 2012 1 次提交
  5. 08 7月, 2012 2 次提交
  6. 07 7月, 2012 1 次提交
  7. 04 7月, 2012 8 次提交
    • J
      ocfs2: Fix bogus error message from ocfs2_global_read_info · a4564ead
      Jan Kara 提交于
      'status' variable in ocfs2_global_read_info() is always != 0 when leaving the
      function because it happens to contain number of read bytes. Thus we always log
      error message although everything is OK. Since all error cases properly call
      mlog_errno() before jumping to out_err, there's no reason to call mlog_errno()
      on exit at all. This is a fallout of c1e8d35e (conversion of mlog_exit()
      calls).
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      a4564ead
    • J
      ocfs2: for SEEK_DATA/SEEK_HOLE, return internal error unchanged if... · 65622e64
      Jeff Liu 提交于
      ocfs2: for SEEK_DATA/SEEK_HOLE, return internal error unchanged if ocfs2_get_clusters_nocache() or ocfs2_inode_lock() call failed.
      
      Hello,
      
      Since ENXIO only means "offset beyond EOF" for SEEK_DATA/SEEK_HOLE,
      Hence we should return the internal error unchanged if ocfs2_inode_lock() or
      ocfs2_get_clusters_nocache() call failed rather than ENXIO.
      Otherwise, it will confuse the user applications when they trying to understand the root cause.
      
      Thanks Dave for pointing this out.
      
      Thanks,
      -Jeff
      
      Cc: Dave Chinner <david@fromorbit.com>
      Signed-off-by: NJie Liu <jeff.liu@oracle.com>
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      65622e64
    • S
      ocfs2: use spinlock irqsave for downconvert lock.patch · a75e9cca
      Srinivas Eeda 提交于
      When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it
      deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread.
      Below is the stack snippet.
      
      The patch disables interrupts when acquiring dc_task_lock spinlock.
      
      	ocfs2_wake_downconvert_thread
      	ocfs2_rw_unlock
      	ocfs2_dio_end_io
      	dio_complete
      	.....
      	bio_endio
      	req_bio_endio
      	....
      	scsi_io_completion
      	blk_done_softirq
      	__do_softirq
      	do_softirq
      	irq_exit
      	do_IRQ
      	ocfs2_downconvert_thread
      	[kthread]
      Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      a75e9cca
    • R
      ocfs2: Misplaced parens in unlikley · 16865b7c
      roel 提交于
      Fix misplaced parentheses
      Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      16865b7c
    • J
      ocfs2: clear unaligned io flag when dio fails · 3e5d3c35
      Junxiao Bi 提交于
      The unaligned io flag is set in the kiocb when an unaligned
      dio is issued, it should be cleared even when the dio fails,
      or it may affect the following io which are using the same
      kiocb.
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      3e5d3c35
    • T
      eCryptfs: Fix lockdep warning in miscdev operations · 60d65f1f
      Tyler Hicks 提交于
      Don't grab the daemon mutex while holding the message context mutex.
      Addresses this lockdep warning:
      
       ecryptfsd/2141 is trying to acquire lock:
        (&ecryptfs_msg_ctx_arr[i].mux){+.+.+.}, at: [<ffffffffa029c213>] ecryptfs_miscdev_read+0x143/0x470 [ecryptfs]
      
       but task is already holding lock:
        (&(*daemon)->mux){+.+...}, at: [<ffffffffa029c2ec>] ecryptfs_miscdev_read+0x21c/0x470 [ecryptfs]
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (&(*daemon)->mux){+.+...}:
              [<ffffffff810a3b8d>] lock_acquire+0x9d/0x220
              [<ffffffff8151c6da>] __mutex_lock_common+0x5a/0x4b0
              [<ffffffff8151cc64>] mutex_lock_nested+0x44/0x50
              [<ffffffffa029c5d7>] ecryptfs_send_miscdev+0x97/0x120 [ecryptfs]
              [<ffffffffa029b744>] ecryptfs_send_message+0x134/0x1e0 [ecryptfs]
              [<ffffffffa029a24e>] ecryptfs_generate_key_packet_set+0x2fe/0xa80 [ecryptfs]
              [<ffffffffa02960f8>] ecryptfs_write_metadata+0x108/0x250 [ecryptfs]
              [<ffffffffa0290f80>] ecryptfs_create+0x130/0x250 [ecryptfs]
              [<ffffffff811963a4>] vfs_create+0xb4/0x120
              [<ffffffff81197865>] do_last+0x8c5/0xa10
              [<ffffffff811998f9>] path_openat+0xd9/0x460
              [<ffffffff81199da2>] do_filp_open+0x42/0xa0
              [<ffffffff81187998>] do_sys_open+0xf8/0x1d0
              [<ffffffff81187a91>] sys_open+0x21/0x30
              [<ffffffff81527d69>] system_call_fastpath+0x16/0x1b
      
       -> #0 (&ecryptfs_msg_ctx_arr[i].mux){+.+.+.}:
              [<ffffffff810a3418>] __lock_acquire+0x1bf8/0x1c50
              [<ffffffff810a3b8d>] lock_acquire+0x9d/0x220
              [<ffffffff8151c6da>] __mutex_lock_common+0x5a/0x4b0
              [<ffffffff8151cc64>] mutex_lock_nested+0x44/0x50
              [<ffffffffa029c213>] ecryptfs_miscdev_read+0x143/0x470 [ecryptfs]
              [<ffffffff811887d3>] vfs_read+0xb3/0x180
              [<ffffffff811888ed>] sys_read+0x4d/0x90
              [<ffffffff81527d69>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
      60d65f1f
    • T
      eCryptfs: Properly check for O_RDONLY flag before doing privileged open · 9fe79d76
      Tyler Hicks 提交于
      If the first attempt at opening the lower file read/write fails,
      eCryptfs will retry using a privileged kthread. However, the privileged
      retry should not happen if the lower file's inode is read-only because a
      read/write open will still be unsuccessful.
      
      The check for determining if the open should be retried was intended to
      be based on the access mode of the lower file's open flags being
      O_RDONLY, but the check was incorrectly performed. This would cause the
      open to be retried by the privileged kthread, resulting in a second
      failed open of the lower file. This patch corrects the check to
      determine if the open request should be handled by the privileged
      kthread.
      Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NDan Carpenter <dan.carpenter@oracle.com>
      9fe79d76
    • J
      cifs: when server doesn't set CAP_LARGE_READ_X, cap default rsize at MaxBufferSize · ec01d738
      Jeff Layton 提交于
      When the server doesn't advertise CAP_LARGE_READ_X, then MS-CIFS states
      that you must cap the size of the read at the client's MaxBufferSize.
      Unfortunately, testing with many older servers shows that they often
      can't service a read larger than their own MaxBufferSize.
      
      Since we can't assume what the server will do in this situation, we must
      be conservative here for the default. When the server can't do large
      reads, then assume that it can't satisfy any read larger than its
      MaxBufferSize either.
      
      Luckily almost all modern servers can do large reads, so this won't
      affect them. This is really just for older win9x and OS/2 era servers.
      Also, note that this patch just governs the default rsize. The admin can
      always override this if he so chooses.
      
      Cc: <stable@vger.kernel.org> # 3.2
      Reported-by: NDavid H. Durgee <dhdurgee@acm.org>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteven French <sfrench@w500smf.(none)>
      ec01d738
  8. 03 7月, 2012 8 次提交
    • C
      Btrfs: run delayed directory updates during log replay · b6305567
      Chris Mason 提交于
      While we are resolving directory modifications in the
      tree log, we are triggering delayed metadata updates to
      the filesystem btrees.
      
      This commit forces the delayed updates to run so the
      replay code can find any modifications done.  It stops
      us from crashing because the directory deleltion replay
      expects items to be removed immediately from the tree.
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      cc: stable@kernel.org
      b6305567
    • J
      Btrfs: hold a ref on the inode during writepages · 7fd1a3f7
      Josef Bacik 提交于
      We can race with unlink and not actually be able to do our igrab in
      btrfs_add_ordered_extent.  This will result in all sorts of problems.
      Instead of doing the complicated work to try and handle returning an error
      properly from btrfs_add_ordered_extent, just hold a ref to the inode during
      writepages.  If we cannot grab a ref we know we're freeing this inode anyway
      and can just drop the dirty pages on the floor, because screw them we're
      going to invalidate them anyway.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      7fd1a3f7
    • J
      Btrfs: fix tree log remove space corner case · bdb7d303
      Josef Bacik 提交于
      The tree log stuff can have allocated space that we end up having split
      across a bitmap and a real extent.  The free space code does not deal with
      this, it assumes that if it finds an extent or bitmap entry that the entire
      range must fall within the entry it finds.  This isn't necessarily the case,
      so rework the remove function so it can handle this case properly.  This
      fixed two panics the user hit, first in the case where the space was
      initially in a bitmap and then in an extent entry, and then the reverse
      case.  Thanks,
      Reported-and-tested-by: NShaun Reich <sreich@kde.org>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      bdb7d303
    • L
      Btrfs: fix wrong check during log recovery · 6bf02314
      Liu Bo 提交于
      When we're evicting an inode during log recovery, we need to ensure that the inode
      is not in orphan state any more, which means inode's run_time flags has _no_
      BTRFS_INODE_HAS_ORPHAN_ITEM.  Thus, the BUG_ON was triggered because of a wrong
      check for the flags.
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      6bf02314
    • A
      Btrfs: use _IOR for BTRFS_IOC_SUBVOL_GETFLAGS · d3a94048
      Alexander Block 提交于
      We used the wrong ioctl macro for the getflags ioctl before.
      As we don't have the set/getflags ioctls in the user space ioctl.h
      at the moment, it's safe to fix it now.
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NAlexander Block <ablock84@googlemail.com>
      d3a94048
    • I
      Btrfs: resume balance on rw (re)mounts properly · 2b6ba629
      Ilya Dryomov 提交于
      This introduces btrfs_resume_balance_async(), which, given that
      restriper state was recovered earlier by btrfs_recover_balance(),
      resumes balance in btrfs-balance kthread.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      2b6ba629
    • I
      Btrfs: restore restriper state on all mounts · 68310a5e
      Ilya Dryomov 提交于
      Fix a bug that triggered asserts in btrfs_balance() in both normal and
      resume modes -- restriper state was not properly restored on read-only
      mounts.  This factors out resuming code from btrfs_restore_balance(),
      which is now also called earlier in the mount sequence to avoid the
      problem of some early writes getting the old profile.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      68310a5e
    • J
      Btrfs: fix dio write vs buffered read race · c3473e83
      Josef Bacik 提交于
      Miao pointed out there's a problem with mixing dio writes and buffered
      reads.  If the read happens between us invalidating the page range and
      actually locking the extent we can bring in pages into page cache.  Then
      once the write finishes if somebody tries to read again it will just find
      uptodate pages and we'll read stale data.  So we need to lock the extent and
      check for uptodate bits in the range.  If there are uptodate bits we need to
      unlock and invalidate again.  This will keep this race from happening since
      we will hold the extent locked until we create the ordered extent, and then
      teh read side always waits for ordered extents.  There was also a race in
      how we updated i_size, previously we were relying on the generic DIO stuff
      to adjust the i_size after the DIO had completed, but this happens outside
      of the extent lock which means reads could come in and not see the updated
      i_size.  So instead move this work into where we create the extents, and
      then this way the update ordered i_size stuff works properly in the endio
      handlers.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      c3473e83