1. 26 7月, 2012 2 次提交
  2. 25 7月, 2012 2 次提交
  3. 20 7月, 2012 6 次提交
    • B
      pnfs-obj: Fix __r4w_get_page when offset is beyond i_size · c999ff68
      Boaz Harrosh 提交于
      It is very common for the end of the file to be unaligned on
      stripe size. But since we know it's beyond file's end then
      the XOR should be preformed with all zeros.
      
      Old code used to just read zeros out of the OSD devices, which is a great
      waist. But what scares me more about this situation is that, we now have
      pages attached to the file's mapping that are beyond i_size. I don't
      like the kind of bugs this calls for.
      
      Fix both birds, by returning a global zero_page, if offset is beyond
      i_size.
      
      TODO:
      	Change the API to ->__r4w_get_page() so a NULL can be
      	returned without being considered as error, since XOR API
      	treats NULL entries as zero_pages.
      
      [Bug since 3.2. Should apply the same way to all Kernels since]
      CC: Stable Tree <stable@kernel.org>
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      c999ff68
    • B
      pnfs-obj: don't leak objio_state if ore_write/read fails · 9909d45a
      Boaz Harrosh 提交于
      [Bug since 3.2 Kernel]
      CC: Stable Tree <stable@kernel.org>
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      9909d45a
    • B
      ore: Unlock r4w pages in exact reverse order of locking · 537632e0
      Boaz Harrosh 提交于
      The read-4-write pages are locked in address ascending order.
      But where unlocked in a way easiest for coding. Fix that,
      locks should be released in opposite order of locking, .i.e
      descending address order.
      
      I have not hit this dead-lock. It was found by inspecting the
      dbug print-outs. I suspect there is an higher lock at caller that
      protects us, but fix it regardless.
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      537632e0
    • B
      ore: Remove support of partial IO request (NFS crash) · 62b62ad8
      Boaz Harrosh 提交于
      Do to OOM situations the ore might fail to allocate all resources
      needed for IO of the full request. If some progress was possible
      it would proceed with a partial/short request, for the sake of
      forward progress.
      
      Since this crashes NFS-core and exofs is just fine without it just
      remove this contraption, and fail.
      
      TODO:
      	Support real forward progress with some reserved allocations
      	of resources, such as mem pools and/or bio_sets
      
      [Bug since 3.2 Kernel]
      CC: Stable Tree <stable@kernel.org>
      CC: Benny Halevy <bhalevy@tonian.com>
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      62b62ad8
    • B
      ore: Fix NFS crash by supporting any unaligned RAID IO · 9ff19309
      Boaz Harrosh 提交于
      In RAID_5/6 We used to not permit an IO that it's end
      byte is not stripe_size aligned and spans more than one stripe.
      .i.e the caller must check if after submission the actual
      transferred bytes is shorter, and would need to resubmit
      a new IO with the remainder.
      
      Exofs supports this, and NFS was supposed to support this
      as well with it's short write mechanism. But late testing has
      exposed a CRASH when this is used with none-RPC layout-drivers.
      
      The change at NFS is deep and risky, in it's place the fix
      at ORE to lift the limitation is actually clean and simple.
      So here it is below.
      
      The principal here is that in the case of unaligned IO on
      both ends, beginning and end, we will send two read requests
      one like old code, before the calculation of the first stripe,
      and also a new site, before the calculation of the last stripe.
      If any "boundary" is aligned or the complete IO is within a single
      stripe. we do a single read like before.
      
      The code is clean and simple by splitting the old _read_4_write
      into 3 even parts:
      1._read_4_write_first_stripe
      2. _read_4_write_last_stripe
      3. _read_4_write_execute
      
      And calling 1+3 at the same place as before. 2+3 before last
      stripe, and in the case of all in a single stripe then 1+2+3
      is preformed additively.
      
      Why did I not think of it before. Well I had a strike of
      genius because I have stared at this code for 2 years, and did
      not find this simple solution, til today. Not that I did not try.
      
      This solution is much better for NFS than the previous supposedly
      solution because the short write was dealt  with out-of-band after
      IO_done, which would cause for a seeky IO pattern where as in here
      we execute in order. At both solutions we do 2 separate reads, only
      here we do it within a single IO request. (And actually combine two
      writes into a single submission)
      
      NFS/exofs code need not change since the ORE API communicates the new
      shorter length on return, what will happen is that this case would not
      occur anymore.
      
      hurray!!
      
      [Stable this is an NFS bug since 3.2 Kernel should apply cleanly]
      CC: Stable Tree <stable@kernel.org>
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      9ff19309
    • A
      UBIFS: fix a bug in empty space fix-up · c6727932
      Artem Bityutskiy 提交于
      UBIFS has a feature called "empty space fix-up" which is a quirk to work-around
      limitations of dumb flasher programs. Namely, of those flashers that are unable
      to skip NAND pages full of 0xFFs while flashing, resulting in empty space at
      the end of half-filled eraseblocks to be unusable for UBIFS. This feature is
      relatively new (introduced in v3.0).
      
      The fix-up routine (fixup_free_space()) is executed only once at the very first
      mount if the superblock has the 'space_fixup' flag set (can be done with -F
      option of mkfs.ubifs). It basically reads all the UBIFS data and metadata and
      writes it back to the same LEB. The routine assumes the image is pristine and
      does not have anything in the journal.
      
      There was a bug in 'fixup_free_space()' where it fixed up the log incorrectly.
      All but one LEB of the log of a pristine file-system are empty. And one
      contains just a commit start node. And 'fixup_free_space()' just unmapped this
      LEB, which resulted in wiping the commit start node. As a result, some users
      were unable to mount the file-system next time with the following symptom:
      
      UBIFS error (pid 1): replay_log_leb: first log node at LEB 3:0 is not CS node
      UBIFS error (pid 1): replay_log_leb: log error detected while replaying the log at LEB 3:0
      
      The root-cause of this bug was that 'fixup_free_space()' wrongly assumed
      that the beginning of empty space in the log head (c->lhead_offs) was known
      on mount. However, it is not the case - it was always 0. UBIFS does not store
      in it the master node and finds out by scanning the log on every mount.
      
      The fix is simple - just pass commit start node size instead of 0 to
      'fixup_leb()'.
      Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@linux.intel.com>
      Cc: stable@vger.kernel.org [v3.0+]
      Reported-by: NIwo Mergler <Iwo.Mergler@netcommwireless.com>
      Tested-by: NIwo Mergler <Iwo.Mergler@netcommwireless.com>
      Reported-by: NJames Nute <newten82@gmail.com>
      c6727932
  4. 18 7月, 2012 2 次提交
  5. 17 7月, 2012 4 次提交
    • J
      cifs: always update the inode cache with the results from a FIND_* · cd60042c
      Jeff Layton 提交于
      When we get back a FIND_FIRST/NEXT result, we have some info about the
      dentry that we use to instantiate a new inode. We were ignoring and
      discarding that info when we had an existing dentry in the cache.
      
      Fix this by updating the inode in place when we find an existing dentry
      and the uniqueid is the same.
      
      Cc: <stable@vger.kernel.org> # .31.x
      Reported-and-Tested-by: NAndrew Bartlett <abartlet@samba.org>
      Reported-by: NBill Robertson <bill_robertson@debortoli.com.au>
      Reported-by: NDion Edwards <dion_edwards@debortoli.com.au>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      cd60042c
    • J
      cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmaps · 3cf003c0
      Jeff Layton 提交于
      Jian found that when he ran fsx on a 32 bit arch with a large wsize the
      process and one of the bdi writeback kthreads would sometimes deadlock
      with a stack trace like this:
      
      crash> bt
      PID: 2789   TASK: f02edaa0  CPU: 3   COMMAND: "fsx"
       #0 [eed63cbc] schedule at c083c5b3
       #1 [eed63d80] kmap_high at c0500ec8
       #2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs]
       #3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
       #4 [eed63e50] do_writepages at c04f3e32
       #5 [eed63e54] __filemap_fdatawrite_range at c04e152a
       #6 [eed63ea4] filemap_fdatawrite at c04e1b3e
       #7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
       #8 [eed63ecc] do_sync_write at c052d202
       #9 [eed63f74] vfs_write at c052d4ee
      #10 [eed63f94] sys_write at c052df4c
      #11 [eed63fb0] ia32_sysenter_target at c0409a98
          EAX: 00000004  EBX: 00000003  ECX: abd73b73  EDX: 012a65c6
          DS:  007b      ESI: 012a65c6  ES:  007b      EDI: 00000000
          SS:  007b      ESP: bf8db178  EBP: bf8db1f8  GS:  0033
          CS:  0073      EIP: 40000424  ERR: 00000004  EFLAGS: 00000246
      
      Each task would kmap part of its address array before getting stuck, but
      not enough to actually issue the write.
      
      This patch fixes this by serializing the marshal_iov operations for
      async reads and writes. The idea here is to ensure that cifs
      aggressively tries to populate a request before attempting to fulfill
      another one. As soon as all of the pages are kmapped for a request, then
      we can unlock and allow another one to proceed.
      
      There's no need to do this serialization on non-CONFIG_HIGHMEM arches
      however, so optimize all of this out when CONFIG_HIGHMEM isn't set.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NJian Li <jiali@redhat.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      3cf003c0
    • J
      cifs: on CONFIG_HIGHMEM machines, limit the rsize/wsize to the kmap space · 3ae629d9
      Jeff Layton 提交于
      We currently rely on being able to kmap all of the pages in an async
      read or write request. If you're on a machine that has CONFIG_HIGHMEM
      set then that kmap space is limited, sometimes to as low as 512 slots.
      
      With 512 slots, we can only support up to a 2M r/wsize, and that's
      assuming that we can get our greedy little hands on all of them. There
      are other users however, so it's possible we'll end up stuck with a
      size that large.
      
      Since we can't handle a rsize or wsize larger than that currently, cap
      those options at the number of kmap slots we have. We could consider
      capping it even lower, but we currently default to a max of 1M. Might as
      well allow those luddites on 32 bit arches enough rope to hang
      themselves.
      
      A more robust fix would be to teach the send and receive routines how
      to contend with an array of pages so we don't need to marshal up a kvec
      array at all. That's a fairly significant overhaul though, so we'll need
      this limit in place until that's ready.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NJian Li <jiali@redhat.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      3ae629d9
    • S
      Initialise mid_q_entry before putting it on the pending queue · ffc61ccb
      Sachin Prabhu 提交于
      A user reported a crash in cifs_demultiplex_thread() caused by an
      incorrectly set mid_q_entry->callback() function. It appears that the
      callback assignment made in cifs_call_async() was not flushed back to
      memory suggesting that a memory barrier was required here. Changing the
      code to make sure that the mid_q_entry structure was completely
      initialised before it was added to the pending queue fixes the problem.
      Signed-off-by: NSachin Prabhu <sprabhu@redhat.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NShirish Pargaonkar <shirishpargaonkar@gmail.com>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      ffc61ccb
  6. 16 7月, 2012 1 次提交
    • A
      fifo: Do not restart open() if it already found a partner · 05d290d6
      Anders Kaseorg 提交于
      If a parent and child process open the two ends of a fifo, and the
      child immediately exits, the parent may receive a SIGCHLD before its
      open() returns.  In that case, we need to make sure that open() will
      return successfully after the SIGCHLD handler returns, instead of
      throwing EINTR or being restarted.  Otherwise, the restarted open()
      would incorrectly wait for a second partner on the other end.
      
      The following test demonstrates the EINTR that was wrongly thrown from
      the parent’s open().  Change .sa_flags = 0 to .sa_flags = SA_RESTART
      to see a deadlock instead, in which the restarted open() waits for a
      second reader that will never come.  (On my systems, this happens
      pretty reliably within about 5 to 500 iterations.  Others report that
      it manages to loop ~forever sometimes; YMMV.)
      
        #include <sys/stat.h>
        #include <sys/types.h>
        #include <sys/wait.h>
        #include <fcntl.h>
        #include <signal.h>
        #include <stdio.h>
        #include <stdlib.h>
        #include <unistd.h>
      
        #define CHECK(x) do if ((x) == -1) {perror(#x); abort();} while(0)
      
        void handler(int signum) {}
      
        int main()
        {
            struct sigaction act = {.sa_handler = handler, .sa_flags = 0};
            CHECK(sigaction(SIGCHLD, &act, NULL));
            CHECK(mknod("fifo", S_IFIFO | S_IRWXU, 0));
            for (;;) {
                int fd;
                pid_t pid;
                putc('.', stderr);
                CHECK(pid = fork());
                if (pid == 0) {
                    CHECK(fd = open("fifo", O_RDONLY));
                    _exit(0);
                }
                CHECK(fd = open("fifo", O_WRONLY));
                CHECK(close(fd));
                CHECK(waitpid(pid, NULL, 0));
            }
        }
      
      This is what I suspect was causing the Git test suite to fail in
      t9010-svn-fe.sh:
      
      	http://bugs.debian.org/678852Signed-off-by: NAnders Kaseorg <andersk@mit.edu>
      Reviewed-by: NJonathan Nieder <jrnieder@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      05d290d6
  7. 14 7月, 2012 5 次提交
  8. 13 7月, 2012 1 次提交
    • J
      block: fix infinite loop in __getblk_slow · 91f68c89
      Jeff Moyer 提交于
      Commit 080399aa ("block: don't mark buffers beyond end of disk as
      mapped") exposed a bug in __getblk_slow that causes mount to hang as it
      loops infinitely waiting for a buffer that lies beyond the end of the
      disk to become uptodate.
      
      The problem was initially reported by Torsten Hilbrich here:
      
          https://lkml.org/lkml/2012/6/18/54
      
      and also reported independently here:
      
          http://www.sysresccd.org/forums/viewtopic.php?f=13&t=4511
      
      and then Richard W.M.  Jones and Marcos Mello noted a few separate
      bugzillas also associated with the same issue.  This patch has been
      confirmed to fix:
      
          https://bugzilla.redhat.com/show_bug.cgi?id=835019
      
      The main problem is here, in __getblk_slow:
      
              for (;;) {
                      struct buffer_head * bh;
                      int ret;
      
                      bh = __find_get_block(bdev, block, size);
                      if (bh)
                              return bh;
      
                      ret = grow_buffers(bdev, block, size);
                      if (ret < 0)
                              return NULL;
                      if (ret == 0)
                              free_more_memory();
              }
      
      __find_get_block does not find the block, since it will not be marked as
      mapped, and so grow_buffers is called to fill in the buffers for the
      associated page.  I believe the for (;;) loop is there primarily to
      retry in the case of memory pressure keeping grow_buffers from
      succeeding.  However, we also continue to loop for other cases, like the
      block lying beond the end of the disk.  So, the fix I came up with is to
      only loop when grow_buffers fails due to memory allocation issues
      (return value of 0).
      
      The attached patch was tested by myself, Torsten, and Rich, and was
      found to resolve the problem in call cases.
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Reported-and-Tested-by: NTorsten Hilbrich <torsten.hilbrich@secunet.com>
      Tested-by: NRichard W.M. Jones <rjones@redhat.com>
      Reviewed-by: NJosh Boyer <jwboyer@redhat.com>
      Cc: Stable <stable@vger.kernel.org>  # 3.0+
      [ Jens is on vacation, taking this directly  - Linus ]
      --
      Stable Notes: this patch requires backport to 3.0, 3.2 and 3.3.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      91f68c89
  9. 12 7月, 2012 3 次提交
  10. 11 7月, 2012 1 次提交
  11. 08 7月, 2012 2 次提交
  12. 07 7月, 2012 1 次提交
  13. 04 7月, 2012 8 次提交
    • J
      ocfs2: Fix bogus error message from ocfs2_global_read_info · a4564ead
      Jan Kara 提交于
      'status' variable in ocfs2_global_read_info() is always != 0 when leaving the
      function because it happens to contain number of read bytes. Thus we always log
      error message although everything is OK. Since all error cases properly call
      mlog_errno() before jumping to out_err, there's no reason to call mlog_errno()
      on exit at all. This is a fallout of c1e8d35e (conversion of mlog_exit()
      calls).
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      a4564ead
    • J
      ocfs2: for SEEK_DATA/SEEK_HOLE, return internal error unchanged if... · 65622e64
      Jeff Liu 提交于
      ocfs2: for SEEK_DATA/SEEK_HOLE, return internal error unchanged if ocfs2_get_clusters_nocache() or ocfs2_inode_lock() call failed.
      
      Hello,
      
      Since ENXIO only means "offset beyond EOF" for SEEK_DATA/SEEK_HOLE,
      Hence we should return the internal error unchanged if ocfs2_inode_lock() or
      ocfs2_get_clusters_nocache() call failed rather than ENXIO.
      Otherwise, it will confuse the user applications when they trying to understand the root cause.
      
      Thanks Dave for pointing this out.
      
      Thanks,
      -Jeff
      
      Cc: Dave Chinner <david@fromorbit.com>
      Signed-off-by: NJie Liu <jeff.liu@oracle.com>
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      65622e64
    • S
      ocfs2: use spinlock irqsave for downconvert lock.patch · a75e9cca
      Srinivas Eeda 提交于
      When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it
      deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread.
      Below is the stack snippet.
      
      The patch disables interrupts when acquiring dc_task_lock spinlock.
      
      	ocfs2_wake_downconvert_thread
      	ocfs2_rw_unlock
      	ocfs2_dio_end_io
      	dio_complete
      	.....
      	bio_endio
      	req_bio_endio
      	....
      	scsi_io_completion
      	blk_done_softirq
      	__do_softirq
      	do_softirq
      	irq_exit
      	do_IRQ
      	ocfs2_downconvert_thread
      	[kthread]
      Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      a75e9cca
    • R
      ocfs2: Misplaced parens in unlikley · 16865b7c
      roel 提交于
      Fix misplaced parentheses
      Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      16865b7c
    • J
      ocfs2: clear unaligned io flag when dio fails · 3e5d3c35
      Junxiao Bi 提交于
      The unaligned io flag is set in the kiocb when an unaligned
      dio is issued, it should be cleared even when the dio fails,
      or it may affect the following io which are using the same
      kiocb.
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJoel Becker <jlbec@evilplan.org>
      3e5d3c35
    • T
      eCryptfs: Fix lockdep warning in miscdev operations · 60d65f1f
      Tyler Hicks 提交于
      Don't grab the daemon mutex while holding the message context mutex.
      Addresses this lockdep warning:
      
       ecryptfsd/2141 is trying to acquire lock:
        (&ecryptfs_msg_ctx_arr[i].mux){+.+.+.}, at: [<ffffffffa029c213>] ecryptfs_miscdev_read+0x143/0x470 [ecryptfs]
      
       but task is already holding lock:
        (&(*daemon)->mux){+.+...}, at: [<ffffffffa029c2ec>] ecryptfs_miscdev_read+0x21c/0x470 [ecryptfs]
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (&(*daemon)->mux){+.+...}:
              [<ffffffff810a3b8d>] lock_acquire+0x9d/0x220
              [<ffffffff8151c6da>] __mutex_lock_common+0x5a/0x4b0
              [<ffffffff8151cc64>] mutex_lock_nested+0x44/0x50
              [<ffffffffa029c5d7>] ecryptfs_send_miscdev+0x97/0x120 [ecryptfs]
              [<ffffffffa029b744>] ecryptfs_send_message+0x134/0x1e0 [ecryptfs]
              [<ffffffffa029a24e>] ecryptfs_generate_key_packet_set+0x2fe/0xa80 [ecryptfs]
              [<ffffffffa02960f8>] ecryptfs_write_metadata+0x108/0x250 [ecryptfs]
              [<ffffffffa0290f80>] ecryptfs_create+0x130/0x250 [ecryptfs]
              [<ffffffff811963a4>] vfs_create+0xb4/0x120
              [<ffffffff81197865>] do_last+0x8c5/0xa10
              [<ffffffff811998f9>] path_openat+0xd9/0x460
              [<ffffffff81199da2>] do_filp_open+0x42/0xa0
              [<ffffffff81187998>] do_sys_open+0xf8/0x1d0
              [<ffffffff81187a91>] sys_open+0x21/0x30
              [<ffffffff81527d69>] system_call_fastpath+0x16/0x1b
      
       -> #0 (&ecryptfs_msg_ctx_arr[i].mux){+.+.+.}:
              [<ffffffff810a3418>] __lock_acquire+0x1bf8/0x1c50
              [<ffffffff810a3b8d>] lock_acquire+0x9d/0x220
              [<ffffffff8151c6da>] __mutex_lock_common+0x5a/0x4b0
              [<ffffffff8151cc64>] mutex_lock_nested+0x44/0x50
              [<ffffffffa029c213>] ecryptfs_miscdev_read+0x143/0x470 [ecryptfs]
              [<ffffffff811887d3>] vfs_read+0xb3/0x180
              [<ffffffff811888ed>] sys_read+0x4d/0x90
              [<ffffffff81527d69>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
      60d65f1f
    • T
      eCryptfs: Properly check for O_RDONLY flag before doing privileged open · 9fe79d76
      Tyler Hicks 提交于
      If the first attempt at opening the lower file read/write fails,
      eCryptfs will retry using a privileged kthread. However, the privileged
      retry should not happen if the lower file's inode is read-only because a
      read/write open will still be unsuccessful.
      
      The check for determining if the open should be retried was intended to
      be based on the access mode of the lower file's open flags being
      O_RDONLY, but the check was incorrectly performed. This would cause the
      open to be retried by the privileged kthread, resulting in a second
      failed open of the lower file. This patch corrects the check to
      determine if the open request should be handled by the privileged
      kthread.
      Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NDan Carpenter <dan.carpenter@oracle.com>
      9fe79d76
    • J
      cifs: when server doesn't set CAP_LARGE_READ_X, cap default rsize at MaxBufferSize · ec01d738
      Jeff Layton 提交于
      When the server doesn't advertise CAP_LARGE_READ_X, then MS-CIFS states
      that you must cap the size of the read at the client's MaxBufferSize.
      Unfortunately, testing with many older servers shows that they often
      can't service a read larger than their own MaxBufferSize.
      
      Since we can't assume what the server will do in this situation, we must
      be conservative here for the default. When the server can't do large
      reads, then assume that it can't satisfy any read larger than its
      MaxBufferSize either.
      
      Luckily almost all modern servers can do large reads, so this won't
      affect them. This is really just for older win9x and OS/2 era servers.
      Also, note that this patch just governs the default rsize. The admin can
      always override this if he so chooses.
      
      Cc: <stable@vger.kernel.org> # 3.2
      Reported-by: NDavid H. Durgee <dhdurgee@acm.org>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteven French <sfrench@w500smf.(none)>
      ec01d738
  14. 03 7月, 2012 2 次提交
    • C
      Btrfs: run delayed directory updates during log replay · b6305567
      Chris Mason 提交于
      While we are resolving directory modifications in the
      tree log, we are triggering delayed metadata updates to
      the filesystem btrees.
      
      This commit forces the delayed updates to run so the
      replay code can find any modifications done.  It stops
      us from crashing because the directory deleltion replay
      expects items to be removed immediately from the tree.
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      cc: stable@kernel.org
      b6305567
    • J
      Btrfs: hold a ref on the inode during writepages · 7fd1a3f7
      Josef Bacik 提交于
      We can race with unlink and not actually be able to do our igrab in
      btrfs_add_ordered_extent.  This will result in all sorts of problems.
      Instead of doing the complicated work to try and handle returning an error
      properly from btrfs_add_ordered_extent, just hold a ref to the inode during
      writepages.  If we cannot grab a ref we know we're freeing this inode anyway
      and can just drop the dirty pages on the floor, because screw them we're
      going to invalidate them anyway.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      7fd1a3f7