提交 · 8ea05e3a4262b9e6871c349fa3486bcfc72ffd1a · bug2833 / cloud-kernel

26 7月, 2012 2 次提交

Btrfs: introduce subvol uuids and times · 8ea05e3a

由 Alexander Block 提交于 7月 25, 2012

This patch introduces uuids for subvolumes. Each
subvolume has it's own uuid. In case it was snapshotted,
it also contains parent_uuid. In case it was received,
it also contains received_uuid.

It also introduces subvolume ctime/otime/stime/rtime. The
first two are comparable to the times found in inodes. otime
is the origin/creation time and ctime is the change time.
stime/rtime are only valid on received subvolumes.
stime is the time of the subvolume when it was
sent. rtime is the time of the subvolume when it was
received.

Additionally to the times, we have a transid for each
time. They are updated at the same place as the times.

btrfs receive uses stransid and rtransid to find out
if a received subvolume changed in the meantime.

If an older kernel mounts a filesystem with the
extented fields, all fields become invalid. The next
mount with a new kernel will detect this and reset the
fields.
Signed-off-by: NAlexander Block <ablock84@googlemail.com>
Reviewed-by: NDavid Sterba <dave@jikos.cz>
Reviewed-by: NArne Jansen <sensille@gmx.net>
Reviewed-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Reviewed-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>

8ea05e3a

Btrfs: make iref_to_path non static · 91cb916c

由 Alexander Block 提交于 6月 03, 2012

Make iref_to_path non static (needed in send) and rename
it to btrfs_iref_to_path
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

91cb916c

25 7月, 2012 2 次提交

Btrfs: add helper for tree enumeration · e6793769

由 Arne Jansen 提交于 9月 13, 2011

Often no exact match is wanted but just the next lower or
higher item. There's a lot of duplicated code throughout
btrfs to deal with the corner cases. This patch adds a
helper function that can facilitate searching.
Signed-off-by: NArne Jansen <sensille@gmx.net>

e6793769

btrfs: allow cross-subvolume file clone · 362a20c5

由 David Sterba 提交于 8月 01, 2011

Lift the EXDEV condition and allow different root trees for files being
cloned, then pass source inode's root when searching for extents.
Cloning is not allowed to cross vfsmounts, ie. when two subvolumes from
one filesystem are mounted separately.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

362a20c5

20 7月, 2012 6 次提交

pnfs-obj: Fix __r4w_get_page when offset is beyond i_size · c999ff68

由 Boaz Harrosh 提交于 6月 08, 2012

It is very common for the end of the file to be unaligned on
stripe size. But since we know it's beyond file's end then
the XOR should be preformed with all zeros.

Old code used to just read zeros out of the OSD devices, which is a great
waist. But what scares me more about this situation is that, we now have
pages attached to the file's mapping that are beyond i_size. I don't
like the kind of bugs this calls for.

Fix both birds, by returning a global zero_page, if offset is beyond
i_size.

TODO:
	Change the API to ->__r4w_get_page() so a NULL can be
	returned without being considered as error, since XOR API
	treats NULL entries as zero_pages.

[Bug since 3.2. Should apply the same way to all Kernels since]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>

c999ff68

B
pnfs-obj: don't leak objio_state if ore_write/read fails · 9909d45a
由 Boaz Harrosh 提交于 6月 08, 2012
```
[Bug since 3.2 Kernel]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
```
9909d45a

ore: Unlock r4w pages in exact reverse order of locking · 537632e0

由 Boaz Harrosh 提交于 7月 11, 2012

The read-4-write pages are locked in address ascending order.
But where unlocked in a way easiest for coding. Fix that,
locks should be released in opposite order of locking, .i.e
descending address order.

I have not hit this dead-lock. It was found by inspecting the
dbug print-outs. I suspect there is an higher lock at caller that
protects us, but fix it regardless.
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>

537632e0

ore: Remove support of partial IO request (NFS crash) · 62b62ad8

由 Boaz Harrosh 提交于 6月 08, 2012

Do to OOM situations the ore might fail to allocate all resources
needed for IO of the full request. If some progress was possible
it would proceed with a partial/short request, for the sake of
forward progress.

Since this crashes NFS-core and exofs is just fine without it just
remove this contraption, and fail.

TODO:
	Support real forward progress with some reserved allocations
	of resources, such as mem pools and/or bio_sets

[Bug since 3.2 Kernel]
CC: Stable Tree <stable@kernel.org>
CC: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>

62b62ad8

ore: Fix NFS crash by supporting any unaligned RAID IO · 9ff19309

由 Boaz Harrosh 提交于 6月 08, 2012

In RAID_5/6 We used to not permit an IO that it's end
byte is not stripe_size aligned and spans more than one stripe.
.i.e the caller must check if after submission the actual
transferred bytes is shorter, and would need to resubmit
a new IO with the remainder.

Exofs supports this, and NFS was supposed to support this
as well with it's short write mechanism. But late testing has
exposed a CRASH when this is used with none-RPC layout-drivers.

The change at NFS is deep and risky, in it's place the fix
at ORE to lift the limitation is actually clean and simple.
So here it is below.

The principal here is that in the case of unaligned IO on
both ends, beginning and end, we will send two read requests
one like old code, before the calculation of the first stripe,
and also a new site, before the calculation of the last stripe.
If any "boundary" is aligned or the complete IO is within a single
stripe. we do a single read like before.

The code is clean and simple by splitting the old _read_4_write
into 3 even parts:
1._read_4_write_first_stripe
2. _read_4_write_last_stripe
3. _read_4_write_execute

And calling 1+3 at the same place as before. 2+3 before last
stripe, and in the case of all in a single stripe then 1+2+3
is preformed additively.

Why did I not think of it before. Well I had a strike of
genius because I have stared at this code for 2 years, and did
not find this simple solution, til today. Not that I did not try.

This solution is much better for NFS than the previous supposedly
solution because the short write was dealt  with out-of-band after
IO_done, which would cause for a seeky IO pattern where as in here
we execute in order. At both solutions we do 2 separate reads, only
here we do it within a single IO request. (And actually combine two
writes into a single submission)

NFS/exofs code need not change since the ORE API communicates the new
shorter length on return, what will happen is that this case would not
occur anymore.

hurray!!

[Stable this is an NFS bug since 3.2 Kernel should apply cleanly]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>

9ff19309

UBIFS: fix a bug in empty space fix-up · c6727932

由 Artem Bityutskiy 提交于 7月 14, 2012

UBIFS has a feature called "empty space fix-up" which is a quirk to work-around
limitations of dumb flasher programs. Namely, of those flashers that are unable
to skip NAND pages full of 0xFFs while flashing, resulting in empty space at
the end of half-filled eraseblocks to be unusable for UBIFS. This feature is
relatively new (introduced in v3.0).

The fix-up routine (fixup_free_space()) is executed only once at the very first
mount if the superblock has the 'space_fixup' flag set (can be done with -F
option of mkfs.ubifs). It basically reads all the UBIFS data and metadata and
writes it back to the same LEB. The routine assumes the image is pristine and
does not have anything in the journal.

There was a bug in 'fixup_free_space()' where it fixed up the log incorrectly.
All but one LEB of the log of a pristine file-system are empty. And one
contains just a commit start node. And 'fixup_free_space()' just unmapped this
LEB, which resulted in wiping the commit start node. As a result, some users
were unable to mount the file-system next time with the following symptom:

UBIFS error (pid 1): replay_log_leb: first log node at LEB 3:0 is not CS node
UBIFS error (pid 1): replay_log_leb: log error detected while replaying the log at LEB 3:0

The root-cause of this bug was that 'fixup_free_space()' wrongly assumed
that the beginning of empty space in the log head (c->lhead_offs) was known
on mount. However, it is not the case - it was always 0. UBIFS does not store
in it the master node and finds out by scanning the log on every mount.

The fix is simple - just pass commit start node size instead of 0 to
'fixup_leb()'.
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@linux.intel.com>
Cc: stable@vger.kernel.org [v3.0+]
Reported-by: NIwo Mergler <Iwo.Mergler@netcommwireless.com>
Tested-by: NIwo Mergler <Iwo.Mergler@netcommwireless.com>
Reported-by: NJames Nute <newten82@gmail.com>

c6727932

18 7月, 2012 2 次提交

ext4: fix duplicated mnt_drop_write call in EXT4_IOC_MOVE_EXT · 331ae496

由 Al Viro 提交于 7月 18, 2012

Caused, AFAICS, by mismerge in commit ff9cb1c4 ("Merge branch
'for_linus' into for_linus_merged")
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org  # 3.3+
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

331ae496

PM: Rename CAP_EPOLLWAKEUP to CAP_BLOCK_SUSPEND · d9914cf6

由 Michael Kerrisk 提交于 7月 17, 2012

As discussed in
http://thread.gmane.org/gmane.linux.kernel/1249726/focus=1288990,
the capability introduced in 4d7e30d9
to govern EPOLLWAKEUP seems misnamed: this capability is about governing
the ability to suspend the system, not using a particular API flag
(EPOLLWAKEUP). We should make the name of the capability more general
to encourage reuse in related cases. (Whether or not this capability
should also be used to govern the use of /sys/power/wake_lock is a
question that needs to be separately resolved.)

This patch renames the capability to CAP_BLOCK_SUSPEND. In order to ensure
that the old capability name doesn't make it out into the wild, could you
please apply and push up the tree to ensure that it is incorporated
for the 3.5 release.
Signed-off-by: NMichael Kerrisk <mtk.manpages@gmail.com>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

d9914cf6

17 7月, 2012 4 次提交

cifs: always update the inode cache with the results from a FIND_* · cd60042c

由 Jeff Layton 提交于 7月 06, 2012

When we get back a FIND_FIRST/NEXT result, we have some info about the
dentry that we use to instantiate a new inode. We were ignoring and
discarding that info when we had an existing dentry in the cache.

Fix this by updating the inode in place when we find an existing dentry
and the uniqueid is the same.

Cc: <stable@vger.kernel.org> # .31.x
Reported-and-Tested-by: NAndrew Bartlett <abartlet@samba.org>
Reported-by: NBill Robertson <bill_robertson@debortoli.com.au>
Reported-by: NDion Edwards <dion_edwards@debortoli.com.au>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

cd60042c

cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmaps · 3cf003c0

由 Jeff Layton 提交于 7月 11, 2012

Jian found that when he ran fsx on a 32 bit arch with a large wsize the
process and one of the bdi writeback kthreads would sometimes deadlock
with a stack trace like this:

crash> bt
PID: 2789   TASK: f02edaa0  CPU: 3   COMMAND: "fsx"
 #0 [eed63cbc] schedule at c083c5b3
 #1 [eed63d80] kmap_high at c0500ec8
 #2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs]
 #3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
 #4 [eed63e50] do_writepages at c04f3e32
 #5 [eed63e54] __filemap_fdatawrite_range at c04e152a
 #6 [eed63ea4] filemap_fdatawrite at c04e1b3e
 #7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
 #8 [eed63ecc] do_sync_write at c052d202
 #9 [eed63f74] vfs_write at c052d4ee
#10 [eed63f94] sys_write at c052df4c
#11 [eed63fb0] ia32_sysenter_target at c0409a98
    EAX: 00000004  EBX: 00000003  ECX: abd73b73  EDX: 012a65c6
    DS:  007b      ESI: 012a65c6  ES:  007b      EDI: 00000000
    SS:  007b      ESP: bf8db178  EBP: bf8db1f8  GS:  0033
    CS:  0073      EIP: 40000424  ERR: 00000004  EFLAGS: 00000246

Each task would kmap part of its address array before getting stuck, but
not enough to actually issue the write.

This patch fixes this by serializing the marshal_iov operations for
async reads and writes. The idea here is to ensure that cifs
aggressively tries to populate a request before attempting to fulfill
another one. As soon as all of the pages are kmapped for a request, then
we can unlock and allow another one to proceed.

There's no need to do this serialization on non-CONFIG_HIGHMEM arches
however, so optimize all of this out when CONFIG_HIGHMEM isn't set.

Cc: <stable@vger.kernel.org>
Reported-by: NJian Li <jiali@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

3cf003c0

cifs: on CONFIG_HIGHMEM machines, limit the rsize/wsize to the kmap space · 3ae629d9

由 Jeff Layton 提交于 7月 11, 2012

We currently rely on being able to kmap all of the pages in an async
read or write request. If you're on a machine that has CONFIG_HIGHMEM
set then that kmap space is limited, sometimes to as low as 512 slots.

With 512 slots, we can only support up to a 2M r/wsize, and that's
assuming that we can get our greedy little hands on all of them. There
are other users however, so it's possible we'll end up stuck with a
size that large.

Since we can't handle a rsize or wsize larger than that currently, cap
those options at the number of kmap slots we have. We could consider
capping it even lower, but we currently default to a max of 1M. Might as
well allow those luddites on 32 bit arches enough rope to hang
themselves.

A more robust fix would be to teach the send and receive routines how
to contend with an array of pages so we don't need to marshal up a kvec
array at all. That's a fairly significant overhaul though, so we'll need
this limit in place until that's ready.

Cc: <stable@vger.kernel.org>
Reported-by: NJian Li <jiali@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

3ae629d9

Initialise mid_q_entry before putting it on the pending queue · ffc61ccb

由 Sachin Prabhu 提交于 7月 11, 2012

A user reported a crash in cifs_demultiplex_thread() caused by an
incorrectly set mid_q_entry->callback() function. It appears that the
callback assignment made in cifs_call_async() was not flushed back to
memory suggesting that a memory barrier was required here. Changing the
code to make sure that the mid_q_entry structure was completely
initialised before it was added to the pending queue fixes the problem.
Signed-off-by: NSachin Prabhu <sprabhu@redhat.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NShirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

ffc61ccb

16 7月, 2012 1 次提交

fifo: Do not restart open() if it already found a partner · 05d290d6

由 Anders Kaseorg 提交于 7月 15, 2012

If a parent and child process open the two ends of a fifo, and the
child immediately exits, the parent may receive a SIGCHLD before its
open() returns.  In that case, we need to make sure that open() will
return successfully after the SIGCHLD handler returns, instead of
throwing EINTR or being restarted.  Otherwise, the restarted open()
would incorrectly wait for a second partner on the other end.

The following test demonstrates the EINTR that was wrongly thrown from
the parent’s open().  Change .sa_flags = 0 to .sa_flags = SA_RESTART
to see a deadlock instead, in which the restarted open() waits for a
second reader that will never come.  (On my systems, this happens
pretty reliably within about 5 to 500 iterations.  Others report that
it manages to loop ~forever sometimes; YMMV.)

  #include <sys/stat.h>
  #include <sys/types.h>
  #include <sys/wait.h>
  #include <fcntl.h>
  #include <signal.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <unistd.h>

  #define CHECK(x) do if ((x) == -1) {perror(#x); abort();} while(0)

  void handler(int signum) {}

  int main()
  {
      struct sigaction act = {.sa_handler = handler, .sa_flags = 0};
      CHECK(sigaction(SIGCHLD, &act, NULL));
      CHECK(mknod("fifo", S_IFIFO | S_IRWXU, 0));
      for (;;) {
          int fd;
          pid_t pid;
          putc('.', stderr);
          CHECK(pid = fork());
          if (pid == 0) {
              CHECK(fd = open("fifo", O_RDONLY));
              _exit(0);
          }
          CHECK(fd = open("fifo", O_WRONLY));
          CHECK(close(fd));
          CHECK(waitpid(pid, NULL, 0));
      }
  }

This is what I suspect was causing the Git test suite to fail in
t9010-svn-fe.sh:

	http://bugs.debian.org/678852Signed-off-by: NAnders Kaseorg <andersk@mit.edu>
Reviewed-by: NJonathan Nieder <jrnieder@gmail.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

05d290d6

14 7月, 2012 5 次提交

xfs: do not call xfs_bdstrat_cb in xfs_buf_iodone_callbacks · 1632dcc9

由 Christoph Hellwig 提交于 7月 13, 2012

xfs_bdstrat_cb only adds a check for a shutdown filesystem over
xfs_buf_iorequest, but xfs_buf_iodone_callbacks just checked for a shut down
filesystem a little earlier.  In addition the shutdown handling in
xfs_bdstrat_cb is not very suitable for this caller.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NBen Myers <bpm@sgi.com>

1632dcc9

xfs: prevent recursion in xfs_buf_iorequest · 40a9b796

由 Christoph Hellwig 提交于 7月 02, 2012

If the b_iodone handler is run in calling context in xfs_buf_iorequest we
can run into a recursion where xfs_buf_iodone_callbacks keeps calling back
into xfs_buf_iorequest because an I/O error happened, which keeps calling
back into xfs_buf_iorequest.  This chain will usually not take long
because the filesystem gets shut down because of log I/O errors, but even
over a short time it can cause stack overflows if run on the same context.

As a short term workaround make sure we always call the iodone handler in
workqueue context.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NBen Myers <bpm@sgi.com>

40a9b796

xfs: don't defer metadata allocation to the workqueue · aa292847

由 Dave Chinner 提交于 7月 12, 2012

Almost all metadata allocations come from shallow stack usage
situations. Avoid the overhead of switching the allocation to a
workqueue as we are not in danger of running out of stack when
making these allocations. Metadata allocations are already marked
through the args that are passed down, so this is trivial to do.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reported-by: NMel Gorman <mgorman@suse.de>
Tested-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NBen Myers <bpm@sgi.com>

aa292847

xfs: really fix the cursor leak in xfs_alloc_ag_vextent_near · e3a746f5

由 Dave Chinner 提交于 7月 12, 2012

The current cursor is reallocated when retrying the allocation, so
the existing cursor needs to be destroyed in both the restart and
the failure cases.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Tested-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NBen Myers <bpm@sgi.com>

e3a746f5

Remove easily user-triggerable BUG from generic_setlease · 8d657eb3

由 Dave Jones 提交于 7月 13, 2012

This can be trivially triggered from userspace by passing in something unexpected.

    kernel BUG at fs/locks.c:1468!
    invalid opcode: 0000 [#1] SMP
    RIP: 0010:generic_setlease+0xc2/0x100
    Call Trace:
      __vfs_setlease+0x35/0x40
      fcntl_setlease+0x76/0x150
      sys_fcntl+0x1c6/0x810
      system_call_fastpath+0x1a/0x1f
Signed-off-by: NDave Jones <davej@redhat.com>
Cc: stable@kernel.org # 3.2+
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8d657eb3

13 7月, 2012 1 次提交

block: fix infinite loop in __getblk_slow · 91f68c89

由 Jeff Moyer 提交于 7月 12, 2012

Commit 080399aa ("block: don't mark buffers beyond end of disk as
mapped") exposed a bug in __getblk_slow that causes mount to hang as it
loops infinitely waiting for a buffer that lies beyond the end of the
disk to become uptodate.

The problem was initially reported by Torsten Hilbrich here:

    https://lkml.org/lkml/2012/6/18/54

and also reported independently here:

    http://www.sysresccd.org/forums/viewtopic.php?f=13&t=4511

and then Richard W.M.  Jones and Marcos Mello noted a few separate
bugzillas also associated with the same issue.  This patch has been
confirmed to fix:

    https://bugzilla.redhat.com/show_bug.cgi?id=835019

The main problem is here, in __getblk_slow:

        for (;;) {
                struct buffer_head * bh;
                int ret;

                bh = __find_get_block(bdev, block, size);
                if (bh)
                        return bh;

                ret = grow_buffers(bdev, block, size);
                if (ret < 0)
                        return NULL;
                if (ret == 0)
                        free_more_memory();
        }

__find_get_block does not find the block, since it will not be marked as
mapped, and so grow_buffers is called to fill in the buffers for the
associated page.  I believe the for (;;) loop is there primarily to
retry in the case of memory pressure keeping grow_buffers from
succeeding.  However, we also continue to loop for other cases, like the
block lying beond the end of the disk.  So, the fix I came up with is to
only loop when grow_buffers fails due to memory allocation issues
(return value of 0).

The attached patch was tested by myself, Torsten, and Rich, and was
found to resolve the problem in call cases.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Reported-and-Tested-by: NTorsten Hilbrich <torsten.hilbrich@secunet.com>
Tested-by: NRichard W.M. Jones <rjones@redhat.com>
Reviewed-by: NJosh Boyer <jwboyer@redhat.com>
Cc: Stable <stable@vger.kernel.org>  # 3.0+
[ Jens is on vacation, taking this directly  - Linus ]
--
Stable Notes: this patch requires backport to 3.0, 3.2 and 3.3.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

91f68c89

12 7月, 2012 3 次提交

fat: fix non-atomic NFS i_pos read · 5d8ecbbc

由 Steven J. Magnani 提交于 7月 11, 2012

fat_encode_fh() can fetch an invalid i_pos value on systems where 64-bit
accesses are not atomic.  Make it use the same accessor as the rest of the
FAT code.
Signed-off-by: NSteven J. Magnani <steve@digidescorp.com>
Acked-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5d8ecbbc

fs: ramfs: file-nommu: add SetPageUptodate() · fea9f718

由 Bob Liu 提交于 7月 11, 2012

There is a bug in the below scenario for !CONFIG_MMU:

 1. create a new file
 2. mmap the file and write to it
 3. read the file can't get the correct value

Because

  sys_read() -> generic_file_aio_read() -> simple_readpage() -> clear_page()

which causes the page to be zeroed.

Add SetPageUptodate() to ramfs_nommu_expand_for_mapping() so that
generic_file_aio_read() do not call simple_readpage().
Signed-off-by: NBob Liu <lliubbo@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fea9f718

ocfs2: fix NULL pointer dereference in __ocfs2_change_file_space() · a4e08d00

由 Luis Henriques 提交于 7月 11, 2012

As ocfs2_fallocate() will invoke __ocfs2_change_file_space() with a NULL
as the first parameter (file), it may trigger a NULL pointer dereferrence
due to a missing check.

Addresses http://bugs.launchpad.net/bugs/1006012Signed-off-by: NLuis Henriques <luis.henriques@canonical.com>
Reported-by: NBret Towe <magnade@gmail.com>
Tested-by: NBret Towe <magnade@gmail.com>
Cc: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: NJoel Becker <jlbec@evilplan.org>
Acked-by: NMark Fasheh <mfasheh@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a4e08d00

11 7月, 2012 1 次提交

NFSv4: Fix an NFSv4 mount regression · f1daf666

由 Trond Myklebust 提交于 7月 10, 2012

The helper nfs_fs_mount() will always call nfs4_try_mount with the
mount_info->fill_super argument pointing to nfs_fill_super, which is
NFSv2/v3 only.
Fix is to have nfs4_try_mount replace it with nfs4_fill_super.

The regression was introduced by commit c40f8d1d (NFS: Create a common
fs_mount() function)
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

f1daf666

08 7月, 2012 2 次提交

NFS: Fix list manipulation snafus in fs/nfs/direct.c · 4035c248

由 Trond Myklebust 提交于 7月 08, 2012

Fix 2 bugs in nfs_direct_write_reschedule:

 - The request needs to be removed from the 'reqs' list before it can
   be added to 'failed'.
 - Fix an infinite loop if the 'failed' list is non-empty.
Reported-by: NJulia Lawall <julia.lawall@lip6.fr>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

4035c248

vfs: make O_PATH file descriptors usable for 'fchdir()' · 332a2e12

由 Linus Torvalds 提交于 7月 07, 2012

We already use them for openat() and friends, but fchdir() also wants to
be able to use O_PATH file descriptors.  This should make it comparable
to the O_SEARCH of Solaris.  In particular, O_PATH allows you to access
(not-quite-open) a directory you don't have read persmission to, only
execute permission.

Noticed during development of multithread support for ksh93.
Reported-by: Nольга крыжановская <olga.kryzhanovska@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org    # O_PATH introduced in 3.0+
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

332a2e12

07 7月, 2012 1 次提交

eCryptfs: Gracefully refuse miscdev file ops on inherited/passed files · 8dc67805

由 Tyler Hicks 提交于 6月 11, 2012

File operations on /dev/ecryptfs would BUG() when the operations were
performed by processes other than the process that originally opened the
file. This could happen with open files inherited after fork() or file
descriptors passed through IPC mechanisms. Rather than calling BUG(), an
error code can be safely returned in most situations.

In ecryptfs_miscdev_release(), eCryptfs still needs to handle the
release even if the last file reference is being held by a process that
didn't originally open the file. ecryptfs_find_daemon_by_euid() will not
be successful, so a pointer to the daemon is stored in the file's
private_data. The private_data pointer is initialized when the miscdev
file is opened and only used when the file is released.

https://launchpad.net/bugs/994247Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
Reported-by: NSasha Levin <levinsasha928@gmail.com>
Tested-by: NSasha Levin <levinsasha928@gmail.com>

8dc67805

04 7月, 2012 8 次提交

ocfs2: Fix bogus error message from ocfs2_global_read_info · a4564ead

由 Jan Kara 提交于 2月 10, 2012

'status' variable in ocfs2_global_read_info() is always != 0 when leaving the
function because it happens to contain number of read bytes. Thus we always log
error message although everything is OK. Since all error cases properly call
mlog_errno() before jumping to out_err, there's no reason to call mlog_errno()
on exit at all. This is a fallout of c1e8d35e (conversion of mlog_exit()
calls).
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

a4564ead

ocfs2: for SEEK_DATA/SEEK_HOLE, return internal error unchanged if... · 65622e64

由 Jeff Liu 提交于 2月 09, 2012

ocfs2: for SEEK_DATA/SEEK_HOLE, return internal error unchanged if ocfs2_get_clusters_nocache() or ocfs2_inode_lock() call failed.

Hello,

Since ENXIO only means "offset beyond EOF" for SEEK_DATA/SEEK_HOLE,
Hence we should return the internal error unchanged if ocfs2_inode_lock() or
ocfs2_get_clusters_nocache() call failed rather than ENXIO.
Otherwise, it will confuse the user applications when they trying to understand the root cause.

Thanks Dave for pointing this out.

Thanks,
-Jeff

Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: NJie Liu <jeff.liu@oracle.com>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

65622e64

ocfs2: use spinlock irqsave for downconvert lock.patch · a75e9cca

由 Srinivas Eeda 提交于 1月 30, 2012

When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it
deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread.
Below is the stack snippet.

The patch disables interrupts when acquiring dc_task_lock spinlock.

	ocfs2_wake_downconvert_thread
	ocfs2_rw_unlock
	ocfs2_dio_end_io
	dio_complete
	.....
	bio_endio
	req_bio_endio
	....
	scsi_io_completion
	blk_done_softirq
	__do_softirq
	do_softirq
	irq_exit
	do_IRQ
	ocfs2_downconvert_thread
	[kthread]
Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

a75e9cca

ocfs2: Misplaced parens in unlikley · 16865b7c

由 roel 提交于 12月 12, 2011

Fix misplaced parentheses
Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

16865b7c

ocfs2: clear unaligned io flag when dio fails · 3e5d3c35

由 Junxiao Bi 提交于 6月 27, 2012

The unaligned io flag is set in the kiocb when an unaligned
dio is issued, it should be cleared even when the dio fails,
or it may affect the following io which are using the same
kiocb.
Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

3e5d3c35

eCryptfs: Fix lockdep warning in miscdev operations · 60d65f1f

由 Tyler Hicks 提交于 6月 11, 2012

Don't grab the daemon mutex while holding the message context mutex.
Addresses this lockdep warning:

 ecryptfsd/2141 is trying to acquire lock:
  (&ecryptfs_msg_ctx_arr[i].mux){+.+.+.}, at: [<ffffffffa029c213>] ecryptfs_miscdev_read+0x143/0x470 [ecryptfs]

 but task is already holding lock:
  (&(*daemon)->mux){+.+...}, at: [<ffffffffa029c2ec>] ecryptfs_miscdev_read+0x21c/0x470 [ecryptfs]

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&(*daemon)->mux){+.+...}:
        [<ffffffff810a3b8d>] lock_acquire+0x9d/0x220
        [<ffffffff8151c6da>] __mutex_lock_common+0x5a/0x4b0
        [<ffffffff8151cc64>] mutex_lock_nested+0x44/0x50
        [<ffffffffa029c5d7>] ecryptfs_send_miscdev+0x97/0x120 [ecryptfs]
        [<ffffffffa029b744>] ecryptfs_send_message+0x134/0x1e0 [ecryptfs]
        [<ffffffffa029a24e>] ecryptfs_generate_key_packet_set+0x2fe/0xa80 [ecryptfs]
        [<ffffffffa02960f8>] ecryptfs_write_metadata+0x108/0x250 [ecryptfs]
        [<ffffffffa0290f80>] ecryptfs_create+0x130/0x250 [ecryptfs]
        [<ffffffff811963a4>] vfs_create+0xb4/0x120
        [<ffffffff81197865>] do_last+0x8c5/0xa10
        [<ffffffff811998f9>] path_openat+0xd9/0x460
        [<ffffffff81199da2>] do_filp_open+0x42/0xa0
        [<ffffffff81187998>] do_sys_open+0xf8/0x1d0
        [<ffffffff81187a91>] sys_open+0x21/0x30
        [<ffffffff81527d69>] system_call_fastpath+0x16/0x1b

 -> #0 (&ecryptfs_msg_ctx_arr[i].mux){+.+.+.}:
        [<ffffffff810a3418>] __lock_acquire+0x1bf8/0x1c50
        [<ffffffff810a3b8d>] lock_acquire+0x9d/0x220
        [<ffffffff8151c6da>] __mutex_lock_common+0x5a/0x4b0
        [<ffffffff8151cc64>] mutex_lock_nested+0x44/0x50
        [<ffffffffa029c213>] ecryptfs_miscdev_read+0x143/0x470 [ecryptfs]
        [<ffffffff811887d3>] vfs_read+0xb3/0x180
        [<ffffffff811888ed>] sys_read+0x4d/0x90
        [<ffffffff81527d69>] system_call_fastpath+0x16/0x1b
Signed-off-by: NTyler Hicks <tyhicks@canonical.com>

60d65f1f

eCryptfs: Properly check for O_RDONLY flag before doing privileged open · 9fe79d76

由 Tyler Hicks 提交于 6月 12, 2012

If the first attempt at opening the lower file read/write fails,
eCryptfs will retry using a privileged kthread. However, the privileged
retry should not happen if the lower file's inode is read-only because a
read/write open will still be unsuccessful.

The check for determining if the open should be retried was intended to
be based on the access mode of the lower file's open flags being
O_RDONLY, but the check was incorrectly performed. This would cause the
open to be retried by the privileged kthread, resulting in a second
failed open of the lower file. This patch corrects the check to
determine if the open request should be handled by the privileged
kthread.
Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NDan Carpenter <dan.carpenter@oracle.com>

9fe79d76

cifs: when server doesn't set CAP_LARGE_READ_X, cap default rsize at MaxBufferSize · ec01d738

由 Jeff Layton 提交于 7月 02, 2012

When the server doesn't advertise CAP_LARGE_READ_X, then MS-CIFS states
that you must cap the size of the read at the client's MaxBufferSize.
Unfortunately, testing with many older servers shows that they often
can't service a read larger than their own MaxBufferSize.

Since we can't assume what the server will do in this situation, we must
be conservative here for the default. When the server can't do large
reads, then assume that it can't satisfy any read larger than its
MaxBufferSize either.

Luckily almost all modern servers can do large reads, so this won't
affect them. This is really just for older win9x and OS/2 era servers.
Also, note that this patch just governs the default rsize. The admin can
always override this if he so chooses.

Cc: <stable@vger.kernel.org> # 3.2
Reported-by: NDavid H. Durgee <dhdurgee@acm.org>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteven French <sfrench@w500smf.(none)>

ec01d738

03 7月, 2012 2 次提交

Btrfs: run delayed directory updates during log replay · b6305567

由 Chris Mason 提交于 7月 02, 2012

While we are resolving directory modifications in the
tree log, we are triggering delayed metadata updates to
the filesystem btrees.

This commit forces the delayed updates to run so the
replay code can find any modifications done.  It stops
us from crashing because the directory deleltion replay
expects items to be removed immediately from the tree.
Signed-off-by: NChris Mason <chris.mason@fusionio.com>
cc: stable@kernel.org

b6305567

Btrfs: hold a ref on the inode during writepages · 7fd1a3f7

由 Josef Bacik 提交于 6月 27, 2012

We can race with unlink and not actually be able to do our igrab in
btrfs_add_ordered_extent. This will result in all sorts of problems.
Instead of doing the complicated work to try and handle returning an error
properly from btrfs_add_ordered_extent, just hold a ref to the inode during
writepages. If we cannot grab a ref we know we're freeing this inode anyway
and can just drop the dirty pages on the floor, because screw them we're
going to invalidate them anyway. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

7fd1a3f7

bug2833 / cloud-kernel 与 Fork 源项目一致

bug2833 / cloud-kernel
与 Fork 源项目一致