提交 · de8d4f5d758786a2cbcfa54a6a85ce747e5637e3 · openeuler / Kernel

15 9月, 2010 1 次提交

aio: check for multiplication overflow in do_io_submit · 75e1c70f

由 Jeff Moyer 提交于 9月 10, 2010

Tavis Ormandy pointed out that do_io_submit does not do proper bounds
checking on the passed-in iocb array:

       if (unlikely(nr < 0))
               return -EINVAL;

       if (unlikely(!access_ok(VERIFY_READ, iocbpp, (nr*sizeof(iocbpp)))))
               return -EFAULT;                      ^^^^^^^^^^^^^^^^^^

The attached patch checks for overflow, and if it is detected, the
number of iocbs submitted is scaled down to a number that will fit in
the long.  This is an ok thing to do, as sys_io_submit is documented as
returning the number of iocbs submitted, so callers should handle a
return value of less than the 'nr' argument passed in.
Reported-by: NTavis Ormandy <taviso@cmpxchg8b.com>
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

75e1c70f

13 9月, 2010 9 次提交

fs/9p: Don't use dotl version of mknod for dotu inode operations · 1d76e313

由 Aneesh Kumar K.V 提交于 8月 30, 2010

We should not use dotlversion for the dotu inode operations
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

1d76e313

fs/9p: Use the correct dentry operations · 3c30750f

由 Aneesh Kumar K.V 提交于 8月 30, 2010

We should use the cached dentry operation only if caching mode is enabled
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

3c30750f

9p: Check for NULL fid in v9fs_dir_release() · 62726a7a

由 jvrao 提交于 8月 25, 2010

NULL fid should be handled in cases where we endup calling v9fs_dir_release()
before even we instantiate the fid in filp.
Signed-off-by: NVenkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

62726a7a

fs/9p: Fix error handling in v9fs_get_sb · 5c25f347

由 Aneesh Kumar K.V 提交于 8月 24, 2010

This was introduced by 7cadb63d58a932041afa3f957d5cbb6ce69dcee5
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

5c25f347

fs/9p, net/9p: memory leak fixes · 62b2be59

由 Latchesar Ionkov 提交于 8月 24, 2010

Four memory leak fixes in the 9P code.
Signed-off-by: NLatchesar Ionkov <lucho@ionkov.net>
Signed-off-by: NEric Van Hensbergen <ericvh@gmail.com>

62b2be59

SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies · 827e3457

由 Trond Myklebust 提交于 9月 12, 2010

The NFSv4 client's callback server calls svc_gss_principal(), which
is defined in the auth_rpcgss.ko

The NFSv4 server has the same dependency, and in addition calls
svcauth_gss_flavor(), gss_mech_get_by_pseudoflavor(),
gss_pseudoflavor_to_service() and gss_mech_put() from the same module.

The module auth_rpcgss itself has no dependencies aside from sunrpc,
so we only need to select RPCSEC_GSS.
Reported-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

827e3457

statfs() gives ESTALE error · fbf3fdd2

由 Menyhart Zoltan 提交于 9月 12, 2010

Hi,

An NFS client executes a statfs("file", &buff) call.
"file" exists / existed, the client has read / written it,
but it has already closed it.

user_path(pathname, &path) looks up "file" successfully in the
directory-cache  and restarts the aging timer of the directory-entry.
Even if "file" has already been removed from the server, because the
lookupcache=positive option I use, keeps the entries valid for a while.

nfs_statfs() returns ESTALE if "file" has already been removed from the
server.

If the user application repeats the statfs("file", &buff) call, we
are stuck: "file" remains young forever in the directory-cache.
Signed-off-by: NZoltan Menyhart  <Zoltan.Menyhart@bull.net>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org

fbf3fdd2

NFS: Fix a typo in nfs_sockaddr_match_ipaddr6 · b20d37ca

由 Trond Myklebust 提交于 9月 12, 2010

Reported-by: NBen Greear <greearb@candelatech.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org

b20d37ca

Remove incorrect do_vfs_lock message · b1bde04c

由 Fabio Olive Leite 提交于 9月 12, 2010

The do_vfs_lock function on fs/nfs/file.c is only called if NLM is
not being used, via the -onolock mount option. Therefore it cannot
really be "out of sync with lock manager" when the local locking
function called returns an error, as there will be no corresponding
call to the NLM. For details, simply check the if/else on do_setlk
and do_unlk on fs/nfs/file.c.
Signed-Off-By: NFabio Olive Leite <fleite@redhat.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

b1bde04c

10 9月, 2010 11 次提交

xfs: log IO completion workqueue is a high priority queue · 51749e47

由 Dave Chinner 提交于 9月 08, 2010

The workqueue implementation in 2.6.36-rcX has changed, resulting
in the workqueues no longer having dedicated threads for work
processing. This has caused severe livelocks under heavy parallel
create workloads because the log IO completions have been getting
held up behind metadata IO completions.  Hence log commits would
stall, memory allocation would stall because pages could not be
cleaned, and lock contention on the AIL during inode IO completion
processing was being seen to slow everything down even further.

By making the log Io completion workqueue a high priority workqueue,
they are queued ahead of all data/metadata IO completions and
processed before the data/metadata completions. Hence the log never
gets stalled, and operations needed to clean memory can continue as
quickly as possible. This avoids the livelock conditions and allos
the system to keep running under heavy load as per normal.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

51749e47

execve: make responsive to SIGKILL with large arguments · 9aea5a65

由 Roland McGrath 提交于 9月 07, 2010

An execve with a very large total of argument/environment strings
can take a really long time in the execve system call. It runs
uninterruptibly to count and copy all the strings. This change
makes it abort the exec quickly if sent a SIGKILL.

Note that this is the conservative change, to interrupt only for
SIGKILL, by using fatal_signal_pending(). It would be perfectly
correct semantics to let any signal interrupt the string-copying in
execve, i.e. use signal_pending() instead of fatal_signal_pending().
We'll save that change for later, since it could have user-visible
consequences, such as having a timer set too quickly make it so that
an execve can never complete, though it always happened to work before.
Signed-off-by: NRoland McGrath <roland@redhat.com>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9aea5a65

execve: improve interactivity with large arguments · 7993bc1f

由 Roland McGrath 提交于 9月 07, 2010

This adds a preemption point during the copying of the argument and
environment strings for execve, in copy_strings().  There is already
a preemption point in the count() loop, so this doesn't add any new
points in the abstract sense.

When the total argument+environment strings are very large, the time
spent copying them can be much more than a normal user time slice.
So this change improves the interactivity of the rest of the system
when one process is doing an execve with very large arguments.
Signed-off-by: NRoland McGrath <roland@redhat.com>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7993bc1f

setup_arg_pages: diagnose excessive argument size · 1b528181

由 Roland McGrath 提交于 9月 07, 2010

The CONFIG_STACK_GROWSDOWN variant of setup_arg_pages() does not
check the size of the argument/environment area on the stack.
When it is unworkably large, shift_arg_pages() hits its BUG_ON.
This is exploitable with a very large RLIMIT_STACK limit, to
create a crash pretty easily.

Check that the initial stack is not too large to make it possible
to map in any executable.  We're not checking that the actual
executable (or intepreter, for binfmt_elf) will fit.  So those
mappings might clobber part of the initial stack mapping.  But
that is just userland lossage that userland made happen, not a
kernel problem.
Signed-off-by: NRoland McGrath <roland@redhat.com>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1b528181

xfs: prevent reading uninitialized stack memory · a122eb2f

由 Dan Rosenberg 提交于 9月 06, 2010

The XFS_IOC_FSGETXATTR ioctl allows unprivileged users to read 12
bytes of uninitialized stack memory, because the fsxattr struct
declared on the stack in xfs_ioc_fsgetxattr() does not alter (or zero)
the 12-byte fsx_pad member before copying it back to the user.  This
patch takes care of it.
Signed-off-by: NDan Rosenberg <dan.j.rosenberg@gmail.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

a122eb2f

minix: fix regression in minix_mkdir() · eee743fd

由 Jorge Boncompte [DTI2] 提交于 9月 09, 2010

Commit 9eed1fb7 ("minix: replace inode uid,gid,mode init with helper")
broke directory creation on minix filesystems.

Fix it by passing the needed mode flag to inode init helper.
Signed-off-by: NJorge Boncompte [DTI2] <jorge@dti2.net>
Cc: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@kernel.org>		[2.6.35.x]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eee743fd

vfs: take O_NONBLOCK out of the O_* uniqueness test · 3ab04d5c

由 James Bottomley 提交于 9月 09, 2010

O_NONBLOCK on parisc has a dual value:

#define O_NONBLOCK	000200004 /* HPUX has separate NDELAY & NONBLOCK */

It is caught by the O_* bits uniqueness check and leads to a parisc
compile error.  The fix would be to take O_NONBLOCK out.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
Cc: Jamie Lokier <jamie@shareable.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3ab04d5c

binfmt_misc: fix binfmt_misc priority · ee3aebdd

由 Jan Sembera 提交于 9月 09, 2010

Commit 74641f58 ("alpha: binfmt_aout fix") (May 2009) introduced a
regression - binfmt_misc is now consulted after binfmt_elf, which will
unfortunately break ia32el.  ia32 ELF binaries on ia64 used to be matched
using binfmt_misc and executed using wrapper.  As 32bit binaries are now
matched by binfmt_elf before bindmt_misc kicks in, the wrapper is ignored.

The fix increases precedence of binfmt_misc to the original state.
Signed-off-by: NJan Sembera <jsembera@suse.cz>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Richard Henderson <rth@twiddle.net
Cc: <stable@kernel.org>		[2.6.everything.x]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ee3aebdd

proc: export uncached bit properly in /proc/kpageflags · ed430fec

由 Takashi Iwai 提交于 9月 09, 2010

Fix the left-over old ifdef for PG_uncached in /proc/kpageflags.  Now it's
used by x86, too.
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ed430fec

O_DIRECT: fix the splitting up of contiguous I/O · 7a801ac6

由 Jeff Moyer 提交于 9月 09, 2010

commit c2c6ca41 (direct-io: do not merge logically non-contiguous requests)
introduced a bug whereby all O_DIRECT I/Os were submitted a page at a time
to the block layer.  The problem is that the code expected
dio->block_in_file to correspond to the current page in the dio.  In fact,
it corresponds to the previous page submitted via submit_page_section.
This was purely an oversight, as the dio->cur_page_fs_offset field was
introduced for just this purpose.  This patch simply uses the correct
variable when calculating whether there is a mismatch between contiguous
logical blocks and contiguous physical blocks (as described in the
comments).

I also switched the if conditional following this check to an else if, to
ensure that we never call dio_bio_submit twice for the same dio (in
theory, this should not happen, anyway).

I've tested this by running blktrace and verifying that a 64KB I/O was
submitted as a single I/O.  I also ran the patched kernel through
xfstests' aio tests using xfs, ext4 (with 1k and 4k block sizes) and btrfs
and verified that there were no regressions as compared to an unpatched
kernel.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Acked-by: NJosef Bacik <jbacik@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: <stable@kernel.org>		[2.6.35.x]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7a801ac6

mm: Move vma_stack_continue into mm.h · 39aa3cb3

由 Stefan Bader 提交于 8月 31, 2010

So it can be used by all that need to check for that.
Signed-off-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

39aa3cb3

09 9月, 2010 9 次提交

cifs: prevent possible memory corruption in cifs_demultiplex_thread · 32670396

由 Jeff Layton 提交于 9月 03, 2010

cifs_demultiplex_thread sets the addr.sockAddr.sin_port without any
regard for the socket family. While it may be that the error in question
here never occurs on an IPv6 socket, it's probably best to be safe and
set the port properly if it ever does.

Break the port setting code out of cifs_fill_sockaddr and into a new
function, and call that from cifs_demultiplex_thread.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

32670396

cifs: eliminate some more premature cifsd exits · 7332f2a6

由 Jeff Layton 提交于 9月 03, 2010

If the tcpStatus is still CifsNew, the main cifs_demultiplex_loop can
break out prematurely in some cases. This is wrong as we will almost
always have other structures with pointers to the TCP_Server_Info. If
the main loop breaks under any other condition other than tcpStatus ==
CifsExiting, then it'll face a use-after-free situation.

I don't see any reason to treat a CifsNew tcpStatus differently than
CifsGood. I believe we'll still want to attempt to reconnect in either
case. What should happen in those situations is that the MIDs get marked
as MID_RETRY_NEEDED. This will make CIFSSMBNegotiate return -EAGAIN, and
then the caller can retry the whole thing on a newly reconnected socket.
If that fails again in the same way, the caller of cifs_get_smb_ses
should tear down the TCP_Server_Info struct.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

7332f2a6

cifs: prevent cifsd from exiting prematurely · 522bbe65

由 Jeff Layton 提交于 9月 03, 2010

When cifs_demultiplex_thread exits, it does a number of cleanup tasks
including freeing the TCP_Server_Info struct. Much of the existing code
in cifs assumes that when there is a cisfSesInfo struct, that it holds a
reference to a valid TCP_Server_Info struct.

We can never allow cifsd to exit when a cifsSesInfo struct is still
holding a reference to the server. The server pointers will then point
to freed memory.

This patch eliminates a couple of questionable conditions where it does
this.  The idea here is to make an -EINTR return from kernel_recvmsg
behave the same way as -ERESTARTSYS or -EAGAIN. If the task was
signalled from cifs_put_tcp_session, then tcpStatus will be CifsExiting,
and the kernel_recvmsg call will return quickly.

There's also another condition where this can occur too -- if the
tcpStatus is still in CifsNew, then it will also exit if the server
closes the socket prematurely.  I think we'll probably also need to fix
that situation, but that requires a bit more consideration.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

522bbe65

[CIFS] ntlmv2/ntlmssp remove-unused-function CalcNTLMv2_partial_mac_key · 4266d911

由 Steve French 提交于 9月 08, 2010

This function is not used, so remove the definition and declaration.
Reviewed-by: NJeff Layton <jlayton@samba.org>
Signed-off-by: NShirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

4266d911

cifs: eliminate redundant xdev check in cifs_rename · 639e7a91

由 Jeff Layton 提交于 9月 03, 2010

The VFS always checks that the source and target of a rename are on the
same vfsmount, and hence have the same superblock. So, this check is
redundant. Remove it and simplify the error handling.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

639e7a91

Revert "[CIFS] Fix ntlmv2 auth with ntlmssp" · c8e56f1f

由 Steve French 提交于 9月 08, 2010

This reverts commit 9fbc5908.

The change to kernel crypto and fixes to ntlvm2 and ntlmssp
series, introduced a regression.  Deferring this patch series
to 2.6.37 after Shirish fixes it.
Signed-off-by: NSteve French <sfrench@us.ibm.com>
Acked-by: NJeff Layton <jlayton@redhat.com>
CC: Shirish Pargaonkar <shirishp@us.ibm.com>

c8e56f1f

Revert "missing changes during ntlmv2/ntlmssp auth and sign" · 745e507a

由 Steve French 提交于 9月 08, 2010

This reverts commit 3ec6bbcd.

    The change to kernel crypto and fixes to ntlvm2 and ntlmssp
    series, introduced a regression.  Deferring this patch series
    to 2.6.37 after Shirish fixes it.
Signed-off-by: NSteve French <sfrench@us.ibm.com>
Acked-by: NJeff Layton <jlayton@redhat.com>
CC: Shirish Pargaonkar <shirishp@us.ibm.com>

745e507a

Revert "Eliminate sparse warning - bad constant expression" · 56234e27

由 Steve French 提交于 9月 08, 2010

This reverts commit 2d20ca83.

    The change to kernel crypto and fixes to ntlvm2 and ntlmssp
    series, introduced a regression.  Deferring this patch series
    to 2.6.37 after Shirish fixes it.
Signed-off-by: NSteve French <sfrench@us.ibm.com>
Acked-by: NJeff Layton <jlayton@redhat.com>
CC: Shirish Pargaonkar <shirishp@us.ibm.com>

56234e27

Revert "[CIFS] Eliminate unused variable warning" · 7100ae97

由 Steve French 提交于 9月 08, 2010

The change to kernel crypto and fixes to ntlvm2 and ntlmssp
series, introduced a regression.  Deferring this patch series
to 2.6.37 after Shirish fixes it.

This reverts commit c89e5198.
Signed-off-by: NSteve French <sfrench@us.ibm.com>
Acked-by: NJeff Layton <jlayton@redhat.com>
CC: Shirish Pargaonkar <shirishp@us.ibm.com>

7100ae97

08 9月, 2010 10 次提交

ocfs2: Fix orphan add in ocfs2_create_inode_in_orphan · 97b8f4a9

由 Mark Fasheh 提交于 8月 13, 2010

ocfs2_create_inode_in_orphan() is used by reflink to create the newly
reflinked inode simultaneously in the orphan dir. This allows us to easily
handle partially-reflinked files during recovery cleanup.

We have a problem though - the orphan dir stringifies inode # to determine
a unique name under which the orphan entry dirent can be created. Since
ocfs2_create_inode_in_orphan() needs the space allocated in the orphan dir
before it can allocate the inode, we currently call into the orphan code:

       /*
        * We give the orphan dir the root blkno to fake an orphan name,
        * and allocate enough space for our insertion.
        */
       status = ocfs2_prepare_orphan_dir(osb, &orphan_dir,
                                         osb->root_blkno,
                                         orphan_name, &orphan_insert);

Using osb->root_blkno might work fine on unindexed directories, but the
orphan dir can have an index.  When it has that index, the above code fails
to allocate the proper index entry.  Later, when we try to remove the file
from the orphan dir (using the actual inode #), the reflink operation will
fail.

To fix this, I created a function ocfs2_alloc_orphaned_file() which uses the
newly split out orphan and inode alloc code to figure out what the inode
block number will be (once allocated) and then prepare the orphan dir from
that data.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NTao Ma <tao.ma@oracle.com>

97b8f4a9

ocfs2: split out ocfs2_prepare_orphan_dir() into locking and prep functions · dd43bcde

由 Mark Fasheh 提交于 8月 13, 2010

We do this because ocfs2_create_inode_in_orphan() wants to order locking of
the orphan dir with respect to locking of the inode allocator *before*
making any changes to the directory.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NTao Ma <tao.ma@oracle.com>

dd43bcde

ocfs2: allow return of new inode block location before allocation of the inode · e49e2767

由 Mark Fasheh 提交于 8月 13, 2010

This allows code which needs to know the eventual block number of an inode
but can't allocate it yet due to transaction or lock ordering. For example,
ocfs2_create_inode_in_orphan() currently gives a junk blkno for preparation
of the orphan dir because it can't yet know where the actual inode is placed
- that code is actually in ocfs2_mknod_locked. This is a problem when the
orphan dirs are indexed as the junk inode number will create an index entry
which goes unused (and fails the later removal from the orphan dir).  Now
with these interfaces, ocfs2_create_inode_in_orphan() can run the block
group search (and get back the inode block number) *before* any actual
allocation occurs.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NTao Ma <tao.ma@oracle.com>

e49e2767

ocfs2: use ocfs2_alloc_dinode_update_counts() instead of open coding · d5134982

由 Mark Fasheh 提交于 8月 13, 2010

ocfs2_search_chain() makes the same updates as
ocfs2_alloc_dinode_update_counts to the alloc inode. Instead of open coding
the bitmap update, use our helper function.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NTao Ma <tao.ma@oracle.com>

d5134982

ocfs2: split out inode alloc code from ocfs2_mknod_locked · 021960ca

由 Mark Fasheh 提交于 8月 13, 2010

Do this by splitting the bulk of the function away from the inode allocation
code at the very tom of ocfs2_mknod_locked(). Existing callers don't need to
change and won't see any difference. The new function created,
__ocfs2_mknod_locked() will be used shortly.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NTao Ma <tao.ma@oracle.com>

021960ca

Ocfs2: Fix a regression bug from mainline commit(). · 81c8c82b

由 Tristan Ye 提交于 8月 19, 2010

The patch is to fix the regression bug brought from commit 6b933c8e...( 'ocfs2:
Avoid direct write if we fall back to buffered I/O'):

http://oss.oracle.com/bugzilla/show_bug.cgi?id=1285

The commit 6b933c8e changed __generic_file_aio_write
to generic_file_buffered_write, which didn't call filemap_{write,wait}_range to  flush
the pagecaches when we were falling O_DIRECT writes back to buffered ones. it did hurt
the O_DIRECT semantics somehow in extented odirect writes.

This patch tries to guarantee O_DIRECT writes of 'fall back to buffered' to be correctly
flushed.
Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
Signed-off-by: NTao Ma <tao.ma@oracle.com>

81c8c82b

ocfs2: Fix deadlock when allocating page · 9b4c0ff3

由 Jan Kara 提交于 8月 24, 2010

We cannot call grab_cache_page() when holding filesystem locks or with
a transaction started as grab_cache_page() calls page allocation with
GFP_KERNEL flag and thus page reclaim can recurse back into the filesystem
causing deadlocks or various assertion failures. We have to use
find_or_create_page() instead and pass it GFP_NOFS as we do with other
allocations.
Acked-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTao Ma <tao.ma@oracle.com>

9b4c0ff3

ocfs2: properly set and use inode group alloc hint · b2b6ebf5

由 Mark Fasheh 提交于 8月 26, 2010

We were setting ac->ac_last_group in ocfs2_claim_suballoc_bits from
res->sr_bg_blkno.  Unfortunately, res->sr_bg_blkno is going to be zero under
normal (non-fragmented) circumstances. The discontig block group patches
effectively turned off that feature. Fix this by correctly calculating what
the next group hint should be.
Acked-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Tested-by: NGoldwyn Rodrigues <rgoldwyn@suse.de>
Signed-off-by: NTao Ma <tao.ma@oracle.com>

b2b6ebf5

ocfs2: Use the right group in nfs sync check. · 889f004a

由 Tao Ma 提交于 9月 02, 2010

We have added discontig block group now, and now an inode
can be allocated in an discontig block group. So get
it in ocfs2_get_suballoc_slot_bit.

The old ocfs2_test_suballoc_bit gets group block no
from the allocation inode which is wrong. Fix it by
passing the right group.
Acked-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NTao Ma <tao.ma@oracle.com>

889f004a

ocfs2: Flush drive's caches on fdatasync · 04eda1a1

由 Jan Kara 提交于 8月 05, 2010

When 'barrier' mount option is specified, we have to issue a cache flush
during fdatasync(2). We have to do this even if inode doesn't have
I_DIRTY_DATASYNC set because we still have to get written *data* to disk so
that they are not lost in case of crash.
Acked-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Singed-off-by: NTao Ma <tao.ma@oracle.com>

04eda1a1

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功