提交 · b5a7e97039a80fae673ccc115ce595d5b88fb4ee · openanolis / cloud-kernel

12 12月, 2011 1 次提交

ext4: fix ext4_end_io_dio() racing against fsync() · b5a7e970

由 Theodore Ts'o 提交于 12月 12, 2011

We need to make sure iocb->private is cleared *before* we put the
io_end structure on i_completed_io_list.  Otherwise fsync() could
potentially run on another CPU and free the iocb structure out from
under us.
Reported-by: NKent Overstreet <koverstreet@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

b5a7e970

09 12月, 2011 6 次提交

procfs: do not overflow get_{idle,iowait}_time for nohz · 2a95ea6c

由 Michal Hocko 提交于 12月 08, 2011

Since commit a25cac51 ("proc: Consider NO_HZ when printing idle and
iowait times") we are reporting idle/io_wait time also while a CPU is
tickless.  We rely on get_{idle,iowait}_time functions to retrieve
proper data.

These functions, however, use usecs_to_cputime to translate micro
seconds time to cputime64_t.  This is just an alias to usecs_to_jiffies
which reduces the data type from u64 to unsigned int and also checks
whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET)
and returns MAX_JIFFY_OFFSET in that case.

When we overflow depends on CONFIG_HZ but especially for CONFIG_HZ_300
it is quite low (1431649781) so we are getting MAX_JIFFY_OFFSET for
>3000s! until we overflow unsigned int.  Just for reference
CONFIG_HZ_100 has an overflow window around 20s, CONFIG_HZ_250 ~8s and
CONFIG_HZ_1000 ~2s.

This results in a bug when people saw [h]top going mad reporting 100%
CPU usage even though there was basically no CPU load.  The reason was
simply that /proc/stat stopped reporting idle/io_wait changes (and
reported MAX_JIFFY_OFFSET) and so the only change happening was for user
system time.

Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision
to 32b type and it is much more appropriate for cumulative time values
(unlike usecs_to_jiffies which intended for timeout calculations).
Signed-off-by: NMichal Hocko <mhocko@suse.cz>
Tested-by: NArtem S. Tashkinov <t.artem@mailcity.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2a95ea6c

fs/proc/meminfo.c: fix compilation error · b53fc7c2

由 Claudio Scordino 提交于 12月 08, 2011

Fix the error message "directives may not be used inside a macro argument"
which appears when the kernel is compiled for the cris architecture.
Signed-off-by: NClaudio Scordino <claudio@evidence.eu.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b53fc7c2

cifs: check for NULL last_entry before calling cifs_save_resume_key · 7023676f

由 Jeff Layton 提交于 12月 01, 2011

Prior to commit eaf35b1e, cifs_save_resume_key had some NULL pointer
checks at the top. It turns out that at least one of those NULL
pointer checks is needed after all.

When the LastNameOffset in a FIND reply appears to be beyond the end of
the buffer, CIFSFindFirst and CIFSFindNext will set srch_inf.last_entry
to NULL. Since eaf35b1e, the code will now oops in this situation.

Fix this by having the callers check for a NULL last entry pointer
before calling cifs_save_resume_key. No change is needed for the
call site in cifs_readdir as it's not reachable with a NULL
current_entry pointer.

This should fix:

    https://bugzilla.redhat.com/show_bug.cgi?id=750247

Cc: stable@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Reported-by: NAdam G. Metzler <adamgmetzler@gmail.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

7023676f

cifs: attempt to freeze while looping on a receive attempt · 95edcff4

由 Jeff Layton 提交于 12月 01, 2011

In the recent overhaul of the demultiplex thread receive path, I
neglected to ensure that we attempt to freeze on each pass through the
receive loop.
Reported-and-Tested-by: NWoody Suwalski <terraluna977@gmail.com>
Reported-and-Tested-by: NAdam Williamson <awilliam@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

95edcff4

cifs: Fix sparse warning when calling cifs_strtoUCS · 59edb63a

由 Steve French 提交于 11月 10, 2011

Fix sparse endian check warning while calling cifs_strtoUCS

CHECK   fs/cifs/smbencrypt.c
fs/cifs/smbencrypt.c:216:37: warning: incorrect type in argument 1
(different base types)
fs/cifs/smbencrypt.c:216:37:    expected restricted __le16 [usertype] *<noident>
fs/cifs/smbencrypt.c:216:37:    got unsigned short *<noident>
Signed-off-by: NSteve French <smfrench@gmail.com>
Acked-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com

59edb63a

P
CIFS: Add descriptions to the brlock cache functions · 9a5101c8
由 Pavel Shilovsky 提交于 11月 07, 2011
```
Signed-off-by: NPavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: NSteve French <sfrench@us.ibm.com>
```
9a5101c8

08 12月, 2011 4 次提交

Btrfs: drop spin lock when memory alloc fails · 1cf4ffdb

由 Liu Bo 提交于 12月 07, 2011

Drop spin lock in convert_extent_bit() when memory alloc fails,
otherwise, it will be a deadlock.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1cf4ffdb

Btrfs: check if the to-be-added device is writable · a5d16333

由 Li Zefan 提交于 12月 07, 2011

If we call ioctl(BTRFS_IOC_ADD_DEV) directly, we'll succeed in adding
a readonly device to a btrfs filesystem, and btrfs will write to
that device, emitting kernel errors:

[ 3109.833692] lost page write due to I/O error on loop2
[ 3109.833720] lost page write due to I/O error on loop2
...
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a5d16333

Btrfs: try cluster but don't advance in search list · 274bd4fb

由 Alexandre Oliva 提交于 12月 07, 2011

When we find an existing cluster, we switch to its block group as the
current block group, possibly skipping multiple blocks in the process.
Furthermore, under heavy contention, multiple threads may fail to
allocate from a cluster and then release just-created clusters just to
proceed to create new ones in a different block group.

This patch tries to allocate from an existing cluster regardless of its
block group, and doesn't switch to that group, instead proceeding to
try to allocate a cluster from the group it was iterating before the
attempt.
Signed-off-by: NAlexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

274bd4fb

Btrfs: try to allocate from cluster even at LOOP_NO_EMPTY_SIZE · 062c05c4

由 Alexandre Oliva 提交于 12月 07, 2011

If we reach LOOP_NO_EMPTY_SIZE, we won't even try to use a cluster that
others might have set up.  Odds are that there won't be one, but if
someone else succeeded in setting it up, we might as well use it, even
if we don't try to set up a cluster again.
Signed-off-by: NAlexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

062c05c4

07 12月, 2011 3 次提交

fix apparmor dereferencing potentially freed dentry, sanitize __d_path() API · 02125a82

由 Al Viro 提交于 12月 05, 2011

__d_path() API is asking for trouble and in case of apparmor d_namespace_path()
getting just that.  The root cause is that when __d_path() misses the root
it had been told to look for, it stores the location of the most remote ancestor
in *root.  Without grabbing references.  Sure, at the moment of call it had
been pinned down by what we have in *path.  And if we raced with umount -l, we
could have very well stopped at vfsmount/dentry that got freed as soon as
prepend_path() dropped vfsmount_lock.

It is safe to compare these pointers with pre-existing (and known to be still
alive) vfsmount and dentry, as long as all we are asking is "is it the same
address?".  Dereferencing is not safe and apparmor ended up stepping into
that.  d_namespace_path() really wants to examine the place where we stopped,
even if it's not connected to our namespace.  As the result, it looked
at ->d_sb->s_magic of a dentry that might've been already freed by that point.
All other callers had been careful enough to avoid that, but it's really
a bad interface - it invites that kind of trouble.

The fix is fairly straightforward, even though it's bigger than I'd like:
	* prepend_path() root argument becomes const.
	* __d_path() is never called with NULL/NULL root.  It was a kludge
to start with.  Instead, we have an explicit function - d_absolute_root().
Same as __d_path(), except that it doesn't get root passed and stops where
it stops.  apparmor and tomoyo are using it.
	* __d_path() returns NULL on path outside of root.  The main
caller is show_mountinfo() and that's precisely what we pass root for - to
skip those outside chroot jail.  Those who don't want that can (and do)
use d_path().
	* __d_path() root argument becomes const.  Everyone agrees, I hope.
	* apparmor does *NOT* try to use __d_path() or any of its variants
when it sees that path->mnt is an internal vfsmount.  In that case it's
definitely not mounted anywhere and dentry_path() is exactly what we want
there.  Handling of sysctl()-triggered weirdness is moved to that place.
	* if apparmor is asked to do pathname relative to chroot jail
and __d_path() tells it we it's not in that jail, the sucker just calls
d_absolute_path() instead.  That's the other remaining caller of __d_path(),
BTW.
        * seq_path_root() does _NOT_ return -ENAMETOOLONG (it's stupid anyway -
the normal seq_file logics will take care of growing the buffer and redoing
the call of ->show() just fine).  However, if it gets path not reachable
from root, it returns SEQ_SKIP.  The only caller adjusted (i.e. stopped
ignoring the return value as it used to do).
Reviewed-by: NJohn Johansen <john.johansen@canonical.com>
ACKed-by: NJohn Johansen <john.johansen@canonical.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org

02125a82

xfs: fix the logspace waiting algorithm · 9f9c19ec

由 Christoph Hellwig 提交于 11月 28, 2011

Apply the scheme used in log_regrant_write_log_space to wake up any other
threads waiting for log space before the newly added one to
log_regrant_write_log_space as well, and factor the code into readable
helpers.  For each of the queues we have add two helpers:

 - one to try to wake up all waiting threads.  This helper will also be
   usable by xfs_log_move_tail once we remove the current opportunistic
   wakeups in it.
 - one to sleep on t_wait until enough log space is available, loosely
   modelled after Linux waitqueues.
 
And use them to reimplement the guts of log_regrant_write_log_space and
log_regrant_write_log_space.  These two function now use one and the same
algorithm for waiting on log space instead of subtly different ones before,
with an option to completely unify them in the near future.

Also move the filesystem shutdown handling to the common caller given
that we had to touch it anyway.

Based on hard debugging and an earlier patch from
Chandra Seetharaman <sekharan@us.ibm.com>.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChandra Seetharaman <sekharan@us.ibm.com>
Tested-by: NChandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: NBen Myers <bpm@sgi.com>

9f9c19ec

xfs: fix nfs export of 64-bit inodes numbers on 32-bit kernels · c29f7d45

由 Christoph Hellwig 提交于 11月 30, 2011

The i_ino field in the VFS inode is of type unsigned long and thus can't
hold the full 64-bit inode number on 32-bit kernels.  We have the full
inode number in the XFS inode, so use that one for nfs exports.  Note
that I've also switched the 32-bit file handles types to it, just to make
the code more consistent and copy & paste errors less likely to happen.
Reported-by: NGuoquan Yang <ygq51@hotmail.com>
Reported-by: NHank Peng <pengxihan@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NBen Myers <bpm@sgi.com>

c29f7d45

03 12月, 2011 1 次提交

xfs: fix allocation length overflow in xfs_bmapi_write() · a99ebf43

由 Dave Chinner 提交于 12月 01, 2011

When testing the new xfstests --large-fs option that does very large
file preallocations, this assert was tripped deep in
xfs_alloc_vextent():

XFS: Assertion failed: args->minlen <= args->maxlen, file: fs/xfs/xfs_alloc.c, line: 2239

The allocation was trying to allocate a zero length extent because
the lower 32 bits of the allocation length was zero. The remaining
length of the allocation to be done was an exact multiple of 2^32 -
the first case I saw was at 496TB remaining to be allocated.

This turns out to be an overflow when converting the allocation
length (a 64 bit quantity) into the extent length to allocate (a 32
bit quantity), and it requires the length to be allocated an exact
multiple of 2^32 blocks to trip the assert.

Fix it by limiting the extent lenth to allocate to MAXEXTLEN.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NBen Myers <bpm@sgi.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

a99ebf43

02 12月, 2011 1 次提交

ocfs2: avoid unaligned access to dqc_bitmap · 93925579

由 Akinobu Mita 提交于 11月 15, 2011

The dqc_bitmap field of struct ocfs2_local_disk_chunk is 32-bit aligned,
but not 64-bit aligned.  The dqc_bitmap is accessed by ocfs2_set_bit(),
ocfs2_clear_bit(), ocfs2_test_bit(), or ocfs2_find_next_zero_bit().  These
are wrapper macros for ext2_*_bit() which need to take an unsigned long
aligned address (though some architectures are able to handle unaligned
address correctly)

So some 64bit architectures may not be able to access the dqc_bitmap
correctly.

This avoids such unaligned access by using another wrapper functions for
ext2_*_bit().  The code is taken from fs/ext4/mballoc.c which also need to
handle unaligned bitmap access.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Acked-by: NJoel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJoel Becker <jlbec@evilplan.org>

93925579

01 12月, 2011 10 次提交

Btrfs: fix meta data raid-repair merge problem · f4a8e656

由 Jan Schmidt 提交于 12月 01, 2011

Commit 4a54c8c1 introduced raid-repair, killing the individual
readpage_io_failed_hook entries from inode.c and disk-io.c. Commit
4bb31e92 introduced new readahead code, adding a readpage_io_failed_hook to
disk-io.c.

The raid-repair commit had logic to disable raid-repair, if
readpage_io_failed_hook is set. Thus, the readahead commit effectively
disabled raid-repair for meta data.

This commit changes the logic to always attempt raid-repair when needed and
call the readpage_io_failed_hook in case raid-repair fails. This is much
more straight forward and should have been like that from the beginning.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Reported-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f4a8e656

Btrfs: skip allocation attempt from empty cluster · be064d11

由 Alexandre Oliva 提交于 11月 30, 2011

If we don't have a cluster, don't bother trying to allocate from it,
jumping right away to the attempt to allocate a new cluster.
Signed-off-by: NAlexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

be064d11

Btrfs: skip block groups without enough space for a cluster · 425d8315

由 Alexandre Oliva 提交于 11月 30, 2011

We test whether a block group has enough free space to hold the
requested block, but when we're doing clustered allocation, we can
save some cycles by testing whether it has enough room for the cluster
upfront, otherwise we end up attempting to set up a cluster and
failing. Only in the NO_EMPTY_SIZE loop do we attempt an unclustered
allocation, and by then we'll have zeroed the cluster size, so this
patch won't stop us from using the block group as a last resort.
Signed-off-by: NAlexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

425d8315

Btrfs: start search for new cluster at the beginning · 1b22bad7

由 Alexandre Oliva 提交于 11月 30, 2011

Instead of starting at zero (offset is always zero), request a cluster
starting at search_start, that denotes the beginning of the current
block group.
Signed-off-by: NAlexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1b22bad7

Btrfs: reset cluster's max_size when creating bitmap · b78d09bc

由 Alexandre Oliva 提交于 11月 30, 2011

The field that indicates the size of the largest contiguous chunk of
free space in the cluster is not initialized when setting up bitmaps,
it's only increased when we find a larger contiguous chunk.  We end up
retaining a larger value than appropriate for highly-fragmented
clusters, which may cause pointless searches for large contiguous
groups, and even cause clusters that do not meet the density
requirements to be set up.
Signed-off-by: NAlexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b78d09bc

Btrfs: initialize new bitmaps' list · f2d0f676

由 Alexandre Oliva 提交于 11月 28, 2011

We're failing to create clusters with bitmaps because
setup_cluster_no_bitmap checks that the list is empty before inserting
the bitmap entry in the list for setup_cluster_bitmap, but the list
field is only initialized when it is restored from the on-disk free
space cache, or when it is written out to disk.

Besides a potential race condition due to the multiple use of the list
field, filesystem performance severely degrades over time: as we use
up all non-bitmap free extents, the try-to-set-up-cluster dance is
done at every metadata block allocation. For every block group, we
fail to set up a cluster, and after failing on them all up to twice,
we fall back to the much slower unclustered allocation.

To make matters worse, before the unclustered allocation, we try to
create new block groups until we reach the 1% threshold, which
introduces additional bitmaps and thus block groups that we'll iterate
over at each metadata block request.

f2d0f676

Btrfs: fix oops when calling statfs on readonly device · b772a86e

由 Li Zefan 提交于 11月 28, 2011

To reproduce this bug:

  # dd if=/dev/zero of=img bs=1M count=256
  # mkfs.btrfs img
  # losetup -r /dev/loop1 img
  # mount /dev/loop1 /mnt
  OOPS!!

It triggered BUG_ON(!nr_devices) in btrfs_calc_avail_data_space().

To fix this, instead of checking write-only devices, we check all open
deivces:

  # df -h /dev/loop1
  Filesystem            Size  Used Avail Use% Mounted on
  /dev/loop1            250M   28K  238M   1% /mnt
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>

b772a86e

Btrfs: Don't error on resizing FS to same size · ece7d20e

由 Mike Fleetwood 提交于 11月 18, 2011

It seems overly harsh to fail a resize of a btrfs file system to the
same size when a shrink or grow would succeed. User app GParted trips
over this error. Allow it by bypassing the shrink or grow operation.
Signed-off-by: NMike Fleetwood <mike.fleetwood@googlemail.com>

ece7d20e

Btrfs: fix deadlock on metadata reservation when evicting a inode · aa38a711

由 Miao Xie 提交于 11月 18, 2011

When I ran the xfstests, I found the test tasks was blocked on meta-data
reservation.

By debugging, I found the reason of this bug:
   start transaction
        |
	v
   reserve meta-data space
	|
	v
   flush delay allocation -> iput inode -> evict inode
	^					|
	|					v
   wait for delay allocation flush <- reserve meta-data space

And besides that, the flush on evicting inode will block the thread, which
is reclaiming the memory, and make oom happen easily.

Fix this bug by skipping the flush step when evicting inode.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>

aa38a711

D
btrfs scrub: handle -ENOMEM from init_ipath() · 26bdef54
由 Dan Carpenter 提交于 11月 16, 2011
```
init_ipath() can return an ERR_PTR(-ENOMEM).
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
```
26bdef54

30 11月, 2011 2 次提交

xfs: fix attr2 vs large data fork assert · 4c393a60

由 Christoph Hellwig 提交于 11月 19, 2011

With Dmitry fsstress updates I've seen very reproducible crashes in
xfs_attr_shortform_remove because xfs_attr_shortform_bytesfit claims that
the attributes would not fit inline into the inode after removing an
attribute.  It turns out that we were operating on an inode with lots
of delalloc extents, and thus an if_bytes values for the data fork that
is larger than biggest possible on-disk storage for it which utterly
confuses the code near the end of xfs_attr_shortform_bytesfit.

Fix this by always allowing the current attribute fork, like we already
do for the attr1 format, given that delalloc conversion will take care
for moving either the data or attribute area out of line if it doesn't
fit at that point - or making the point moot by merging extents at this
point.

Also document the function better, and clean up some loose bits.
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NBen Myers <bpm@sgi.com>

4c393a60

xfs: force buffer writeback before blocking on the ilock in inode reclaim · 4dd2cb4a

由 Christoph Hellwig 提交于 11月 29, 2011

If we are doing synchronous inode reclaim we block the VM from making
progress in memory reclaim.  So if we encouter a flush locked inode
promote it in the delwri list and wake up xfsbufd to write it out now.
Without this we can get hangs of up to 30 seconds during workloads hitting
synchronous inode reclaim.

The scheme is copied from what we do for dquot reclaims.
Reported-by: NSimon Kirby <sim@hostway.ca>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NSimon Kirby <sim@hostway.ca>
Signed-off-by: NBen Myers <bpm@sgi.com>

4dd2cb4a

29 11月, 2011 1 次提交

xfs: validate acl count · fa8b18ed

由 Christoph Hellwig 提交于 11月 20, 2011

This prevents in-memory corruption and possible panics if the on-disk
ACL is badly corrupted.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NBen Myers <bpm@sgi.com>

fa8b18ed

25 11月, 2011 1 次提交

ext4: fix racy use-after-free in ext4_end_io_dio() · 4c81f045

由 Tejun Heo 提交于 11月 24, 2011

ext4_end_io_dio() queues io_end->work and then clears iocb->private;
however, io_end->work calls aio_complete() which frees the iocb
object.  If that slab object gets reallocated, then ext4_end_io_dio()
can end up clearing someone else's iocb->private, this use-after-free
can cause a leak of a struct ext4_io_end_t structure.

Detected and tested with slab poisoning.

[ Note: Can also reproduce using 12 fio's against 12 file systems with the
  following configuration file:

  [global]
  direct=1
  ioengine=libaio
  iodepth=1
  bs=4k
  ba=4k
  size=128m

  [create]
  filename=${TESTDIR}
  rw=write

  -- tytso ]

Google-Bug-Id: 5354697
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reported-by: NKent Overstreet <koverstreet@google.com>
Tested-by: NKent Overstreet <koverstreet@google.com>
Cc: stable@kernel.org

4c81f045

24 11月, 2011 3 次提交

eCryptfs: Extend array bounds for all filename chars · 0f751e64

由 Tyler Hicks 提交于 11月 23, 2011

From mhalcrow's original commit message:

    Characters with ASCII values greater than the size of
    filename_rev_map[] are valid filename characters.
    ecryptfs_decode_from_filename() will access kernel memory beyond
    that array, and ecryptfs_parse_tag_70_packet() will then decrypt
    those characters. The attacker, using the FNEK of the crafted file,
    can then re-encrypt the characters to reveal the kernel memory past
    the end of the filename_rev_map[] array. I expect low security
    impact since this array is statically allocated in the text area,
    and the amount of memory past the array that is accessible is
    limited by the largest possible ASCII filename character.

This patch solves the issue reported by mhalcrow but with an
implementation suggested by Linus to simply extend the length of
filename_rev_map[] to 256. Characters greater than 0x7A are mapped to
0x00, which is how invalid characters less than 0x7A were previously
being handled.
Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
Reported-by: NMichael Halcrow <mhalcrow@google.com>
Cc: stable@kernel.org

0f751e64

eCryptfs: Flush file in vma close · 32001d6f

由 Tyler Hicks 提交于 11月 21, 2011

Dirty pages weren't being written back when an mmap'ed eCryptfs file was
closed before the mapping was unmapped. Since f_ops->flush() is not
called by the munmap() path, the lower file was simply being released.
This patch flushes the eCryptfs file in the vm_ops->close() path.

https://launchpad.net/bugs/870326Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
Cc: stable@kernel.org [2.6.39+]

32001d6f

eCryptfs: Prevent file create race condition · b59db43a

由 Tyler Hicks 提交于 11月 21, 2011

The file creation path prematurely called d_instantiate() and
unlock_new_inode() before the eCryptfs inode info was fully
allocated and initialized and before the eCryptfs metadata was written
to the lower file.

This could result in race conditions in subsequent file and inode
operations leading to unexpected error conditions or a null pointer
dereference while attempting to use the unallocated memory.

https://launchpad.net/bugs/813146Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
Cc: stable@kernel.org

b59db43a

23 11月, 2011 1 次提交

mount_subtree() pointless use-after-free · d31da0f0

由 Al Viro 提交于 11月 22, 2011

d'oh... we'd carefully pinned mnt->mnt_sb down, dropped mnt and attempt
to grab s_umount on mnt->mnt_sb.  The trouble is, *mnt might've been
overwritten by now...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d31da0f0

22 11月, 2011 2 次提交

Btrfs: remove free-space-cache.c WARN during log replay · 24a70313

由 Chris Mason 提交于 11月 21, 2011

The log replay code only partially loads block groups, since
the block group caching code is able to detect and deal with
extents the logging code has pinned down.

While the logging code is pinning down block groups, there is
a bogus WARN_ON we're hitting if the code wasn't able to find
an extent in the cache.  This commit removes the warning because
it can happen any time there isn't a valid free space cache
for that block group.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

24a70313

ext4: fix up a undefined error in ext4_free_blocks in debugging code · 6e58ad69

由 Yongqiang Yang 提交于 11月 21, 2011

sbi is not defined, so let ext4_free_blocks use EXT4_SB(sb) instead
when EXT4FS_DEBUG is defined.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>

6e58ad69

21 11月, 2011 1 次提交

VFS: Log the fact that we've given ELOOP rather than creating a loop · dd179946

由 David Howells 提交于 8月 16, 2011

To prevent an NFS server from being used to create a directory loop in an NFS
superblock on the client, the following patch was committed:

commit 18367501
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Tue Jul 12 21:42:24 2011 -0400
Subject: fix loop checks in d_materialise_unique()

This causes ELOOP to be reported to anyone trying to access the dentry that
would otherwise cause the kernel to complete the loop.

However, no indication is given to the caller as to why an operation that ought
to work doesn't. The fault is with the kernel, which doesn't want to try and
solve the problem as it gets horrendously messy if there's another mountpoint
somewhere in the trees being spliced that can't be moved[*].

[*] The real problem is that we don't handle the excision of a subtree that
gets moved _out_ of what we can see. This can happen on the server where a
directory is merely moved between two other dirs on the same filesystem, but
where destination dir is not accessible by the client.

So, given the choice to return ELOOP rather than trying to reconfigure the
dentry tree, we should give the caller some indication of why they aren't being
allowed to make what should be a legitimate request and log a message.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NSachin Prabhu <sprabhu@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

dd179946

20 11月, 2011 3 次提交

Btrfs: sectorsize align offsets in fiemap · 4d479cf0

由 Josef Bacik 提交于 11月 17, 2011

We've been hitting BUG()'s in btrfs_cont_expand and btrfs_fallocate and anywhere
else that calls btrfs_get_extent while running xfstests 13 in a loop. This is
because fiemap is calling btrfs_get_extent with non-sectorsize aligned offsets,
which will end up adding mappings that are not sectorsize aligned, which will
cause problems in some cases for subsequent calls to btrfs_get_extent for
similar areas that are sectorsize aligned. With this patch I ran xfstests 13 in
a loop for a couple of hours and didn't hit the problem that I could previously
hit in at most 20 minutes. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

4d479cf0

Btrfs: clear pages dirty for io and set them extent mapped · f7d61dcd

由 Josef Bacik 提交于 11月 15, 2011

When doing the io_ctl helpers to clean up the free space cache stuff I stopped
using our normal prepare_pages stuff, which means I of course forgot to do
things like set the pages extent mapped, which will cause us all sorts of
wonderful propblems.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

f7d61dcd

Btrfs: wait on caching if we're loading the free space cache · 291c7d2f

由 Josef Bacik 提交于 11月 14, 2011

We've been hitting panics when running xfstest 13 in a loop for long periods of
time. And actually this problem has always existed so we've been hitting these
things randomly for a while. Basically what happens is we get a thread coming
into the allocator and reading the space cache off of disk and adding the
entries to the free space cache as we go. Then we get another thread that comes
in and tries to allocate from that block group. Since block_group->cached !=
BTRFS_CACHE_NO it goes ahead and tries to do the allocation. We do this because
if we're doing the old slow way of caching we don't want to hold people up and
wait for everything to finish. The problem with this is we could end up
discarding the space cache at some arbitrary point in the future, which means we
could very well end up allocating space that is either bad, or when the real
caching happens it could end up thinking the space isn't in use when it really
is and cause all sorts of other problems.

The solution is to add a new flag to indicate we are loading the free space
cache from disk, and always try to cache the block group if cache->cached !=
BTRFS_CACHE_FINISHED. That way if we are loading the space cache anybody else
who tries to allocate from the block group will have to wait until it's finished
to make sure it completes successfully. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

291c7d2f

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功