提交 · 45f1a9c3f63db3d4562c16062a51740801fbd88c · gsplhtlxg / clone-Linux

31 8月, 2014 1 次提交
- D
  ext4: use ext4_update_i_disksize instead of opencoded ones · ee124d27
  由 Dmitry Monakhov 提交于 8月 30, 2014
```
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
  ee124d27
30 8月, 2014 2 次提交
- T
  ext4: convert ext4_bread() to use the ERR_PTR convention · 1c215028
  由 Theodore Ts'o 提交于 8月 29, 2014
```
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
  1c215028
- T
  ext4: convert ext4_getblk() to use the ERR_PTR convention · 10560082
  由 Theodore Ts'o 提交于 8月 29, 2014
```
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
  10560082
29 8月, 2014 1 次提交

ext4: update i_disksize coherently with block allocation on error path · 6603120e

由 Dmitry Monakhov 提交于 8月 27, 2014

In case of delalloc block i_disksize may be less than i_size. So we
have to update i_disksize each time we allocated and submitted some
blocks beyond i_disksize.  We weren't doing this on the error paths,
so fix this.

testcase: xfstest generic/019
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

6603120e

24 8月, 2014 1 次提交

ext4: move i_size,i_disksize update routines to helper function · 4631dbf6

由 Dmitry Monakhov 提交于 8月 23, 2014

Cc: stable@vger.kernel.org # needed for bug fix patches
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

4631dbf6

15 7月, 2014 2 次提交

ext4: fix punch hole on files with indirect mapping · 4f579ae7

由 Lukas Czerner 提交于 7月 15, 2014

Currently punch hole code on files with direct/indirect mapping has some
problems which may lead to a data loss. For example (from Jan Kara):

fallocate -n -p 10240000 4096

will punch the range 10240000 - 12632064 instead of the range 1024000 -
10244096.

Also the code is a bit weird and it's not using infrastructure provided
by indirect.c, but rather creating it's own way.

This patch fixes the issues as well as making the operation to run 4
times faster from my testing (punching out 60GB file). It uses similar
approach used in ext4_ind_truncate() which takes advantage of
ext4_free_branches() function.

Also rename the ext4_free_hole_blocks() to something more sensible, like
the equivalent we have for extent mapped files. Call it
ext4_ind_remove_space().

This has been tested mostly with fsx and some xfstests which are testing
punch hole but does not require unwritten extents which are not
supported with direct/indirect mapping. Not problems showed up even with
1024k block size.

CC: stable@vger.kernel.org
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

4f579ae7

ext4: remove metadata reservation checks · 71d4f7d0

由 Theodore Ts'o 提交于 7月 15, 2014

Commit 27dd4385 ("ext4: introduce reserved space") reserves 2% of
the file system space to make sure metadata allocations will always
succeed.  Given that, tracking the reservation of metadata blocks is
no longer necessary.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

71d4f7d0

02 6月, 2014 1 次提交

ext4: handle symlink properly with inline_data · bd9db175

由 Zheng Liu 提交于 6月 02, 2014

This commit tries to fix a bug that we can't read symlink properly with
inline data feature when the length of symlink is greater than 60 bytes
but less than extra space.

The key issue is in ext4_inode_is_fast_symlink() that it doesn't check
whether or not an inode has inline data.  When the user creates a new
symlink, an inode will be allocated with MAY_INLINE_DATA flag.  Then
symlink will be stored in ->i_block and extended attribute space.  In
the mean time, this inode is with inline data flag.  After remounting
it, ext4_inode_is_fast_symlink() function thinks that this inode is a
fast symlink so that the data in ->i_block is copied to the user, and
the data in extra space is trimmed.  In fact this inode should be as a
normal symlink.

The following script can hit this bug.

  #!/bin/bash

  cd ${MNT}
  filename=ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
  rm -rf test
  mkdir test
  cd test
  echo "hello" >$filename
  ln -s $filename symlinkfile
  cd
  sudo umount /mnt/sda1
  sudo mount -t ext4 /dev/sda1 /mnt/sda1
  readlink /mnt/sda1/test/symlinkfile

After applying this patch, it will break the assumption in e2fsck
because the original implementation doesn't want to support symlink
with inline data.
Reported-by: N"Darrick J. Wong" <darrick.wong@oracle.com>
Reported-by: NIan Nartowicz <claws@nartowicz.co.uk>
Cc: Ian Nartowicz <claws@nartowicz.co.uk>
Cc: Tao Ma <tm@tao.ma>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

bd9db175

13 5月, 2014 2 次提交

ext4: add missing BUFFER_TRACE before ext4_journal_get_write_access · 5d601255

由 liang xie 提交于 5月 12, 2014

Make them more consistently
Signed-off-by: Nxieliang <xieliang@xiaomi.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5d601255

ext4: remove unnecessary double parentheses · c8b459f4

由 Lukas Czerner 提交于 5月 12, 2014

Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c8b459f4

12 5月, 2014 2 次提交

ext4: make local functions static · c197855e

由 Stephen Hemminger 提交于 5月 12, 2014

I have been running make namespacecheck to look for unneeded globals, and
found these in ext4.
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c197855e

ext4: fix data integrity sync in ordered mode · 1c8349a1

由 Namjae Jeon 提交于 5月 12, 2014

When we perform a data integrity sync we tag all the dirty pages with
PAGECACHE_TAG_TOWRITE at start of ext4_da_writepages.  Later we check
for this tag in write_cache_pages_da and creates a struct
mpage_da_data containing contiguously indexed pages tagged with this
tag and sync these pages with a call to mpage_da_map_and_submit.  This
process is done in while loop until all the PAGECACHE_TAG_TOWRITE
pages are synced. We also do journal start and stop in each iteration.
journal_stop could initiate journal commit which would call
ext4_writepage which in turn will call ext4_bio_write_page even for
delayed OR unwritten buffers. When ext4_bio_write_page is called for
such buffers, even though it does not sync them but it clears the
PAGECACHE_TAG_TOWRITE of the corresponding page and hence these pages
are also not synced by the currently running data integrity sync. We
will end up with dirty pages although sync is completed.

This could cause a potential data loss when the sync call is followed
by a truncate_pagecache call, which is exactly the case in
collapse_range.  (It will cause generic/127 failure in xfstests)

To avoid this issue, we can use set_page_writeback_keepwrite instead of
set_page_writeback, which doesn't clear TOWRITE tag.

Cc: stable@vger.kernel.org
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

1c8349a1

07 5月, 2014 4 次提交
- A
  switch {__,}blockdev_direct_IO() to iov_iter · 31b14039
  由 Al Viro 提交于 3月 05, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  31b14039
- A
  get rid of pointless iov_length() in ->direct_IO() · a6cbcd4a
  由 Al Viro 提交于 3月 04, 2014
```
all callers have iov_length(iter->iov, iter->nr_segs) == iov_iter_count(iter)
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  a6cbcd4a
- A
  ext4: switch the guts of ->direct_IO() to iov_iter · 16b1f05d
  由 Al Viro 提交于 3月 04, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  16b1f05d
- A
  pass iov_iter to ->direct_IO() · d8d3d94b
  由 Al Viro 提交于 3月 04, 2014
```
unmodified, for now
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  d8d3d94b
22 4月, 2014 1 次提交

ext4: add a new spinlock i_raw_lock to protect the ext4's raw inode · 202ee5df

由 Theodore Ts'o 提交于 4月 21, 2014

To avoid potential data races, use a spinlock which protects the raw
(on-disk) inode.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

202ee5df

21 4月, 2014 2 次提交

ext4: rename uninitialized extents to unwritten · 556615dc

由 Lukas Czerner 提交于 4月 20, 2014

Currently in ext4 there is quite a mess when it comes to naming
unwritten extents. Sometimes we call it uninitialized and sometimes we
refer to it as unwritten.

The right name for the extent which has been allocated but does not
contain any written data is _unwritten_. Other file systems are
using this name consistently, even the buffer head state refers to it as
unwritten. We need to fix this confusion in ext4.

This commit changes every reference to an uninitialized extent (meaning
allocated but unwritten) to unwritten extent. This includes comments,
function names and variable names. It even covers abbreviation of the
word uninitialized (such as uninit) and some misspellings.

This commit does not change any of the code paths at all. This has been
confirmed by comparing md5sums of the assembly code of each object file
after all the function names were stripped from it.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

556615dc

ext4: get rid of EXT4_MAP_UNINIT flag · 090f32ee

由 Lukas Czerner 提交于 4月 20, 2014

Currently EXT4_MAP_UNINIT is used in dioread_nolock case to mark the
cases where we're using dioread_nolock and we're writing into either
unallocated, or unwritten extent, because we need to make sure that
any DIO write into that inode will wait for the extent conversion.

However EXT4_MAP_UNINIT is not only entirely misleading name but also
unnecessary because we can check for EXT4_MAP_UNWRITTEN in the
dioread_nolock case instead.

This commit removes EXT4_MAP_UNINIT flag.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

090f32ee

18 4月, 2014 1 次提交

ext4: discard preallocations after removing space · ef24f6c2

由 Lukas Czerner 提交于 4月 18, 2014

Currently in ext4_collapse_range() and ext4_punch_hole() we're
discarding preallocation twice. Once before we attempt to do any changes
and second time after we're done with the changes.

While the second call to ext4_discard_preallocations() in
ext4_punch_hole() case is not needed, we need to discard preallocation
right after ext4_ext_remove_space() in collapse range case because in
the case we had to restart a transaction in the middle of removing space
we might have new preallocations created.

Remove unneeded ext4_discard_preallocations() ext4_punch_hole() and move
it to the better place in ext4_collapse_range()
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ef24f6c2

12 4月, 2014 2 次提交

fs: disallow all fallocate operation on active swapfile · 0790b31b

由 Lukas Czerner 提交于 4月 12, 2014

Currently some file system have IS_SWAPFILE check in their fallocate
implementations and some do not. However we should really prevent any
fallocate operation on swapfile so move the check to vfs and remove the
redundant checks from the file systems fallocate implementations.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0790b31b

ext4: remove unnecessary check for APPEND and IMMUTABLE · 9ef06cec

由 Lukas Czerner 提交于 4月 12, 2014

All the checks IS_APPEND and IS_IMMUTABLE for the fallocate operation on
the inode are done in vfs. No need to do this again in ext4. Remove it.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9ef06cec

11 4月, 2014 1 次提交

ext4: move ext4_update_i_disksize() into mpage_map_and_submit_extent() · 622cad13

由 Theodore Ts'o 提交于 4月 11, 2014

The function ext4_update_i_disksize() is used in only one place, in
the function mpage_map_and_submit_extent().  Move its code to simplify
the code paths, and also move the call to ext4_mark_inode_dirty() into
the i_data_sem's critical region, to be consistent with all of the
other places where we update i_disksize.  That way, we also keep the
raw_inode's i_disksize protected, to avoid the following race:

      CPU #1                                 CPU #2

   down_write(&i_data_sem)
   Modify i_disk_size
   up_write(&i_data_sem)
                                        down_write(&i_data_sem)
                                        Modify i_disk_size
                                        Copy i_disk_size to on-disk inode
                                        up_write(&i_data_sem)
   Copy i_disk_size to on-disk inode
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org

622cad13

08 4月, 2014 1 次提交

ext4: update PF_MEMALLOC handling in ext4_write_inode() · 87f7e416

由 Theodore Ts'o 提交于 4月 08, 2014

The special handling of PF_MEMALLOC callers in ext4_write_inode()
shouldn't be necessary as there shouldn't be any. Warn about it. Also
update comment before the function as it seems somewhat outdated.

(Changes modeled on an ext3 patch posted by Jan Kara to the linux-ext4
mailing list on Februaryt 28, 2014, which apparently never went into
the ext3 tree.)
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>

87f7e416

07 4月, 2014 1 次提交

ext4: FIBMAP ioctl causes BUG_ON due to handle EXT_MAX_BLOCKS · 4adb6ab3

由 Kazuya Mio 提交于 4月 07, 2014

When we try to get 2^32-1 block of the file which has the extent
(ee_block=2^32-2, ee_len=1) with FIBMAP ioctl, it causes BUG_ON
in ext4_ext_put_gap_in_cache().

To avoid the problem, ext4_map_blocks() needs to check the file logical block
number. ext4_ext_put_gap_in_cache() called via ext4_map_blocks() cannot
handle 2^32-1 because the maximum file logical block number is 2^32-2.

Note that ext4_ind_map_blocks() returns -EIO when the block number is invalid.
So ext4_map_blocks() should also return the same errno.
Signed-off-by: NKazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

4adb6ab3

04 4月, 2014 1 次提交

mm + fs: store shadow entries in page cache · 91b0abe3

由 Johannes Weiner 提交于 4月 03, 2014

Reclaim will be leaving shadow entries in the page cache radix tree upon
evicting the real page.  As those pages are found from the LRU, an
iput() can lead to the inode being freed concurrently.  At this point,
reclaim must no longer install shadow pages because the inode freeing
code needs to ensure the page tree is really empty.

Add an address_space flag, AS_EXITING, that the inode freeing code sets
under the tree lock before doing the final truncate.  Reclaim will check
for this flag before installing shadow pages.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Reviewed-by: NRik van Riel <riel@redhat.com>
Reviewed-by: NMinchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Metin Doslu <metin@citusdata.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ozgun Erdogan <ozgun@citusdata.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

91b0abe3

31 3月, 2014 1 次提交

ext4: atomically set inode->i_flags in ext4_set_inode_flags() · 00a1a053

由 Theodore Ts'o 提交于 3月 30, 2014

Use cmpxchg() to atomically set i_flags instead of clearing out the
S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the
EXT4_IMMUTABLE_FL, EXT4_APPEND_FL flags, since this opens up a race
where an immutable file has the immutable flag cleared for a brief
window of time.
Reported-by: NJohn Sullivan <jsrhbz@kanargh.force9.co.uk>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

00a1a053

25 3月, 2014 4 次提交

ext4: fix comment typo · e04027e8

由 Matthew Wilcox 提交于 3月 24, 2014

Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e04027e8

ext4: make ext4_block_zero_page_range static · 94350ab5

由 Matthew Wilcox 提交于 3月 24, 2014

It's only called within inode.c, so make it static, remove its prototype
from ext4.h and move it above all of its callers so it doesn't need a
prototype within inode.c.
Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

94350ab5

ext4: atomically set inode->i_flags in ext4_set_inode_flags() · 5f16f322

由 Theodore Ts'o 提交于 3月 24, 2014

Use cmpxchg() to atomically set i_flags instead of clearing out the
S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the
EXT4_IMMUTABLE_FL, EXT4_APPEND_FL flags, since this opens up a race
where an immutable file has the immutable flag cleared for a brief
window of time.
Reported-by: NJohn Sullivan <jsrhbz@kanargh.force9.co.uk>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

5f16f322

ext4: optimize Hurd tests when reading/writing inodes · ed3654eb

由 Theodore Ts'o 提交于 3月 24, 2014

Set a in-memory superblock flag to indicate whether the file system is
designed to support the Hurd.

Also, add a sanity check to make sure the 64-bit feature is not set
for Hurd file systems, since i_file_acl_high conflicts with a
Hurd-specific field.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ed3654eb

20 3月, 2014 1 次提交

ext4: kill i_version support for Hurd-castrated file systems · c4f65706

由 Theodore Ts'o 提交于 3月 20, 2014

The Hurd file system uses uses the inode field which is now used for
i_version for its translator block.  This means that ext2 file systems
that are formatted for GNU Hurd can't be used to support NFSv4.  Given
that Hurd file systems don't support extents, and a huge number of
modern file system features, this is no great loss.

If we don't do this, the attempt to update the i_version field will
stomp over the translator block field, which will cause file system
corruption for Hurd file systems.  This can be replicated via:

mke2fs -t ext2 -o hurd /dev/vdc
mount -t ext4 /dev/vdc /vdc
touch /vdc/bug0000
umount /dev/vdc
e2fsck -f /dev/vdc

Addresses-Debian-Bug: #738758
Reported-By: NGabriele Giacone <1o5g4r8o@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c4f65706

19 3月, 2014 1 次提交

ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate · b8a86845

由 Lukas Czerner 提交于 3月 18, 2014

Introduce new FALLOC_FL_ZERO_RANGE flag for fallocate. This has the same
functionality as xfs ioctl XFS_IOC_ZERO_RANGE.

It can be used to convert a range of file to zeros preferably without
issuing data IO. Blocks should be preallocated for the regions that span
holes in the file, and the entire range is preferable converted to
unwritten extents

This can be also used to preallocate blocks past EOF in the same way as
with fallocate. Flag FALLOC_FL_KEEP_SIZE which should cause the inode
size to remain the same.

Also add appropriate tracepoints.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b8a86845

04 3月, 2014 1 次提交

ext4: Speedup WB_SYNC_ALL pass called from sync(2) · 10542c22

由 Jan Kara 提交于 3月 04, 2014

When doing filesystem wide sync, there's no need to force transaction
commit (or synchronously write inode buffer) separately for each inode
because ext4_sync_fs() takes care of forcing commit at the end (VFS
takes care of flushing buffer cache, respectively). Most of the time
this slowness doesn't manifest because previous WB_SYNC_NONE writeback
doesn't leave much to write but when there are processes aggressively
creating new files and several filesystems to sync, the sync slowness
can be noticeable. In the following test script sync(1) takes around 6
minutes when there are two ext4 filesystems mounted on a standard SATA
drive. After this patch sync takes a couple of seconds so we have about
two orders of magnitude improvement.

      function run_writers
      {
        for (( i = 0; i < 10; i++ )); do
          mkdir $1/dir$i
          for (( j = 0; j < 40000; j++ )); do
            dd if=/dev/zero of=$1/dir$i/$j bs=4k count=4 &>/dev/null
          done &
        done
      }

      for dir in "$@"; do
        run_writers $dir
      done

      sleep 40
      time sync
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

10542c22

21 2月, 2014 2 次提交

ext4: avoid exposure of stale data in ext4_punch_hole() · e251f9bc

由 Maxim Patlasov 提交于 2月 20, 2014

While handling punch-hole fallocate, it's useless to truncate page cache
before removing the range from extent tree (or block map in indirect case)
because page cache can be re-populated (by read-ahead or read(2) or mmap-ed
read) immediately after truncating page cache, but before updating extent
tree (or block map). In that case the user will see stale data even after
fallocate is completed.

Until the problem of data corruption resulting from pages backed by
already freed blocks is fully resolved, the simple thing we can do now
is to add another truncation of pagecache after punch hole is done.
Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

e251f9bc

ext4: avoid possible overflow in ext4_map_blocks() · e861b5e9

由 Theodore Ts'o 提交于 2月 20, 2014

The ext4_map_blocks() function returns the number of blocks which
satisfying the caller's request. This number of blocks requested by
the caller is specified by an unsigned integer, but the return value
of ext4_map_blocks() is a signed integer (to accomodate error codes
per the kernel's standard error signalling convention).

Historically, overflows could never happen since mballoc() will refuse
to allocate more than 2048 blocks at a time (which is something we
should fix), and if the blocks were already allocated, the fact that
there would be some number of intervening metadata blocks pretty much
guaranteed that there could never be a contiguous region of data
blocks that was greater than 2**31 blocks.

However, this is now possible if there is a file system which is a bit
bigger than 8TB, and is created using the new mke2fs hugeblock
feature, which can create a perfectly contiguous file. In that case,
if a userspace program attempted to call fallocate() on this already
fully allocated file, it's possible that ext4_map_blocks() could
return a number large enough that it would overflow a signed integer,
resulting in a ext4 thinking that the ext4_map_blocks() call had
failed with some strange error code.

Since ext4_map_blocks() is always free to return a smaller number of
blocks than what was requested by the caller, fix this by capping the
number of blocks that ext4_map_blocks() will ever try to map to 2**31
- 1. In practice this should never get hit, except by someone
deliberately trying to provke the above-described bug.

Thanks to the PaX team for asking whethre this could possibly happen
in some off-line discussions about using some static code checking
technology they are developing to find bugs in kernel code.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e861b5e9

26 1月, 2014 1 次提交

ext2/3/4: use generic posix ACL infrastructure · 64e178a7

由 Christoph Hellwig 提交于 12月 20, 2013

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

64e178a7

08 1月, 2014 1 次提交

ext4: don't pass freed handle to ext4_walk_page_buffers · 8c9367fd

由 Theodore Ts'o 提交于 1月 07, 2014

This is harmless, since ext4_walk_page_buffers only passes the handle
onto the callback function, and in this call site the function in
question, bput_one(), doesn't actually use the handle.  But there's no
point passing in an invalid handle, and it creates a Coverity warning,
so let's just clean it up.

Addresses-Coverity-Id: #1091168
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8c9367fd

07 1月, 2014 2 次提交

ext4: ext4_inode_is_fast_symlink should use EXT4_CLUSTER_SIZE · 65eddb56

由 Yongqiang Yang 提交于 1月 06, 2014

Can be reproduced by xfstests 62 with bigalloc and 128bit size inode.
Signed-off-by: NYongqiang Yang <yangyongqiang01@baidu.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>

65eddb56

ext4: enable punch hole for bigalloc · 9cb00419

由 Zheng Liu 提交于 1月 06, 2014

After applied this commit (d23142c6), ext4 has supported punch hole for
a file system with bigalloc feature.  But we forgot to enable it.  This
commit fixes it.

Cc: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9cb00419