提交 · 3cfb6772d4cf3875d199f5a5a547cadaef7c1382 · openanolis / cloud-kernel

13 7月, 2018 1 次提交

ext4: check for allocation block validity with block group locked · 8d5a803c

由 Theodore Ts'o 提交于 7月 12, 2018

With commit 044e6e3d: "ext4: don't update checksum of new
initialized bitmaps" the buffer valid bit will get set without
actually setting up the checksum for the allocation bitmap, since the
checksum will get calculated once we actually allocate an inode or
block.

If we are doing this, then we need to (re-)check the verified bit
after we take the block group lock.  Otherwise, we could race with
another process reading and verifying the bitmap, which would then
complain about the checksum being invalid.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1780137Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org

8d5a803c

14 6月, 2018 2 次提交

ext4: only look at the bg_flags field if it is valid · 8844618d

由 Theodore Ts'o 提交于 6月 14, 2018

The bg_flags field in the block group descripts is only valid if the
uninit_bg or metadata_csum feature is enabled.  We were not
consistently looking at this field; fix this.

Also block group #0 must never have uninitialized allocation bitmaps,
or need to be zeroed, since that's where the root inode, and other
special inodes are set up.  Check for these conditions and mark the
file system as corrupted if they are detected.

This addresses CVE-2018-10876.

https://bugzilla.kernel.org/show_bug.cgi?id=199403Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org

8844618d

ext4: always check block group bounds in ext4_init_block_bitmap() · 819b23f1

由 Theodore Ts'o 提交于 6月 13, 2018

Regardless of whether the flex_bg feature is set, we should always
check to make sure the bits we are setting in the block bitmap are
within the block group bounds.

https://bugzilla.kernel.org/show_bug.cgi?id=199865Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org

819b23f1

13 5月, 2018 1 次提交

ext4: mark block bitmap corrupted when found · 736dedbb

由 Wang Shilong 提交于 5月 12, 2018

There are still some cases that we missed to set
block bitmaps corrupted bit properly:

1) block bitmap number is wrong.
2) failed to read block bitmap due to disk errors.
3) double free block bitmaps..
4) some mismatch check with bitmaps vs buddy information.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: NWang Shilong <wshilong@ddn.com>
Reviewed-by: NLiu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>

736dedbb

12 5月, 2018 1 次提交

ext4: add new ext4_mark_group_bitmap_corrupted() helper · db79e6d1

由 Wang Shilong 提交于 5月 12, 2018

Since there are many places to set inode/block bitmap
corrupt bit, add a new helper for it, which will make
codes more clear.
Signed-off-by: NWang Shilong <wshilong@ddn.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>

db79e6d1

24 4月, 2018 1 次提交

ext4: fix bitmap position validation · 22be37ac

由 Lukas Czerner 提交于 4月 24, 2018

Currently in ext4_valid_block_bitmap() we expect the bitmap to be
positioned anywhere between 0 and s_blocksize clusters, but that's
wrong because the bitmap can be placed anywhere in the block group. This
causes false positives when validating bitmaps on perfectly valid file
system layouts. Fix it by checking whether the bitmap is within the group
boundary.

The problem can be reproduced using the following

mkfs -t ext3 -E stride=256 /dev/vdb1
mount /dev/vdb1 /mnt/test
cd /mnt/test
wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.3.tar.xz
tar xf linux-4.16.3.tar.xz

This will result in the warnings in the logs

EXT4-fs error (device vdb1): ext4_validate_block_bitmap:399: comm tar: bg 84: block 2774529: invalid block bitmap

[ Changed slightly for clarity and to not drop a overflow test -- TYT ]
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reported-by: NIlya Dryomov <idryomov@gmail.com>
Fixes: 7dac4a17 ("ext4: add validity checks for bitmap block numbers")
Cc: stable@vger.kernel.org

22be37ac

27 3月, 2018 1 次提交

ext4: add validity checks for bitmap block numbers · 7dac4a17

由 Theodore Ts'o 提交于 3月 26, 2018

An privileged attacker can cause a crash by mounting a crafted ext4
image which triggers a out-of-bounds read in the function
ext4_valid_block_bitmap() in fs/ext4/balloc.c.

This issue has been assigned CVE-2018-1093.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=199181
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1560782Reported-by: NWen Xu <wen.xu@gatech.edu>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

7dac4a17

20 2月, 2018 1 次提交

ext4: don't update checksum of new initialized bitmaps · 044e6e3d

由 Theodore Ts'o 提交于 2月 19, 2018

When reading the inode or block allocation bitmap, if the bitmap needs
to be initialized, do not update the checksum in the block group
descriptor.  That's because we're not set up to journal those changes.
Instead, just set the verified bit on the bitmap block, so that it's
not necessary to validate the checksum.

When a block or inode allocation actually happens, at that point the
checksum will be calculated, and update of the bg descriptor block
will be properly journalled.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

044e6e3d

12 1月, 2018 1 次提交

ext4: use 'sbi' instead of 'EXT4_SB(sb)' · 49598e04

由 Jun Piao 提交于 1月 11, 2018

We could use 'sbi' instead of 'EXT4_SB(sb)' to make code more elegant.
Signed-off-by: NJun Piao <piaojun@huawei.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

49598e04

02 11月, 2017 1 次提交

License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318

由 Greg Kroah-Hartman 提交于 11月 01, 2017

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

b2441318

02 10月, 2017 1 次提交

ext4: retry allocations conservatively · 68fd9750

由 Theodore Ts'o 提交于 10月 01, 2017

Now that we no longer try to reserve metadata blocks for delayed
allocations (which tended to overestimate the required number of
blocks significantly), we really don't need retry allocations when the
disk is very full as aggressively any more.

The only time when it makes sense to retry an allocation is if we have
freshly deleted blocks that will only become available after a
transaction commit.  And if we lose that race, it's not worth it to
try more than once.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

68fd9750

06 7月, 2016 1 次提交

ext4: validate s_reserved_gdt_blocks on mount · 5b9554dc

由 Theodore Ts'o 提交于 7月 05, 2016

If s_reserved_gdt_blocks is extremely large, it's possible for
ext4_init_block_bitmap(), which is called when ext4 sets up an
uninitialized block bitmap, to corrupt random kernel memory.  Add the
same checks which e2fsck has --- it must never be larger than
blocksize / sizeof(__u32) --- and then add a backup check in
ext4_init_block_bitmap() in case the superblock gets modified after
the file system is mounted.
Reported-by: NVegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

5b9554dc

27 6月, 2016 1 次提交

ext4: optimize ext4_should_retry_alloc() to improve ENOSPC performance · d08854f5

由 Theodore Ts'o 提交于 6月 26, 2016

If there are no pending blocks to be released after a commit, forcing
a journal commit has no hope of helping.  It's possible that a commit
had just completed, so if there are now free blocks available for
allocation, it's worth retrying the commit.
Reported-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

d08854f5

08 6月, 2016 1 次提交

fs: have submit_bh users pass in op and flags separately · 2a222ca9

由 Mike Christie 提交于 6月 05, 2016

This has submit_bh users pass in the operation and flags separately,
so submit_bh_wbc can setup the bio op and bi_rw flags on the bio that
is submitted.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

2a222ca9

13 5月, 2016 1 次提交

ext4: fix race in transient ENOSPC detection · dbc427ce

由 Jan Kara 提交于 5月 13, 2016

When there are blocks to free in the running transaction, block
allocator can return ENOSPC although the filesystem has some blocks to
free. We use ext4_should_retry_alloc() to force commit of the current
transaction and return whether anything was committed so that it makes
sense to retry the allocation. However the transaction may get committed
after block allocation fails but before we call
ext4_should_retry_alloc(). So ext4_should_retry_alloc() returns false
because there is nothing to commit and we wrongly return ENOSPC.

Fix the race by unconditionally returning 1 from ext4_should_retry_alloc()
when we tried to commit a transaction. This should not add any
unnecessary retries since we had a transaction running a while ago when
trying to allocate blocks and we want to retry the allocation once that
transaction has committed anyway.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

dbc427ce

12 2月, 2016 1 次提交

ext4: fix scheduling in atomic on group checksum failure · 05145bd7

由 Jan Kara 提交于 2月 11, 2016

When block group checksum is wrong, we call ext4_error() while holding
group spinlock from ext4_init_block_bitmap() or
ext4_init_inode_bitmap() which results in scheduling while in atomic.
Fix the issue by calling ext4_error() later after dropping the spinlock.

CC: stable@vger.kernel.org
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>

05145bd7

18 10月, 2015 3 次提交

ext4: make the bitmap read routines return real error codes · 9008a58e

由 Darrick J. Wong 提交于 10月 17, 2015

Make the bitmap reaading routines return real error codes (EIO,
EFSCORRUPTED, EFSBADCRC) which can then be reflected back to
userspace for more precise diagnosis work.

In particular, this means that mballoc no longer claims that we're out
of memory if the block bitmaps become corrupt.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

9008a58e

ext4: clean up feature test macros with predicate functions · e2b911c5

由 Darrick J. Wong 提交于 10月 17, 2015

Create separate predicate functions to test/set/clear feature flags,
thereby replacing the wordy old macros.  Furthermore, clean out the
places where we open-coded feature tests.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

e2b911c5

ext4: call out CRC and corruption errors with specific error codes · 6a797d27

由 Darrick J. Wong 提交于 10月 17, 2015

Instead of overloading EIO for CRC errors and corrupt structures,
return the same error codes that XFS returns for the same issues.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

6a797d27

08 6月, 2015 1 次提交

ext4: verify block bitmap even after fresh initialization · 41e5b7ed

由 Lukas Czerner 提交于 6月 08, 2015

If we want to rely on the buffer_verified() flag of the block bitmap
buffer, we have to set it consistently. However currently if we're
initializing uninitialized block bitmap in
ext4_read_block_bitmap_nowait() we're not going to set buffer verified
at all.

We can do this by simply setting the flag on the buffer, but I think
it's actually better to run ext4_validate_block_bitmap() to make sure
that what we did in the ext4_init_block_bitmap() is right.

So run ext4_validate_block_bitmap() even after the block bitmap
initialization. Also bail out early from ext4_validate_block_bitmap() if
we see corrupt bitmap, since we already know it's corrupt and we do not
need to verify that.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

41e5b7ed

03 4月, 2015 2 次提交

ext4: remove unnecessary lock/unlock of i_block_reservation_lock · 5a4f3145

由 Maurizio Lombardi 提交于 4月 03, 2015

This is a leftover of commit 71d4f7d0Signed-off-by: NMaurizio Lombardi <mlombard@redhat.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>

5a4f3145

ext4: remove unused header files · 72b8e0f9

由 Sheng Yong 提交于 4月 02, 2015

Remove unused header files and header files which are included in
ext4.h.
Signed-off-by: NSheng Yong <shengyong1@huawei.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

72b8e0f9

13 10月, 2014 1 次提交

ext4: move error report out of atomic context in ext4_init_block_bitmap() · aef4885a

由 Dmitry Monakhov 提交于 10月 13, 2014

Error report likely result in IO so it is bad idea to do it from
atomic context.

This patch should fix following issue:

BUG: sleeping function called from invalid context at include/linux/buffer_head.h:349
in_atomic(): 1, irqs_disabled(): 0, pid: 137, name: kworker/u128:1
5 locks held by kworker/u128:1/137:
 #0:  ("writeback"){......}, at: [<ffffffff81085618>] process_one_work+0x228/0x4d0
 #1:  ((&(&wb->dwork)->work)){......}, at: [<ffffffff81085618>] process_one_work+0x228/0x4d0
 #2:  (jbd2_handle){......}, at: [<ffffffff81242622>] start_this_handle+0x712/0x7b0
 #3:  (&ei->i_data_sem){......}, at: [<ffffffff811fa387>] ext4_map_blocks+0x297/0x430
 #4:  (&(&bgl->locks[i].lock)->rlock){......}, at: [<ffffffff811f3180>] ext4_read_block_bitmap_nowait+0x5d0/0x630
CPU: 3 PID: 137 Comm: kworker/u128:1 Not tainted 3.17.0-rc2-00184-g82752e4 #165
Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
Workqueue: writeback bdi_writeback_workfn (flush-1:0)
 0000000000000411 ffff880813777288 ffffffff815c7fdc ffff880813777288
 ffff880813a8bba0 ffff8808137772a8 ffffffff8108fb30 ffff880803e01e38
 ffff880803e01e38 ffff8808137772c8 ffffffff811a8d53 ffff88080ecc6000
Call Trace:
 [<ffffffff815c7fdc>] dump_stack+0x51/0x6d
 [<ffffffff8108fb30>] __might_sleep+0xf0/0x100
 [<ffffffff811a8d53>] __sync_dirty_buffer+0x43/0xe0
 [<ffffffff811a8e03>] sync_dirty_buffer+0x13/0x20
 [<ffffffff8120f581>] ext4_commit_super+0x1d1/0x230
 [<ffffffff8120fa03>] save_error_info+0x23/0x30
 [<ffffffff8120fd06>] __ext4_error+0xb6/0xd0
 [<ffffffff8120f260>] ? ext4_group_desc_csum+0x140/0x190
 [<ffffffff811f2d8c>] ext4_read_block_bitmap_nowait+0x1dc/0x630
 [<ffffffff8122e23a>] ext4_mb_init_cache+0x21a/0x8f0
 [<ffffffff8113ae95>] ? lru_cache_add+0x55/0x60
 [<ffffffff8112e16c>] ? add_to_page_cache_lru+0x6c/0x80
 [<ffffffff8122eaa0>] ext4_mb_init_group+0x190/0x280
 [<ffffffff8122ec51>] ext4_mb_good_group+0xc1/0x190
 [<ffffffff8123309a>] ext4_mb_regular_allocator+0x17a/0x410
 [<ffffffff8122c821>] ? ext4_mb_use_preallocated+0x31/0x380
 [<ffffffff81233535>] ? ext4_mb_new_blocks+0x205/0x8e0
 [<ffffffff8116ed5c>] ? kmem_cache_alloc+0xfc/0x180
 [<ffffffff812335b0>] ext4_mb_new_blocks+0x280/0x8e0
 [<ffffffff8116f2c4>] ? __kmalloc+0x144/0x1c0
 [<ffffffff81221797>] ? ext4_find_extent+0x97/0x320
 [<ffffffff812257f4>] ext4_ext_map_blocks+0xbc4/0x1050
 [<ffffffff811fa387>] ? ext4_map_blocks+0x297/0x430
 [<ffffffff811fa3ab>] ext4_map_blocks+0x2bb/0x430
 [<ffffffff81200e43>] ? ext4_init_io_end+0x23/0x50
 [<ffffffff811feb44>] ext4_writepages+0x564/0xaf0
 [<ffffffff815cde3b>] ? _raw_spin_unlock+0x2b/0x40
 [<ffffffff810ac7bd>] ? lock_release_non_nested+0x2fd/0x3c0
 [<ffffffff811a009e>] ? writeback_sb_inodes+0x10e/0x490
 [<ffffffff811a009e>] ? writeback_sb_inodes+0x10e/0x490
 [<ffffffff811377e3>] do_writepages+0x23/0x40
 [<ffffffff8119c8ce>] __writeback_single_inode+0x9e/0x280
 [<ffffffff811a026b>] writeback_sb_inodes+0x2db/0x490
 [<ffffffff811a0664>] wb_writeback+0x174/0x2d0
 [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
 [<ffffffff811a0863>] wb_do_writeback+0xa3/0x200
 [<ffffffff811a0a40>] bdi_writeback_workfn+0x80/0x230
 [<ffffffff81085618>] ? process_one_work+0x228/0x4d0
 [<ffffffff810856cd>] process_one_work+0x2dd/0x4d0
 [<ffffffff81085618>] ? process_one_work+0x228/0x4d0
 [<ffffffff81085c1d>] worker_thread+0x35d/0x460
 [<ffffffff810858c0>] ? process_one_work+0x4d0/0x4d0
 [<ffffffff810858c0>] ? process_one_work+0x4d0/0x4d0
 [<ffffffff8108a885>] kthread+0xf5/0x100
 [<ffffffff810990e5>] ? local_clock+0x25/0x30
 [<ffffffff8108a790>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff815ce2ac>] ret_from_fork+0x7c/0xb0
 [<ffffffff8108a790>] ? __init_kthread_work
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

aef4885a

05 9月, 2014 1 次提交

ext4: prepare to drop EXT4_STATE_DELALLOC_RESERVED · e3cf5d5d

由 Theodore Ts'o 提交于 9月 04, 2014

The EXT4_STATE_DELALLOC_RESERVED flag was originally implemented
because it was too hard to make sure the mballoc and get_block flags
could be reliably passed down through all of the codepaths that end up
calling ext4_mb_new_blocks().

Since then, we have mb_flags passed down through most of the code
paths, so getting rid of EXT4_STATE_DELALLOC_RESERVED isn't as tricky
as it used to.

This commit plumbs in the last of what is required, and then adds a
WARN_ON check to make sure we haven't missed anything.  If this passes
a full regression test run, we can then drop
EXT4_STATE_DELALLOC_RESERVED.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

e3cf5d5d

15 7月, 2014 1 次提交

ext4: remove metadata reservation checks · 71d4f7d0

由 Theodore Ts'o 提交于 7月 15, 2014

Commit 27dd4385 ("ext4: introduce reserved space") reserves 2% of
the file system space to make sure metadata allocations will always
succeed.  Given that, tracking the reservation of metadata blocks is
no longer necessary.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

71d4f7d0

26 6月, 2014 1 次提交

ext4: decrement free clusters/inodes counters when block group declared bad · e43bb4e6

由 Namjae Jeon 提交于 6月 26, 2014

We should decrement free clusters counter when block bitmap is marked
as corrupt and free inodes counter when the allocation bitmap is
marked as corrupt to avoid misunderstanding due to incorrect available
size in statfs result.  User can get immediately ENOSPC error from
write begin without reaching for the writepages.

Cc: Darrick J. Wong<darrick.wong@oracle.com>
Reported-by: NAmit Sahrawat <amit.sahrawat83@gmail.com>
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>

e43bb4e6

12 5月, 2014 3 次提交

ext4: make local functions static · c197855e

由 Stephen Hemminger 提交于 5月 12, 2014

I have been running make namespacecheck to look for unneeded globals, and
found these in ext4.
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c197855e

ext4: fix block bitmap validation when bigalloc, ^flex_bg · e674e5cb

由 Darrick J. Wong 提交于 5月 12, 2014

On a bigalloc,^flex_bg filesystem, the ext4_valid_block_bitmap
function fails to convert from blocks to clusters when spot-checking
the validity of the bitmap block that we've just read from disk. This
causes ext4 to think that the bitmap is garbage, which results in the
block group being taken offline when it's not necessary. Add in the
necessary EXT4_B2C() calls to perform the conversions.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e674e5cb

ext4: fix block bitmap initialization under sparse_super2 · 1beeef1b

由 Darrick J. Wong 提交于 5月 12, 2014

The ext4_bg_has_super() function doesn't know about the new rules for
where backup superblocks go on a sparse_super2 filesystem.  Therefore,
block bitmap initialization doesn't know that it shouldn't reserve
space for backups in groups that are never going to contain backups.
The result of this is e2fsck complaining about the block bitmap being
incorrect (fortunately not in a way that results in cross-linked
files), so fix the whole thing.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1beeef1b

15 4月, 2014 1 次提交

ext4: fix ext4_count_free_clusters() with EXT4FS_DEBUG and bigalloc enabled · 036acea2

由 Azat Khuzhin 提交于 4月 14, 2014

With bigalloc enabled we must use EXT4_CLUSTERS_PER_GROUP() instead of
EXT4_BLOCKS_PER_GROUP() otherwise we will go beyond the allocated buffer.

$ mount -t ext4 /dev/vde /vde
[   70.573993] EXT4-fs DEBUG (fs/ext4/mballoc.c, 2346): ext4_mb_alloc_groupinfo:
[   70.575174] allocated s_groupinfo array for 1 meta_bg's
[   70.576172] EXT4-fs DEBUG (fs/ext4/super.c, 2092): ext4_check_descriptors:
[   70.576972] Checking group descriptorsBUG: unable to handle kernel paging request at ffff88006ab56000
[   72.463686] IP: [<ffffffff81394eb9>] __bitmap_weight+0x2a/0x7f
[   72.464168] PGD 295e067 PUD 2961067 PMD 7fa8e067 PTE 800000006ab56060
[   72.464738] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[   72.465139] Modules linked in:
[   72.465402] CPU: 1 PID: 3560 Comm: mount Tainted: G        W    3.14.0-rc2-00069-ge57bce1 #60
[   72.466079] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   72.466505] task: ffff88007ce6c8a0 ti: ffff88006b7f0000 task.ti: ffff88006b7f0000
[   72.466505] RIP: 0010:[<ffffffff81394eb9>]  [<ffffffff81394eb9>] __bitmap_weight+0x2a/0x7f
[   72.466505] RSP: 0018:ffff88006b7f1c00  EFLAGS: 00010206
[   72.466505] RAX: 0000000000000000 RBX: 000000000000050a RCX: 0000000000000040
[   72.466505] RDX: 0000000000000000 RSI: 0000000000080000 RDI: 0000000000000000
[   72.466505] RBP: ffff88006b7f1c28 R08: 0000000000000002 R09: 0000000000000000
[   72.466505] R10: 000000000000babe R11: 0000000000000400 R12: 0000000000080000
[   72.466505] R13: 0000000000000200 R14: 0000000000002000 R15: ffff88006ab55000
[   72.466505] FS:  00007f43ba1fa840(0000) GS:ffff88007f800000(0000) knlGS:0000000000000000
[   72.466505] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   72.466505] CR2: ffff88006ab56000 CR3: 000000006b7e6000 CR4: 00000000000006e0
[   72.466505] Stack:
[   72.466505]  ffff88006ab65000 0000000000000000 0000000000000000 0000000000010000
[   72.466505]  ffff88006ab6f400 ffff88006b7f1c58 ffffffff81396bb8 0000000000010000
[   72.466505]  0000000000000000 ffff88007b869a90 ffff88006a48a000 ffff88006b7f1c70
[   72.466505] Call Trace:
[   72.466505]  [<ffffffff81396bb8>] memweight+0x5f/0x8a
[   72.466505]  [<ffffffff811c3b19>] ext4_count_free+0x13/0x21
[   72.466505]  [<ffffffff811c396c>] ext4_count_free_clusters+0xdb/0x171
[   72.466505]  [<ffffffff811e3bdd>] ext4_fill_super+0x117c/0x28ef
[   72.466505]  [<ffffffff81391569>] ? vsnprintf+0x1c7/0x3f7
[   72.466505]  [<ffffffff8114d8dc>] mount_bdev+0x145/0x19c
[   72.466505]  [<ffffffff811e2a61>] ? ext4_calculate_overhead+0x2a1/0x2a1
[   72.466505]  [<ffffffff811dab1d>] ext4_mount+0x15/0x17
[   72.466505]  [<ffffffff8114e3aa>] mount_fs+0x67/0x150
[   72.466505]  [<ffffffff811637ea>] vfs_kern_mount+0x64/0xde
[   72.466505]  [<ffffffff81165d19>] do_mount+0x6fe/0x7f5
[   72.466505]  [<ffffffff81126cc8>] ? strndup_user+0x3a/0xd9
[   72.466505]  [<ffffffff8116604b>] SyS_mount+0x85/0xbe
[   72.466505]  [<ffffffff81619e90>] tracesys+0xdd/0xe2
[   72.466505] Code: c3 89 f0 b9 40 00 00 00 55 99 48 89 e5 41 57 f7 f9 41 56 49 89 ff 41 55 45 31 ed 41 54 41 89 f4 53 31 db 41 89 c6 45 39 ee 7e 10 <4b> 8b 3c ef 49 ff c5 e8 bf ff ff ff 01 c3 eb eb 31 c0 45 85 f6
[   72.466505] RIP  [<ffffffff81394eb9>] __bitmap_weight+0x2a/0x7f
[   72.466505]  RSP <ffff88006b7f1c00>
[   72.466505] CR2: ffff88006ab56000
[   72.466505] ---[ end trace 7d051a08ae138573 ]---
Killed
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

036acea2

31 10月, 2013 1 次提交

ext4: don't count free clusters from a corrupt block group · 2746f7a1

由 Darrick J. Wong 提交于 10月 31, 2013

A bg that's been flagged "corrupt" by definition has no free blocks,
so that the allocator won't be tempted to use the damaged bg.
Therefore, we shouldn't count the clusters in the damaged group when
calculating free counts.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>

2746f7a1

29 8月, 2013 4 次提交

ext4: mark group corrupt on group descriptor checksum · bdfb6ff4

由 Darrick J. Wong 提交于 8月 28, 2013

If the group descriptor fails validation, mark the whole blockgroup
corrupt so that the inode/block allocators skip this group.  The
previous approach takes the risk of writing to a damaged group
descriptor; hopefully it was never the case that the [ib]bitmap fields
pointed to another valid block and got dirtied, since the memset would
fill the page with 1s.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

bdfb6ff4

ext4: mark block group as corrupt on block bitmap error · 163a203d

由 Darrick J. Wong 提交于 8月 28, 2013

When we notice a block-bitmap corruption (because of device failure or
something else), we should mark this group as corrupt and prevent
further block allocations/deallocations from it. Currently, we end up
generating one error message for every block in the bitmap. This
potentially could make the system unstable as noticed in some
bugs. With this patch, the error will be printed only the first time
and mark the entire block group as corrupted. This prevents future
access allocations/deallocations from it.

Also tested by corrupting the block
bitmap and forcefully introducing the mb_free_blocks error:
(1) create a largefile (2Gb)
$ dd if=/dev/zero of=largefile oflag=direct bs=10485760 count=200
(2) umount filesystem. use dumpe2fs to see which block-bitmaps
are in use by largefile and note their block numbers
(3) use dd to zero-out the used block bitmaps
$ dd if=/dev/zero of=/dev/hdc4 bs=4096 seek=14 count=8 oflag=direct
(4) mount the FS and delete the largefile.
(5) recreate the largefile. verify that the new largefile does not
get any blocks from the groups marked as bad.
Without the patch, we will see mb_free_blocks error for each bit in
each zero'ed out bitmap at (4). With the patch, we only see the error
once per blockgroup:
[  309.706803] EXT4-fs error (device sdb4): ext4_mb_generate_buddy:735: group 15: 32768 clusters in bitmap, 0 in gd. blk grp corrupted.
[  309.720824] EXT4-fs error (device sdb4): ext4_mb_generate_buddy:735: group 14: 32768 clusters in bitmap, 0 in gd. blk grp corrupted.
[  309.732858] EXT4-fs error (device sdb4) in ext4_free_blocks:4802: IO failure
[  309.748321] EXT4-fs error (device sdb4): ext4_mb_generate_buddy:735: group 13: 32768 clusters in bitmap, 0 in gd. blk grp corrupted.
[  309.760331] EXT4-fs error (device sdb4) in ext4_free_blocks:4802: IO failure
[  309.769695] EXT4-fs error (device sdb4): ext4_mb_generate_buddy:735: group 12: 32768 clusters in bitmap, 0 in gd. blk grp corrupted.
[  309.781721] EXT4-fs error (device sdb4) in ext4_free_blocks:4802: IO failure
[  309.798166] EXT4-fs error (device sdb4): ext4_mb_generate_buddy:735: group 11: 32768 clusters in bitmap, 0 in gd. blk grp corrupted.
[  309.810184] EXT4-fs error (device sdb4) in ext4_free_blocks:4802: IO failure
[  309.819532] EXT4-fs error (device sdb4): ext4_mb_generate_buddy:735: group 10: 32768 clusters in bitmap, 0 in gd. blk grp corrupted.

Google-Bug-Id: 7258357

[darrick.wong@oracle.com]
Further modifications (by Darrick) to make more obvious that this corruption
bit applies to blocks only.  Set the corruption flag if the block group bitmap
verification fails.

Original-author: Aditya Kali <adityakali@google.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

163a203d

ext4: fix type declaration of ext4_validate_block_bitmap · dbde0abe

由 Darrick J. Wong 提交于 8月 28, 2013

The block_group parameter to ext4_validate_block_bitmap is both used
as a ext4_group_t inside the function and the same type is passed in
by all callers.  We might as well use the typedef consistently instead
of open-coding the 'unsigned int'.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

dbde0abe

ext4: error out if verifying the block bitmap fails · 48d9eb97

由 Darrick J. Wong 提交于 8月 28, 2013

The block bitmap verification code assumes that calling ext4_error()
either panics the system or makes the fs readonly.  However, this is
not always true: when 'errors=continue' is specified, an error is
printed but we don't return any indication of error to the caller,
which is (probably) the block allocator, which pretends that the crud
we read in off the disk is a usable bitmap.  Yuck.

A block bitmap that fails the check should at least return no bitmap
to the caller.  The block allocator should be told to go look in a
different group, but that's a separate issue.

The easiest way to reproduce this is to modify bg_block_bitmap (on a
^flex_bg fs) to point to a block outside the block group; or you can
create a metadata_csum filesystem and zero out the block bitmaps.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

48d9eb97

06 7月, 2013 1 次提交

ext4: fix ext4_get_group_number() · 960fd856

由 Theodore Ts'o 提交于 7月 05, 2013

The function ext4_get_group_number() was introduced as an optimization
in commit bd86298e.  Unfortunately, this commit incorrectly
calculate the group number for file systems with a 1k block size (when
s_first_data_block is 1 instead of zero).  This could cause the
following kernel BUG:

[  568.877799] ------------[ cut here ]------------
[  568.877833] kernel BUG at fs/ext4/mballoc.c:3728!
[  568.877840] Oops: Exception in kernel mode, sig: 5 [#1]
[  568.877845] SMP NR_CPUS=32 NUMA pSeries
[  568.877852] Modules linked in: binfmt_misc
[  568.877861] CPU: 1 PID: 3516 Comm: fs_mark Not tainted 3.10.0-03216-g7c6809ff-dirty #1
[  568.877867] task: c0000001fb0b8000 ti: c0000001fa954000 task.ti: c0000001fa954000
[  568.877873] NIP: c0000000002f42a4 LR: c0000000002f4274 CTR: c000000000317ef8
[  568.877879] REGS: c0000001fa956ed0 TRAP: 0700   Not tainted  (3.10.0-03216-g7c6809ff-dirty)
[  568.877884] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 24000428  XER: 00000000
[  568.877902] SOFTE: 1
[  568.877905] CFAR: c0000000002b5464
[  568.877908]
GPR00: 0000000000000001 c0000001fa957150 c000000000c6a408 c0000001fb588000
GPR04: 0000000000003fff c0000001fa9571c0 c0000001fa9571c4 000138098c50625f
GPR08: 1301200000000000 0000000000000002 0000000000000001 0000000000000000
GPR12: 0000000024000422 c00000000f33a300 0000000000008000 c0000001fa9577f0
GPR16: c0000001fb7d0100 c000000000c29190 c0000000007f46e8 c000000000a14672
GPR20: 0000000000000001 0000000000000008 ffffffffffffffff 0000000000000000
GPR24: 0000000000000100 c0000001fa957278 c0000001fdb2bc78 c0000001fa957288
GPR28: 0000000000100100 c0000001fa957288 c0000001fb588000 c0000001fdb2bd10
[  568.877993] NIP [c0000000002f42a4] .ext4_mb_release_group_pa+0xec/0x1c0
[  568.877999] LR [c0000000002f4274] .ext4_mb_release_group_pa+0xbc/0x1c0
[  568.878004] Call Trace:
[  568.878008] [c0000001fa957150] [c0000000002f4274] .ext4_mb_release_group_pa+0xbc/0x1c0 (unreliable)
[  568.878017] [c0000001fa957200] [c0000000002fb070] .ext4_mb_discard_lg_preallocations+0x394/0x444
[  568.878025] [c0000001fa957340] [c0000000002fb45c] .ext4_mb_release_context+0x33c/0x734
[  568.878032] [c0000001fa957440] [c0000000002fbcf8] .ext4_mb_new_blocks+0x4a4/0x5f4
[  568.878039] [c0000001fa957510] [c0000000002ef56c] .ext4_ext_map_blocks+0xc28/0x1178
[  568.878047] [c0000001fa957640] [c0000000002c1a94] .ext4_map_blocks+0x2c8/0x490
[  568.878054] [c0000001fa957730] [c0000000002c536c] .ext4_writepages+0x738/0xc60
[  568.878062] [c0000001fa957950] [c000000000168a78] .do_writepages+0x5c/0x80
[  568.878069] [c0000001fa9579d0] [c00000000015d1c4] .__filemap_fdatawrite_range+0x88/0xb0
[  568.878078] [c0000001fa957aa0] [c00000000015d23c] .filemap_write_and_wait_range+0x50/0xfc
[  568.878085] [c0000001fa957b30] [c0000000002b8edc] .ext4_sync_file+0x220/0x3c4
[  568.878092] [c0000001fa957be0] [c0000000001f849c] .vfs_fsync_range+0x64/0x80
[  568.878098] [c0000001fa957c70] [c0000000001f84f0] .vfs_fsync+0x38/0x4c
[  568.878105] [c0000001fa957d00] [c0000000001f87f4] .do_fsync+0x54/0x90
[  568.878111] [c0000001fa957db0] [c0000000001f8894] .SyS_fsync+0x28/0x3c
[  568.878120] [c0000001fa957e30] [c000000000009c88] syscall_exit+0x0/0x7c
[  568.878125] Instruction dump:
[  568.878130] 60000000 813d0034 81610070 38000000 7f8b4800 419e001c 813f007c 7d2bfe70
[  568.878144] 7d604a78 7c005850 54000ffe 7c0007b4 <0b000000> e8a10076 e87f0090 7fa4eb78
[  568.878160] ---[ end trace 594d911d9654770b ]---

In addition fix the STD_GROUP optimization so that it works for
bigalloc file systems as well.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reported-by: NLi Zhong <lizhongfs@gmail.com>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>
Cc: stable@vger.kernel.org  # 3.10

960fd856

06 6月, 2013 1 次提交

ext4: optimize test_root() · f4afb4f4

由 Theodore Ts'o 提交于 6月 06, 2013

The test_root() function could potentially loop forever due to
overflow issues.  So rewrite test_root() to avoid this issue; as a
bonus, it is 38% faster when benchmarked via a test loop:

int main(int argc, char **argv)
{
	int  i;

	for (i = 0; i < 1 << 24; i++) {
		if (test_root(i, 7))
			printf("%d\n", i);
	}
}
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f4afb4f4

21 4月, 2013 1 次提交

ext4: mark all metadata I/O with REQ_META · 9f203507

由 Theodore Ts'o 提交于 4月 20, 2013

As Dave Chinner pointed out at the 2013 LSF/MM workshop, it's
important that metadata I/O requests are marked as such to avoid
priority inversions caused by I/O bandwidth throttling.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9f203507

10 4月, 2013 1 次提交

ext4: introduce reserved space · 27dd4385

由 Lukas Czerner 提交于 4月 09, 2013

Currently in ENOSPC condition when writing into unwritten space, or
punching a hole, we might need to split the extent and grow extent tree.
However since we can not allocate any new metadata blocks we'll have to
zero out unwritten part of extent or punched out part of extent, or in
the worst case return ENOSPC even though use actually does not allocate
any space.

Also in delalloc path we do reserve metadata and data blocks for the
time we're going to write out, however metadata block reservation is
very tricky especially since we expect that logical connectivity implies
physical connectivity, however that might not be the case and hence we
might end up allocating more metadata blocks than previously reserved.
So in future, metadata reservation checks should be removed since we can
not assure that we do not under reserve.

And this is where reserved space comes into the picture. When mounting
the file system we slice off a little bit of the file system space (2%
or 4096 clusters, whichever is smaller) which can be then used for the
cases mentioned above to prevent costly zeroout, or unexpected ENOSPC.

The number of reserved clusters can be set via sysfs, however it can
never be bigger than number of free clusters in the file system.

Note that this patch fixes the failure of xfstest 274 as expected.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>

27dd4385

04 4月, 2013 1 次提交

ext4: introduce ext4_get_group_number() · bd86298e

由 Lukas Czerner 提交于 4月 03, 2013

Currently on many places in ext4 we're using
ext4_get_group_no_and_offset() even though we're only interested in
knowing the block group of the particular block, not the offset within
the block group so we can use more efficient way to compute block
group.

This patch introduces ext4_get_group_number() which computes block
group for a given block much more efficiently. Use this function
instead of ext4_get_group_no_and_offset() everywhere where we're only
interested in knowing the block group.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

bd86298e

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功