提交 · a34eb503742fd25155fd6cff6163daacead9fbc3 · openeuler / Kernel

27 7月, 2013 1 次提交

ext4: make sure group number is bumped after a inode allocation race · a34eb503

由 Theodore Ts'o 提交于 7月 26, 2013

When we try to allocate an inode, and there is a race between two
CPU's trying to grab the same inode, _and_ this inode is the last free
inode in the block group, make sure the group number is bumped before
we continue searching the rest of the block groups.  Otherwise, we end
up searching the current block group twice, and we end up skipping
searching the last block group.  So in the unlikely situation where
almost all of the inodes are allocated, it's possible that we will
return ENOSPC even though there might be free inodes in that last
block group.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

a34eb503

05 6月, 2013 1 次提交

ext4: provide wrappers for transaction reservation calls · 5fe2fe89

由 Jan Kara 提交于 6月 04, 2013

Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5fe2fe89

21 4月, 2013 1 次提交

ext4: mark all metadata I/O with REQ_META · 9f203507

由 Theodore Ts'o 提交于 4月 20, 2013

As Dave Chinner pointed out at the 2013 LSF/MM workshop, it's
important that metadata I/O requests are marked as such to avoid
priority inversions caused by I/O bandwidth throttling.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9f203507

20 4月, 2013 1 次提交

ext4: move quota initialization out of inode allocation transaction · eb9cc7e1

由 Jan Kara 提交于 4月 19, 2013

Inode allocation transaction is pretty heavy (246 credits with quotas
and extents before previous patch, still around 200 after it).  This is
mostly due to credits required for allocation of quota structures
(credits there are heavily overestimated but it's difficult to make
better estimates if we don't want to wire non-trivial assumptions about
quota format into filesystem).

So move quota initialization out of allocation transaction. That way
transaction for quota structure allocation will be started only if we
need to look up quota structure on disk (rare) and furthermore it will
be started for each quota type separately, not for all of them at once.
This reduces maximum transaction size to 34 is most cases and to 73 in
the worst case.

[ Modified by tytso to clean up the cleanup paths for error handling.
  Also use a separate call to ext4_std_error() for each failure so it
  is easier for someone who is debugging a problem in this function to
  determine which function call failed. ]
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

eb9cc7e1

10 4月, 2013 1 次提交

ext4: fix usless declarations · 8c8e0ca6

由 Dmitri Monakho 提交于 4月 09, 2013

This patch should fix sparse complains about shadow declatations.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8c8e0ca6

12 3月, 2013 1 次提交

ext4: use atomic64_t for the per-flexbg free_clusters count · 90ba983f

由 Theodore Ts'o 提交于 3月 11, 2013

A user who was using a 8TB+ file system and with a very large flexbg
size (> 65536) could cause the atomic_t used in the struct flex_groups
to overflow.  This was detected by PaX security patchset:

http://forums.grsecurity.net/viewtopic.php?f=3&t=3289&p=12551#p12551

This bug was introduced in commit 9f24e420, so it's been around
since 2.6.30.  :-(

Fix this by using an atomic64_t for struct orlav_stats's
free_clusters.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>
Cc: stable@vger.kernel.org

90ba983f

15 2月, 2013 1 次提交

ext4: use KERN_WARNING for warning messages · 8de5c325

由 Theodore Ts'o 提交于 2月 14, 2013

Some messages printed related to a WARN_ON(1) were printed using
KERN_NOTICE.  Use KERN_WARNING or ext4_warning() instead so that
context related to the WARN_ON() is printed at the same printk warning
level (and log files, etc.)
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8de5c325

10 2月, 2013 1 次提交

ext4: start handle at the last possible moment when creating inodes · 1139575a

由 Theodore Ts'o 提交于 2月 09, 2013

In ext4_{create,mknod,mkdir,symlink}(), don't start the journal handle
until the inode has been succesfully allocated. In order to do this,
we need to start the handle in the ext4_new_inode(). So create a new
variant of this function, ext4_new_inode_start_handle(), so the handle
can be created at the last possible minute, before we need to modify
the inode allocation bitmap block.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1139575a

09 2月, 2013 1 次提交

ext4: pass context information to jbd2__journal_start() · 9924a92a

由 Theodore Ts'o 提交于 2月 08, 2013

So we can better understand what bits of ext4 are responsible for
long-running jbd2 handles, use jbd2__journal_start() so we can pass
context information for logging purposes.

The recommended way for finding the longer-running handles is:

   T=/sys/kernel/debug/tracing
   EVENT=$T/events/jbd2/jbd2_handle_stats
   echo "interval > 5" > $EVENT/filter
   echo 1 > $EVENT/enable

   ./run-my-fs-benchmark

   cat $T/trace > /tmp/problem-handles

This will list handles that were active for longer than 20ms.  Having
longer-running handles is bad, because a commit started at the wrong
time could stall for those 20+ milliseconds, which could delay an
fsync() or an O_SYNC operation.  Here is an example line from the
trace file describing a handle which lived on for 311 jiffies, or over
1.2 seconds:

postmark-2917  [000] ....   196.435786: jbd2_handle_stats: dev 254,32 
   tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1
   dirtied_blocks 0
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9924a92a

11 12月, 2012 1 次提交

ext4: enable ext4 inline support · f08225d1

由 Tao Ma 提交于 12月 10, 2012

Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f08225d1

30 11月, 2012 1 次提交

ext4: fix possible use after free with metadata csum · aeb1e5d6

由 Theodore Ts'o 提交于 11月 29, 2012

Commit fa77dcfa introduces block bitmap checksum calculation into
ext4_new_inode() in the case that block group was uninitialized.
However we brelse() the bitmap buffer before we attempt to checksum it
so we have no guarantee that the buffer is still there.

Fix this by releasing the buffer after the possible checksum
computation.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Acked-by: NDarrick J. Wong <darrick.wong@oracle.com>
Cc: stable@vger.kernel.org

aeb1e5d6

29 10月, 2012 1 次提交

ext4: fix unjournaled inode bitmap modification · ffb5387e

由 Eric Sandeen 提交于 10月 28, 2012

commit 119c0d44 changed
ext4_new_inode() such that the inode bitmap was being modified
outside a transaction, which could lead to corruption, and was
discovered when journal_checksum found a bad checksum in the
journal during log replay.

Nix ran into this when using the journal_async_commit mount
option, which enables journal checksumming.  The ensuing
journal replay failures due to the bad checksums led to
filesystem corruption reported as the now infamous
"Apparent serious progressive ext4 data corruption bug"

[ Changed by tytso to only call ext4_journal_get_write_access() only
  when we're fairly certain that we're going to allocate the inode. ]

I've tested this by mounting with journal_checksum and
running fsstress then dropping power; I've also tested by
hacking DM to create snapshots w/o first quiescing, which
allows me to test journal replay repeatedly w/o actually
power-cycling the box.  Without the patch I hit a journal
checksum error every time.  With this fix it survives
many iterations.
Reported-by: NNix <nix@esperi.org.uk>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

ffb5387e

22 10月, 2012 1 次提交

ext4: Checksum the block bitmap properly with bigalloc enabled · 79f1ba49

由 Tao Ma 提交于 10月 22, 2012

In mke2fs, we only checksum the whole bitmap block and it is right.
While in the kernel, we use EXT4_BLOCKS_PER_GROUP to indicate the
size of the checksumed bitmap which is wrong when we enable bigalloc.
The right size should be EXT4_CLUSTERS_PER_GROUP and this patch fixes
it.

Also as every caller of ext4_block_bitmap_csum_set and
ext4_block_bitmap_csum_verify pass in EXT4_BLOCKS_PER_GROUP(sb)/8,
we'd better removes this parameter and sets it in the function itself.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>
Cc: stable@vger.kernel.org

79f1ba49

24 9月, 2012 1 次提交

ext4: check free inode count before allocating an inode · f2a09af6

由 Yongqiang Yang 提交于 9月 23, 2012

Recently, I ecountered some corrupted filesystems in which some
groups' free inode counts were 65535, it seemed that free inode
count was overflow.  This patch teaches ext4 to check free inode
count before allocaing an inode.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f2a09af6

23 7月, 2012 1 次提交

ext4: remove useless marking of superblock dirty · 97a74068

由 Jan Kara 提交于 7月 22, 2012

Commit a0375156 properly notes that superblock doesn't need to be marked
as dirty when only number of free inodes / blocks / number of directories
changes since that is recomputed on each mount anyway. However that comment
leaves some unnecessary markings as dirty in place. Remove these.

Artem: tested using xfstests for both journalled and non-journalled ext4.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Tested-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>

97a74068

01 7月, 2012 1 次提交

ext4: pass a char * to ext4_count_free() instead of a buffer_head ptr · f6fb99ca

由 Theodore Ts'o 提交于 6月 30, 2012

Make it possible for ext4_count_free to operate on buffers and not
just data in buffer_heads.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

f6fb99ca

29 5月, 2012 2 次提交

ext4: protect group inode free counting with group lock · 6f2e9f0e

由 Tao Ma 提交于 5月 28, 2012

Now when we set the group inode free count, we don't have a proper
group lock so that multiple threads may decrease the inode free
count at the same time. And e2fsck will complain something like:

Free inodes count wrong for group #1 (1, counted=0).
Fix? no

Free inodes count wrong for group #2 (3, counted=0).
Fix? no

Directories count wrong for group #2 (780, counted=779).
Fix? no

Free inodes count wrong for group #3 (2272, counted=2273).
Fix? no

So this patch try to protect it with the ext4_lock_group.

btw, it is found by xfstests test case 269 and the volume is
mkfsed with the parameter
"-O ^resize_inode,^uninit_bg,extent,meta_bg,flex_bg,ext_attr"
and I have run it 100 times and the error in e2fsck doesn't
show up again.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6f2e9f0e

ext4: fix potential NULL dereference in ext4_free_inodes_counts() · bb3d132a

由 Dan Carpenter 提交于 5月 28, 2012

The ext4_get_group_desc() function returns NULL on error, and
ext4_free_inodes_count() function dereferences it without checking.
There is a check on the next line, but it's too late.
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

bb3d132a

16 5月, 2012 1 次提交
- E
  userns: Convert ext4 to user kuid/kgid where appropriate · 08cefc7a
  由 Eric W. Biederman 提交于 2月 07, 2012
```
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
```
  08cefc7a
30 4月, 2012 4 次提交

ext4: make block group checksums use metadata_csum algorithm · feb0ab32

由 Darrick J. Wong 提交于 4月 29, 2012

metadata_csum supersedes uninit_bg.  Convert the ROCOMPAT uninit_bg
flag check to a helper function that covers both, and make the
checksum calculation algorithm use either crc16 or the metadata_csum
chosen algorithm depending on which flag is set.  Print a warning if
we try to mount a filesystem with both feature flags set.
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

feb0ab32

ext4: calculate and verify block bitmap checksum · fa77dcfa

由 Darrick J. Wong 提交于 4月 29, 2012

Compute and verify the checksum of the block bitmap; this checksum is
stored in the block group descriptor.
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

fa77dcfa

ext4: calculate and verify checksums for inode bitmaps · 41a246d1

由 Darrick J. Wong 提交于 4月 29, 2012

Compute and verify the checksum of the inode bitmap; the checkum is
stored in the block group descriptor.
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

41a246d1

ext4: calculate and verify inode checksums · 814525f4

由 Darrick J. Wong 提交于 4月 29, 2012

This patch introduces to ext4 the ability to calculate and verify
inode checksums.  This requires the use of a new ro compatibility flag
and some accompanying e2fsprogs patches to provide the relevant
features in tune2fs and e2fsck.  The inode generation changes have
been integrated into this patch.
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

814525f4

20 3月, 2012 2 次提交
- T
  ext4: change some printk() calls to use ext4_msg() instead · 92b97816
  由 Theodore Ts'o 提交于 3月 19, 2012
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
  92b97816
- T
  ext4: remove trailing newlines from ext4_msg() and ext4_error() messages · 1084f252
  由 Theodore Ts'o 提交于 3月 19, 2012
```
The functions ext4_msg() and ext4_error() already tack on a trailing
newline, so remove the unnecessary extra newline.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
  1084f252
21 2月, 2012 1 次提交

ext4: fix race when setting bitmap_uptodate flag · 813e5727

由 Theodore Ts'o 提交于 2月 20, 2012

In ext4_read_{inode,block}_bitmap() we were setting bitmap_uptodate()
before submitting the buffer for read.  The is bad, since we check
bitmap_uptodate() without locking the buffer, and so if another
process is racing with us, it's possible that they will think the
bitmap is uptodate even though the read has not completed yet,
resulting in inodes and blocks potentially getting allocated more than
once if we get really unlucky.

Addresses-Google-Bug: 2828254
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

813e5727

07 2月, 2012 1 次提交

ext4: fold ext4_claim_inode into ext4_new_inode · 119c0d44

由 Theodore Ts'o 提交于 2月 06, 2012

The function ext4_claim_inode() is only called by one function,
ext4_new_inode(), and by folding the functionality into
ext4_new_inode(), we can remove almost 50 lines of code, and put all
of the logic of allocating a new inode into a single place.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

119c0d44

04 1月, 2012 1 次提交
- A
  ext4: propagate umode_t · dcca3fec
  由 Al Viro 提交于 7月 26, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  dcca3fec
29 12月, 2011 2 次提交

ext4: use proper little-endian bitops · 597d508c

由 Akinobu Mita 提交于 12月 28, 2011

ext4_{set,clear}_bit() is defined as __test_and_{set,clear}_bit_le() for
ext4.  Only two ext4_{set,clear}_bit() calls check the return value.  The
rest of calls ignore the return value and they can be replaced with
__{set,clear}_bit_le().

This changes ext4_{set,clear}_bit() from __test_and_{set,clear}_bit_le()
to __{set,clear}_bit_le() and introduces ext4_test_and_{set,clear}_bit()
for the two places where old bit needs to be returned.

This ext4_{set,clear}_bit() change is considered safe, because if someone
uses these macros without noticing the change, new ext4_{set,clear}_bit
don't have return value and causes compiler errors where the return value
is used.

This also removes unused ext4_find_first_zero_bit().
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

597d508c

T
ext4: avoid counting the number of free inodes twice in find_group_orlov() · 14c83c9f
由 Theodore Ts'o 提交于 12月 28, 2011
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
14c83c9f

19 12月, 2011 1 次提交

ext4: fix error handling on inode bitmap corruption · acd6ad83

由 Jan Kara 提交于 12月 18, 2011

When insert_inode_locked() fails in ext4_new_inode() it most likely means inode
bitmap got corrupted and we allocated again inode which is already in use. Also
doing unlock_new_inode() during error recovery is wrong since the inode does
not have I_NEW set. Fix the problem by jumping to fail: (instead of fail_drop:)
which declares filesystem error and does not call unlock_new_inode().
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

acd6ad83

02 11月, 2011 1 次提交

filesystems: add missing nlink wrappers · 6d6b77f1

由 Miklos Szeredi 提交于 10月 28, 2011

Replace direct i_nlink updates with the respective updater function
(inc_nlink, drop_nlink, clear_nlink, inode_dec_link_count).
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

6d6b77f1

01 11月, 2011 1 次提交

ext4: remove comments about extent mount option in ext4_new_inode() · 4af83508

由 Eryu Guan 提交于 10月 31, 2011

Remove comments about 'extent' mount option in ext4_new_inode(), since
it's no longer exists.
Signed-off-by: NEryu Guan <guaneryu@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4af83508

29 10月, 2011 1 次提交

ext4: fix quota accounting during migration · 5cb81dab

由 Dmitry Monakhov 提交于 10月 29, 2011

The tmp_inode should have same uid/gid as the original inode.
Otherwise new metadata blocks will be accounted to wrong quota-id,
which will result in a quota leak after the inode migration is
completed.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5cb81dab

18 10月, 2011 1 次提交

ext4: functions should not be declared extern · e0cbee3e

由 H Hartley Sweeten 提交于 10月 18, 2011

The function declarations in ext4.h are already marked extern, so it's
not necessary to do so in the .c files.

This quiets the sparse noise:

warning: function 'ext4_flush_completed_IO' with external linkage has definition
warning: function 'ext4_init_inode_table' with external linkage has definition
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e0cbee3e

09 10月, 2011 1 次提交

ext4: remove deprecated oldalloc · 4113c4ca

由 Lukas Czerner 提交于 10月 08, 2011

For a long time now orlov is the default block allocator in the
ext4. It performs better than the old one and no one seems to claim
otherwise so we can safely drop it and make oldalloc and orlov mount
option deprecated.

This is a part of the effort to reduce number of ext4 options hence the
test matrix.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4113c4ca

10 9月, 2011 4 次提交

ext4: rename ext4_free_blocks_after_init() to ext4_free_clusters_after_init() · cff1dfd7

由 Theodore Ts'o 提交于 9月 09, 2011

This function really returns the number of clusters after initializing
an uninitalized block bitmap has been initialized.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

cff1dfd7

ext4: Rename ext4_free_blks_{count,set}() to refer to clusters · 021b65bb

由 Theodore Ts'o 提交于 9月 09, 2011

The field bg_free_blocks_count_{lo,high} in the block group
descriptor has been repurposed to hold the number of free clusters for
bigalloc functions.  So rename the functions so it makes it easier to
read and audit the block allocation and block freeing code.

Note: at this point in bigalloc development we doesn't support
online resize, so this also makes it really obvious all of the places
we need to fix up to add support for online resize.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

021b65bb

ext4: convert the free_blocks field in s_flex_groups to be free_clusters · 24aaa8ef

由 Theodore Ts'o 提交于 9月 09, 2011

Convert the free_blocks to be free_clusters to make the final revised
bigalloc changes easier to read/understand.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

24aaa8ef

ext4: convert s_{dirty,free}blocks_counter to s_{dirty,free}clusters_counter · 57042651

由 Theodore Ts'o 提交于 9月 09, 2011

Convert the percpu counters s_dirtyblocks_counter and
s_freeblocks_counter in struct ext4_super_info to be
s_dirtyclusters_counter and s_freeclusters_counter.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

57042651

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功