提交 · e1290b3e62c496ade19939ce036f35bb69306820 · openeuler / Kernel

21 5月, 2011 2 次提交

ext4: Remove unnecessary wait_event ext4_run_lazyinit_thread() · e1290b3e

由 Lukas Czerner 提交于 5月 20, 2011

For some reason we have been waiting for lazyinit thread to start in the
ext4_run_lazyinit_thread() but it is not needed since it was jus
unnecessary complexity, so get rid of it. We can also remove li_task and
li_wait_task since it is not used anymore.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>

e1290b3e

ext4: Use schedule_timeout_interruptible() for waiting in lazyinit thread · 4ed5c033

由 Lukas Czerner 提交于 5月 20, 2011

In order to make lazyinit eat approx. 10% of io bandwidth at max, we
are sleeping between zeroing each single inode table. For that purpose
we are using timer which wakes up thread when it expires. It is set
via add_timer() and this may cause troubles in the case that thread
has been woken up earlier and in next iteration we call add_timer() on
still running timer hence hitting BUG_ON in add_timer(). We could fix
that by using mod_timer() instead however we can use
schedule_timeout_interruptible() for waiting and hence simplifying
things a lot.

This commit exchange the old "waiting mechanism" with simple
schedule_timeout_interruptible(), setting the time to sleep. Hence we
do not longer need li_wait_daemon waiting queue and others, so get rid
of it.

Addresses-Red-Hat-Bugzilla: #699708
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>

4ed5c033

19 5月, 2011 1 次提交

ext4: don't warn about mnt_count if it has been disabled · ed3ce80a

由 Tao Ma 提交于 5月 18, 2011

Currently, if we mkfs a new ext4 volume with s_max_mnt_count set to
zero, and mount it for the first time, we will get the warning:

	maximal mount count reached, running e2fsck is recommended

It is really misleading. So change the check so that it won't warn in
that case.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ed3ce80a

16 5月, 2011 1 次提交

ext4: fix oops in ext4_quota_off() · 0b268590

由 Amir Goldstein 提交于 5月 16, 2011

If quota is not enabled when ext4_quota_off() is called, we must not
dereference quota file inode since it is NULL.  Check properly for
this.

This fixes a bug in commit 21f97697 (ext4: remove unnecessary
[cm]time update of quota file), which was merged for 2.6.39-rc3.
Reported-by: NAmir Goldstein <amir73il@users.sf.net>
Signed-off-by: NAmir Goldstein <amir73il@users.sf.net>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0b268590

09 5月, 2011 2 次提交

ext4: remove redundant #ifdef in super.c · 66bb8279

由 Amerigo Wang 提交于 5月 09, 2011

There is already an #ifdef CONFIG_QUOTA some lines above,
so this one is totally useless.
Signed-off-by: NWANG Cong <amwang@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

66bb8279

ext4: remove redundant check for first_not_zeroed in ext4_register_li_request · 55ff3840

由 Tao Ma 提交于 5月 09, 2011

We have checked first_not_zeroed == ngroups already above, so remove
this redundant check.

sbi->s_li_request = NULL above is also removed since it is NULL
already.

Cc: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

55ff3840

19 4月, 2011 1 次提交

ext4: check for ext[23] file system features when mounting as ext[23] · 2035e776

由 Theodore Ts'o 提交于 4月 18, 2011

Provide better emulation for ext[23] mode by enforcing that the file
system does not have any unsupported file system features as defined
by ext[23] when emulating the ext[23] file system driver when
CONFIG_EXT4_USE_FOR_EXT23 is defined.

This causes the file system type information in /proc/mounts to be
correct for the automatically mounted root file system.  This also
means that "mount -t ext2 /dev/sda /mnt" will fail if /dev/sda
contains an ext3 or ext4 file system, just as one would expect if the
original ext2 file system driver were in use.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2035e776

11 4月, 2011 1 次提交

ext4: allow an active handle to be started when freezing · be4f27d3

由 Yongqiang Yang 提交于 4月 10, 2011

ext4_journal_start_sb() should not prevent an active handle from being
started due to s_frozen.  Otherwise, deadlock is easy to happen, below
is a situation.

================================================
     freeze         |       truncate
================================================
                    |  ext4_ext_truncate()
    freeze_super()  |   starts a handle
    sets s_frozen   |
                    |  ext4_ext_truncate()
                    |  holds i_data_sem
  ext4_freeze()     |
  waits for updates |
                    |  ext4_free_blocks()
                    |  calls dquot_free_block()
                    |
                    |  dquot_free_blocks()
                    |  calls ext4_dirty_inode()
                    |
                    |  ext4_dirty_inode()
                    |  trys to start an active
                    |  handle
                    |
                    |  block due to s_frozen
================================================
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reported-by: NAmir Goldstein <amir73il@users.sf.net>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>

be4f27d3

06 4月, 2011 1 次提交

ext4: init timer earlier to avoid a kernel panic in __save_error_info · 04496411

由 Tao Ma 提交于 4月 05, 2011

During mount, when we fail to open journal inode or root inode, the
__save_error_info will mod_timer. But actually s_err_report isn't
initialized yet and the kernel oops. The detailed information can
be found https://bugzilla.kernel.org/show_bug.cgi?id=32082.

The best way is to check whether the timer s_err_report is initialized
or not. But it seems that in include/linux/timer.h, we can't find a
good function to check the status of this timer, so this patch just
move the initializtion of s_err_report earlier so that we can avoid
the kernel panic. The corresponding del_timer is also added in the
error path.
Reported-by: NSami Liedes <sliedes@cc.hut.fi>
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

04496411

05 4月, 2011 2 次提交

ext4: fix a double free in ext4_register_li_request · 46e4690b

由 Tao Ma 提交于 4月 04, 2011

In ext4_register_li_request, we malloc a ext4_li_request and
inserts it into ext4_li_info->li_request_list. In case of any
error later, we free it in the end.  But if we have some error
in ext4_run_lazyinit_thread, the whole li_request_list will be
dropped and freed in it. So we will double free this ext4_li_request.

This patch just sets elr to NULL after it is inserted to the list
so that the latter kfree won't double free it.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

46e4690b

ext4: remove unnecessary [cm]time update of quota file · 21f97697

由 Jan Kara 提交于 4月 04, 2011

It is not necessary to update [cm]time of quota file on each quota
file write and it wastes journal space and IO throughput with inode
writes. So just remove the updating from ext4_quota_write() and only
update times when quotas are being turned off. Userspace cannot get
anything reliable from quota files while they are used by the kernel
anyway.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

21f97697

31 3月, 2011 1 次提交

Fix common misspellings · 25985edc

由 Lucas De Marchi 提交于 3月 30, 2011

Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

25985edc

22 3月, 2011 1 次提交

ext4: add missing space in printk's in __ext4_grp_locked_error() · 21149d61

由 Robin Dong 提交于 3月 21, 2011

When we do performence-testing on ext4 filesystem, we observed a
warning like this:

EXT4-fs error (device sda7): ext4_mb_generate_buddy:718: group 259825901 blocks in bitmap, 26057 in gd

instead, it should be

"group 2598, 25901 blocks in bitmap, 26057 in gd"
Reviewed-by: NColy Li <bosong.ly@taobao.com>
Cc: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: NRobin Dong <sanbai@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

21149d61

15 3月, 2011 1 次提交

ext4: Copy fs UUID to superblock · f2fa2ffc

由 Aneesh Kumar K.V 提交于 1月 29, 2011

File system UUID is made available to application
via  /proc/<pid>/mountinfo
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f2fa2ffc

06 3月, 2011 1 次提交

ext4: Use single thread to perform DIO unwritten convertion · 198868f3

由 Mingming Cao 提交于 3月 05, 2011

While running ext4 testing on multiple core, we found there are per
cpu ext4-dio-unwritten threads processing conversion from unwritten
extents to written for IOs completed from async direct IO patch.  Per
filesystem is enough, we don't need per cpu threads to work on
conversion.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>

198868f3

28 2月, 2011 2 次提交

ext4: skip orphan cleanup if fs has unknown ROCOMPAT features · d39195c3

由 Amir Goldstein 提交于 2月 28, 2011

Orphan cleanup is currently executed even if the file system has some
number of unknown ROCOMPAT features, which deletes inodes and frees
blocks, which could be very bad for some RO_COMPAT features,
especially the SNAPSHOT feature.

This patch skips the orphan cleanup if it contains readonly compatible
features not known by this ext4 implementation, which would prevent
the fs from being mounted (or remounted) readwrite.
Signed-off-by: NAmir Goldstein <amir73il@users.sf.net>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d39195c3

ext4: fix missing iput of root inode for some mount error paths · 32a9bb57

由 Manish Katiyar 提交于 2月 27, 2011

This assures that the root inode is not leaked, and that sb->s_root is
NULL, which will prevent generic_shutdown_super() from doing extra
work, including call sync_filesystem, which ultimately results in
ext4_sync_fs() getting called with an uninitialized struct super,
which is the cause of the crash noted in Kernel Bugzilla #26752.

https://bugzilla.kernel.org/show_bug.cgi?id=26752Signed-off-by: NManish Katiyar <mkatiyar@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

32a9bb57

27 2月, 2011 1 次提交

ext4: enable mblk_io_submit by default · 6fd7a467

由 Theodore Ts'o 提交于 2月 26, 2011

Now that we've fixed the file corruption bug in commit d50bdd5a,
it's time to enable mblk_io_submit by default.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6fd7a467

24 2月, 2011 2 次提交

ext4: enable acls and user_xattr by default · ea663336

由 Eric Sandeen 提交于 2月 23, 2011

There's no good reason to require the extra step of providing
a mount option for acl or user_xattr once the feature is configured
on; no other filesystem that I know of requires this.

Userspace patches have set these options in default mount options,
and this patch makes them default in the kernel.  At some point
we can start to deprecate the options, perhaps.

For now I've removed default mount option checks in show_options()
to be explicit about what's set, since it's changing the default,
but I'm open to alternatives if desired.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ea663336

L
ext4: mark file-local functions and variables as static · 0b75a840
由 Lukas Czerner 提交于 2月 23, 2011
```
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
0b75a840

22 2月, 2011 2 次提交

ext4: allow inode_readahead_blks=0 (linux-2.6.37) · 5dbd571d

由 Alexander V. Lukyanov 提交于 2月 21, 2011

I cannot disable inode-read-ahead feature of ext4 (on 2.6.37):

# echo 0 > /sys/fs/ext4/sda2/inode_readahead_blks 
bash: echo: write error: Invalid argument

On a server with lots of small files and random access this read-ahead makes
performance worse, and I'd like to disable it. I work around this problem
by using value of 1, but it still reads an extra block.

This patch fixes the problem by checking for zero explicitly.
Signed-off-by: NAlexander V. Lukyanov <lav@netis.ru>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5dbd571d

ext4: Fix sparse warning: Using plain integer as NULL pointer · 7dc57615

由 Peter Huewe 提交于 2月 21, 2011

This patch fixes the warning "Using plain integer as NULL pointer",
generated by sparse, by replacing the offending 0s with NULL.
Signed-off-by: NPeter Huewe <peterhuewe@gmx.de>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7dc57615

12 2月, 2011 1 次提交

ext4: serialize unaligned asynchronous DIO · e9e3bcec

由 Eric Sandeen 提交于 2月 12, 2011

ext4 has a data corruption case when doing non-block-aligned
asynchronous direct IO into a sparse file, as demonstrated
by xfstest 240.

The root cause is that while ext4 preallocates space in the
hole, mappings of that space still look "new" and 
dio_zero_block() will zero out the unwritten portions.  When
more than one AIO thread is going, they both find this "new"
block and race to zero out their portion; this is uncoordinated
and causes data corruption.

Dave Chinner fixed this for xfs by simply serializing all
unaligned asynchronous direct IO.  I've done the same here.
The difference is that we only wait on conversions, not all IO.
This is a very big hammer, and I'm not very pleased with
stuffing this into ext4_file_write().  But since ext4 is
DIO_LOCKING, we need to serialize it at this high level.

I tried to move this into ext4_ext_direct_IO, but by then
we have the i_mutex already, and we will wait on the
work queue to do conversions - which must also take the
i_mutex.  So that won't work.

This was originally exposed by qemu-kvm installing to
a raw disk image with a normal sector-63 alignment.  I've
tested a backport of this patch with qemu, and it does
avoid the corruption.  It is also quite a lot slower
(14 min for package installs, vs. 8 min for well-aligned)
but I'll take slow correctness over fast corruption any day.

Mingming suggested that we can track outstanding
conversions, and wait on those so that non-sparse
files won't be affected, and I've implemented that here;
unaligned AIO to nonsparse files won't take a perf hit.

[tytso@mit.edu: Keep the mutex as a hashed array instead
 of bloating the ext4 inode]

[tytso@mit.edu: Fix up namespace issues so that global
 variables are protected with an "ext4_" prefix.]
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e9e3bcec

04 2月, 2011 3 次提交

ext4: fix up ext4 error handling · dd68314c

由 Theodore Ts'o 提交于 2月 03, 2011

Make sure we the correct cleanup happens if we die while trying to
load the ext4 file system.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

dd68314c

ext4: unregister features interface on module unload · 8f021222

由 Lukas Czerner 提交于 2月 03, 2011

Ext4 features interface was not properly unregistered which led to
problems while unloading/reloading ext4 module. This commit fixes that by
adding proper kobject unregistration code into ext4_exit_fs() as well as
fail-path of ext4_init_fs()
Reported-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

8f021222

ext4: fix panic on module unload when stopping lazyinit thread · 8f1f7453

由 Eric Sandeen 提交于 2月 03, 2011

https://bugzilla.kernel.org/show_bug.cgi?id=27652

If the lazyinit thread is running, the teardown function
ext4_destroy_lazyinit_thread() has problems:

        ext4_clear_request_list();
        while (ext4_li_info->li_task) {
                wake_up(&ext4_li_info->li_wait_daemon);
                wait_event(ext4_li_info->li_wait_task,
                           ext4_li_info->li_task == NULL);
        }

Clearing the request list will cause the thread to exit and free
ext4_li_info, so then we're waiting on something which is getting
freed.

Fix this up by making the thread respond to kthread_stop, and exit,
without the need to wait for that exit in some other homegrown way.

Cc: stable@kernel.org
Reported-and-Tested-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8f1f7453

01 2月, 2011 1 次提交

ext4: convert to alloc_workqueue() · fd89d5f2

由 Tejun Heo 提交于 2月 01, 2011

Convert create_workqueue() to alloc_workqueue().  This is an identity
conversion.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: linux-ext4@vger.kernel.org

fd89d5f2

13 1月, 2011 1 次提交

quota: Fix deadlock during path resolution · f00c9e44

由 Jan Kara 提交于 9月 15, 2010

As Al Viro pointed out path resolution during Q_QUOTAON calls to quotactl
is prone to deadlocks. We hold s_umount semaphore for reading during the
path resolution and resolution itself may need to acquire the semaphore
for writing when e. g. autofs mountpoint is passed.

Solve the problem by performing the resolution before we get hold of the
superblock (and thus s_umount semaphore). The whole thing is complicated
by the fact that some filesystems (OCFS2) ignore the path argument. So to
distinguish between filesystem which want the path and which do not we
introduce new .quota_on_meta callback which does not get the path. OCFS2
then uses this callback instead of old .quota_on.

CC: Al Viro <viro@ZenIV.linux.org.uk>
CC: Christoph Hellwig <hch@lst.de>
CC: Ted Ts'o <tytso@mit.edu>
CC: Joel Becker <joel.becker@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>

f00c9e44

11 1月, 2011 5 次提交

ext4: fix uninitialized variable in ext4_register_li_request · 6c5a6cb9

由 Andrew Morton 提交于 1月 10, 2011

fs/ext4/super.c: In function 'ext4_register_li_request':
fs/ext4/super.c:2936: warning: 'ret' may be used uninitialized in this function

It looks buggy to me, too.

Cc: Lukas Czerner <lczerner@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6c5a6cb9

ext4: dynamically allocate the jbd2_inode in ext4_inode_info as necessary · 8aefcd55

由 Theodore Ts'o 提交于 1月 10, 2011

Replace the jbd2_inode structure (which is 48 bytes) with a pointer
and only allocate the jbd2_inode when it is needed --- that is, when
the file system has a journal present and the inode has been opened
for writing.  This allows us to further slim down the ext4_inode_info
structure.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8aefcd55

ext4: replace i_delalloc_reserved_flag with EXT4_STATE_DELALLOC_RESERVED · f2321097

由 Theodore Ts'o 提交于 1月 10, 2011

Remove the short element i_delalloc_reserved_flag from the
ext4_inode_info structure and replace it a new bit in i_state_flags.
Since we have an ext4_inode_info for every ext4 inode cached in the
inode cache, any savings we can produce here is a very good thing from
a memory utilization perspective.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f2321097

ext4: Use ext4_error_file() to print the pathname to the corrupted inode · f7c21177

由 Theodore Ts'o 提交于 1月 10, 2011

Where the file pointer is available, use ext4_error_file() instead of
ext4_error_inode().
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f7c21177

ext4: use IS_ERR() to check for errors in ext4_error_file · f9a62d09

由 Dan Carpenter 提交于 1月 10, 2011

d_path() returns an ERR_PTR and it doesn't return NULL.  This is in
ext4_error_file() and no one actually calls ext4_error_file().
Signed-off-by: NDan Carpenter <error27@gmail.com>

f9a62d09

07 1月, 2011 1 次提交

fs: icache RCU free inodes · fa0d7e3d

由 Nick Piggin 提交于 1月 07, 2011

RCU free the struct inode. This will allow:

- Subsequent store-free path walking patch. The inode must be consulted for
  permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
  to take i_lock no longer need to take sb_inode_list_lock to walk the list in
  the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
  page lock to follow page->mapping.

The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.

In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.

The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

fa0d7e3d

20 12月, 2010 2 次提交

ext4: Use printf extension %pV · 0ff2ea7d

由 Joe Perches 提交于 12月 19, 2010

Using %pV reduces the number of printk calls and eliminates any
possible message interleaving from other printk calls.

In function __ext4_grp_locked_error also added KERN_CONT to some
printks.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0ff2ea7d

J
ext4: Use vzalloc in ext4_fill_flex_info() · 94de56ab
由 Joe Perches 提交于 12月 19, 2010
```
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
94de56ab

16 12月, 2010 3 次提交

T
ext4: Add second mount options field since the s_mount_opt is full up · a2595b8a
由 Theodore Ts'o 提交于 12月 15, 2010
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
a2595b8a

ext4: Move struct ext4_mount_options from ext4.h to super.c · 673c6100

由 Theodore Ts'o 提交于 12月 15, 2010

Move the ext4_mount_options structure definition from ext4.h, since it
is only used in super.c.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

673c6100

ext4: Simplify the usage of clear_opt() and set_opt() macros · fd8c37ec

由 Theodore Ts'o 提交于 12月 15, 2010

Change clear_opt() and set_opt() to take a superblock pointer instead
of a pointer to EXT4_SB(sb)->s_mount_opt.  This makes it easier for us
to support a second mount option field.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

fd8c37ec

15 12月, 2010 1 次提交

ext4: Turn off multiple page-io submission by default · 1449032b

由 Theodore Ts'o 提交于 12月 14, 2010

Jon Nelson has found a test case which causes postgresql to fail with
the error:

psql:t.sql:4: ERROR: invalid page header in block 38269 of relation base/16384/16581

Under memory pressure, it looks like part of a file can end up getting
replaced by zero's.  Until we can figure out the cause, we'll roll
back the change and use block_write_full_page() instead of
ext4_bio_write_page().  The new, more efficient writing function can
be used via the mount option mblk_io_submit, so we can test and fix
the new page I/O code.

To reproduce the problem, install postgres 8.4 or 9.0, and pin enough
memory such that the system just at the end of triggering writeback
before running the following sql script:

begin;
create temporary table foo as select x as a, ARRAY[x] as b FROM
generate_series(1, 10000000 ) AS x;
create index foo_a_idx on foo (a);
create index foo_b_idx on foo USING GIN (b);
rollback;

If the temporary table is created on a hard drive partition which is
encrypted using dm_crypt, then under memory pressure, approximately
30-40% of the time, pgsql will issue the above failure.

This patch should fix this problem, and the problem will come back if
the file system is mounted with the mblk_io_submit mount option.
Reported-by: NJon Nelson <jnelson@jamponi.net>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1449032b

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功