提交 · e8134b27e351e813414da3b95aa8eac6d3908088 · openeuler / Kernel

06 1月, 2009 2 次提交

ext4: Fix race between read_block_bitmap() and mark_diskspace_used() · e8134b27

由 Aneesh Kumar K.V 提交于 1月 05, 2009

We need to make sure we update the block bitmap and clear
EXT4_BG_BLOCK_UNINIT flag with sb_bgl_lock held, since
ext4_read_block_bitmap() looks at EXT4_BG_BLOCK_UNINIT to decide
whether to initialize the block bitmap each time it is called
(introduced by commit c806e68f), and this can race with block
allocations in ext4_mb_mark_diskspace_used().

ext4_read_block_bitmap does:

spin_lock(sb_bgl_lock(EXT4_SB(sb), block_group));
if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
	ext4_init_block_bitmap(sb, bh, block_group, desc);

Now on the block allocation side we do

mb_set_bits(sb_bgl_lock(sbi, ac->ac_b_ex.fe_group), bitmap_bh->b_data,
			ac->ac_b_ex.fe_start, ac->ac_b_ex.fe_len);
....
spin_lock(sb_bgl_lock(sbi, ac->ac_b_ex.fe_group));
if (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
	gdp->bg_flags &= cpu_to_le16(~EXT4_BG_BLOCK_UNINIT);

ie on allocation we update the bitmap then we take the sb_bgl_lock
and clear the EXT4_BG_BLOCK_UNINIT flag. What can happen is a
parallel ext4_read_block_bitmap can zero out the bitmap in between
the above mb_set_bits and spin_lock(sb_bg_lock..)

The race results in below user visible errors
EXT4-fs error (device sdb1): ext4_mb_release_inode_pa: free 100, pa_free 105
EXT4-fs error (device sdb1): mb_free_blocks: double-free of inode 0's block ..
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

e8134b27

ext4: fix BUG when calling ext4_error with locked block group · 5d1b1b3f

由 Aneesh Kumar K.V 提交于 1月 05, 2009

The mballoc code likes to call ext4_error while it is holding locked
block groups. This can causes a scheduling in atomic context BUG. We
can't just unlock the block group and relock it after/if ext4_error
returns since that might result in race conditions in the case where
the filesystem is set to continue after finding errors.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5d1b1b3f

24 11月, 2008 1 次提交

ext4: Fix lockdep recursive locking warning · b7be019e

由 Aneesh Kumar K.V 提交于 11月 23, 2008

In ext4_mb_init_group(), if the filesystem block size is less than
PAGE_SIZE/2, the code tries to grab alloc_sem for multiple block
groups in a loop.  We need to allow for this by using
down_write_nested() and passing in the loop index as a lock subclass
number.  This works because no other code path needs to take multiple
alloc_sem's.  Note that lockdep will fail for filesystem blocksize
smaller than to PAGE_SIZE/16k.  (e.g., a 1k filesystem blocksize with
a 32k page size, or a 2k filesystem blocksize with a 64k blocksize,
etc.)
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b7be019e

06 1月, 2009 1 次提交

ext4: don't use blocks freed but not yet committed in buddy cache init · 7a2fcbf7

由 Aneesh Kumar K.V 提交于 1月 05, 2009

When we generate buddy cache (especially during resize) we need to
make sure we don't use the blocks freed but not yet comitted.  This
makes sure we have the right value of free blocks count in the group
info and also in the bitmap.  This also ensures the ordered mode
consistency
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

7a2fcbf7

07 11月, 2008 1 次提交

jbd2: Call journal commit callback without holding j_list_lock · fb68407b

由 Aneesh Kumar K.V 提交于 11月 06, 2008

Avoid freeing the transaction in __jbd2_journal_drop_transaction() so
the journal commit callback can run without holding j_list_lock, to
avoid lock contention on this spinlock.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

fb68407b

26 11月, 2008 1 次提交

ext4: cleanup mballoc header files · c3a326a6

由 Aneesh Kumar K.V 提交于 11月 25, 2008

Move some of the forward declaration of the static functions
to mballoc.c where they are used. This enables us to include
mballoc.h in other .c files. Also correct the buddy cache
documentation.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c3a326a6

06 1月, 2009 2 次提交

ext4: Use EXT4_GROUP_INFO_NEED_INIT_BIT during resize · 920313a7

由 Aneesh Kumar K.V 提交于 1月 05, 2009

The new groups added during resize are flagged as
need_init group. Make sure we properly initialize these
groups. When we have block size < page size and we are adding
new groups the page may still be marked uptodate even though
we haven't initialized the group. While forcing the init
of buddy cache we need to make sure other groups part of the
same page of buddy cache is not using the cache.
group_info->alloc_sem is added to ensure the same.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
cc: stable@kernel.org

920313a7

ext4: Add blocks added during resize to bitmap · e21675d4

由 Aneesh Kumar K.V 提交于 1月 05, 2009

With this change new blocks added during resize
are marked as free in the block bitmap and the
group is flagged with EXT4_GROUP_INFO_NEED_INIT_BIT
flag.  This makes sure when mballoc tries to allocate
blocks from the new group we would reload the
buddy information using the bitmap present in the disk.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

e21675d4

23 11月, 2008 1 次提交

ext4: sparse fixes · 3a06d778

由 Aneesh Kumar K.V 提交于 11月 22, 2008

* Change EXT4_HAS_*_FEATURE to return a boolean
* Add a function prototype for ext4_fiemap() in ext4.h
* Make ext4_ext_fiemap_cb() and ext4_xattr_fiemap() be static functions
* Add lock annotations to mb_free_blocks()
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3a06d778

05 11月, 2008 2 次提交

jbd2: Remove a large array of bh's from the stack of the checkpoint routine · 1a0d3786

由 Theodore Ts'o 提交于 11月 05, 2008

jbd2_log_do_checkpoint()n is one of the kernel's largest stack users.
Move the array of buffer head's from the stack of jbd2_log_do_checkpoint()
to the in-core journal structure.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1a0d3786

ext4: Change unsigned long to unsigned int · 498e5f24

由 Theodore Ts'o 提交于 11月 05, 2008

Convert the unsigned longs that are most responsible for bloating the
stack usage on 64-bit systems.

Nearly all places in the ext3/4 code which uses "unsigned long" is
probably a bug, since on 32-bit systems a ulong a 32-bits, which means
we are wasting stack space on 64-bit systems.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

498e5f24

06 1月, 2009 1 次提交

ext4: Make ext4_group_t be an unsigned int · a9df9a49

由 Theodore Ts'o 提交于 1月 05, 2009

Nearly all places in the ext3/4 code which uses "unsigned long" is
probably a bug, since on 32-bit systems a ulong a 32-bits, which means
we are wasting stack space on 64-bit systems.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a9df9a49

05 11月, 2008 1 次提交

ext4: Remove i_ext_generation from ext4_inode_info structure · cde64360

由 Theodore Ts'o 提交于 11月 04, 2008

The i_ext_generation was incremented, but never used.  Remove it to
slim down the ext4_inode_info structure.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

cde64360

04 1月, 2009 1 次提交

ext4: add fsync batch tuning knobs · 30773840

由 Theodore Ts'o 提交于 1月 03, 2009

Add new mount options, min_batch_time and max_batch_time, which
controls how long the jbd2 layer should wait for additional filesystem
operations to get batched with a synchronous write transaction.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

30773840

17 12月, 2008 1 次提交

ext4: display average commit time · d7cfa468

由 Theodore Ts'o 提交于 12月 17, 2008

Display the average commit time (which is used by the ext4 fsync
batching patch) in /proc/fs/jbd2/*/info for performance tuning
purposes.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d7cfa468

26 11月, 2008 1 次提交

jbd2: improve jbd2 fsync batching · e07f7183

由 Josef Bacik 提交于 11月 26, 2008

This patch removes the static sleep time in favor of a more self
optimizing approach where we measure the average amount of time it
takes to commit a transaction to disk and the ammount of time a
transaction has been running. If somebody does a sync write or an
fsync() traditionally we would sleep for 1 jiffies, which depending on
the value of HZ could be a significant amount of time compared to how
long it takes to commit a transaction to the underlying storage. With
this patch instead of sleeping for a jiffie, we check to see if the
amount of time this transaction has been running is less than the
average commit time, and if it is we sleep for the delta using
schedule_hrtimeout to give us a higher precision sleep time. This
greatly benefits high end storage where you could end up sleeping for
longer than it takes to commit the transaction and therefore sitting
idle instead of allowing the transaction to be committed by keeping
the sleep time to a minimum so you are sure to always be doing
something.
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e07f7183

06 1月, 2009 3 次提交

ext4: Don't overwrite allocation_context ac_status · 032115fc

由 Aneesh Kumar K.V 提交于 1月 05, 2009

We can call ext4_mb_check_limits even after successfully allocating
the requested blocks.  In that case, make sure we don't overwrite
ac_status if it already has the status AC_STATUS_FOUND.  This fixes
the lockdep warning:

=============================================
[ INFO: possible recursive locking detected ]
2.6.28-rc6-autokern1 #1
---------------------------------------------
fsstress/11948 is trying to acquire lock:
 (&meta_group_info[i]->alloc_sem){----}, at: [<c04d9a49>] ext4_mb_load_buddy+0x9f/0x278
.....

stack backtrace:
.....
 [<c04db974>] ext4_mb_regular_allocator+0xbb5/0xd44
.....

but task is already holding lock:
 (&meta_group_info[i]->alloc_sem){----}, at: [<c04d9a49>] ext4_mb_load_buddy+0x9f/0x278
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

032115fc

ext4: remove extraneous newlines from calls to ext4_error() and ext4_warning() · fde4d95a

由 Theodore Ts'o 提交于 1月 05, 2009

This removes annoying blank syslog entries emitted by ext4_error() or
ext4_warning(), since these functions add their own newline.
Signed-off-by: NNick Warne <nick@ukfsn.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

fde4d95a

jbd2: Add barrier not supported test to journal_wait_on_commit_record · fd98496f

由 Theodore Ts'o 提交于 1月 05, 2009

Xen doesn't report that barriers are not supported until buffer I/O is
reported as completed, instead of when the buffer I/O is submitted.
Add a check and a fallback codepath to journal_wait_on_commit_record()
to detect this case, so that attempts to mount ext4 filesystems on
LVM/devicemapper devices on Xen guests don't blow up with an "Aborting
journal on device XXX"; "Remounting filesystem read-only" error.

Thanks to Andreas Sundstrom for reporting this issue.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

fd98496f

07 1月, 2009 1 次提交

ext4: Allow ext4 to run without a journal · 0390131b

由 Frank Mayhar 提交于 1月 07, 2009

A few weeks ago I posted a patch for discussion that allowed ext4 to run
without a journal.  Since that time I've integrated the excellent
comments from Andreas and fixed several serious bugs.  We're currently
running with this patch and generating some performance numbers against
both ext2 (with backported reservations code) and ext4 with and without
a journal.  It just so happens that running without a journal is
slightly faster for most everything.

We did
	iozone -T -t 4 s 2g -r 256k -T -I -i0 -i1 -i2

which creates 4 threads, each of which create and do reads and writes on
a 2G file, with a buffer size of 256K, using O_DIRECT for all file opens
to bypass the page cache.  Results:

                     ext2        ext4, default   ext4, no journal
  initial writes   13.0 MB/s        15.4 MB/s          15.7 MB/s
  rewrites         13.1 MB/s        15.6 MB/s          15.9 MB/s
  reads            15.2 MB/s        16.9 MB/s          17.2 MB/s
  re-reads         15.3 MB/s        16.9 MB/s          17.2 MB/s
  random readers    5.6 MB/s         5.6 MB/s           5.7 MB/s
  random writers    5.1 MB/s         5.3 MB/s           5.4 MB/s 

So it seems that, so far, this was a useful exercise.
Signed-off-by: NFrank Mayhar <fmayhar@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0390131b

17 12月, 2008 1 次提交

ext4: Widen type of ext4_sb_info.s_mb_maxs[] · ff7ef329

由 Yasunori Goto 提交于 12月 17, 2008

I chased the cause of following ext4 oops report which is tested on
ia64 box.

http://bugzilla.kernel.org/show_bug.cgi?id=12018

The cause is the size of s_mb_maxs array that is defined as "unsigned
short" in ext4_sb_info structure.  If the file system's block size is
8k or greater, an unsigned short is not wide enough to contain the
value fs->blocksize << 3.
Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: stable@kernel.org

ff7ef329

27 11月, 2008 1 次提交

ext4: When resizing set the EXT4_BG_INODE_ZEROED flag for new block groups · 93c0d863

由 Solofo.Ramangalahy@bull.net 提交于 11月 26, 2008

The inode table has been zeroed in setup_new_group_blocks().  Mark it as
such in ext4_group_add().  Since we are currently clearing inode table
for the new block group, we should set the EXT4_BG_INODE_ZEROED flag.
If at some point in the future we don't immediately zero out the inode
table as part of the resize operation, then obviously we shouldn't do
this.

Signed-off-by: Solofo.Ramangalahy@bull.net
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

93c0d863

26 11月, 2008 3 次提交

R
ext4: Use simple_strtol() instead of simple_strtoul() in ext4_ui_proc_open · 23475e26
由 Roel Kluin 提交于 11月 26, 2008
```
Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
23475e26

jbd2: Add BH_JBDPrivateStart · 171bbfbe

由 Mark Fasheh 提交于 11月 25, 2008

Add this so that file systems using JBD2 can safely allocate unused b_state
bits.

In this case, we add it so that Ocfs2 can define a single bit for tracking
the validation state of a buffer.
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

171bbfbe

ext4: fix build warning · 25f1ee3a

由 Wu Fengguang 提交于 11月 25, 2008

Replace `if' with `goto' to assure gcc that ix has been initialized.
Signed-off-by: NWu Fengguang <wfg@linux.intel.com>

25f1ee3a

06 1月, 2009 2 次提交

ext4: avoid ext4_error when mounting a fs with a single bg · 565a9617

由 Aneesh Kumar K.V 提交于 1月 05, 2009

Remove some completely unneeded code which which caused an ext4_error
to be generated when mounting a file system with only a single block
group.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

565a9617

ext4: Fix the delalloc writepages to allocate blocks at the right offset. · 791b7f08

由 Aneesh Kumar K.V 提交于 1月 05, 2009

When iterating through the pages which have mapped buffer_heads, we
failed to update the b_state value. This results in allocating blocks
at logical offset 0.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

791b7f08

05 11月, 2008 1 次提交

ext4: tone down ext4_da_writepages warnings · 2a21e37e

由 Theodore Ts'o 提交于 11月 05, 2008

If the filesystem has errors, ext4_da_writepages() will return a *lot*
of errors, including lots and lots of stack dumps. While it's true
that we are dropping user data on the floor, which is unfortunate, the
stack dumps aren't helpful, and they tend to obscure the true original
root cause of the problem. So in the case where the filesystem has
aborted, return an EROFS right away.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2a21e37e

13 12月, 2008 1 次提交

ext4: remove do_blk_alloc() · 97df5d15

由 Theodore Ts'o 提交于 12月 12, 2008

The convenience function do_blk_alloc() is a static function with only
one caller, so fold it into ext4_new_meta_blocks() to simplify the
code and to make it easier to understand.

To save more stack space, if count is a null pointer in
ext4_new_meta_blocks() assume that caller wanted a single block (and
if there is an error, no blocks were allocated).
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

97df5d15

08 12月, 2008 1 次提交

ext4: remove ext4_new_meta_block() · cfe82c85

由 Theodore Ts'o 提交于 12月 07, 2008

There were only two one callers of the function ext4_new_meta_block(),
which just a very simpler wrapper function around
ext4_new_meta_blocks().  Change those two functions to call
ext4_new_meta_blocks() directly, to save code and stack space usage.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

cfe82c85

02 1月, 2009 1 次提交

ext4: remove ext4_new_blocks() and call ext4_mb_new_blocks() directly · 815a1130

由 Theodore Ts'o 提交于 1月 01, 2009

There was only one caller of the compatibility function
ext4_new_blocks(), in balloc.c's ext4_alloc_blocks().  Change it to
call ext4_mb_new_blocks() directly, and remove ext4_new_blocks()
altogether.  This cleans up the code, by removing two extra functions
from the call chain, and hopefully saving some stack usage.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

815a1130

07 1月, 2009 1 次提交

Update Documentation/filesystems/ext4.txt · 8e1a4857

由 Theodore Ts'o 提交于 1月 06, 2009

Fix paragraph with recommendations on how to tune ext4 for benchmarks.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8e1a4857

07 12月, 2008 1 次提交

ext3/4: Fix loop index in do_split() so it is signed · 59e315b4

由 Theodore Ts'o 提交于 12月 06, 2008

This fixes a gcc warning but it doesn't appear able to result in a
failure, since the primary way the loop is exited is the first
conditional in the for loop, and at least for a consistent filesystem,
the signed/unsigned should in practice never be exposed.
Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

59e315b4

29 10月, 2008 2 次提交

ext4: Add support for non-native signed/unsigned htree hash algorithms · f99b2589

由 Theodore Ts'o 提交于 10月 28, 2008

The original ext3 hash algorithms assumed that variables of type char
were signed, as God and K&R intended. Unfortunately, this assumption
is not true on some architectures. Userspace support for marking
filesystems with non-native signed/unsigned chars was added two years
ago, but the kernel-side support was never added (until now).
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f99b2589

ext3: Add support for non-native signed/unsigned htree hash algorithms · 5e1f8c9e

由 Theodore Ts'o 提交于 10月 28, 2008

The original ext3 hash algorithms assumed that variables of type char
were signed, as God and K&R intended.  Unfortunately, this assumption
is not true on some architectures.  Userspace support for marking
filesystems with non-native signed/unsigned chars was added two years
ago, but the kernel-side support was never added (until now).
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: akpm@linux-foundation.org
Cc: linux-kernel@vger.kernel.org

5e1f8c9e

30 10月, 2008 1 次提交

ext4: fix printk format warning · 8f72fbdf

由 Alexander Beregalov 提交于 10月 29, 2008

fs/ext4/balloc.c:607: warning: format '%lld' expects type 'long long int', but argument 2 has type 's64'
fs/ext4/inode.c:1822: warning: format '%lld' expects type 'long long int', but argument 2 has type 's64'
fs/ext4/inode.c:1824: warning: format '%lld' expects type 'long long int', but argument 2 has type 's64'
Signed-off-by: NAlexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

8f72fbdf

05 1月, 2009 4 次提交

Merge branch 'audit.b61' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current · fe0bdec6

由 Linus Torvalds 提交于 1月 04, 2009

* 'audit.b61' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  audit: validate comparison operations, store them in sane form
  clean up audit_rule_{add,del} a bit
  make sure that filterkey of task,always rules is reported
  audit rules ordering, part 2
  fixing audit rule ordering mess, part 1
  audit_update_lsm_rules() misses the audit_inode_hash[] ones
  sanitize audit_log_capset()
  sanitize audit_fd_pair()
  sanitize audit_mq_open()
  sanitize AUDIT_MQ_SENDRECV
  sanitize audit_mq_notify()
  sanitize audit_mq_getsetattr()
  sanitize audit_ipc_set_perm()
  sanitize audit_ipc_obj()
  sanitize audit_socketcall
  don't reallocate buffer in every audit_sockaddr()

fe0bdec6

rtc: add alarm/update irq interfaces · 099e6576

由 Alessandro Zummo 提交于 1月 04, 2009

Add standard interfaces for alarm/update irqs enabling.  Drivers are no
more required to implement equivalent ioctl code as rtc-dev will provide
it.

UIE emulation should now be handled correctly and will work even for those
RTC drivers who cannot be configured to do both UIE and AIE.
Signed-off-by: NAlessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

099e6576

fs: symlink write_begin allocation context fix · 54566b2c

由 Nick Piggin 提交于 1月 04, 2009

With the write_begin/write_end aops, page_symlink was broken because it
could no longer pass a GFP_NOFS type mask into the point where the
allocations happened.  They are done in write_begin, which would always
assume that the filesystem can be entered from reclaim.  This bug could
cause filesystem deadlocks.

The funny thing with having a gfp_t mask there is that it doesn't really
allow the caller to arbitrarily tinker with the context in which it can be
called.  It couldn't ever be GFP_ATOMIC, for example, because it needs to
take the page lock.  The only thing any callers care about is __GFP_FS
anyway, so turn that into a single flag.

Add a new flag for write_begin, AOP_FLAG_NOFS.  Filesystems can now act on
this flag in their write_begin function.  Change __grab_cache_page to
accept a nofs argument as well, to honour that flag (while we're there,
change the name to grab_cache_page_write_begin which is more instructive
and does away with random leading underscores).

This is really a more flexible way to go in the end anyway -- if a
filesystem happens to want any extra allocations aside from the pagecache
ones in ints write_begin function, it may now use GFP_KERNEL (rather than
GFP_NOFS) for common case allocations (eg.  ocfs2_alloc_write_ctxt, for a
random example).

[kosaki.motohiro@jp.fujitsu.com: fix ubifs]
[kosaki.motohiro@jp.fujitsu.com: fix fuse]
Signed-off-by: NNick Piggin <npiggin@suse.de>
Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@kernel.org>		[2.6.28.x]
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
[ Cleaned up the calling convention: just pass in the AOP flags
  untouched to the grab_cache_page_write_begin() function.  That
  just simplifies everybody, and may even allow future expansion of the
  logic.   - Linus ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

54566b2c

viafb: fix crashes due to 4k stack overflow · e687d691

由 Bruno Prémont 提交于 1月 04, 2009

The function viafb_cursor() uses 2 stack-variables of CURSOR_SIZE bits;
CURSOR_SIZE is defined as (8 * 1024).  Using up twice 1k on stack is too
much for 4k-stack (though it works with 8k-stacks).  Make those two
variables kzalloc'ed to preserve stack space.

Also merge the whole lot of local struct's in viafb_ioctl into a union so
the stack usage gets minimized here as well.  (struct's are only accessed
in their indicidual IOCTL case) This second part is only compile-tested as
I know of no userspace app using the IOCTLs.
Signed-off-by: NBruno Prémont <bonbons@linux-vserver.org>
Cc: <JosephChan@via.com.tw>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e687d691

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功