- 26 11月, 2008 1 次提交
-
-
由 Josef Bacik 提交于
This patch removes the static sleep time in favor of a more self optimizing approach where we measure the average amount of time it takes to commit a transaction to disk and the ammount of time a transaction has been running. If somebody does a sync write or an fsync() traditionally we would sleep for 1 jiffies, which depending on the value of HZ could be a significant amount of time compared to how long it takes to commit a transaction to the underlying storage. With this patch instead of sleeping for a jiffie, we check to see if the amount of time this transaction has been running is less than the average commit time, and if it is we sleep for the delta using schedule_hrtimeout to give us a higher precision sleep time. This greatly benefits high end storage where you could end up sleeping for longer than it takes to commit the transaction and therefore sitting idle instead of allowing the transaction to be committed by keeping the sleep time to a minimum so you are sure to always be doing something. Signed-off-by: NJosef Bacik <jbacik@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 06 1月, 2009 3 次提交
-
-
由 Aneesh Kumar K.V 提交于
We can call ext4_mb_check_limits even after successfully allocating the requested blocks. In that case, make sure we don't overwrite ac_status if it already has the status AC_STATUS_FOUND. This fixes the lockdep warning: ============================================= [ INFO: possible recursive locking detected ] 2.6.28-rc6-autokern1 #1 --------------------------------------------- fsstress/11948 is trying to acquire lock: (&meta_group_info[i]->alloc_sem){----}, at: [<c04d9a49>] ext4_mb_load_buddy+0x9f/0x278 ..... stack backtrace: ..... [<c04db974>] ext4_mb_regular_allocator+0xbb5/0xd44 ..... but task is already holding lock: (&meta_group_info[i]->alloc_sem){----}, at: [<c04d9a49>] ext4_mb_load_buddy+0x9f/0x278 Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
由 Theodore Ts'o 提交于
This removes annoying blank syslog entries emitted by ext4_error() or ext4_warning(), since these functions add their own newline. Signed-off-by: NNick Warne <nick@ukfsn.org> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Xen doesn't report that barriers are not supported until buffer I/O is reported as completed, instead of when the buffer I/O is submitted. Add a check and a fallback codepath to journal_wait_on_commit_record() to detect this case, so that attempts to mount ext4 filesystems on LVM/devicemapper devices on Xen guests don't blow up with an "Aborting journal on device XXX"; "Remounting filesystem read-only" error. Thanks to Andreas Sundstrom for reporting this issue. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
- 07 1月, 2009 1 次提交
-
-
由 Frank Mayhar 提交于
A few weeks ago I posted a patch for discussion that allowed ext4 to run without a journal. Since that time I've integrated the excellent comments from Andreas and fixed several serious bugs. We're currently running with this patch and generating some performance numbers against both ext2 (with backported reservations code) and ext4 with and without a journal. It just so happens that running without a journal is slightly faster for most everything. We did iozone -T -t 4 s 2g -r 256k -T -I -i0 -i1 -i2 which creates 4 threads, each of which create and do reads and writes on a 2G file, with a buffer size of 256K, using O_DIRECT for all file opens to bypass the page cache. Results: ext2 ext4, default ext4, no journal initial writes 13.0 MB/s 15.4 MB/s 15.7 MB/s rewrites 13.1 MB/s 15.6 MB/s 15.9 MB/s reads 15.2 MB/s 16.9 MB/s 17.2 MB/s re-reads 15.3 MB/s 16.9 MB/s 17.2 MB/s random readers 5.6 MB/s 5.6 MB/s 5.7 MB/s random writers 5.1 MB/s 5.3 MB/s 5.4 MB/s So it seems that, so far, this was a useful exercise. Signed-off-by: NFrank Mayhar <fmayhar@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 12月, 2008 1 次提交
-
-
由 Yasunori Goto 提交于
I chased the cause of following ext4 oops report which is tested on ia64 box. http://bugzilla.kernel.org/show_bug.cgi?id=12018 The cause is the size of s_mb_maxs array that is defined as "unsigned short" in ext4_sb_info structure. If the file system's block size is 8k or greater, an unsigned short is not wide enough to contain the value fs->blocksize << 3. Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: stable@kernel.org
-
- 27 11月, 2008 1 次提交
-
-
The inode table has been zeroed in setup_new_group_blocks(). Mark it as such in ext4_group_add(). Since we are currently clearing inode table for the new block group, we should set the EXT4_BG_INODE_ZEROED flag. If at some point in the future we don't immediately zero out the inode table as part of the resize operation, then obviously we shouldn't do this. Signed-off-by: Solofo.Ramangalahy@bull.net Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 26 11月, 2008 3 次提交
-
-
由 Roel Kluin 提交于
Signed-off-by: NRoel Kluin <roel.kluin@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Mark Fasheh 提交于
Add this so that file systems using JBD2 can safely allocate unused b_state bits. In this case, we add it so that Ocfs2 can define a single bit for tracking the validation state of a buffer. Signed-off-by: NMark Fasheh <mfasheh@suse.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Wu Fengguang 提交于
Replace `if' with `goto' to assure gcc that ix has been initialized. Signed-off-by: NWu Fengguang <wfg@linux.intel.com>
-
- 06 1月, 2009 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
Remove some completely unneeded code which which caused an ext4_error to be generated when mounting a file system with only a single block group. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
由 Aneesh Kumar K.V 提交于
When iterating through the pages which have mapped buffer_heads, we failed to update the b_state value. This results in allocating blocks at logical offset 0. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
- 05 11月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
If the filesystem has errors, ext4_da_writepages() will return a *lot* of errors, including lots and lots of stack dumps. While it's true that we are dropping user data on the floor, which is unfortunate, the stack dumps aren't helpful, and they tend to obscure the true original root cause of the problem. So in the case where the filesystem has aborted, return an EROFS right away. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 13 12月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
The convenience function do_blk_alloc() is a static function with only one caller, so fold it into ext4_new_meta_blocks() to simplify the code and to make it easier to understand. To save more stack space, if count is a null pointer in ext4_new_meta_blocks() assume that caller wanted a single block (and if there is an error, no blocks were allocated). Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 08 12月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
There were only two one callers of the function ext4_new_meta_block(), which just a very simpler wrapper function around ext4_new_meta_blocks(). Change those two functions to call ext4_new_meta_blocks() directly, to save code and stack space usage. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 02 1月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
There was only one caller of the compatibility function ext4_new_blocks(), in balloc.c's ext4_alloc_blocks(). Change it to call ext4_mb_new_blocks() directly, and remove ext4_new_blocks() altogether. This cleans up the code, by removing two extra functions from the call chain, and hopefully saving some stack usage. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 07 1月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
Fix paragraph with recommendations on how to tune ext4 for benchmarks. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 07 12月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
This fixes a gcc warning but it doesn't appear able to result in a failure, since the primary way the loop is exited is the first conditional in the for loop, and at least for a consistent filesystem, the signed/unsigned should in practice never be exposed. Signed-off-by: NRoel Kluin <roel.kluin@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 29 10月, 2008 2 次提交
-
-
由 Theodore Ts'o 提交于
The original ext3 hash algorithms assumed that variables of type char were signed, as God and K&R intended. Unfortunately, this assumption is not true on some architectures. Userspace support for marking filesystems with non-native signed/unsigned chars was added two years ago, but the kernel-side support was never added (until now). Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The original ext3 hash algorithms assumed that variables of type char were signed, as God and K&R intended. Unfortunately, this assumption is not true on some architectures. Userspace support for marking filesystems with non-native signed/unsigned chars was added two years ago, but the kernel-side support was never added (until now). Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org
-
- 30 10月, 2008 1 次提交
-
-
由 Alexander Beregalov 提交于
fs/ext4/balloc.c:607: warning: format '%lld' expects type 'long long int', but argument 2 has type 's64' fs/ext4/inode.c:1822: warning: format '%lld' expects type 'long long int', but argument 2 has type 's64' fs/ext4/inode.c:1824: warning: format '%lld' expects type 'long long int', but argument 2 has type 's64' Signed-off-by: NAlexander Beregalov <a.beregalov@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 05 1月, 2009 19 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current由 Linus Torvalds 提交于
* 'audit.b61' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: audit: validate comparison operations, store them in sane form clean up audit_rule_{add,del} a bit make sure that filterkey of task,always rules is reported audit rules ordering, part 2 fixing audit rule ordering mess, part 1 audit_update_lsm_rules() misses the audit_inode_hash[] ones sanitize audit_log_capset() sanitize audit_fd_pair() sanitize audit_mq_open() sanitize AUDIT_MQ_SENDRECV sanitize audit_mq_notify() sanitize audit_mq_getsetattr() sanitize audit_ipc_set_perm() sanitize audit_ipc_obj() sanitize audit_socketcall don't reallocate buffer in every audit_sockaddr()
-
由 Alessandro Zummo 提交于
Add standard interfaces for alarm/update irqs enabling. Drivers are no more required to implement equivalent ioctl code as rtc-dev will provide it. UIE emulation should now be handled correctly and will work even for those RTC drivers who cannot be configured to do both UIE and AIE. Signed-off-by: NAlessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Cc: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nick Piggin 提交于
With the write_begin/write_end aops, page_symlink was broken because it could no longer pass a GFP_NOFS type mask into the point where the allocations happened. They are done in write_begin, which would always assume that the filesystem can be entered from reclaim. This bug could cause filesystem deadlocks. The funny thing with having a gfp_t mask there is that it doesn't really allow the caller to arbitrarily tinker with the context in which it can be called. It couldn't ever be GFP_ATOMIC, for example, because it needs to take the page lock. The only thing any callers care about is __GFP_FS anyway, so turn that into a single flag. Add a new flag for write_begin, AOP_FLAG_NOFS. Filesystems can now act on this flag in their write_begin function. Change __grab_cache_page to accept a nofs argument as well, to honour that flag (while we're there, change the name to grab_cache_page_write_begin which is more instructive and does away with random leading underscores). This is really a more flexible way to go in the end anyway -- if a filesystem happens to want any extra allocations aside from the pagecache ones in ints write_begin function, it may now use GFP_KERNEL (rather than GFP_NOFS) for common case allocations (eg. ocfs2_alloc_write_ctxt, for a random example). [kosaki.motohiro@jp.fujitsu.com: fix ubifs] [kosaki.motohiro@jp.fujitsu.com: fix fuse] Signed-off-by: NNick Piggin <npiggin@suse.de> Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: <stable@kernel.org> [2.6.28.x] Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> [ Cleaned up the calling convention: just pass in the AOP flags untouched to the grab_cache_page_write_begin() function. That just simplifies everybody, and may even allow future expansion of the logic. - Linus ] Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bruno Prémont 提交于
The function viafb_cursor() uses 2 stack-variables of CURSOR_SIZE bits; CURSOR_SIZE is defined as (8 * 1024). Using up twice 1k on stack is too much for 4k-stack (though it works with 8k-stacks). Make those two variables kzalloc'ed to preserve stack space. Also merge the whole lot of local struct's in viafb_ioctl into a union so the stack usage gets minimized here as well. (struct's are only accessed in their indicidual IOCTL case) This second part is only compile-tested as I know of no userspace app using the IOCTLs. Signed-off-by: NBruno Prémont <bonbons@linux-vserver.org> Cc: <JosephChan@via.com.tw> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pekka Enberg 提交于
As suggested by Andreas Dilger, introduce a bgl_lock_ptr() helper in <linux/blockgroup_lock.h> and add separate sb_bgl_lock() helpers to filesystem specific header files to break the hidden dependency to struct ext[234]_sb_info. Also, while at it, convert the macros to static inlines to try make up for all the times I broke Andrew Morton's tree. Acked-by: NAndreas Dilger <adilger@sun.com> Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Randy Dunlap 提交于
Include header files as used/needed: In file included from drivers/leds/leds-dac124s085.c:16: include/linux/spi/spi.h:66: error: field 'dev' has incomplete type include/linux/spi/spi.h: In function 'to_spi_device': include/linux/spi/spi.h:100: warning: type defaults to 'int' in declaration of '__mptr' ... Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Cc: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Adam Lackorzynski 提交于
The flush_cache_vmap in vmap_page_range() is called with the end of the range twice. The following patch fixes this for me. Signed-off-by: NAdam Lackorzynski <adam@os.inf.tu-dresden.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Li Zefan 提交于
The race is calling cgroup_clone() while umounting the ns cgroup subsys, and thus cgroup_clone() might access invalid cgroup_fs, or kill_sb() is called after cgroup_clone() created a new dir in it. The BUG I triggered is BUG_ON(root->number_of_cgroups != 1); ------------[ cut here ]------------ kernel BUG at kernel/cgroup.c:1093! invalid opcode: 0000 [#1] SMP ... Process umount (pid: 5177, ti=e411e000 task=e40c4670 task.ti=e411e000) ... Call Trace: [<c0493df7>] ? deactivate_super+0x3f/0x51 [<c04a3600>] ? mntput_no_expire+0xb3/0xdd [<c04a3ab2>] ? sys_umount+0x265/0x2ac [<c04a3b06>] ? sys_oldumount+0xd/0xf [<c0403911>] ? sysenter_do_call+0x12/0x31 ... EIP: [<c0456e76>] cgroup_kill_sb+0x23/0xe0 SS:ESP 0068:e411ef2c ---[ end trace c766c1be3bf944ac ]--- Cc: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Al Viro 提交于
Don't store the field->op in the messy (and very inconvenient for e.g. audit_comparator()) form; translate to dense set of values and do full validation of userland-submitted value while we are at it. ->audit_init_rule() and ->audit_match_rule() get new values now; in-tree instances updated. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Fix the actual rule listing; add per-type lists _not_ used for matching, with all exit,... sitting on one such list. Simplifies "do something for all rules" logics, while we are at it... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Problem: ordering between the rules on exit chain is currently lost; all watch and inode rules are listed after everything else _and_ exit,never on one kind doesn't stop exit,always on another from being matched. Solution: assign priorities to rules, keep track of the current highest-priority matching rule and its result (always/never). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
* no allocations * return void * don't duplicate checked for dummy context Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
* no allocations * return void Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
* don't bother with allocations * don't do double copy_from_user() * don't duplicate parts of check for audit_dummy_context() Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
* logging the original value of *msg_prio in mq_timedreceive(2) is insane - the argument is write-only (i.e. syscall always ignores the original value and only overwrites it). * merge __audit_mq_timed{send,receive} * don't do copy_from_user() twice * don't mess with allocations in auditsc part * ... and don't bother checking !audit_enabled and !context in there - we'd already checked for audit_dummy_context(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
* don't copy_from_user() twice * don't bother with allocations * don't duplicate parts of audit_dummy_context() * make it return void Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-