- 17 12月, 2010 2 次提交
-
-
由 Tejun Heo 提交于
Currently, media presence polling for removeable block devices is done from userland. There are several issues with this. * Polling is done by periodically opening the device. For SCSI devices, the command sequence generated by such action involves a few different commands including TEST_UNIT_READY. This behavior, while perfectly legal, is different from Windows which only issues single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some ATAPI devices lock up after being periodically queried such command sequences. * There is no reliable and unintrusive way for a userland program to tell whether the target device is safe for media presence polling. For example, polling for media presence during an on-going burning session can make it fail. The polling program can avoid this by opening the device with O_EXCL but then it risks making a valid exclusive user of the device fail w/ -EBUSY. * Userland polling is unnecessarily heavy and in-kernel implementation is lighter and better coordinated (workqueue, timer slack). This patch implements framework for in-kernel disk event handling, which includes media presence polling. * bdops->check_events() is added, which supercedes ->media_changed(). It should check whether there's any pending event and return if so. Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be called parallelly. * gendisk->events and ->async_events are added. These should be initialized by block driver before passing the device to add_disk(). The former contains the mask of all supported events and the latter the mask of all events which the device can report without polling. /sys/block/*/events[_async] export these to userland. * Kernel parameter block.events_dfl_poll_msecs controls the system polling interval (default is 0 which means disable) and /sys/block/*/events_poll_msecs control polling intervals for individual devices (default is -1 meaning use system setting). Note that if a device can report all supported events asynchronously and its polling interval isn't explicitly set, the device won't be polled regardless of the system polling interval. * If a device is opened exclusively with write access, event checking is automatically disabled until all write exclusive accesses are released. * There are event 'clearing' events. For example, both of currently defined events are cleared after the device has been successfully opened. This information is passed to ->check_events() callback using @clearing argument as a hint. * Event checking is always performed from system_nrt_wq and timer slack is set to 25% for polling. * Nothing changes for drivers which implement ->media_changed() but not ->check_events(). Going forward, all drivers will be converted to ->check_events() and ->media_change() will be dropped. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Tejun Heo 提交于
There's no reason for register_disk() and del_gendisk() to be in fs/partitions/check.c. Move both to genhd.c. While at it, collapse unlink_gendisk(), which was artificially in a separate function due to genhd.c / check.c split, into del_gendisk(). Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 15 11月, 2010 1 次提交
-
-
由 Steven Whitehouse 提交于
This area of the code has always been a bit delicate due to the subtleties of lock ordering. The problem is that for "normal" alloc/dealloc, we always grab the inode locks first and the rgrp lock later. In order to ensure no races in looking up the unlinked, but still allocated inodes, we need to hold the rgrp lock when we do the lookup, which means that we can't take the inode glock. The solution is to borrow the technique already used by NFS to solve what is essentially the same problem (given an inode number, look up the inode carefully, checking that it really is in the expected state). We cannot do that directly from the allocation code (lock ordering again) so we give the job to the pre-existing delete workqueue and carry on with the allocation as normal. If we find there is no space, we do a journal flush (required anyway if space from a deallocation is to be released) which should block against the pending deallocations, so we should always get the space back. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 13 11月, 2010 7 次提交
-
-
由 Tao Ma 提交于
Commit 83fd9c7f changes l_level, l_requested and l_blocking of ocfs2_lock_res from int to unsigned char. But actually it is initially as -1(ocfs2_lock_res_init_common) which correspoding to 255 for unsigned char. So the whole dlm lock mechanism doesn't work now which means a disaster to ocfs2. Cc: Goldwyn Rodrigues <rgoldwyn@suse.de> Signed-off-by: NTao Ma <tao.ma@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Tejun Heo 提交于
After recent blkdev_get() modifications, open_by_devnum() and open_bdev_exclusive() are simple wrappers around blkdev_get(). Replace them with blkdev_get_by_dev() and blkdev_get_by_path(). blkdev_get_by_dev() is identical to open_by_devnum(). blkdev_get_by_path() is slightly different in that it doesn't automatically add %FMODE_EXCL to @mode. All users are converted. Most conversions are mechanical and don't introduce any behavior difference. There are several exceptions. * btrfs now sets FMODE_EXCL in btrfs_device->mode, so there's no reason to OR it explicitly on blkdev_put(). * gfs2, nilfs2 and the generic mount_bdev() now set FMODE_EXCL in sb->s_mode. * With the above changes, sb->s_mode now always should contain FMODE_EXCL. WARN_ON_ONCE() added to kill_block_super() to detect errors. The new blkdev_get_*() functions are with proper docbook comments. While at it, add function description to blkdev_get() too. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Neil Brown <neilb@suse.de> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Joern Engel <joern@lazybastard.org> Cc: Chris Mason <chris.mason@oracle.com> Cc: Jan Kara <jack@suse.cz> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: reiserfs-devel@vger.kernel.org Cc: xfs-masters@oss.sgi.com Cc: Alexander Viro <viro@zeniv.linux.org.uk>
-
由 Tejun Heo 提交于
bdev read-only status can be queried using bdev_read_only() and may change while the device is being opened. Enforce it by checking it from blkdev_get() after open succeeds. This makes bdev_read_only() check in open_bdev_exclusive() and fsg_lun_open() unnecessary. Drop them. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David Brownell <dbrownell@users.sourceforge.net> Cc: linux-usb@vger.kernel.org
-
由 Tejun Heo 提交于
With claim/release rolled into blkdev_get/put(), there's no reason to keep bd_abort/finish_claim(), __bd_claim() and bd_release() as separate functions. It only makes the code difficult to follow. Collapse them into blkdev_get/put(). This will ease future changes around claim/release. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Over time, block layer has accumulated a set of APIs dealing with bdev open, close, claim and release. * blkdev_get/put() are the primary open and close functions. * bd_claim/release() deal with exclusive open. * open/close_bdev_exclusive() are combination of open and claim and the other way around, respectively. * bd_link/unlink_disk_holder() to create and remove holder/slave symlinks. * open_by_devnum() wraps bdget() + blkdev_get(). The interface is a bit confusing and the decoupling of open and claim makes it impossible to properly guarantee exclusive access as in-kernel open + claim sequence can disturb the existing exclusive open even before the block layer knows the current open if for another exclusive access. Reorganize the interface such that, * blkdev_get() is extended to include exclusive access management. @holder argument is added and, if is @FMODE_EXCL specified, it will gain exclusive access atomically w.r.t. other exclusive accesses. * blkdev_put() is similarly extended. It now takes @mode argument and if @FMODE_EXCL is set, it releases an exclusive access. Also, when the last exclusive claim is released, the holder/slave symlinks are removed automatically. * bd_claim/release() and close_bdev_exclusive() are no longer necessary and either made static or removed. * bd_link_disk_holder() remains the same but bd_unlink_disk_holder() is no longer necessary and removed. * open_bdev_exclusive() becomes a simple wrapper around lookup_bdev() and blkdev_get(). It also has an unexpected extra bdev_read_only() test which probably should be moved into blkdev_get(). * open_by_devnum() is modified to take @holder argument and pass it to blkdev_get(). Most of bdev open/close operations are unified into blkdev_get/put() and most exclusive accesses are tested atomically at the open time (as it should). This cleans up code and removes some, both valid and invalid, but unnecessary all the same, corner cases. open_bdev_exclusive() and open_by_devnum() can use further cleanup - rename to blkdev_get_by_path() and blkdev_get_by_devt() and drop special features. Well, let's leave them for another day. Most conversions are straight-forward. drbd conversion is a bit more involved as there was some reordering, but the logic should stay the same. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NNeil Brown <neilb@suse.de> Acked-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: NMike Snitzer <snitzer@redhat.com> Acked-by: NPhilipp Reisner <philipp.reisner@linbit.com> Cc: Peter Osterlund <petero2@telia.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Alex Elder <aelder@sgi.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: dm-devel@redhat.com Cc: drbd-dev@lists.linbit.com Cc: Leo Chen <leochen@broadcom.com> Cc: Scott Branden <sbranden@broadcom.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Joern Engel <joern@logfs.org> Cc: reiserfs-devel@vger.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk>
-
由 Tejun Heo 提交于
Code to manage symlinks in /sys/block/*/{holders|slaves} are overly complex with multiple holder considerations, redundant extra references to all involved kobjects, unused generic kobject holder support and unnecessary mixup with bd_claim/release functionalities. Strip it down to what's necessary (single gendisk holder) and make it use a separate interface. This is a step for cleaning up bd_claim/release. This patch makes dm-table slightly more complex but it will be simplified again with further changes. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NNeil Brown <neilb@suse.de> Acked-by: NMike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com
-
由 Tejun Heo 提交于
In the failure path of __btrfs_open_devices(), close_bdev_exclusive() is called with @flags which doesn't match the one used during open_bdev_exclusive(). Fix it. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Chris Mason <chris.mason@oracle.com>
-
- 12 11月, 2010 1 次提交
-
-
由 Dave Jones 提交于
WARN_ONCE is a bit strong for a deprecation warning, given that it spews a huge backtrace. Signed-off-by: NDave Jones <davej@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 11月, 2010 11 次提交
-
-
由 Kay Sievers 提交于
We already export 'ro' for the disk. This adds the same attribute for partitions. Cc: Karel Zak <kzak@redhat.com> Signed-off-by: NKay Sievers <kay.sievers@vrfy.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Christoph Hellwig 提交于
In commit 20cb52eb, titled "xfs: simplify xfs_vm_writepage" I added an assert that any !mapped and uptodate buffers are not dirty. That asserts turns out to trigger a lot when running fsx on filesystems with small block sizes. The reason for that is that the assert is simply incorrect. !mapped and uptodate just mean this buffer covers a hole, and whenever we do a set_page_dirty we mark all blocks in the page dirty, no matter if they have data or not. So remove the assert, and update the comment above the condition to match reality. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAlex Elder <aelder@sgi.com>
-
由 J. Bruce Fields 提交于
A minor oversight from f7347ce4, "fasync: re-organize fasync entry insertion to allow it under a spinlock": this cleanup-on-error was only needed to handle -ENOMEM. Now that we're preallocating it's unneeded. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
We must also free the passed-in lease in the case it wasn't used because an existing lease was upgrade/downgraded or already existed. Note the nfsd caller doesn't care because it's fl_change callback returns an error in those cases. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Christoph Hellwig 提交于
XFS does not need it's inodes to actuall be hashed in the VFS inode cache, but we require the inode to be marked hashed for the writeback code to work. Insted of using insert_inode_hash, which requires a second inode_lock roundtrip after the partial merge of the inode scalability patches in 2.6.37-rc simply use the new hlist_add_fake helper to mark it hashed without requiring a lock or touching a global cache line. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAlex Elder <aelder@sgi.com>
-
由 Christoph Hellwig 提交于
Andi Kleen reported that gcc-4.5 gives lots of warnings for him inside the XFS code. It turned out most of them are due to the quota stubs beeing macros, and gcc now complaining about macros evaluating to 0 that are not assigned to variables. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAlex Elder <aelder@sgi.com>
-
由 Christoph Hellwig 提交于
The filestreams code may take the iolock on the parent inode while holding it on a child. This is the only place in XFS where we take both the child and parent iolock, so just telling lockdep about it is enough. The lock flag required for that was already added as part of the ilock lockdep annotations and unused so far. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAlex Elder <aelder@sgi.com>
-
由 Dave Chinner 提交于
The delayed write buffer split trace currently issues a trace for every buffer it scans. These buffers are not necessarily queued for delayed write. Indeed, when buffers are pinned, there can be thousands of traces of buffers that aren't actually queued for delayed write and the ones that are are lost in the noise. Move the trace point to record only buffers that are split out for IO to be issued on. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAlex Elder <aelder@sgi.com>
-
由 Dave Chinner 提交于
The walk fails to decrement the per-ag reference count when the non-blocking walk fails to obtain the per-ag reclaim lock, leading to an assert failure on debug kernels when unmounting a filesystem. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAlex Elder <aelder@sgi.com>
-
由 Kulikov Vasiliy 提交于
al_hreq is copied from userland. If al_hreq.buflen is not properly aligned then xfs_attr_list will ignore the last bytes of kbuf. These bytes are unitialized. It leads to leaking of contents of kernel stack memory. Signed-off-by: NVasiliy Kulikov <segooon@gmail.com> Signed-off-by: NAlex Elder <aelder@sgi.com>
-
由 Christoph Hellwig 提交于
We promised to do this for 2.6.37, and the code looks stable enough to keep that promise. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NAlex Elder <aelder@sgi.com>
-
- 10 11月, 2010 4 次提交
-
-
由 Sergey Senozhatsky 提交于
Commit 4221a991 "Add RCU check for find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns= Assertion failed in sys_ioprio_get. The patch is fixing assertion failure in ioprio_set as well. kernel/pid.c:419 invoked rcu_dereference_check() without protection! stack backtrace: Pid: 4254, comm: iotop Not tainted Call Trace: [<ffffffff810656f2>] lockdep_rcu_dereference+0xaa/0xb2 [<ffffffff81053c67>] find_task_by_pid_ns+0x4f/0x68 [<ffffffff81053c9d>] find_task_by_vpid+0x1d/0x1f [<ffffffff811104e2>] sys_ioprio_get+0x50/0x2da [<ffffffff81002182>] system_call_fastpath+0x16/0x1b V2: rcu critical section expanded according to comment by Paul E. McKenney Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Daniel J Blueman 提交于
With 2.6.37-rc1, I observe sys_ioprio_set not taking the RCU lock [1] across access to the task credentials. Inspecting the code in fs/ioprio.c, the tasklist_lock is held for read across the __task_cred call, which is presumably sufficient to prevent the task credentials becoming stale. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- kernel/pid.c:419 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 1 lock held by start-stop-daem/2246: #0: (tasklist_lock){.?.?..}, at: [<ffffffff811a2dfa>] sys_ioprio_set+0x8a/0x400 stack backtrace: Pid: 2246, comm: start-stop-daem Not tainted 2.6.37-rc1-330cd+ #2 Call Trace: [<ffffffff8109f5f4>] lockdep_rcu_dereference+0xa4/0xc0 [<ffffffff81085651>] find_task_by_pid_ns+0x81/0x90 [<ffffffff8108567d>] find_task_by_vpid+0x1d/0x20 [<ffffffff811a3160>] sys_ioprio_set+0x3f0/0x400 [<ffffffff816efa79>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff81003482>] system_call_fastpath+0x16/0x1b Take the RCU lock for read across acquiring the pointer to the task credentials and dereferencing it. Signed-off-by: NDaniel J Blueman <daniel.blueman@gmail.com> Fixed up by Jens to fix missing rcu_read_unlock() on mismatches. Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Jens Axboe 提交于
If the iovec is being set up in a way that causes uaddr + PAGE_SIZE to overflow, we could end up attempting to map a huge number of pages. Check for this invalid input type. Reported-by: NDan Rosenberg <drosenberg@vsecurity.com> Cc: stable@kernel.org Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Jens Axboe 提交于
Reported-by: NDan Rosenberg <drosenberg@vsecurity.com> Cc: stable@kernel.org Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 09 11月, 2010 8 次提交
-
-
由 Suresh Jayaraman 提交于
Andrew Hendry reported a kmemleak warning in 2.6.37-rc1 while editing a text file with gedit over cifs. unreferenced object 0xffff88022ee08b40 (size 32): comm "gedit", pid 2524, jiffies 4300160388 (age 2633.655s) hex dump (first 32 bytes): 5c 2e 67 6f 75 74 70 75 74 73 74 72 65 61 6d 2d \.goutputstream- 35 42 41 53 4c 56 00 de 09 00 00 00 2c 26 78 ee 5BASLV......,&x. backtrace: [<ffffffff81504a4d>] kmemleak_alloc+0x2d/0x60 [<ffffffff81136e13>] __kmalloc+0xe3/0x1d0 [<ffffffffa0313db0>] build_path_from_dentry+0xf0/0x230 [cifs] [<ffffffffa031ae1e>] cifs_setattr+0x9e/0x770 [cifs] [<ffffffff8115fe90>] notify_change+0x170/0x2e0 [<ffffffff81145ceb>] sys_fchmod+0x10b/0x140 [<ffffffff8100c172>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff The commit 1025774c that removed inode_setattr() seems to have introduced this memleak by returning early without freeing 'full_path'. Reported-by: NAndrew Hendry <andrew.hendry@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NSuresh Jayaraman <sjayaraman@suse.de> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-
由 Meelis Roos 提交于
Fix openpromfs compilation by adding a missing semicolon in fs/openpromfs/inode.c openprom_mount(). Signed-off-by: NMeelis Roos <mroos@linux.ee> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jeff Layton 提交于
Commit 13cfb733 made cifs_ioctl use the tlink attached to the cifsFileInfo for a filp. This ignores the case of an open directory however, which in CIFS can have a NULL private_data until a readdir is done on it. This patch re-adds the NULL pointer checks that were removed in commit 50ae28f0 and moves the setting of tcon and "caps" variables lower. Long term, a better fix would be to establish a f_op->open routine for directories that populates that field at open time, but that requires some other changes to how readdir calls are handled. Reported-by: NKjell Rune Skaaraas <kjella79@yahoo.no> Reviewed-and-Tested-by: NSuresh Jayaraman <sjayaraman@suse.de> Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-
由 Theodore Ts'o 提交于
Add ext4_evict_inode, ext4_drop_inode, ext4_mark_inode_dirty, and ext4_begin_ordered_truncate() Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Commit 5c521830 (ext4: Support discard requests when running in no-journal mode) attempts to add sb_issue_discard() for data blocks (in data=writeback mode) and in no-journal mode. Unfortunately, this no longer works, because in commit dd3932ed (block: remove BLKDEV_IFL_WAIT), sb_issue_discard() only presents a synchronous interface, and there are times when we call ext4_free_blocks() when we are are holding a spinlock, or are otherwise in an atomic context. For now, I've removed the call to sb_issue_discard() to prevent a deadlock or (if spinlock debugging is enabled) failures like this: BUG: scheduling while atomic: rc.sysinit/1376/0x00000002 Pid: 1376, comm: rc.sysinit Not tainted 2.6.36-ARCH #1 Call Trace: [<ffffffff810397ce>] __schedule_bug+0x5e/0x70 [<ffffffff81403110>] schedule+0x950/0xa70 [<ffffffff81060bad>] ? insert_work+0x7d/0x90 [<ffffffff81060fbd>] ? queue_work_on+0x1d/0x30 [<ffffffff81061127>] ? queue_work+0x37/0x60 [<ffffffff8140377d>] schedule_timeout+0x21d/0x360 [<ffffffff812031c3>] ? generic_make_request+0x2c3/0x540 [<ffffffff81402680>] wait_for_common+0xc0/0x150 [<ffffffff81041490>] ? default_wake_function+0x0/0x10 [<ffffffff812034bc>] ? submit_bio+0x7c/0x100 [<ffffffff810680a0>] ? wake_bit_function+0x0/0x40 [<ffffffff814027b8>] wait_for_completion+0x18/0x20 [<ffffffff8120a969>] blkdev_issue_discard+0x1b9/0x210 [<ffffffff811ba03e>] ext4_free_blocks+0x68e/0xb60 [<ffffffff811b1650>] ? __ext4_handle_dirty_metadata+0x110/0x120 [<ffffffff811b098c>] ext4_ext_truncate+0x8cc/0xa70 [<ffffffff810d713e>] ? pagevec_lookup+0x1e/0x30 [<ffffffff81191618>] ext4_truncate+0x178/0x5d0 [<ffffffff810eacbb>] ? unmap_mapping_range+0xab/0x280 [<ffffffff810d8976>] vmtruncate+0x56/0x70 [<ffffffff811925cb>] ext4_setattr+0x14b/0x460 [<ffffffff811319e4>] notify_change+0x194/0x380 [<ffffffff81117f80>] do_truncate+0x60/0x90 [<ffffffff811e08fa>] ? security_inode_permission+0x1a/0x20 [<ffffffff811eaec1>] ? tomoyo_path_truncate+0x11/0x20 [<ffffffff81127539>] do_last+0x5d9/0x770 [<ffffffff811278bd>] do_filp_open+0x1ed/0x680 [<ffffffff8140644f>] ? page_fault+0x1f/0x30 [<ffffffff81132bfc>] ? alloc_fd+0xec/0x140 [<ffffffff81118db1>] do_sys_open+0x61/0x120 [<ffffffff81118e8b>] sys_open+0x1b/0x20 [<ffffffff81002e6b>] system_call_fastpath+0x16/0x1b https://bugzilla.kernel.org/show_bug.cgi?id=22302Reported-by: NMathias Burén <mathias.buren@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: jiayingz@google.com
-
由 Dmitry Monakhov 提交于
It's not needed to sync the filesystem, and it fixes a lock_dep complaint. Signed-off-by: NDmitry Monakhov <dmonakhov@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Theodore Ts'o 提交于
Use an atomic_t and make sure we don't free the structure while we might still be submitting I/O for that page. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The following BUG can occur when an inode which is getting freed when it still has dirty pages outstanding, and it gets deleted (in this because it was the target of a rename). In ordered mode, we need to make sure the data pages are written just in case we crash before the rename (or unlink) is committed. If the inode is being freed then when we try to igrab the inode, we end up tripping the BUG_ON at fs/ext4/page-io.c:146. To solve this problem, we need to keep track of the number of io callbacks which are pending, and avoid destroying the inode until they have all been completed. That way we don't have to bump the inode count to keep the inode from being destroyed; an approach which doesn't work because the count could have already been dropped down to zero before the inode writeback has started (at which point we're not allowed to bump the count back up to 1, since it's already started getting freed). Thanks to Dave Chinner for suggesting this approach, which is also used by XFS. kernel BUG at /scratch_space/linux-2.6/fs/ext4/page-io.c:146! Call Trace: [<ffffffff811075b1>] ext4_bio_write_page+0x172/0x307 [<ffffffff811033a7>] mpage_da_submit_io+0x2f9/0x37b [<ffffffff811068d7>] mpage_da_map_and_submit+0x2cc/0x2e2 [<ffffffff811069b3>] mpage_add_bh_to_extent+0xc6/0xd5 [<ffffffff81106c66>] write_cache_pages_da+0x2a4/0x3ac [<ffffffff81107044>] ext4_da_writepages+0x2d6/0x44d [<ffffffff81087910>] do_writepages+0x1c/0x25 [<ffffffff810810a4>] __filemap_fdatawrite_range+0x4b/0x4d [<ffffffff810815f5>] filemap_fdatawrite_range+0xe/0x10 [<ffffffff81122a2e>] jbd2_journal_begin_ordered_truncate+0x7b/0xa2 [<ffffffff8110615d>] ext4_evict_inode+0x57/0x24c [<ffffffff810c14a3>] evict+0x22/0x92 [<ffffffff810c1a3d>] iput+0x212/0x249 [<ffffffff810bdf16>] dentry_iput+0xa1/0xb9 [<ffffffff810bdf6b>] d_kill+0x3d/0x5d [<ffffffff810be613>] dput+0x13a/0x147 [<ffffffff810b990d>] sys_renameat+0x1b5/0x258 [<ffffffff81145f71>] ? _atomic_dec_and_lock+0x2d/0x4c [<ffffffff810b2950>] ? cp_new_stat+0xde/0xea [<ffffffff810b29c1>] ? sys_newlstat+0x2d/0x38 [<ffffffff810b99c6>] sys_rename+0x16/0x18 [<ffffffff81002a2b>] system_call_fastpath+0x16/0x1b Reported-by: NNick Bowler <nbowler@elliptictech.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Tested-by: NNick Bowler <nbowler@elliptictech.com>
-
- 06 11月, 2010 1 次提交
-
-
由 Pavel Shilovsky 提交于
All the callers already have a pointer to struct cifsInodeInfo. Use it. Signed-off-by: NSuresh Jayaraman <sjayaraman@suse.de> Signed-off-by: NPavel Shilovsky <piastryyy@gmail.com> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-
- 05 11月, 2010 2 次提交
-
-
由 Jeff Layton 提交于
This patch is based on Dan's original patch. His original description is below: Smatch complained about a couple checking for NULL after dereferencing bugs. I'm not super familiar with the code so I did the conservative thing and move the dereferences after the checks. The dereferences in cifs_lock() and cifs_fsync() were added in ba00ba64 "cifs: make various routines use the cifsFileInfo->tcon pointer". The dereference in find_writable_file() was added in 6508d904 "cifs: have find_readable/writable_file filter by fsuid". The comments there say it's possible to trigger the NULL dereference under stress. Signed-off-by: NDan Carpenter <error27@gmail.com> Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-
由 Suresh Jayaraman 提交于
Noticed while reviewing (late) the rbtree conversion patchset (which has been merged already). Cc: Jeff Layton <jlayton@redhat.com> Signed-off-by: NSuresh Jayaraman <sjayaraman@suse.de> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-
- 04 11月, 2010 1 次提交
-
-
由 Theodore Ts'o 提交于
We now initialize the percpu counters before replaying the journal, but after the journal, we recalculate the global counters, to deal with the possibility of the per-blockgroup counts getting updated by the journal replay. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 03 11月, 2010 2 次提交
-
-
由 J. Bruce Fields 提交于
If a connection is closed just after a sequence or create_session is sent over it, we could end up trying to register a callback that will never get called since the xprt is already marked dead. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Steve French 提交于
Signed-off-by: NSteve French <sfrench@us.ibm.com>
-