- 24 7月, 2011 3 次提交
-
-
由 Dan Williams 提交于
Some systems benefit from completions always being steered to the strict requester cpu rather than the looser "per-socket" steering that blk_cpu_to_group() attempts by default. This is because the first CPU in the group mask ends up being completely overloaded with work, while the others (including the original submitter) has power left to spare. Allow the strict mode to be set by writing '2' to the sysfs control file. This is identical to the scheme used for the nomerges file, where '2' is a more aggressive setting than just being turned on. echo 2 > /sys/block/<bdev>/queue/rq_affinity Cc: Christoph Hellwig <hch@infradead.org> Cc: Roland Dreier <roland@purestorage.com> Tested-by: NDave Jiang <dave.jiang@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Mikulas Patocka 提交于
backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu synchronize_rcu sleeps several timer ticks. synchronize_rcu_expedited is much faster. With 100Hz timer frequency, when we remove 10000 block devices with "dmsetup remove_all" command, it takes 27 minutes. With this patch, removing 10000 block devices takes only 15 seconds. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Jens Axboe 提交于
A '!' snuck in before the unlikely, rendering it useless. Reported-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 14 7月, 2011 1 次提交
-
-
由 Richard Kennedy 提交于
Reorder request_queue to remove 16 bytes of alignment padding in 64 bit builds. On my config this shrinks the size of this structure from 1608 to 1592 bytes and therefore needs one fewer cachelines. Also trivially move the open bracket { to be on the same line as the structure name to make it easier to grep. Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 12 7月, 2011 4 次提交
-
-
由 Shaohua Li 提交于
Currently when the last queue of a group has no request, we don't expire the queue to hope request from the group comes soon, so the group doesn't miss its share. But if the think time is big, the assumption isn't correct and we just waste bandwidth. In such case, we don't do idle. [global] runtime=30 direct=1 [test1] cgroup=test1 cgroup_weight=1000 rw=randread ioengine=libaio size=500m runtime=30 directory=/mnt filename=file1 thinktime=9000 [test2] cgroup=test2 cgroup_weight=1000 rw=randread ioengine=libaio size=500m runtime=30 directory=/mnt filename=file2 patched base test1 64k 39k test2 548k 540k total 604k 578k group1 gets much better throughput because it waits less time. To check if the patch changes behavior of queue without think time. I also tried to give test1 2ms think time or no think time. The test result is stable. The thoughput doesn't change with/without the patch. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Acked-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Shaohua Li 提交于
Currently when the last queue of a service tree has no request, we don't expire the queue to hope request from the service tree comes soon, so the service tree doesn't miss its share. But if the think time is big, the assumption isn't correct and we just waste bandwidth. In such case, we don't do idle. [global] runtime=10 direct=1 [test1] rw=randread ioengine=libaio size=500m directory=/mnt filename=file1 thinktime=9000 [test2] rw=read ioengine=libaio size=1G directory=/mnt filename=file2 patched base test1 41k/s 33k/s test2 15868k/s 15789k/s total 15902k/s 15817k/s A slightly better To check if the patch changes behavior of queue without think time. I also tried to give test1 2ms think time or no think time. The test has variation even without the patch, but the average throughput doesn't change with/without the patch. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Acked-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Shaohua Li 提交于
Move the variables to do think time check to a sepatate struct. This is to prepare adding think time check for service tree and group. No functional change. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Acked-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Justin TerAvest 提交于
fs_excl is a poor man's priority inheritance for filesystems to hint to the block layer that an operation is important. It was never clearly specified, not widely adopted, and will not prevent starvation in many cases (like across cgroups). fs_excl was introduced with the time sliced CFQ IO scheduler, to indicate when a process held FS exclusive resources and thus needed a boost. It doesn't cover all file systems, and it was never fully complete. Lets kill it. Signed-off-by: NJustin TerAvest <teravest@google.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 11 7月, 2011 1 次提交
-
-
由 Justin TerAvest 提交于
There is no consistency among filesystems from what bios (or requests) are marked as being metadata. It's interesting to expose this in traces, but we shouldn't schedule the requests differently based on whether or not they're marked as being metadata. Signed-off-by: NJustin TerAvest <teravest@google.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 08 7月, 2011 2 次提交
-
-
由 Shaohua Li 提交于
I'm often confused why not disable preempt when changing blk_plug list. It would be better to add comments here in case others have the similar concerns. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Shaohua Li 提交于
When I test fio script with big I/O depth, I found the total throughput drops compared to some relative small I/O depth. The reason is the thread accumulates big requests in its plug list and causes some delays (surely this depends on CPU speed). I thought we'd better have a threshold for requests. When a threshold reaches, this means there is no request merge and queue lock contention isn't severe when pushing per-task requests to queue, so the main advantages of blk plug don't exist. We can force a plug list flush in this case. With this, my test throughput actually increases and almost equals to small I/O depth. Another side effect is irq off time decreases in blk_flush_plug_list() for big I/O depth. The BLK_MAX_REQUEST_COUNT is choosen arbitarily, but 16 is efficiently to reduce lock contention to me. But I'm open here, 32 is ok in my test too. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 07 7月, 2011 2 次提交
-
-
由 Johannes Stezenbach 提交于
Fix headers_check error introduced by 390192b3: include/linux/fd.h:6: included file 'linux/compat.h' is not exported Signed-off-by: NJohannes Stezenbach <js@sig21.net> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Mike Snitzer 提交于
Due to the recently identified overflow in read_capacity_16() it was possible for max_discard_sectors to be zero but still have discards enabled on the associated device's queue. Eliminate the possibility for blkdev_issue_discard to infinitely loop. Interestingly this issue wasn't identified until a device, whose discard_granularity was 0 due to read_capacity_16 overflow, was consumed by blk_stack_limits() to construct limits for a higher-level DM multipath device. The multipath device's resulting limits never had the discard limits stacked because blk_stack_limits() will only do so if the bottom device's discard_granularity != 0. This resulted in the multipath device's limits.max_discard_sectors being 0. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 02 7月, 2011 1 次提交
-
-
由 Johannes Stezenbach 提交于
On Linux x86_64 host with 32bit userspace, running qemu or even just "qemu-img create -f qcow2 some.img 1G" causes a kernel warning: ioctl32(qemu-img:5296): Unknown cmd fd(3) cmd(00005326){t:'S';sz:0} arg(7fffffff) on some.img ioctl32(qemu-img:5296): Unknown cmd fd(3) cmd(801c0204){t:02;sz:28} arg(fff77350) on some.img ioctl 00005326 is CDROM_DRIVE_STATUS, ioctl 801c0204 is FDGETPRM. The warning appears because the Linux compat-ioctl handler for these ioctls only applies to block devices, while qemu also uses the ioctls on plain files. Signed-off-by: NJohannes Stezenbach <js@sig21.net> Acked-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 01 7月, 2011 2 次提交
-
-
由 Tejun Heo 提交于
Currently, only open(2) is defined as the 'clearing' point. It has two roles - first, it's an acknowledgement from userland indicating that the event has been received and kernel can clear pending states and proceed to generate more events. Secondly, it's passed on to device drivers as a hint indicating that a synchronization point has been reached and it might want to take a deeper look at the device. The latter currently is only used by sr which uses two different mechanisms - GET_EVENT_MEDIA_STATUS_NOTIFICATION and TEST_UNIT_READY to discover events, where the former is lighter weight and safe to be used repeatedly but may not provide full coverage. Among other things, GET_EVENT can't detect media removal while TUR can. This patch makes close(2) - blkdev_put() - indicate clearing hint for MEDIA_CHANGE to drivers. disk_check_events() is renamed to disk_flush_events() and updated to take @mask for events to flush which is or'd to ev->clearing and will be passed to the driver on the next ->check_events() invocation. This change makes sr generate MEDIA_CHANGE when media is ejected from userland - e.g. with eject(1). Note: Given the current usage, it seems @clearing hint is needlessly complex. disk_clear_events() can simply clear all events and the hint can be boolean @flush. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Jens Axboe 提交于
Conflicts: block/blk-throttle.c block/cfq-iosched.c Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 30 6月, 2011 8 次提交
-
-
-
由 Lars Ellenberg 提交于
We used to write these with BIO_RW_BARRIER aka REQ_HARDBARRIER (unless disabled in the configuration). The correct semantic now would be to write with FLUSH/FUA. For example, with activity log transactions, FUA alone is not enough, we need the corresponding bitmap update (and all related application updates) on stable storage as well. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
If we have an asymetrically congested network, we may send P_PING, but due to congestion, the corresponding P_PING_ACK would time out, and we would drop a (congested, but otherwise) healthy connection ("PingAck did not arrive in time.") Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
If we have a good resync rate, we will frequently update the on-disk bitmap, which, if not accounted for as resync io, may let an otherwise idle device appear to be "busy", and cause us to throttle resync. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
The last commit, drbd: add missing spinlock to bitmap receive, introduced a cond_resched_lock(), where the lock in question is taken with irqs disabled. As we must not schedule with IRQs disabled, and cond_resched_lock_irq() does not exist, yet, we re-aquire the spin_lock_irq() for each bitmap page processed in turn. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
During bitmap exchange, when using the RLE bitmap compression scheme, we have a code path that can set the whole bitmap at once. To avoid holding spin_lock_irq() for too long, we used to lock out other bitmap modifications during bitmap exchange by other means, and then, knowing we have exclusive access to the bitmap, modify it without the spinlock, and with IRQs enabled. Since we now allow local IO to continue, potentially setting additional bits during the bitmap receive phase, this is no longer true, and we get uncoordinated updates of bitmap members, causing bm_set to no longer accurately reflect the total number of set bits. To actually see this, you'd need to have a large bitmap, use RLE bitmap compression, and have busy IO during sync handshake and bitmap exchange. Fix this by taking the spin_lock_irq() in this code path as well, but calling cond_resched_lock() after each page worth of bits processed. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 27 6月, 2011 4 次提交
-
-
由 Shaohua Li 提交于
ioc->ioc_data is rcu protectd, so uses correct API to access it. This doesn't change any behavior, but just make code consistent. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Cc: stable@kernel.org # after ab4bd22dSigned-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Shaohua Li 提交于
I got a rcu warnning at boot. the ioc->ioc_data is rcu_deferenced, but doesn't hold rcu_read_lock. Signed-off-by: NShaohua Li <shaohua.li@intel.com> Cc: stable@kernel.org # after ab4bd22dSigned-off-by: NJens Axboe <jaxboe@fusionio.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6由 Linus Torvalds 提交于
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: cifs: mark CONFIG_CIFS_NFSD_EXPORT as BROKEN cifs: free blkcipher in smbhash
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6由 Linus Torvalds 提交于
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: cifs: propagate errors from cifs_get_root() to mount(2) cifs: tidy cifs_do_mount() up a bit cifs: more breakage on mount failures cifs: close sget() races cifs: pull freeing mountdata/dropping nls/freeing cifs_sb into cifs_umount() cifs: move cifs_umount() call into ->kill_sb() cifs: pull cifs_mount() call up sanitize cifs_umount() prototype cifs: initialize ->tlink_tree in cifs_setup_cifs_sb() cifs: allocate mountdata earlier cifs: leak on mount if we share superblock cifs: don't pass superblock to cifs_mount() cifs: don't leak nls on mount failure cifs: double free on mount failure take bdi setup/destruction into cifs_mount/cifs_umount Acked-by: NSteve French <smfrench@gmail.com>
-
- 26 6月, 2011 1 次提交
-
-
由 Steve French 提交于
-
- 25 6月, 2011 11 次提交
-
-
由 Linus Torvalds 提交于
Merge branch 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rtc: vt8500: Fix build error & cleanup rtc_class_ops->update_irq_enable() alarmtimers: Return -ENOTSUPP if no RTC device is present alarmtimers: Handle late rtc module loading
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6由 Linus Torvalds 提交于
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: Remove unneeded version.h includes from sound/ ASoC: pxa-ssp: Correct check for stream presence ASoC: imx: add missing module informations ASoC: imx: Remove unused Kconfig SND_MXC_SOC_SSI entry ALSA: HDA: Pinfix quirk for HP Z200 Workstation ALSA: VIA HDA: Create a master amplifier control for VT1718S. ALSA: VIA HDA: Mute/unmute mixer conncted to Headphone for VT1718S. ALSA: VIA HDA: Modify initial verbs list for VT1718S. ALSA: hda - Remove ALC268 model override for CPR2000 ALSA: HDA: Remove quirk for an HP device ASoC: Remove unused and about to be broken SND_SOC_CUSTOM I/O bus
-
由 Thomas Gleixner 提交于
Merge branch 'fortglx/3.0/tip/timers/rtc' of git://git.linaro.org/people/jstultz/linux into timers/urgent * rtc: vt8500: Fix build error & cleanup rtc_class_ops->update_irq_enable()
-
git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6由 Linus Torvalds 提交于
* 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6: drm/i915: save/resume forcewake lock fixes Revert "drm/i915: Kill GTT mappings when moving from GTT domain" drm/i915: Apply HWSTAM workaround for BSD ring on SandyBridge drm/i915: Call intel_enable_plane from i9xx_crtc_mode_set (again)
-
由 Al Viro 提交于
... instead of just failing with -EINVAL Acked-by: NPavel Shilovsky <piastryyy@gmail.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Acked-by: NPavel Shilovsky <piastryyy@gmail.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
if cifs_get_root() fails, we end up with ->mount() returning NULL, which is not what callers expect. Moreover, in case of superblock reuse we end up leaking a superblock reference... Acked-by: NPavel Shilovsky <piastryyy@gmail.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
have ->s_fs_info set by the set() callback passed to sget() Acked-by: NPavel Shilovsky <piastryyy@gmail.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
all callers of cifs_umount() proceed to do the same thing; pull it into cifs_umount() itself. Acked-by: NPavel Shilovsky <piastryyy@gmail.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
instead of calling it manually in case if cifs_read_super() fails to set ->s_root, just call it from ->kill_sb(). cifs_put_super() is gone now *and* we have cifs_sb shutdown and destruction done after the superblock is gone from ->s_instances. Acked-by: NPavel Shilovsky <piastryyy@gmail.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
... to the point prior to sget(). Now we have cifs_sb set up early enough. Acked-by: NPavel Shilovsky <piastryyy@gmail.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-