- 30 5月, 2011 1 次提交
-
-
由 Jens Axboe 提交于
It was not a good idea to start dereferencing disk->queue from the fs sysfs strategy for displaying discard alignment. We ran into first a NULL pointer deref, and after fixing that we sometimes see unvalid disk->queue pointer values. Since discard is the only one of the bunch actually looking into the queue, just revert the change. This reverts commit 23ceb5b7. Conflicts: fs/partitions/check.c
-
- 27 5月, 2011 1 次提交
-
-
由 Jens Axboe 提交于
Eric Dumazet reports: ---- At boot, I have a crash in part_discard_alignment_show+0x1b/0x50 CR2 : 000006ac fault in : mov 0x2c(%rcx),%edx I suspect commit 23ceb5b7 (block: Remove extra discard_alignment from hd_struct) being in fault ---- Not quite known how ->queue can be NULL while the sysfs entry exists, but lets play it safe and check for a NULL queue. The rest of the sysfs show strategies in check.c do not dereference disk->queue. Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 09 5月, 2011 1 次提交
-
-
由 Jens Axboe 提交于
Stephen reports: ----- After merging the block tree, today's linux-next build (x86_64 allmodconfig) produced this warning: fs/partitions/check.c: In function 'part_discard_alignment_show': fs/partitions/check.c:263: warning: format '%u' expects type 'unsigned int', but argument 3 has type 'long long unsigned int' Introduced by commit ("block: Remove extra discard_alignment from hd_struct") ----- Fix it up by just removing the cast, we return an int already. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 07 5月, 2011 1 次提交
-
-
由 Tao Ma 提交于
Currently, hd_struct.discard_alignment is only used when we show /sys/block/sdx/sdx/discard_alignment. So remove it and calculate when it is asked to show. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 31 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 22 3月, 2011 1 次提交
-
-
由 Shaohua Li 提交于
After the stack plugging introduction, these are called lockless. Ensure that the counters are updated atomically. Signed-off-by: Shaohua Li<shaohua.li@intel.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 07 1月, 2011 1 次提交
-
-
由 Jens Axboe 提交于
We can't use krefs since it's apparently restricted to very basic reference counting. This reverts commit e4a683c8. Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 05 1月, 2011 1 次提交
-
-
由 Jerome Marchand 提交于
/proc/diskstats would display a strange output as follows. $ cat /proc/diskstats |grep sda 8 0 sda 90524 7579 102154 20464 0 0 0 0 0 14096 20089 8 1 sda1 19085 1352 21841 4209 0 0 0 0 4294967064 15689 4293424691 ~~~~~~~~~~ 8 2 sda2 71252 3624 74891 15950 0 0 0 0 232 23995 1562390 8 3 sda3 54 487 2188 92 0 0 0 0 0 88 92 8 4 sda4 4 0 8 0 0 0 0 0 0 0 0 8 5 sda5 81 2027 2130 138 0 0 0 0 0 87 137 Its reason is the wrong way of accounting hd_struct->in_flight. When a bio is merged into a request belongs to different partition by ELEVATOR_FRONT_MERGE. The detailed root cause is as follows. Assuming that there are two partition, sda1 and sda2. 1. A request for sda2 is in request_queue. Hence sda1's hd_struct->in_flight is 0 and sda2's one is 1. | hd_struct->in_flight --------------------------- sda1 | 0 sda2 | 1 --------------------------- 2. A bio belongs to sda1 is issued and is merged into the request mentioned on step1 by ELEVATOR_BACK_MERGE. The first sector of the request is changed from sda2 region to sda1 region. However the two partition's hd_struct->in_flight are not changed. | hd_struct->in_flight --------------------------- sda1 | 0 sda2 | 1 --------------------------- 3. The request is finished and blk_account_io_done() is called. In this case, sda2's hd_struct->in_flight, not a sda1's one, is decremented. | hd_struct->in_flight --------------------------- sda1 | -1 sda2 | 1 --------------------------- The patch fixes the problem by caching the partition lookup inside the request structure, hence making sure that the increment and decrement will always happen on the same partition struct. This also speeds up IO with accounting enabled, since it cuts down on the number of lookups we have to do. Also add a refcount to struct hd_struct to keep the partition in memory as long as users exist. We use kref_test_and_get() to ensure we don't add a reference to a partition which is going away. Signed-off-by: NJerome Marchand <jmarchan@redhat.com> Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: stable@kernel.org Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 17 12月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
There's no reason for register_disk() and del_gendisk() to be in fs/partitions/check.c. Move both to genhd.c. While at it, collapse unlink_gendisk(), which was artificially in a separate function due to genhd.c / check.c split, into del_gendisk(). Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 13 11月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
Over time, block layer has accumulated a set of APIs dealing with bdev open, close, claim and release. * blkdev_get/put() are the primary open and close functions. * bd_claim/release() deal with exclusive open. * open/close_bdev_exclusive() are combination of open and claim and the other way around, respectively. * bd_link/unlink_disk_holder() to create and remove holder/slave symlinks. * open_by_devnum() wraps bdget() + blkdev_get(). The interface is a bit confusing and the decoupling of open and claim makes it impossible to properly guarantee exclusive access as in-kernel open + claim sequence can disturb the existing exclusive open even before the block layer knows the current open if for another exclusive access. Reorganize the interface such that, * blkdev_get() is extended to include exclusive access management. @holder argument is added and, if is @FMODE_EXCL specified, it will gain exclusive access atomically w.r.t. other exclusive accesses. * blkdev_put() is similarly extended. It now takes @mode argument and if @FMODE_EXCL is set, it releases an exclusive access. Also, when the last exclusive claim is released, the holder/slave symlinks are removed automatically. * bd_claim/release() and close_bdev_exclusive() are no longer necessary and either made static or removed. * bd_link_disk_holder() remains the same but bd_unlink_disk_holder() is no longer necessary and removed. * open_bdev_exclusive() becomes a simple wrapper around lookup_bdev() and blkdev_get(). It also has an unexpected extra bdev_read_only() test which probably should be moved into blkdev_get(). * open_by_devnum() is modified to take @holder argument and pass it to blkdev_get(). Most of bdev open/close operations are unified into blkdev_get/put() and most exclusive accesses are tested atomically at the open time (as it should). This cleans up code and removes some, both valid and invalid, but unnecessary all the same, corner cases. open_bdev_exclusive() and open_by_devnum() can use further cleanup - rename to blkdev_get_by_path() and blkdev_get_by_devt() and drop special features. Well, let's leave them for another day. Most conversions are straight-forward. drbd conversion is a bit more involved as there was some reordering, but the logic should stay the same. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NNeil Brown <neilb@suse.de> Acked-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: NMike Snitzer <snitzer@redhat.com> Acked-by: NPhilipp Reisner <philipp.reisner@linbit.com> Cc: Peter Osterlund <petero2@telia.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Alex Elder <aelder@sgi.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: dm-devel@redhat.com Cc: drbd-dev@lists.linbit.com Cc: Leo Chen <leochen@broadcom.com> Cc: Scott Branden <sbranden@broadcom.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Joern Engel <joern@logfs.org> Cc: reiserfs-devel@vger.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk>
-
- 11 11月, 2010 1 次提交
-
-
由 Kay Sievers 提交于
We already export 'ro' for the disk. This adds the same attribute for partitions. Cc: Karel Zak <kzak@redhat.com> Signed-off-by: NKay Sievers <kay.sievers@vrfy.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 25 10月, 2010 1 次提交
-
-
由 Jens Axboe 提交于
This reverts commit 7681bfee. Conflicts: include/linux/genhd.h It has numerous issues with the cleanup path and non-elevator devices. Revert it for now so we can come up with a clean version without rushing things. Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 23 10月, 2010 1 次提交
-
-
由 Andi Kleen 提交于
I have some systems which need legacy sysfs due to old tools that are making assumptions that a directory can never be a symlink to another directory, and it's a big hazzle to compile separate kernels for them. This patch turns CONFIG_SYSFS_DEPRECATED into a run time option that can be switched on/off the kernel command line. This way the same binary can be used in both cases with just a option on the command line. The old CONFIG_SYSFS_DEPRECATED_V2 option is still there to set the default. I kept the weird name to not break existing config files. Also the compat code can be still completely disabled by undefining CONFIG_SYSFS_DEPRECATED_SWITCH -- just the optimizer takes care of this now instead of lots of ifdefs. This makes the code look nicer. v2: This is an updated version on top of Kay's patch to only handle the block devices. I tested it on my old systems and that seems to work. Cc: axboe@kernel.dk Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 19 10月, 2010 1 次提交
-
-
由 Yasuaki Ishimatsu 提交于
/proc/diskstats would display a strange output as follows. $ cat /proc/diskstats |grep sda 8 0 sda 90524 7579 102154 20464 0 0 0 0 0 14096 20089 8 1 sda1 19085 1352 21841 4209 0 0 0 0 4294967064 15689 4293424691 ~~~~~~~~~~ 8 2 sda2 71252 3624 74891 15950 0 0 0 0 232 23995 1562390 8 3 sda3 54 487 2188 92 0 0 0 0 0 88 92 8 4 sda4 4 0 8 0 0 0 0 0 0 0 0 8 5 sda5 81 2027 2130 138 0 0 0 0 0 87 137 Its reason is the wrong way of accounting hd_struct->in_flight. When a bio is merged into a request belongs to different partition by ELEVATOR_FRONT_MERGE. The detailed root cause is as follows. Assuming that there are two partition, sda1 and sda2. 1. A request for sda2 is in request_queue. Hence sda1's hd_struct->in_flight is 0 and sda2's one is 1. | hd_struct->in_flight --------------------------- sda1 | 0 sda2 | 1 --------------------------- 2. A bio belongs to sda1 is issued and is merged into the request mentioned on step1 by ELEVATOR_BACK_MERGE. The first sector of the request is changed from sda2 region to sda1 region. However the two partition's hd_struct->in_flight are not changed. | hd_struct->in_flight --------------------------- sda1 | 0 sda2 | 1 --------------------------- 3. The request is finished and blk_account_io_done() is called. In this case, sda2's hd_struct->in_flight, not a sda1's one, is decremented. | hd_struct->in_flight --------------------------- sda1 | -1 sda2 | 1 --------------------------- The patch fixes the problem by caching the partition lookup inside the request structure, hence making sure that the increment and decrement will always happen on the same partition struct. This also speeds up IO with accounting enabled, since it cuts down on the number of lookups we have to do. When reloading partition tables, quiesce IO to ensure that no request references to the partition struct exists. When it is safe to free the partition table, the IO for that device is restarted again. Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: stable@kernel.org Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 15 9月, 2010 1 次提交
-
-
由 Will Drewry 提交于
I'm reposting this patch series as v4 since there have been no additional comments, and I cleaned up one extra bit of unneeded code (in 3/3). The patches are against Linus's tree: 2bfc96a1 (2.6.36-rc3). Would this patchset be suitable for inclusion in an mm branch? This changes adds a partition_meta_info struct which itself contains a union of structures that provide partition table specific metadata. This change leaves the union empty. The subsequent patch includes an implementation for CONFIG_EFI_PARTITION-based metadata. Signed-off-by: NWill Drewry <wad@chromium.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 11 8月, 2010 1 次提交
-
-
由 Alexey Dobriyan 提交于
Fix this garbage happening quite often: ==> sda: scsi 3:0:0:0: CD-ROM TOSHIBA ==> sda1 sda2 sda3 sda4 <sr0: scsi3-mmc drive: 24x/24x writer dvd-ram cd/rw xa/form2 cdda tray ^^^ Uniform CD-ROM driver Revision: 3.20 sr 3:0:0:0: Attached scsi CD-ROM sr0 ==> sda5 sda6 sda7 > Make "sda: sda1 ..." lines actually lines. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 6月, 2010 1 次提交
-
-
由 Paul E. McKenney 提交于
Remove all rcu head inits. We don't care about the RCU head state before passing it to call_rcu() anyway. Only leave the "on_stack" variants so debugobjects can keep track of objects on stack. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andries Brouwer <aeb@cwi.nl>
-
- 22 5月, 2010 4 次提交
-
-
由 Tejun Heo 提交于
Currently, native capacity unlocking is initiated only when a recognized partition extends beyond the end of the disk. However, there are several other unhandled cases where truncated capacity can lead to misdetection of partitions. * Partition table is fully beyond EOD. * Partition table is partially beyond EOD (daisy chained ones). * Recognized partition starts beyond EOD. This patch updates generic partition check code such that all the above three cases are handled too. For the first two, @state tracks whether low level partition check code tried to read beyond EOD during partition scan and triggers native capacity unlocking accordingly. The third is now handled similarly to the original unlocking case. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ben Hutchings <ben@decadent.org.uk> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Make the following changes to partition check code. * Add ->bdev to struct parsed_partitions. * Introduce read_part_sector() which is a simple wrapper around read_dev_sector() which takes struct parsed_partitions *state instead of @bdev. * For functions which used to take @state and @bdev, drop @bdev. For functions which used to take @bdev, replace it with @state. * While updating, drop superflous checks on NULL state/bdev in ldm.c. This cleans up the API a bit and enables better handling of IO errors during partition check as the generic partition check code now has much better visibility into what went wrong in the low level code paths. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ben Hutchings <ben@decadent.org.uk> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
bdops->set_capacity() is unnecessarily generic. All that's required is a simple one way notification to lower level driver telling it to try to unlock native capacity. There's no reason to pass in target capacity or return the new capacity. The former is always the inherent native capacity and the latter can be handled via the usual device resize / revalidation path. In fact, the current API is always used that way. Replace ->set_capacity() with ->unlock_native_capacity() which take only @disk and doesn't return anything. IDE which is the only current user of the API is converted accordingly. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Device resize via ->set_capacity() can reveal new partitions (e.g. in chained partition table formats such as dos extended parts). Restart partition scan from the beginning after resizing a device. This change also makes libata always revalidate the disk after resize which makes lower layer native capacity unlocking implementation simpler and more robust as resize can be handled in the usual path. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NBen Hutchings <ben@decadent.org.uk> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 30 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: NTejun Heo <tj@kernel.org> Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-
- 11 1月, 2010 1 次提交
-
-
由 Martin K. Petersen 提交于
All callers of the stacking functions use 512-byte sector units rather than byte offsets. Simplify the code so the stacking functions take sectors when specifying data offsets. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 10 11月, 2009 1 次提交
-
-
由 Martin K. Petersen 提交于
While SSDs track block usage on a per-sector basis, RAID arrays often have allocation blocks that are bigger. Allow the discard granularity and alignment to be set and teach the topology stacking logic how to handle them. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 07 10月, 2009 1 次提交
-
-
由 Nikanth Karthikesan 提交于
Commit a9327cac added seperate read and write statistics of in_flight requests. And exported the number of read and write requests in progress seperately through sysfs. But Corrado Zoccolo <czoccolo@gmail.com> reported getting strange output from "iostat -kx 2". Global values for service time and utilization were garbage. For interval values, utilization was always 100%, and service time is higher than normal. So this was reverted by commit 0f78ab98 The problem was in part_round_stats_single(), I missed the following: if (now == part->stamp) return; - if (part->in_flight) { + if (part_in_flight(part)) { __part_stat_add(cpu, part, time_in_queue, part_in_flight(part) * (now - part->stamp)); __part_stat_add(cpu, part, io_ticks, (now - part->stamp)); With this chunk included, the reported regression gets fixed. Signed-off-by: NNikanth Karthikesan <knikanth@suse.de> -- Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 05 10月, 2009 1 次提交
-
-
由 Jens Axboe 提交于
This reverts commit a9327cac. Corrado Zoccolo <czoccolo@gmail.com> reports: "with 2.6.32-rc1 I started getting the following strange output from "iostat -kx 2": Linux 2.6.31bisect (et2) 04/10/2009 _i686_ (2 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 10,70 0,00 3,16 15,75 0,00 70,38 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 18,22 0,00 0,67 0,01 14,77 0,02 43,94 0,01 10,53 39043915,03 2629219,87 sdb 60,89 9,68 50,79 3,04 1724,43 50,52 65,95 0,70 13,06 488437,47 2629219,87 avg-cpu: %user %nice %system %iowait %steal %idle 2,72 0,00 0,74 0,00 0,00 96,53 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 sdb 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 avg-cpu: %user %nice %system %iowait %steal %idle 6,68 0,00 0,99 0,00 0,00 92,33 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 sdb 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 avg-cpu: %user %nice %system %iowait %steal %idle 4,40 0,00 0,73 1,47 0,00 93,40 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 100,00 sdb 0,00 4,00 0,00 3,00 0,00 28,00 18,67 0,06 19,50 333,33 100,00 Global values for service time and utilization are garbage. For interval values, utilization is always 100%, and service time is higher than normal. I bisected it down to: [a9327cac] Seperate read and write statistics of in_flight requests and verified that reverting just that commit indeed solves the issue on 2.6.32-rc1." So until this is debugged, revert the bad commit. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 22 9月, 2009 1 次提交
-
-
由 Alexey Dobriyan 提交于
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 9月, 2009 1 次提交
-
-
由 David Brownell 提交于
Let attribute group vectors be declared "const". We'd like to let most attribute metadata live in read-only sections... this is a start. Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 14 9月, 2009 1 次提交
-
-
由 Nikanth Karthikesan 提交于
Currently, there is a single in_flight counter measuring the number of requests in the request_queue. But some monitoring tools would like to know how many read requests and write requests are in progress. Split the current in_flight counter into two seperate counters for read and write. This information is exported as a sysfs attribute, as changing the currently available stat files would break the existing tools. Signed-off-by: NNikanth Karthikesan <knikanth@suse.de> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 13 7月, 2009 1 次提交
-
-
由 Heiko Carstens 提交于
git commit f67f129e "Driver core: implement uevent suppress in kobject" contains this chunk for fs/partitions/check.c: /* suppress uevent if the disk supresses it */ - if (!ddev->uevent_suppress) + if (!dev_get_uevent_suppress(pdev)) kobject_uevent(&pdev->kobj, KOBJ_ADD); However that should have been - if (!ddev->uevent_suppress) + if (!dev_get_uevent_suppress(ddev)) Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Acked-by: NMing Lei <tom.leiming@gmail.com> Cc: stable <stable@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 07 6月, 2009 2 次提交
-
-
* Add ->set_capacity block device method and use it in rescan_partitions() to attempt enabling native capacity of the device upon detecting the partition which exceeds device capacity. * Add GENHD_FL_NATIVE_CAPACITY flag to try limit attempts of enabling native capacity during partition scan. Together with the consecutive patch implementing ->set_capacity method in ide-gd device driver this allows automatic disabling of Host Protected Area (HPA) if any partitions overlapping HPA are detected. Cc: Robert Hancock <hancockrwd@gmail.com> Cc: Frans Pop <elendil@planet.nl> Cc: "Andries E. Brouwer" <Andries.Brouwer@cwi.nl> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Emphatically-Acked-by: NAlan Cox <alan@linux.intel.com> Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
The current warning message says only about the kernel's action taken without mentioning the underlying reason behind it. Noticed-by: NRobert Hancock <hancockrwd@gmail.com> Cc: Frans Pop <elendil@planet.nl> Cc: "Andries E. Brouwer" <Andries.Brouwer@cwi.nl> Cc: Al Viro <viro@zeniv.linux.org.uk> Emphatically-Acked-by: NAlan Cox <alan@linux.intel.com> Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
-
- 23 5月, 2009 1 次提交
-
-
由 Martin K. Petersen 提交于
To support devices with physical block sizes bigger than 512 bytes we need to ensure proper alignment. This patch adds support for exposing I/O topology characteristics as devices are stacked. logical_block_size is the smallest unit the device can address. physical_block_size indicates the smallest I/O the device can write without incurring a read-modify-write penalty. The io_min parameter is the smallest preferred I/O size reported by the device. In many cases this is the same as the physical block size. However, the io_min parameter can be scaled up when stacking (RAID5 chunk size > physical block size). The io_opt characteristic indicates the optimal I/O size reported by the device. This is usually the stripe width for arrays. The alignment_offset parameter indicates the number of bytes the start of the device/partition is offset from the device's natural alignment. Partition tools and MD/DM utilities can use this to pad their offsets so filesystems start on proper boundaries. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 25 3月, 2009 1 次提交
-
-
由 Ming Lei 提交于
This patch implements uevent suppress in kobject and removes it from struct device, based on the following ideas: 1,Uevent sending should be one attribute of kobject, so suppressing it in kobject layer is more natural than in device layer. By this way, we can do it for other objects embedded with kobject. 2,It may save several bytes for each instance of struct device.(On my omap3(32bit ARM) based box, can save 8bytes per device object) This patch also introduces dev_set|get_uevent_suppress() helpers to set and query uevent_suppress attribute in case to help kobject as private part of struct device in future. [This version is against the latest driver-core patch set of Greg,please ignore the last version.] Signed-off-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 27 1月, 2009 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Also make sure sparse (make C=2 block/blktrace.o) is happy too. Reported-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 1月, 2009 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Impact: New way of using the blktrace infrastructure This drops the requirement of userspace utilities to use the blktrace facility. Configuration is done thru sysfs, adding a "trace" directory to the partition directory where blktrace can be enabled for the associated request_queue. The same filters present in the IOCTL interface are present as sysfs device attributes. The /sys/block/sdX/sdXN/trace/enable file allows tracing without any filters. The other files in this directory: pid, act_mask, start_lba and end_lba can be used with the same meaning as with the IOCTL interface. Using the sysfs interface will only setup the request_queue->blk_trace fields, tracing will only take place when the "blk" tracer is selected via the ftrace interface, as in the following example: To see the trace, one can use the /d/tracing/trace file or the /d/tracign/trace_pipe file, with semantics defined in the ftrace documentation in Documentation/ftrace.txt. [root@f10-1 ~]# cat /t/trace kjournald-305 [000] 3046.491224: 8,1 A WBS 6367 + 8 <- (8,1) 6304 kjournald-305 [000] 3046.491227: 8,1 Q R 6367 + 8 [kjournald] kjournald-305 [000] 3046.491236: 8,1 G RB 6367 + 8 [kjournald] kjournald-305 [000] 3046.491239: 8,1 P NS [kjournald] kjournald-305 [000] 3046.491242: 8,1 I RBS 6367 + 8 [kjournald] kjournald-305 [000] 3046.491251: 8,1 D WB 6367 + 8 [kjournald] kjournald-305 [000] 3046.491610: 8,1 U WS [kjournald] 1 <idle>-0 [000] 3046.511914: 8,1 C RS 6367 + 8 [6367] [root@f10-1 ~]# The default line context (prefix) format is the one described in the ftrace documentation, with the blktrace specific bits using its existing format, described in blkparse(8). If one wants to have the classic blktrace formatting, this is possible by using: [root@f10-1 ~]# echo blk_classic > /t/trace_options [root@f10-1 ~]# cat /t/trace 8,1 0 3046.491224 305 A WBS 6367 + 8 <- (8,1) 6304 8,1 0 3046.491227 305 Q R 6367 + 8 [kjournald] 8,1 0 3046.491236 305 G RB 6367 + 8 [kjournald] 8,1 0 3046.491239 305 P NS [kjournald] 8,1 0 3046.491242 305 I RBS 6367 + 8 [kjournald] 8,1 0 3046.491251 305 D WB 6367 + 8 [kjournald] 8,1 0 3046.491610 305 U WS [kjournald] 1 8,1 0 3046.511914 0 C RS 6367 + 8 [6367] [root@f10-1 ~]# Using the ftrace standard format allows more flexibility, such as the ability of asking for backtraces via trace_options: [root@f10-1 ~]# echo noblk_classic > /t/trace_options [root@f10-1 ~]# echo stacktrace > /t/trace_options [root@f10-1 ~]# cat /t/trace kjournald-305 [000] 3318.826779: 8,1 A WBS 6375 + 8 <- (8,1) 6312 kjournald-305 [000] 3318.826782: <= submit_bio <= submit_bh <= sync_dirty_buffer <= journal_commit_transaction <= kjournald <= kthread <= child_rip kjournald-305 [000] 3318.826836: 8,1 Q R 6375 + 8 [kjournald] kjournald-305 [000] 3318.826837: <= generic_make_request <= submit_bio <= submit_bh <= sync_dirty_buffer <= journal_commit_transaction <= kjournald <= kthread Please read the ftrace documentation to use aditional, standardized tracing filters such as /d/tracing/trace_cpumask, etc. See also /d/tracing/trace_mark to add comments in the trace stream, that is equivalent to the /d/block/sdaN/msg interface. Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 10 1月, 2009 1 次提交
-
-
由 Neil Brown 提交于
Neil writes: Hi Jens, I've found a little bug for you. It was introduced by a6f23657 block: add one-hit cache for disk partition lookup and has the effect of killing my machine whenever I try to assemble an md array :-( One of the devices in the array has partitions, and mdadm always deletes partitions before putting a whole-device in an array (as it can cause confusion). The next IO to that device locks the machine. I don't really understand exactly why it locks up, but it happens in disk_map_sector_rcu(). This patch fixes it. Which is due to a missing clear of the (now) stale partition lookup data. So clear that when we delete a partition. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 07 1月, 2009 1 次提交
-
-
由 Kay Sievers 提交于
Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: NKay Sievers <kay.sievers@vrfy.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 18 11月, 2008 2 次提交
-
-
由 Tejun Heo 提交于
Block ext devt conversion missed md_autodetect_dev() call in rescan_partitions() leaving md autodetect unable to see partitions. Fix it. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Make add_partition() return pointer to the new hd_struct on success and ERR_PTR() value on failure. This change will be used to fix md autodetection bug. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-