- 23 9月, 2014 10 次提交
-
-
由 Christoph Hellwig 提交于
Duplicate the (small) timeout handler in blk-mq so that we can pass arguments more easily to the driver timeout handler. This enables the next patch. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Don't do a kmalloc from timer to handle timeouts, chances are we could be under heavy load or similar and thus just miss out on the timeouts. Fortunately it is very easy to just iterate over all in use tags, and doing this properly actually cleans up the blk_mq_busy_iter API as well, and prepares us for the next patch by passing a reserved argument to the iterator. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Now that we've changed the driver API on the submission side use the opportunity to fix up the name on the completion side to fit into the general scheme. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
When we call blk_mq_start_request from the core blk-mq code before calling into ->queue_rq there is a racy window where the timeout handler can hit before we've fully set up the driver specific part of the command. Move the call to blk_mq_start_request into the driver so the driver can start the request only once it is fully set up. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Pass an explicit parameter for the last request in a batch to ->queue_rq instead of using a request flag. Besides being a cleaner and non-stateful interface this is also required for the next patch, which fixes the blk-mq I/O submission code to not start a time too early. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
When requests are retried due to hw or sw resource shortages, we often stop the associated hardware queue. So ensure that we restart the queues when running the requeue work, otherwise the queue run will be a no-op. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
__blk_mq_alloc_rq_maps() can be invoked multiple times, if we scale back the queue depth if we are low on memory. So don't clear set->tags when we fail, this is handled directly in the parent function, blk_mq_alloc_tag_set(). Reported-by: NRobert Elliott <Elliott@hp.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
We should not insert requests into the flush state machine from blk_mq_insert_request. All incoming flush requests come through blk_{m,s}q_make_request and are handled there, while blk_execute_rq_nowait should only be called for BLOCK_PC requests. All other callers deal with requests that already went through the flush statemchine and shouldn't be reinserted into it. Reported-by: NRobert Elliott <Elliott@hp.com> Debugged-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 David Hildenbrand 提交于
This patch should fix the bug reported in https://lkml.org/lkml/2014/9/11/249. We have to initialize at least the atomic_flags and the cmd_flags when allocating storage for the requests. Otherwise blk_mq_timeout_check() might dereference uninitialized pointers when racing with the creation of a request. Also move the reset of cmd_flags for the initializing code to the point where a request is freed. So we will never end up with pending flush request indicators that might trigger dereferences of invalid pointers in blk_mq_timeout_check(). Cc: stable@vger.kernel.org Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com> Reported-by: NPaulo De Rezende Pinatti <ppinatti@linux.vnet.ibm.com> Tested-by: NPaulo De Rezende Pinatti <ppinatti@linux.vnet.ibm.com> Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
When we start the request, we set the deadline and flip the bits marking the request as started and non-complete. However, it's important that the deadline store is ordered before flipping the bits, otherwise we could have a small window where the request is marked started but with an invalid deadline. This can confuse the timeout handling. Suggested-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 10 9月, 2014 2 次提交
-
-
由 Jens Axboe 提交于
If we are running in a kdump environment, resources are scarce. For some SCSI setups with a huge set of shared tags, we run out of memory allocating what the drivers is asking for. So implement a scale back logic to reduce the tag depth for those cases, allowing the driver to successfully load. We should extend this to detect low memory situations, and implement a sane fallback for those (1 queue, 64 tags, or something like that). Tested-by: NRobert Elliott <elliott@hp.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Alan Stern 提交于
When a queue is registered, the block layer turns off the bypass setting (because bypass is enabled when the queue is created). This doesn't work well for queues that are unregistered and then registered again; we get a WARNING because of the unbalanced calls to blk_queue_bypass_end(). This patch fixes the problem by making blk_register_queue() call blk_queue_bypass_end() only the first time the queue is registered. Signed-off-by: NAlan Stern <stern@rowland.harvard.edu> Acked-by: NTejun Heo <tj@kernel.org> CC: James Bottomley <James.Bottomley@HansenPartnership.com> CC: Jens Axboe <axboe@kernel.dk> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 09 9月, 2014 1 次提交
-
-
由 Tejun Heo 提交于
bdev_get_queue() returns the request_queue associated with the specified block_device. blk_get_backing_dev_info() makes use of bdev_get_queue() to determine the associated bdi given a block_device. All the callers of bdev_get_queue() including blk_get_backing_dev_info() assume that bdev_get_queue() may return NULL and implement NULL handling; however, bdev_get_queue() requires the passed in block_device is opened and attached to its gendisk. Because an active gendisk always has a valid request_queue associated with it, bdev_get_queue() can never return NULL and neither can blk_get_backing_dev_info(). Make it clear that neither of the two functions can return NULL and remove NULL handling from all the callers. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Chris Mason <clm@fb.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 08 9月, 2014 1 次提交
-
-
由 Tejun Heo 提交于
blkcg->id is a unique id given to each blkcg; however, the cgroup_subsys_state which each blkcg embeds already has ->serial_nr which can be used for the same purpose. Drop blkcg->id and replace its uses with blkcg->css.serial_nr. Rename cfq_cgroup->blkcg_id to ->blkcg_serial_nr and @id in check_blkcg_changed() to @serial_nr for consistency. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 04 9月, 2014 2 次提交
-
-
由 Keith Busch 提交于
Releases the dev_t minor when all references are closed to prevent another device from acquiring the same major/minor. Since the partition's release may be invoked from call_rcu's soft-irq context, the ext_dev_idr's mutex had to be replaced with a spinlock so as not so sleep. Signed-off-by: NKeith Busch <keith.busch@intel.com> Cc: stable@kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Robert Elliott 提交于
In blk-mq.c blk_mq_alloc_tag_set, if: set->tags = kmalloc_node() succeeds, but one of the blk_mq_init_rq_map() calls fails, goto out_unwind; needs to free set->tags so the caller is not obligated to do so. None of the current callers (null_blk, virtio_blk, virtio_blk, or the forthcoming scsi-mq) do so. set->tags needs to be set to NULL after doing so, so other tag cleanup logic doesn't try to free a stale pointer later. Also set it to NULL in blk_mq_free_tag_set. Tested with error injection on the forthcoming scsi-mq + hpsa combination. Signed-off-by: NRobert Elliott <elliott@hp.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 03 9月, 2014 1 次提交
-
-
由 Ming Lei 提交于
QUEUE_FLAG_NO_SG_MERGE is set at default for blk-mq devices, so bio->bi_phys_segment computed may be bigger than queue_max_segments(q) for blk-mq devices, then drivers will fail to handle the case, for example, BUG_ON() in virtio_queue_rq() can be triggerd for virtio-blk: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1359146 This patch fixes the issue by ignoring the QUEUE_FLAG_NO_SG_MERGE flag if the computed bio->bi_phys_segment is bigger than queue_max_segments(q), and the regression is caused by commit 05f1dd53(block: add queue flag for disabling SG merging). Reported-by: NKick In <pierre-andre.morey@canonical.com> Tested-by: NChris J Arges <chris.j.arges@canonical.com> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 29 8月, 2014 2 次提交
-
-
由 Jens Axboe 提交于
Dan writes: block/bsg.c:327 bsg_map_hdr() error: 'next_rq' dereferencing possible ERR_PTR(). Fix this by setting next_rq to NULL, for the case where it can be != NULL but an error pointer. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Joe Lawrence 提交于
The blk_get_request function may fail in low-memory conditions or during device removal (even if __GFP_WAIT is set). To distinguish between these errors, modify the blk_get_request call stack to return the appropriate ERR_PTR. Verify that all callers check the return status and consider IS_ERR instead of a simple NULL pointer check. For consistency, make a similar change to the blk_mq_alloc_request leg of blk_get_request. It may fail if the queue is dead, or the caller was unwilling to wait. Signed-off-by: NJoe Lawrence <joe.lawrence@stratus.com> Acked-by: Jiri Kosina <jkosina@suse.cz> [for pktdvd] Acked-by: Boaz Harrosh <bharrosh@panasas.com> [for osd] Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 28 8月, 2014 1 次提交
-
-
由 Toshiaki Makita 提交于
Explain that weight has to be updated on activation. This complements previous fix e15693ef ("cfq-iosched: Fix wrong children_weight calculation"). Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 27 8月, 2014 2 次提交
-
-
由 Joe Lawrence 提交于
The blk-core dead queue checks introduce an error scenario to blk_get_request that returns NULL if the request queue has been shutdown. This affects the behavior for __GFP_WAIT callers, who should verify the return value before dereferencing. Signed-off-by: NJoe Lawrence <joe.lawrence@stratus.com> Acked-by: Jiri Kosina <jkosina@suse.cz> [for pktdvd] Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Toshiaki Makita 提交于
cfq_group_service_tree_add() is applying new_weight at the beginning of the function via cfq_update_group_weight(). This actually allows weight to change between adding it to and subtracting it from children_weight, and triggers WARN_ON_ONCE() in cfq_group_service_tree_del(), or even causes oops by divide error during vfr calculation in cfq_group_service_tree_add(). The detailed scenario is as follows: 1. Create blkio cgroups X and Y as a child of X. Set X's weight to 500 and perform some I/O to apply new_weight. This X's I/O completes before starting Y's I/O. 2. Y starts I/O and cfq_group_service_tree_add() is called with Y. 3. cfq_group_service_tree_add() walks up the tree during children_weight calculation and adds parent X's weight (500) to children_weight of root. children_weight becomes 500. 4. Set X's weight to 1000. 5. X starts I/O and cfq_group_service_tree_add() is called with X. 6. cfq_group_service_tree_add() applies its new_weight (1000). 7. I/O of Y completes and cfq_group_service_tree_del() is called with Y. 8. I/O of X completes and cfq_group_service_tree_del() is called with X. 9. cfq_group_service_tree_del() subtracts X's weight (1000) from children_weight of root. children_weight becomes -500. This triggers WARN_ON_ONCE(). 10. Set X's weight to 500. 11. X starts I/O and cfq_group_service_tree_add() is called with X. 12. cfq_group_service_tree_add() applies its new_weight (500) and adds it to children_weight of root. children_weight becomes 0. Calcularion of vfr triggers oops by divide error. weight should be updated right before adding it to children_weight. Reported-by: NRuki Sekiya <sekiya.ruki@lab.ntt.co.jp> Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: NTejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 26 8月, 2014 1 次提交
-
-
由 Sabrina Dubroca 提交于
Before commit 2cada584 ("block: cleanup error handling in sg_io"), we had ret = 0 before entering the last big if block of sg_io. Since 2cada584, ret = -EFAULT, which breaks hdparm: /dev/sda: setting Advanced Power Management level to 0xc8 (200) HDIO_DRIVE_CMD failed: Bad address APM_level = 128 Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Fixes: 2cada584 ("block: cleanup error handling in sg_io") Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 23 8月, 2014 2 次提交
-
-
由 Tony Battersby 提交于
blk_rq_set_block_pc() memsets rq->cmd to 0, so it should come immediately after blk_get_request() to avoid overwriting the user-supplied CDB. Also check for failure to allocate rq. Fixes: f27b087b ("block: add blk_rq_set_block_pc()") Cc: <stable@vger.kernel.org> # 3.16.x Signed-off-by: NTony Battersby <tonyb@cybernetics.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tony Battersby 提交于
This patch fixes code such as the following with scsi-mq enabled: rq = blk_get_request(...); blk_rq_set_block_pc(rq); rq->cmd = my_cmd_buffer; /* separate CDB buffer */ blk_execute_rq_nowait(...); Code like this appears in e.g. sg_start_req() in drivers/scsi/sg.c (for large CDBs only). Without this patch, scsi_mq_prep_fn() will set rq->cmd back to rq->__cmd, causing the wrong CDB to be sent to the device. Signed-off-by: NTony Battersby <tonyb@cybernetics.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 22 8月, 2014 6 次提交
-
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBoaz Harrosh <boaz@plexistor.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Make sure we always clean up through the out label and just have a single place to put the request. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tejun Heo 提交于
While converting to percpu_ref for freezing, add703fd ("blk-mq: use percpu_ref for mq usage count") incorrectly made blk_mq_freeze_queue() misbehave when freezing is nested due to percpu_ref_kill() being invoked on an already killed ref. Fix it by making blk_mq_freeze_queue() kill and kick the queue only for the outermost freeze attempt. All the nested ones can simply wait for the ref to reach zero. While at it, remove unnecessary @wake initialization from blk_mq_unfreeze_queue(). Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
Just grammar or spelling errors, nothing major. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Sagi Grimberg 提交于
When getting a pi error we get to bio_integrity_end_io with bi_remaining already decremented to 0 where we will eventually need to call bio_endio with restored original bio completion handler. Calling bio_endio invokes a BUG_ON(). We should call bio_endio_nodec instead, like what is done in bio_integrity_verify_fn. Signed-off-by: NSagi Grimberg <sagig@mellanox.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
blk-mq uses BLK_MQ_F_SHOULD_MERGE, as set by the driver at init time, to determine whether it should merge IO or not. However, this could also be disabled by the admin, if merging is switched off through sysfs. So check the general queue state as well before attempting to merge IO. Reported-by: NRob Elliott <Elliott@hp.com> Tested-by: NRob Elliott <Elliott@hp.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 16 8月, 2014 1 次提交
-
-
由 Ming Lei 提交于
Before doing queue release, the queue has been freezed already by blk_cleanup_queue(), so needn't to freeze queue for deleting tag set. This patch fixes the WARNING of "percpu_ref_kill() called more than once!" which is triggered during unloading block driver. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 06 8月, 2014 1 次提交
-
-
由 Dan Carpenter 提交于
The lvip[] array has "state->limit" elements so the condition here should be >= instead of >. Fixes: 6ceea22b ('partitions: add aix lvm partition support files') Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NPhilippe De Muyter <phdm@macqel.be> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 02 8月, 2014 1 次提交
-
-
由 Mikulas Patocka 提交于
Various subsystems can ask the bio subsystem to create a bio slab cache with some free space before the bio. This free space can be used for any purpose. Device mapper uses this per-bio-data feature to place some target-specific and device-mapper specific data before the bio, so that the target-specific data doesn't have to be allocated separately. This per-bio-data mechanism is used in place of kmalloc, so we need the allocated slab to have the same memory alignment as memory allocated with kmalloc. Change bio_find_or_create_slab() so that it uses ARCH_KMALLOC_MINALIGN alignment when creating the slab cache. This is needed so that dm-crypt can use per-bio-data for encryption - the crypto subsystem assumes this data will have the same alignment as kmalloc'ed memory. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Acked-by: NJens Axboe <axboe@fb.com>
-
- 16 7月, 2014 1 次提交
-
-
由 Tejun Heo 提交于
While a queue is being destroyed, all the blkgs are destroyed and its ->root_blkg pointer is set to NULL. If someone else starts to drain while the queue is in this state, the following oops happens. NULL pointer dereference at 0000000000000028 IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 PGD e4a1067 PUD b773067 PMD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched] CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000 RIP: 0010:[<ffffffff8144e944>] [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 RSP: 0018:ffff88000efd7bf0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001 R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450 R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28 FS: 00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0 Stack: ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80 ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58 ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450 Call Trace: [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60 [<ffffffff81427641>] __blk_drain_queue+0x71/0x180 [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0 [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120 [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50 [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40 [<ffffffff8142d476>] blk_release_queue+0x26/0xd0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff81427505>] blk_put_queue+0x15/0x20 [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0 [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0 [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20 [<ffffffff817930e2>] device_release+0x32/0xa0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff817934d7>] put_device+0x17/0x20 [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0 [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40 [<ffffffff817d1257>] sdev_store_delete+0x27/0x30 [<ffffffff81792ca8>] dev_attr_store+0x18/0x30 [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50 [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170 [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0 [<ffffffff811f69bd>] SyS_write+0x4d/0xc0 [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b 776687bc ("block, blk-mq: draining can't be skipped even if bypass_depth was non-zero") made it easier to trigger this bug by making blk_queue_bypass_start() drain even when it loses the first bypass test to blk_cleanup_queue(); however, the bug has always been there even before the commit as blk_queue_bypass_start() could race against queue destruction, win the initial bypass test but perform the actual draining after blk_cleanup_queue() already destroyed all blkgs. Fix it by skippping calling into policy draining if all the blkgs are already gone. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NShirish Pargaonkar <spargaonkar@suse.com> Reported-by: NSasha Levin <sasha.levin@oracle.com> Reported-by: NJet Chen <jet.chen@intel.com> Cc: stable@vger.kernel.org Tested-by: NShirish Pargaonkar <spargaonkar@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 15 7月, 2014 3 次提交
-
-
由 Tejun Heo 提交于
Currently, cftypes added by cgroup_add_cftypes() are used for both the unified default hierarchy and legacy ones and subsystems can mark each file with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to appear only on one of them. This is quite hairy and error-prone. Also, we may end up exposing interface files to the default hierarchy without thinking it through. cgroup_subsys will grow two separate cftype addition functions and apply each only on the hierarchies of the matching type. This will allow organizing cftypes in a lot clearer way and encourage subsystems to scrutinize the interface which is being exposed in the new default hierarchy. In preparation, this patch adds cgroup_add_legacy_cftypes() which currently is a simple wrapper around cgroup_add_cftypes() and replaces all cgroup_add_cftypes() usages with it. While at it, this patch drops a completely spurious return from __hugetlb_cgroup_file_init(). This patch doesn't introduce any functional differences. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NLi Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Tejun Heo 提交于
Currently, cgroup_subsys->base_cftypes is used for both the unified default hierarchy and legacy ones and subsystems can mark each file with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to appear only on one of them. This is quite hairy and error-prone. Also, we may end up exposing interface files to the default hierarchy without thinking it through. cgroup_subsys will grow two separate cftype arrays and apply each only on the hierarchies of the matching type. This will allow organizing cftypes in a lot clearer way and encourage subsystems to scrutinize the interface which is being exposed in the new default hierarchy. In preparation, this patch renames cgroup_subsys->base_cftypes to cgroup_subsys->legacy_cftypes. This patch is pure rename. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NLi Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Aristeu Rozanski <aris@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
由 Jens Axboe 提交于
This reverts commit 254c4407. It causes crashes with cryptsetup, even after a few iterations and updates. Drop it for now.
-
- 14 7月, 2014 1 次提交
-
-
由 Mikulas Patocka 提交于
This patch provides the compat BLKZEROOUT ioctl. The argument is a pointer to two uint64_t values, so there is no need to translate it. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # 3.7+ Acked-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 12 7月, 2014 1 次提交
-
-
由 Tejun Heo 提交于
While a queue is being destroyed, all the blkgs are destroyed and its ->root_blkg pointer is set to NULL. If someone else starts to drain while the queue is in this state, the following oops happens. NULL pointer dereference at 0000000000000028 IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 PGD e4a1067 PUD b773067 PMD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched] CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000 RIP: 0010:[<ffffffff8144e944>] [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230 RSP: 0018:ffff88000efd7bf0 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001 R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450 R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28 FS: 00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0 Stack: ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80 ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58 ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450 Call Trace: [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60 [<ffffffff81427641>] __blk_drain_queue+0x71/0x180 [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0 [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120 [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50 [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40 [<ffffffff8142d476>] blk_release_queue+0x26/0xd0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff81427505>] blk_put_queue+0x15/0x20 [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0 [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0 [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20 [<ffffffff817930e2>] device_release+0x32/0xa0 [<ffffffff81454968>] kobject_cleanup+0x38/0x70 [<ffffffff81454848>] kobject_put+0x28/0x60 [<ffffffff817934d7>] put_device+0x17/0x20 [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0 [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40 [<ffffffff817d1257>] sdev_store_delete+0x27/0x30 [<ffffffff81792ca8>] dev_attr_store+0x18/0x30 [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50 [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170 [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0 [<ffffffff811f69bd>] SyS_write+0x4d/0xc0 [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b 776687bc ("block, blk-mq: draining can't be skipped even if bypass_depth was non-zero") made it easier to trigger this bug by making blk_queue_bypass_start() drain even when it loses the first bypass test to blk_cleanup_queue(); however, the bug has always been there even before the commit as blk_queue_bypass_start() could race against queue destruction, win the initial bypass test but perform the actual draining after blk_cleanup_queue() already destroyed all blkgs. Fix it by skippping calling into policy draining if all the blkgs are already gone. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NShirish Pargaonkar <spargaonkar@suse.com> Reported-by: NSasha Levin <sasha.levin@oracle.com> Reported-by: NJet Chen <jet.chen@intel.com> Cc: stable@vger.kernel.org Tested-by: NShirish Pargaonkar <spargaonkar@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-