- 07 3月, 2012 9 次提交
-
-
由 Tejun Heo 提交于
IO scheduling and cgroup are tied to the issuing task via io_context and cgroup of %current. Unfortunately, there are cases where IOs need to be routed via a different task which makes scheduling and cgroup limit enforcement applied completely incorrectly. For example, all bios delayed by blk-throttle end up being issued by a delayed work item and get assigned the io_context of the worker task which happens to serve the work item and dumped to the default block cgroup. This is double confusing as bios which aren't delayed end up in the correct cgroup and makes using blk-throttle and cfq propio together impossible. Any code which punts IO issuing to another task is affected which is getting more and more common (e.g. btrfs). As both io_context and cgroup are firmly tied to task including userland visible APIs to manipulate them, it makes a lot of sense to match up tasks to bios. This patch implements bio_associate_current() which associates the specified bio with %current. The bio will record the associated ioc and blkcg at that point and block layer will use the recorded ones regardless of which task actually ends up issuing the bio. bio release puts the associated ioc and blkcg. It grabs and remembers ioc and blkcg instead of the task itself because task may already be dead by the time the bio is issued making ioc and blkcg inaccessible and those are all block layer cares about. elevator_set_req_fn() is updated such that the bio elvdata is being allocated for is available to the elevator. This doesn't update block cgroup policies yet. Further patches will implement the support. -v2: #ifdef CONFIG_BLK_CGROUP added around bio->bi_ioc dereference in rq_ioc() to fix build breakage. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Kent Overstreet <koverstreet@google.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tejun Heo 提交于
Currently ioc->nr_tasks is used to decide two things - whether an ioc is done issuing IOs and whether it's shared by multiple tasks. This patch separate out the first into ioc->active_ref, which is acquired and released using {get|put}_io_context_active() respectively. This will be used to associate bio's with a given task. This patch doesn't introduce any visible behavior change. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tejun Heo 提交于
ioc_task_link() is used to share %current's ioc on clone. If %current->io_context is set, %current is guaranteed to have refcount on the ioc and, thus, ioc_task_link() can't fail. Replace error checking in ioc_task_link() with WARN_ON_ONCE() and make it just increment refcount and nr_tasks. -v2: Description typo fix (Vivek). Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tejun Heo 提交于
Now that blkg additions / removals are always done under both q and blkcg locks, the only places RCU locking is necessary are blkg_lookup[_create]() for lookup w/o blkcg lock. This patch drops unncessary RCU locking replacing it with plain blkcg locking as necessary. * blkiocg_pre_destroy() already perform proper locking and don't need RCU. Dropped. * blkio_read_blkg_stats() now uses blkcg->lock instead of RCU read lock. This isn't a hot path. * Now unnecessary synchronize_rcu() from queue exit paths removed. This makes q->nr_blkgs unnecessary. Dropped. * RCU annotation on blkg->q removed. -v2: Vivek pointed out that blkg_lookup_create() still needs to be called under rcu_read_lock(). Updated. -v3: After the update, stats_lock locking in blkio_read_blkg_stats() shouldn't be using _irq variant as it otherwise ends up enabling irq while blkcg->lock is locked. Fixed. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tejun Heo 提交于
With the previous patch to move blkg list heads and counters to request_queue and blkg, logic to manage them in both policies are almost identical and can be moved to blkcg core. This patch moves blkg link logic into blkg_lookup_create(), implements common blkg unlink code in blkg_destroy(), and updates blkg_destory_all() so that it's policy specific and can skip root group. The updated blkg_destroy_all() is now used to both clear queue for bypassing and elv switching, and release all blkgs on q exit. This patch introduces a race window where policy [de]registration may race against queue blkg clearing. This can only be a problem on cfq unload and shouldn't be a real problem in practice (and we have many other places where this race already exists). Future patches will remove these unlikely races. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tejun Heo 提交于
Currently, specific policy implementations are responsible for maintaining list and number of blkgs. This duplicates code unnecessarily, and hinders factoring common code and providing blkcg API with better defined semantics. After this patch, request_queue hosts list heads and counters and blkg has list nodes for both policies. This patch only relocates the necessary fields and the next patch will actually move management code into blkcg core. Note that request_queue->blkg_list[] and ->nr_blkgs[] are hardcoded to have 2 elements. This is to avoid include dependency and will be removed by the next patch. This patch doesn't introduce any behavior change. -v2: Now unnecessary conditional on CONFIG_BLK_CGROUP_MODULE removed as pointed out by Vivek. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tejun Heo 提交于
Keep track of all request_queues which have blkcg initialized and turn on bypass and invoke blkcg_clear_queue() on all before making changes to blkcg policies. This is to prepare for moving blkg management into blkcg core. Note that this uses more brute force than necessary. Finer grained shoot down will be implemented later and given that policy [un]registration almost never happens on running systems (blk-throtl can't be built as a module and cfq usually is the builtin default iosched), this shouldn't be a problem for the time being. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tejun Heo 提交于
Rename and extend elv_queisce_start/end() to blk_queue_bypass_start/end() which are exported and supports nesting via @q->bypass_depth. Also add blk_queue_bypass() to test bypass state. This will be further extended and used for blkio_group management. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tejun Heo 提交于
elevator_ops->elevator_init_fn() has a weird return value. It returns a void * which the caller should assign to q->elevator->elevator_data and %NULL return denotes init failure. Update such that it returns integer 0/-errno and sets elevator_data directly as necessary. This makes the interface more conventional and eases further cleanup. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 02 3月, 2012 2 次提交
-
-
由 Alan Stern 提交于
This patch (as1519) fixes a bug in the block layer's disk-events polling. The polling is done by a work routine queued on the system_nrt_wq workqueue. Since that workqueue isn't freezable, the polling continues even in the middle of a system sleep transition. Obviously, polling a suspended drive for media changes and such isn't a good thing to do; in the case of USB mass-storage devices it can lead to real problems requiring device resets and even re-enumeration. The patch fixes things by creating a new system-wide, non-reentrant, freezable workqueue and using it for disk-events polling. Signed-off-by: NAlan Stern <stern@rowland.harvard.edu> CC: <stable@kernel.org> Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NRafael J. Wysocki <rjw@sisk.pl> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jun'ichi Nomura 提交于
Since 2.6.39 (1196f8b8), when a driver returns -ENOMEDIUM for open(), __blkdev_get() calls rescan_partitions() to remove in-kernel partition structures and raise KOBJ_CHANGE uevent. However it ends up calling driver's revalidate_disk without open and could cause oops. In the case of SCSI: process A process B ---------------------------------------------- sys_open __blkdev_get sd_open returns -ENOMEDIUM scsi_remove_device <scsi_device torn down> rescan_partitions sd_revalidate_disk <oops> Oopses are reported here: http://marc.info/?l=linux-scsi&m=132388619710052 This patch separates the partition invalidation from rescan_partitions() and use it for -ENOMEDIUM case. Reported-by: NHuajun Li <huajun.li.lee@gmail.com> Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: NTejun Heo <tj@kernel.org> Cc: stable@kernel.org Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 2月, 2012 2 次提交
-
-
由 Tejun Heo 提交于
While updating locking, b2efa052 "block, cfq: unlink cfq_io_context's immediately" moved elevator_exit_icq_fn() invocation from exit_io_context() to the final ioc put. While this doesn't cause catastrophic failure, it effectively removes task exit notification to elevator and cause noticeable IO performance degradation with CFQ. On task exit, CFQ used to immediately expire the slice if it was being used by the exiting task as no more IO would be issued by the task; however, after b2efa052, the notification is lost and disk could sit idle needlessly, leading to noticeable IO performance degradation for certain workloads. This patch renames ioc_exit_icq() to ioc_destroy_icq(), separates elevator_exit_icq_fn() invocation into ioc_exit_icq() and invokes it from exit_io_context(). ICQ_EXITED flag is added to avoid invoking the callback more than once for the same icq. Walking icq_list from ioc side and invoking elevator callback requires reverse double locking. This may be better implemented using RCU; unfortunately, using RCU isn't trivial. e.g. RCU protection would need to cover request_queue and queue_lock switch on cleanup makes grabbing queue_lock from RCU unsafe. Reverse double locking should do, at least for now. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-and-bisected-by: NShaohua Li <shli@kernel.org> LKML-Reference: <CANejiEVzs=pUhQSTvUppkDcc2TNZyfohBRLygW5zFmXyk5A-xQ@mail.gmail.com> Tested-by: NShaohua Li <shaohua.li@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tejun Heo 提交于
icq->changed was used for ICQ_*_CHANGED bits. Rename it to flags and access it under ioc->lock instead of using atomic bitops. ioc_get_changed() is added so that the changed part can be fetched and cleared as before. icq->flags will be used to carry other flags. Signed-off-by: NTejun Heo <tj@kernel.org> Tested-by: NShaohua Li <shaohua.li@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 14 2月, 2012 4 次提交
-
-
由 Seungwon Jeon 提交于
Current PIO mode makes a kernel crash with CONFIG_HIGHMEM. Highmem pages have a NULL from sg_virt(sg). This patch fixes the following problem. Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = c0004000 [00000000] *pgd=00000000 Internal error: Oops: 817 [#1] PREEMPT SMP Modules linked in: CPU: 0 Not tainted (3.0.15-01423-gdbf465f #589) PC is at dw_mci_pull_data32+0x4c/0x9c LR is at dw_mci_read_data_pio+0x54/0x1f0 pc : [<c0358824>] lr : [<c035988c>] psr: 20000193 sp : c0619d48 ip : c0619d70 fp : c0619d6c r10: 00000000 r9 : 00000002 r8 : 00001000 r7 : 00000200 r6 : 00000000 r5 : e1dd3100 r4 : 00000000 r3 : 65622023 r2 : 0000007f r1 : eeb96000 r0 : e1dd3100 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment xkernel Control: 10c5387d Table: 61e2004a DAC: 00000015 Process swapper (pid: 0, stack limit = 0xc06182f0) Stack: (0xc0619d48 to 0xc061a000) 9d40: e1dd3100 e1a4f000 00000000 e1dd3100 e1a4f000 00000200 9d60: c0619da4 c0619d70 c035988c c03587e4 c0619d9c e18158f4 e1dd3100 e1dd3100 9d80: 00000020 00000000 00000000 00000020 c06e8a84 00000000 c0619e04 c0619da8 9da0: c0359b24 c0359844 e18158f4 e1dd3164 e1dd3168 e1dd3150 3d02fc79 e1dd3154 9dc0: e1dd3178 00000000 00000020 00000000 e1dd3150 00000000 c10dd7e8 e1a84900 9de0: c061e7cc 00000000 00000000 0000008d c06e8a84 c061e780 c0619e4c c0619e08 9e00: c00c4738 c0359a34 3d02fc79 00000000 c0619e4c c05a1698 c05a1670 c05a165c 9e20: c04de8b0 c061e780 c061e7cc e1a84900 ffffed68 0000008d c0618000 00000000 9e40: c0619e6c c0619e50 c00c48b4 c00c46c8 c061e780 c00423ac c061e7cc ffffed68 9e60: c0619e8c c0619e70 c00c7358 c00c487c 0000008d ffffee38 c0618000 ffffed68 9e80: c0619ea4 c0619e90 c00c4258 c00c72b0 c00423ac ffffee38 c0619ecc c0619ea8 9ea0: c004241c c00c4234 ffffffff f8810000 0000006d 00000002 00000001 7fffffff 9ec0: c0619f44 c0619ed0 c0048bc0 c00423c4 220ae7a9 00000000 386f0d30 0005d3a4 9ee0: c00423ac c10dd0b8 c06f2cd8 c0618000 c0594778 c003a674 7fffffff c0619f44 9f00: 386f0d30 c0619f18 c00a6f94 c005be3c 80000013 ffffffff 386f0d30 0005d3a4 9f20: 386f0d30 0005d2d1 c10dd0a8 c10dd0b8 c06f2cd8 c0618000 c0619f74 c0619f48 9f40: c0345858 c005be00 c00a2440 c0618000 c0618000 c00410d8 c06c1944 c00410fc 9f60: c0594778 c003a674 c0619f9c c0619f78 c004a7e8 c03457b4 c0618000 c06c18f8 9f80: 00000000 c0039c70 c06c18d4 c003a674 c0619fb4 c0619fa0 c04ceafc c004a714 9fa0: c06287b4 c06c18f8 c0619ff4 c0619fb8 c0008b68 c04cea68 c0008578 00000000 9fc0: 00000000 c003a674 00000000 10c5387d c0628658 c003aa78 c062f1c4 4000406a 9fe0: 413fc090 00000000 00000000 c0619ff8 40008044 c0008858 00000000 00000000 Backtrace: [<c03587d8>] (dw_mci_pull_data32+0x0/0x9c) from [<c035988c>] (dw_mci_read_data_pio+0x54/0x1f0) r6:00000200 r5:e1a4f000 r4:e1dd3100 [<c0359838>] (dw_mci_read_data_pio+0x0/0x1f0) from [<c0359b24>] (dw_mci_interrupt+0xfc/0x4a4) [<c0359a28>] (dw_mci_interrupt+0x0/0x4a4) from [<c00c4738>] (handle_irq_event_percpu+0x7c/0x1b4) [<c00c46bc>] (handle_irq_event_percpu+0x0/0x1b4) from [<c00c48b4>] (handle_irq_event+0x44/0x64) [<c00c4870>] (handle_irq_event+0x0/0x64) from [<c00c7358>] (handle_fasteoi_irq+0xb4/0x124) r7:ffffed68 r6:c061e7cc r5:c00423ac r4:c061e780 [<c00c72a4>] (handle_fasteoi_irq+0x0/0x124) from [<c00c4258>] (generic_handle_irq+0x30/0x38) r7:ffffed68 r6:c0618000 r5:ffffee38 r4:0000008d [<c00c4228>] (generic_handle_irq+0x0/0x38) from [<c004241c>] (asm_do_IRQ+0x64/0xe0) r5:ffffee38 r4:c00423ac [<c00423b8>] (asm_do_IRQ+0x0/0xe0) from [<c0048bc0>] (__irq_svc+0x80/0x14c) Exception stack(0xc0619ed0 to 0xc0619f18) Signed-off-by: NSeungwon Jeon <tgih.jun@samsung.com> Acked-by: NWill Newton <will.newton@imgtec.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: NChris Ball <cjb@laptop.org>
-
由 Girish K S 提交于
Modified the mmc_poweroff to resume before sending the poweroff notification command. In sleep mode only AWAKE and RESET commands are allowed, so before sending the poweroff notification command resume from sleep mode and then send the notification command. PowerOff Notify is tested on a Synopsis Designware Host Controller (eMMC 4.5). The suspend to RAM and resume works fine. Signed-off-by: NGirish K S <girish.shivananjappa@linaro.org> Tested-by: NGirish K S <girish.shivananjappa@linaro.org> Reviewed-by: NSaugata Das <saugata.das@linaro.org> Signed-off-by: NChris Ball <cjb@laptop.org>
-
由 Jaehoon Chung 提交于
There is an understood mismatch between the voltage the host controller is set to and the voltage supplied to the card by a fixed voltage regulator. Teaching the driver to accept the mismatch is overly complicated. Instead just accept the regulator's voltage. This patch adds MMC_CAP2_BROKEN_VOLTAGE. If the voltage didn't satisfy between min_uV and max_uV, try to change the voltage in core.c. When changing the voltage, maybe use regulator_set_voltage(). In regulator_set_voltage(), check the below condition. /* sanity check */ if (!rdev->desc->ops->set_voltage && !rdev->desc->ops->set_voltage_sel) { ret = -EINVAL; goto out; } If some board should use the fixed-regulator, always return -EINVAL. Then, eMMC didn't initialize always. So if use a fixed-regulator, we need to add the MMC_CAP2_BROKEN_VOLTAGE. Signed-off-by: NJaehoon Chung <jh80.chung@samsung.com> Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com> Acked-by: NAdrian Hunter <adrian.hunter@intel.com> Signed-off-by: NChris Ball <cjb@laptop.org>
-
由 Sujit Reddy Thumma 提交于
Ensure clocks are always enabled before any interaction with the host controller driver. This makes sure that there is no race between host execution and the core layer turning off clocks in different context with clock gating framework. Signed-off-by: NSujit Reddy Thumma <sthumma@codeaurora.org> Acked-by: NLinus Walleij <linus.walleij@linaro.org> Acked-by: NPer Forlin <per.forlin@stericsson.com> Signed-off-by: NChris Ball <cjb@laptop.org>
-
- 09 2月, 2012 1 次提交
-
-
由 Paolo Bonzini 提交于
The keeplocked variable in the cdrom driver is shared across multiple drives, but set in per-device ioctls. Move it to the per-device struct, avoiding that the setting on one drive affects the driver's behavior when closing another. [ Impact: limit udev's confusion to one drive when a CD burning program unlocks the CD door at the end of burning. ] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 08 2月, 2012 2 次提交
-
-
由 Tejun Heo 提交于
Plug merge calls two elevator callbacks outside queue lock - elevator_allow_merge_fn() and elevator_bio_merged_fn(). Although attempt_plug_merge() suggests that elevator is guaranteed to be there through the existing request on the plug list, nothing prevents plug merge from calling into dying or initializing elevator. For regular merges, bypass ensures elvpriv count to reach zero, which in turn prevents merges as all !ELVPRIV requests get REQ_SOFTBARRIER from forced back insertion. Plug merge doesn't check ELVPRIV, and, as the requests haven't gone through elevator insertion yet, it doesn't have SOFTBARRIER set allowing merges on a bypassed queue. This, for example, leads to the following crash during elevator switch. BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffff813b34e9>] cfq_allow_merge+0x49/0xa0 PGD 112cbc067 PUD 115d5c067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU 1 Modules linked in: deadline_iosched Pid: 819, comm: dd Not tainted 3.3.0-rc2-work+ #76 Bochs Bochs RIP: 0010:[<ffffffff813b34e9>] [<ffffffff813b34e9>] cfq_allow_merge+0x49/0xa0 RSP: 0018:ffff8801143a38f8 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffff88011817ce28 RCX: ffff880116eb6cc0 RDX: 0000000000000000 RSI: ffff880118056e20 RDI: ffff8801199512f8 RBP: ffff8801143a3908 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff880118195708 R13: ffff880118052aa0 R14: ffff8801143a3d50 R15: ffff880118195708 FS: 00007f19f82cb700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000008 CR3: 0000000112c6a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process dd (pid: 819, threadinfo ffff8801143a2000, task ffff880116eb6cc0) Stack: ffff88011817ce28 ffff880118195708 ffff8801143a3928 ffffffff81391bba ffff88011817ce28 ffff880118195708 ffff8801143a3948 ffffffff81391bf1 ffff88011817ce28 0000000000000000 ffff8801143a39a8 ffffffff81398e3e Call Trace: [<ffffffff81391bba>] elv_rq_merge_ok+0x4a/0x60 [<ffffffff81391bf1>] elv_try_merge+0x21/0x40 [<ffffffff81398e3e>] blk_queue_bio+0x8e/0x390 [<ffffffff81396a5a>] generic_make_request+0xca/0x100 [<ffffffff81396b04>] submit_bio+0x74/0x100 [<ffffffff811d45c2>] __blockdev_direct_IO+0x1ce2/0x3450 [<ffffffff811d0dc7>] blkdev_direct_IO+0x57/0x60 [<ffffffff811460b5>] generic_file_aio_read+0x6d5/0x760 [<ffffffff811986b2>] do_sync_read+0xe2/0x120 [<ffffffff81199345>] vfs_read+0xc5/0x180 [<ffffffff81199501>] sys_read+0x51/0x90 [<ffffffff81aeac12>] system_call_fastpath+0x16/0x1b There are multiple ways to fix this including making plug merge check ELVPRIV; however, * Calling into elevator outside queue lock is confusing and error-prone. * Requests on plug list aren't known to the elevator. They aren't on the elevator yet, so there's no elevator specific state to update. * Given the nature of plug merges - collecting bio's for the same purpose from the same issuer - elevator specific restrictions aren't applicable. So, simply don't call into elevator methods from plug merge by moving elv_bio_merged() from bio_attempt_*_merge() to blk_queue_bio(), and using blk_try_merge() in attempt_plug_merge(). This is based on Jens' patch to skip elevator_allow_merge_fn() from plug merge. Note that this makes per-cgroup merged stats skip plug merging. Signed-off-by: NTejun Heo <tj@kernel.org> LKML-Reference: <4F16F3CA.90904@kernel.dk> Original-patch-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tejun Heo 提交于
blk_rq_merge_ok() is the elevator-neutral part of merge eligibility test. blk_try_merge() determines merge direction and expects the caller to have tested elv_rq_merge_ok() previously. elv_rq_merge_ok() now wraps blk_rq_merge_ok() and then calls elv_iosched_allow_merge(). elv_try_merge() is removed and the two callers are updated to call elv_rq_merge_ok() explicitly followed by blk_try_merge(). While at it, make rq_merge_ok() functions return bool. This is to prepare for plug merge update and doesn't introduce any behavior change. This is based on Jens' patch to skip elevator_allow_merge_fn() from plug merge. Signed-off-by: NTejun Heo <tj@kernel.org> LKML-Reference: <4F16F3CA.90904@kernel.dk> Original-patch-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 07 2月, 2012 2 次提交
-
-
由 Tejun Heo 提交于
put_io_context() performed a complex trylock dancing to avoid deferring ioc release to workqueue. It was also broken on UP because trylock was always assumed to succeed which resulted in unbalanced preemption count. While there are ways to fix the UP breakage, even the most pathological microbench (forced ioc allocation and tight fork/exit loop) fails to show any appreciable performance benefit of the optimization. Strip it out. If there turns out to be workloads which are affected by this change, simpler optimization from the discussion thread can be applied later. Signed-off-by: NTejun Heo <tj@kernel.org> LKML-Reference: <1328514611.21268.66.camel@sli10-conroe> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Heiko Carstens 提交于
Setting the task name is done within setup_new_exec() by accessing bprm->filename. However this happens after flush_old_exec(). This may result in a use after free bug, flush_old_exec() may "complete" vfork_done, which will wake up the parent which in turn may free the passed in filename. To fix this add a new tcomm field in struct linux_binprm which contains the now early generated task name until it is used. Fixes this bug on s390: Unable to handle kernel pointer dereference at virtual kernel address 0000000039768000 Process kworker/u:3 (pid: 245, task: 000000003a3dc840, ksp: 0000000039453818) Krnl PSW : 0704000180000000 0000000000282e94 (setup_new_exec+0xa0/0x374) Call Trace: ([<0000000000282e2c>] setup_new_exec+0x38/0x374) [<00000000002dd12e>] load_elf_binary+0x402/0x1bf4 [<0000000000280a42>] search_binary_handler+0x38e/0x5bc [<0000000000282b6c>] do_execve_common+0x410/0x514 [<0000000000282cb6>] do_execve+0x46/0x58 [<00000000005bce58>] kernel_execve+0x28/0x70 [<000000000014ba2e>] ____call_usermodehelper+0x102/0x140 [<00000000005bc8da>] kernel_thread_starter+0x6/0xc [<00000000005bc8d4>] kernel_thread_starter+0x0/0xc Last Breaking-Event-Address: [<00000000002830f0>] setup_new_exec+0x2fc/0x374 Kernel panic - not syncing: Fatal exception: panic_on_oops Reported-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 2月, 2012 1 次提交
-
-
由 Venkatesh Pallipadi 提交于
Looks like change "PM QoS: Move and rename the implementation files" merged during the 3.2 development cycle made PM QoS depend on CONFIG_PM which depends on (PM_SLEEP || PM_RUNTIME). That breaks CPU C-states with kernels not having these CONFIGs, causing CPUs to spend time in Polling loop idle instead of going into deep C-states, consuming way way more power. This is with either acpi idle or intel idle enabled. Either CONFIG_PM should be enabled with any pm_qos users or the !CONFIG_PM pm_qos_request() should return sane defaults not to break the existing users. Here's is the patch for the latter option. [rjw: Modified the changelog slightly.] Signed-off-by: NVenkatesh Pallipadi <venki@google.com> Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Cc: stable@vger.kernel.org
-
- 04 2月, 2012 1 次提交
-
-
由 Peter Ujfalusi 提交于
Store the last used mclk configuration for the PLL. Signed-off-by: NPeter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>
-
- 03 2月, 2012 4 次提交
-
-
由 Kuninori Morimoto 提交于
The usb/ch9.h will be installed to /usr/include/linux, and be used from user space. But le16_to_cpu() is only defined for kernel code. Without this patch, user space compile will be broken. Special thanks to Stefan Becker Cc: stable@vger.kernel.org # 3.2 Reported-by: NStefan Becker <chemobejk@gmail.com> Signed-off-by: NKuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: NFelipe Balbi <balbi@ti.com>
-
由 Josh Triplett 提交于
Signed-off-by: NJosh Triplett <josh@joshtriplett.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christopher Yeoh 提交于
This fixes the race in process_vm_core found by Oleg (see http://article.gmane.org/gmane.linux.kernel/1235667/ for details). This has been updated since I last sent it as the creation of the new mm_access() function did almost exactly the same thing as parts of the previous version of this patch did. In order to use mm_access() even when /proc isn't enabled, we move it to kernel/fork.c where other related process mm access functions already are. Signed-off-by: NChris Yeoh <yeohc@au1.ibm.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Haiyang Zhang 提交于
There is a possible data corruption if an RNDIS message goes beyond page boundary in the sending code path. This patch fixes the problem. Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com> Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 2月, 2012 5 次提交
-
-
由 Kuninori Morimoto 提交于
The usb/ch9.h will be installed to /usr/include/linux, and be used from user space. But le16_to_cpu() is only defined for kernel code. Without this patch, user space compile will be broken. Special thanks to Stefan Becker Reported-by: NStefan Becker <chemobejk@gmail.com> Signed-off-by: NKuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Artem Bityutskiy 提交于
This patch fixes merge conflict resolution breakage introduced by merge d3712b9d ("Merge tag 'for-linus' of git://github.com/prasad-joshi/logfs_upstream"). The commit changed 'mtd_can_have_bb()' function and made it always return zero, which is incorrect. Instead, we need it to return whether the underlying flash device can have bad eraseblocks or not. UBI needs this information because it affects how it handles the underlying flash. E.g., if the underlying flash is NOR, it cannot have bad blocks and any write or erase error is fatal, and all we can do is to switch to R/O mode. We do not need to reserve a pool of good eraseblocks for bad eraseblocks handling, and so on. This patch also removes 'mtd_can_have_bb()' invocations from Logfs to ensure correct Logfs behavior. I've tested that with this patch UBI works on top of NOR and NAND flashes emulated by mtdram and nandsim correspondingly. This patch is based on patch from Linus Torvalds. Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com> Acked-by: NJörn Engel <joern@logfs.org> Acked-by: NPrasad Joshi <prasadjoshi.linux@gmail.com> Acked-by: NBrian Norris <computersforpeace@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kim, Milo 提交于
The 'x(execute)' permission is removed. (chmod from 0755 to 0644) Signed-off-by: NMilo(Woogyom) Kim <milo.kim@ti.com> Signed-off-by: NAnton Vorontsov <cbouatmailru@gmail.com>
-
由 Heiko Stübner 提交于
A struct device parameter is used in the enable and disable callbacks to distinguish between different gpio_keys devices. Platforms that don't use these callbacks may not include struct device at all, as seen on arch/arm/mach-s3c2410/mach-n30.c Signed-off-by: NHeiko Stuebner <heiko@sntech.de> Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
-
由 Guennadi Liakhovetski 提交于
Add a flag to allow platforms to specify, whether a DMAC instance supports the MEMCPY operation. To avoid regressions, preserve the current default. Signed-off-by: NGuennadi Liakhovetski <g.liakhovetski@gmx.de> Acked-by: NPaul Mundt <lethal@linux-sh.org> Signed-off-by: NVinod Koul <vinod.koul@linux.intel.com>
-
- 01 2月, 2012 2 次提交
-
-
由 Dmitry Kasatkin 提交于
MPI_NULL is replaced with normal NULL. Signed-off-by: NDmitry Kasatkin <dmitry.kasatkin@intel.com> Reviewed-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NJames Morris <jmorris@namei.org>
-
由 Wu Fengguang 提交于
PROP_MAX_SHIFT should be set to <=32 on 64-bit box. This fixes two bugs in the below lines of bdi_dirty_limit(): bdi_dirty *= numerator; do_div(bdi_dirty, denominator); 1) divide error: do_div() only uses the lower 32 bit of the denominator, which may trimmed to be 0 when PROP_MAX_SHIFT > 32. 2) overflow: (bdi_dirty * numerator) could easily overflow if numerator used up to 48 bits, leaving only 16 bits to bdi_dirty Cc: <stable@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Reported-by: NIlya Tumaykin <librarian_rus@yahoo.com> Tested-by: NIlya Tumaykin <librarian_rus@yahoo.com> Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
-
- 30 1月, 2012 2 次提交
-
-
由 Artem Bityutskiy 提交于
Commits 3fe4bae8 and 079c985e broke MTD suspend in 2 ways: 1. When the '->suspend' method is not present, we return -EOPNOTSUPP, but the callers of 'mtd_suspend()' expects 0 instead. 2. Checking of the 'mtd' parameter against NULL has been incorrectly removed in 'mtd_cls_suspend()'. This patch fixes the breakages. This has been found, analyzed, reported and tested by Rafael J. Wysocki <rjw@sisk.pl>. Note, this patch is not needed in the stable tree because it causes a regression introduced during the v3.3 merge window. Reported-by: NRafael J. Wysocki <rjw@sisk.pl> Tested-by: NRafael J. Wysocki <rjw@sisk.pl> Tested-by: NRussell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
由 Rafael J. Wysocki 提交于
Commit 2aede851 PM / Hibernate: Freeze kernel threads after preallocating memory introduced a mechanism by which kernel threads were frozen after the preallocation of hibernate image memory to avoid problems with frozen kernel threads not responding to memory freeing requests. However, it overlooked the s2disk code path in which the SNAPSHOT_CREATE_IMAGE ioctl was run directly after SNAPSHOT_FREE, which caused freeze_workqueues_begin() to BUG(), because it saw that worqueues had been already frozen. Although in principle this issue might be addressed by removing the relevant BUG_ON() from freeze_workqueues_begin(), that would reintroduce the very problem that commit 2aede851 attempted to avoid into that particular code path. For this reason, to fix the issue at hand, introduce thaw_kernel_threads() and make the SNAPSHOT_FREE ioctl execute it. Special thanks to Srivatsa S. Bhat for detailed analysis of the problem. Reported-and-tested-by: NJiri Slaby <jslaby@suse.cz> Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Acked-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: stable@kernel.org
-
- 27 1月, 2012 1 次提交
-
-
由 Stephane Eranian 提交于
This patch fixes the sampling interrupt throttling mechanism. It was broken in v3.2. Events were not being unthrottled. The unthrottling mechanism required that events be checked at each timer tick. This patch solves this problem and also separates: - unthrottling - multiplexing - frequency-mode period adjustments Not all of them need to be executed at each timer tick. This third version of the patch is based on my original patch + PeterZ proposal (https://lkml.org/lkml/2012/1/7/87). At each timer tick, for each context: - if the current CPU has throttled events, we unthrottle events - if context has frequency-based events, we adjust sampling periods - if we have reached the jiffies interval, we multiplex (rotate) We decoupled rotation (multiplexing) from frequency-mode sampling period adjustments. They should not necessarily happen at the same rate. Multiplexing is subject to jiffies_interval (currently at 1 but could be higher once the tunable is exposed via sysfs). We have grouped frequency-mode adjustment and unthrottling into the same routine to minimize code duplication. When throttled while in frequency mode, we scan the events only once. We have fixed the threshold enforcement code in __perf_event_overflow(). There was a bug whereby it would allow more than the authorized rate because an increment of hwc->interrupts was not executed at the right place. The patch was tested with low sampling limit (2000) and fixed periods, frequency mode, overcommitted PMU. On a 2.1GHz AMD CPU: $ cat /proc/sys/kernel/perf_event_max_sample_rate 2000 We set a rate of 3000 samples/sec (2.1GHz/3000 = 700000): $ perf record -e cycles,cycles -c 700000 noploop 10 $ perf report -D | tail -21 Aggregated stats: TOTAL events: 80086 MMAP events: 88 COMM events: 2 EXIT events: 4 THROTTLE events: 19996 UNTHROTTLE events: 19996 SAMPLE events: 40000 cycles stats: TOTAL events: 40006 MMAP events: 5 COMM events: 1 EXIT events: 4 THROTTLE events: 9998 UNTHROTTLE events: 9998 SAMPLE events: 20000 cycles stats: TOTAL events: 39996 THROTTLE events: 9998 UNTHROTTLE events: 9998 SAMPLE events: 20000 For 10s, the cap is 2x2000x10 = 40000 samples. We get exactly that: 20000 samples/event. Signed-off-by: NStephane Eranian <eranian@google.com> Cc: <stable@kernel.org> # v3.2+ Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120126160319.GA5655@quadSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 25 1月, 2012 2 次提交
-
-
由 Jiri Pirko 提交于
This patch changes event message behaviour to send only updated records instead of whole list. This fixes bug on which userspace receives non-actual data in case multiple events occur in row. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Randy Dunlap 提交于
Fix new kernel-doc warning: Warning(include/linux/usb.h:1251): No description found for parameter 'num_mapped_sgs' Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-