1. 19 8月, 2015 18 次提交
    • T
      blkcg: remove unnecessary request_list->blkg NULL test in blk_put_rl() · 401efbf8
      Tejun Heo 提交于
      Since ec13b1d6 ("blkcg: always create the blkcg_gq for the root
      blkcg"), a request_list always has its blkg associated.  Drop
      unnecessary rl->blkg NULL test from blk_put_rl().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      401efbf8
    • T
      cfq-iosched: charge async IOs to the appropriate blkcg's instead of the root · 60a83707
      Tejun Heo 提交于
      Up until now, all async IOs were queued to async queues which are
      shared across the whole request_queue, which means that blkcg resource
      control is completely void on async IOs including all writeback IOs.
      It was done this way because writeback didn't support writeback and
      there was no way of telling which writeback IO belonged to which
      cgroup; however, writeback recently became cgroup aware and writeback
      bio's are sent down properly tagged with the blkcg's to charge them
      against.
      
      This patch makes async cfq_queues per-cfq_cgroup instead of
      per-cfq_data so that each async IO is charged to the blkcg that it was
      tagged for instead of unconditionally attributing it to root.
      
      * cfq_data->async_cfqq and ->async_idle_cfqq are moved to cfq_group
        and alloc / destroy paths are updated accordingly.
      
      * cfq_link_cfqq_cfqg() no longer overrides @cfqg to root for async
        queues.
      
      * check_blkcg_changed() now also invalidates async queues as they no
        longer stay the same across cgroups.
      
      After this patch, cfq's proportional IO control through blkio.weight
      works correctly when cgroup writeback is in use.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      60a83707
    • T
      cfq-iosched: fold cfq_find_alloc_queue() into cfq_get_queue() · d4aad7ff
      Tejun Heo 提交于
      cfq_find_alloc_queue() checks whether a queue actually needs to be
      allocated, which is unnecessary as its sole caller, cfq_get_queue(),
      only calls it if so.  Also, the oom queue fallback logic is scattered
      between cfq_get_queue() and cfq_find_alloc_queue().  There really
      isn't much going on in the latter and things can be made simpler by
      folding it into cfq_get_queue().
      
      This patch collapses cfq_find_alloc_queue() into cfq_get_queue().  The
      change is fairly straight-forward with one exception - async_cfqq is
      now initialized to NULL and the "!is_sync" test in the last if
      conditional is replaced with "async_cfqq" test.  This is because gcc
      (5.1.1) gets confused for some reason and warns that async_cfqq may be
      used uninitialized otherwise.  Oh well, the code isn't necessarily
      worse this way.
      
      This patch doesn't cause any functional difference.
      
      v2: Updated to reflect GFP_ATOMIC -> GPF_NOWAIT.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      d4aad7ff
    • T
      cfq-iosched: move cfq_group determination from cfq_find_alloc_queue() to cfq_get_queue() · 322731ed
      Tejun Heo 提交于
      This is necessary for making async cfq_cgroups per-cfq_group instead
      of per-cfq_data.  While this change makes cfq_get_queue() perform RCU
      locking and look up cfq_group even when it reuses async queue, the
      extra overhead is extremely unlikely to be noticeable given that this
      is already sitting behind cic->cfqq[] cache and the overall cost of
      cfq operation.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      322731ed
    • T
      cfq-iosched: remove @gfp_mask from cfq_find_alloc_queue() · 2da8de0b
      Tejun Heo 提交于
      Even when allocations fail, cfq_find_alloc_queue() always returns a
      valid cfq_queue by falling back to the oom cfq_queue.  As such, there
      isn't much point in taking @gfp_mask and trying "harder" if __GFP_WAIT
      is set.  GFP_NOWAIT allocations don't fail often and even when they do
      the degraded behavior is acceptable and temporary.
      
      After all, the only reason get_request(), which ultimately determines
      the gfp_mask, cares about __GFP_WAIT is to guarantee request
      allocation, assuming IO forward progress, for callers which are
      willing to wait.  There's no reason for cfq_find_alloc_queue() to
      behave differently on __GFP_WAIT when it already has a fallback
      mechanism.
      
      Remove @gfp_mask from cfq_find_alloc_queue() and propagate the changes
      to its callers.  This simplifies the function quite a bit and will
      help making async queues per-cfq_group.
      
      v2: Updated to reflect GFP_ATOMIC -> GPF_NOWAIT.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2da8de0b
    • T
      blkcg, cfq-iosched: use GFP_NOWAIT instead of GFP_ATOMIC for non-critical allocations · d93a11f1
      Tejun Heo 提交于
      blkcg performs several allocations to track IOs per cgroup and enforce
      resource control.  Most of these allocations are performed lazily on
      demand in the IO path and thus can't involve reclaim path.  Currently,
      these allocations use GFP_ATOMIC; however, blkcg can gracefully deal
      with occassional failures of these allocations by punting IOs to the
      root cgroup and there's no reason to reach into the emergency reserve.
      
      This patch replaces GFP_ATOMIC with GFP_NOWAIT for the following
      allocations.
      
      * bdi_writeback_congested and blkcg_gq allocations in blkg_create().
      
      * radix tree node allocations for blkcg->blkg_tree.
      
      * cfq_queue allocation on ioprio changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Suggested-and-Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Suggested-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      d93a11f1
    • T
      cfq-iosched: minor cleanups · 563180a4
      Tejun Heo 提交于
      * Some were accessing cic->cfqq[] directly.  Always use cic_to_cfqq()
        and cic_set_cfqq().
      
      * check_ioprio_changed() doesn't need to verify cfq_get_queue()'s
        return for NULL.  It's always non-NULL.  Simplify accordingly.
      
      This patch doesn't cause any functional changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      563180a4
    • T
      cfq-iosched: fix oom cfq_queue ref leak in cfq_set_request() · bce6133b
      Tejun Heo 提交于
      If the cfq_queue cached in cfq_io_cq is the oom one, cfq_set_request()
      replaces it by invoking cfq_get_queue() again without putting the oom
      queue leaking the reference it was holding.  While oom queues are not
      released through reference counting, they're still reference counted
      and this can theoretically lead to the reference count overflowing and
      incorrectly invoke the usual release path on it.
      
      Fix it by making cfq_set_request() put the ref it was holding.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      bce6133b
    • T
      cfq-iosched: fix async oom queue handling · 95e5d6f6
      Tejun Heo 提交于
      Async cfqq's (cfq_queue's) are shared across cfq_data.  When
      cfq_get_queue() obtains a new queue from cfq_find_alloc_queue(), it
      stashes the pointer in cfq_data and reuses it from then on; however,
      the function doesn't consider that cfq_find_alloc_queue() may return
      the oom_cfqq under memory pressure and installs the returned queue
      unconditionally.
      
      If the oom_cfqq is installed as an async cfqq, cfq_set_request() will
      continue calling cfq_get_queue() hoping to replace it with a proper
      queue; however, cfq_get_queue() will keep returning the cached queue
      for the slot - the oom_cfqq.
      
      Fix it by skipping caching if the queue is the oom one.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      95e5d6f6
    • T
      cfq-iosched: simplify control flow in cfq_get_queue() · 4ebc1c61
      Tejun Heo 提交于
      cfq_get_queue()'s control flow looks like the following.
      
      	async_cfqq = NULL;
      	cfqq = NULL;
      
      	if (!is_sync) {
      		...
      		async_cfqq = ...;
      		cfqq = *async_cfqq;
      	}
      
      	if (!cfqq)
      		cfqq = ...;
      
      	if (!is_sync && !(*async_cfqq))
      		...;
      
      The only thing the local variable init, the second if, and the
      async_cfqq test in the third if achieves is to skip cfqq creation and
      installation if *async_cfqq was already non-NULL.  This is needlessly
      complicated with different tests examining the same condition.
      Simplify it to the following.
      
      	if (!is_sync) {
      		...
      		async_cfqq = ...;
      		cfqq = *async_cfqq;
      		if (cfqq)
      			goto out;
      	}
      
      	cfqq = ...;
      
      	if (!is_sync)
      		...;
       out:
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4ebc1c61
    • T
      writeback: update writeback tracepoints to report cgroup · 5634cc2a
      Tejun Heo 提交于
      The following tracepoints are updated to report the cgroup used during
      cgroup writeback.
      
      * writeback_write_inode[_start]
      * writeback_queue
      * writeback_exec
      * writeback_start
      * writeback_written
      * writeback_wait
      * writeback_nowork
      * writeback_wake_background
      * wbc_writepage
      * writeback_queue_io
      * bdi_dirty_ratelimit
      * balance_dirty_pages
      * writeback_sb_inodes_requeue
      * writeback_single_inode[_start]
      
      Note that writeback_bdi_register is separated out from writeback_class
      as reporting cgroup doesn't make sense to it.  Tracepoints which take
      bdi are updated to take bdi_writeback instead.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Suggested-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      5634cc2a
    • T
      kernfs: implement kernfs_path_len() · 9acee9c5
      Tejun Heo 提交于
      Add a function to determine the path length of a kernfs node.  This
      for now will be used by writeback tracepoint updates.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      9acee9c5
    • T
      60292bcc
    • T
      writeback: remove wb_writeback_work->single_wait/done · 8a1270cd
      Tejun Heo 提交于
      wb_writeback_work->single_wait/done are used for the wait mechanism
      for synchronous wb_work (wb_writeback_work) items which are issued
      when bdi_split_work_to_wbs() fails to allocate memory for asynchronous
      wb_work items; however, there's no reason to use a separate wait
      mechanism for this.  bdi_split_work_to_wbs() can simply use on-stack
      fallback wb_work item and separate wb_completion to wait for it.
      
      This patch removes wb_work->single_wait/done and the related code and
      make bdi_split_work_to_wbs() use on-stack fallback wb_work and
      wb_completion instead.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Suggested-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      8a1270cd
    • T
      writeback: bdi_for_each_wb() iteration is memcg ID based not blkcg · 1ed8d48c
      Tejun Heo 提交于
      wb's (bdi_writeback's) are currently keyed by memcg ID; however, in an
      earlier implementation, wb's were keyed by blkcg ID.
      bdi_for_each_wb() walks bdi->cgwb_tree in the ascending ID order and
      allows iterations to start from an arbitrary ID which is used to
      interrupt and resume iterations.
      
      Unfortunately, while changing wb to be keyed by memcg ID instead of
      blkcg, bdi_for_each_wb() was missed and is still assuming that wb's
      are keyed by blkcg ID.  This doesn't affect iterations which don't get
      interrupted but bdi_split_work_to_wbs() makes use of iteration
      resuming on allocation failures and thus may incorrectly skip or
      repeat wb's.
      
      Fix it by changing bdi_for_each_wb() to take memcg IDs instead of
      blkcg IDs and updating bdi_split_work_to_wbs() accordingly.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      1ed8d48c
    • J
      Merge branch 'for-4.3-unified-base' of... · 11743ee0
      Jens Axboe 提交于
      Merge branch 'for-4.3-unified-base' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup into for-4.3/blkcg
      11743ee0
    • T
      cgroup: introduce cgroup_subsys->legacy_name · 3e1d2eed
      Tejun Heo 提交于
      This allows cgroup subsystems to use a different name on the unified
      hierarchy.  cgroup_subsys->name is used on the unified hierarchy,
      ->legacy_name elsewhere.  If ->legacy_name is not explicitly set, it's
      automatically set to ->name and the userland visible behavior remains
      unchanged.
      
      v2: Make parse_cgroupfs_options() only consider ->legacy_name as mount
          options are used only on legacy hierarchies.  Suggested by Li
          Zefan.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: cgroups@vger.kernel.org
      3e1d2eed
    • T
      cgroup: don't print subsystems for the default hierarchy · d98817d4
      Tejun Heo 提交于
      It doesn't make sense to print subsystems on mount option or
      /proc/PID/cgroup for the default hierarchy.
      
      * cgroup.controllers file at the root of the default hierarchy lists
        the currently attached controllers.
      
      * The default hierarchy is catch-all for unmounted subsystems.
      
      * The default hierarchy doesn't accept any mount options.
      
      Suppress subsystem printing on mount options and /proc/PID/cgroup for
      the default hierarchy.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: cgroups@vger.kernel.org
      d98817d4
  2. 12 8月, 2015 1 次提交
  3. 06 8月, 2015 1 次提交
  4. 05 8月, 2015 1 次提交
    • T
      cgroup: define controller file conventions · 6abc8ca1
      Tejun Heo 提交于
      Traditionally, each cgroup controller implemented whatever interface
      it wanted leading to interfaces which are widely inconsistent.
      Examining the requirements of the controllers readily yield that there
      are only a few control schemes shared among all.
      
      Two major controllers already had to implement new interface for the
      unified hierarchy due to significant structural changes.  Let's take
      the chance to establish common conventions throughout all controllers.
      
      This patch defines CGROUP_WEIGHT_MIN/DFL/MAX to be used on all weight
      based control knobs and documents the conventions that controllers
      should follow on the unified hierarchy.  Except for io.weight knob,
      all existing unified hierarchy knobs are already compliant.  A
      follow-up patch will update io.weight.
      
      v2: Added descriptions of min, low and high knobs.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      6abc8ca1
  5. 28 7月, 2015 1 次提交
  6. 27 7月, 2015 3 次提交
    • L
      Linux 4.2-rc4 · cbfe8fa6
      Linus Torvalds 提交于
      cbfe8fa6
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2579d019
      Linus Torvalds 提交于
      Pull perf fix from Thomas Gleixner:
       "A single fix for the intel cqm perf facility to prevent IPIs from
        interrupt context"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel/cqm: Return cached counter value from IRQ context
      2579d019
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 28003486
      Linus Torvalds 提交于
      Pull x86 fixes from Thomas Gleixner:
       "This update contains:
      
         - the manual revert of the SYSCALL32 changes which caused a
           regression
      
         - a fix for the MPX vma handling
      
         - three fixes for the ioremap 'is ram' checks.
      
         - PAT warning fixes
      
         - a trivial fix for the size calculation of TLB tracepoints
      
         - handle old EFI structures gracefully
      
        This also contains a PAT fix from Jan plus a revert thereof.  Toshi
        explained why the code is correct"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm/pat: Revert 'Adjust default caching mode translation tables'
        x86/asm/entry/32: Revert 'Do not use R9 in SYSCALL32' commit
        x86/mm: Fix newly introduced printk format warnings
        mm: Fix bugs in region_is_ram()
        x86/mm: Remove region_is_ram() call from ioremap
        x86/mm: Move warning from __ioremap_check_ram() to the call site
        x86/mm/pat, drivers/media/ivtv: Move the PAT warning and replace WARN() with pr_warn()
        x86/mm/pat, drivers/infiniband/ipath: Replace WARN() with pr_warn()
        x86/mm/pat: Adjust default caching mode translation tables
        x86/fpu: Disable dependent CPU features on "noxsave"
        x86/mpx: Do not set ->vm_ops on MPX VMAs
        x86/mm: Add parenthesis for TLB tracepoint size calculation
        efi: Handle memory error structures produced based on old versions of standard
      28003486
  7. 26 7月, 2015 12 次提交
    • T
      x86/mm/pat: Revert 'Adjust default caching mode translation tables' · 1a4e8795
      Thomas Gleixner 提交于
      Toshi explains:
      
      "No, the default values need to be set to the fallback types,
       i.e. minimal supported mode.  For WC and WT, UC is the fallback type.
      
       When PAT is disabled, pat_init() does update the tables below to
       enable WT per the default BIOS setup.  However, when PAT is enabled,
       but CPU has PAT -errata, WT falls back to UC per the default values."
      
      Revert: ca1fec58 'x86/mm/pat: Adjust default caching mode translation tables'
      Requested-by: NToshi Kani <toshi.kani@hp.com>
      Cc: Jan Beulich <jbeulich@suse.de>
      Link: http://lkml.kernel.org/r/1437577776.3214.252.camel@hp.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      1a4e8795
    • M
      perf/x86/intel/cqm: Return cached counter value from IRQ context · 2c534c0d
      Matt Fleming 提交于
      Peter reported the following potential crash which I was able to
      reproduce with his test program,
      
      [  148.765788] ------------[ cut here ]------------
      [  148.765796] WARNING: CPU: 34 PID: 2840 at kernel/smp.c:417 smp_call_function_many+0xb6/0x260()
      [  148.765797] Modules linked in:
      [  148.765800] CPU: 34 PID: 2840 Comm: perf Not tainted 4.2.0-rc1+ #4
      [  148.765803]  ffffffff81cdc398 ffff88085f105950 ffffffff818bdfd5 0000000000000007
      [  148.765805]  0000000000000000 ffff88085f105990 ffffffff810e413a 0000000000000000
      [  148.765807]  ffffffff82301080 0000000000000022 ffffffff8107f640 ffffffff8107f640
      [  148.765809] Call Trace:
      [  148.765810]  <NMI>  [<ffffffff818bdfd5>] dump_stack+0x45/0x57
      [  148.765818]  [<ffffffff810e413a>] warn_slowpath_common+0x8a/0xc0
      [  148.765822]  [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
      [  148.765824]  [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
      [  148.765825]  [<ffffffff810e422a>] warn_slowpath_null+0x1a/0x20
      [  148.765827]  [<ffffffff811613f6>] smp_call_function_many+0xb6/0x260
      [  148.765829]  [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
      [  148.765831]  [<ffffffff81161748>] on_each_cpu_mask+0x28/0x60
      [  148.765832]  [<ffffffff8107f6ef>] intel_cqm_event_count+0x7f/0xe0
      [  148.765836]  [<ffffffff811cdd35>] perf_output_read+0x2a5/0x400
      [  148.765839]  [<ffffffff811d2e5a>] perf_output_sample+0x31a/0x590
      [  148.765840]  [<ffffffff811d333d>] ? perf_prepare_sample+0x26d/0x380
      [  148.765841]  [<ffffffff811d3497>] perf_event_output+0x47/0x60
      [  148.765843]  [<ffffffff811d36c5>] __perf_event_overflow+0x215/0x240
      [  148.765844]  [<ffffffff811d4124>] perf_event_overflow+0x14/0x20
      [  148.765847]  [<ffffffff8107e7f4>] intel_pmu_handle_irq+0x1d4/0x440
      [  148.765849]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
      [  148.765853]  [<ffffffff81219bad>] ? vunmap_page_range+0x19d/0x2f0
      [  148.765854]  [<ffffffff81219d11>] ? unmap_kernel_range_noflush+0x11/0x20
      [  148.765859]  [<ffffffff814ce6fe>] ? ghes_copy_tofrom_phys+0x11e/0x2a0
      [  148.765863]  [<ffffffff8109e5db>] ? native_apic_msr_write+0x2b/0x30
      [  148.765865]  [<ffffffff8109e44d>] ? x2apic_send_IPI_self+0x1d/0x20
      [  148.765869]  [<ffffffff81065135>] ? arch_irq_work_raise+0x35/0x40
      [  148.765872]  [<ffffffff811c8d86>] ? irq_work_queue+0x66/0x80
      [  148.765875]  [<ffffffff81075306>] perf_event_nmi_handler+0x26/0x40
      [  148.765877]  [<ffffffff81063ed9>] nmi_handle+0x79/0x100
      [  148.765879]  [<ffffffff81064422>] default_do_nmi+0x42/0x100
      [  148.765880]  [<ffffffff81064563>] do_nmi+0x83/0xb0
      [  148.765884]  [<ffffffff818c7c0f>] end_repeat_nmi+0x1e/0x2e
      [  148.765886]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
      [  148.765888]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
      [  148.765890]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
      [  148.765891]  <<EOE>>  [<ffffffff8110ab66>] finish_task_switch+0x156/0x210
      [  148.765898]  [<ffffffff818c1671>] __schedule+0x341/0x920
      [  148.765899]  [<ffffffff818c1c87>] schedule+0x37/0x80
      [  148.765903]  [<ffffffff810ae1af>] ? do_page_fault+0x2f/0x80
      [  148.765905]  [<ffffffff818c1f4a>] schedule_user+0x1a/0x50
      [  148.765907]  [<ffffffff818c666c>] retint_careful+0x14/0x32
      [  148.765908] ---[ end trace e33ff2be78e14901 ]---
      
      The CQM task events are not safe to be called from within interrupt
      context because they require performing an IPI to read the counter value
      on all sockets. And performing IPIs from within IRQ context is a
      "no-no".
      
      Make do with the last read counter value currently event in
      event->count when we're invoked in this context.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vikas Shivappa <vikas.shivappa@intel.com>
      Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
      Cc: Will Auld <will.auld@intel.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1437490509-15373-1-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      2c534c0d
    • L
      Merge tag 'usb-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 26ae19a3
      Linus Torvalds 提交于
      Pull USB fixes from Greg KH:
       "Here's a few USB and PHY fixes for 4.2-rc4.
      
        Nothing major, the shortlog has the full details.
      
        All of these have been in linux-next successfully"
      
      * tag 'usb-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (21 commits)
        USB: OHCI: fix bad #define in ohci-tmio.c
        cdc-acm: Destroy acm_minors IDR on module exit
        usb-storage: Add ignore-device quirk for gm12u320 based usb mini projectors
        usb-storage: ignore ZTE MF 823 card reader in mode 0x1225
        USB: OHCI: Fix race between ED unlink and URB submission
        usb: core: lpm: set lpm_capable for root hub device
        xhci: do not report PLC when link is in internal resume state
        xhci: prevent bus_suspend if SS port resuming in phase 1
        xhci: report U3 when link is in resume state
        xhci: Calculate old endpoints correctly on device reset
        usb: xhci: Bugfix for NULL pointer deference in xhci_endpoint_init() function
        xhci: Workaround to get D3 working in Intel xHCI
        xhci: call BIOS workaround to enable runtime suspend on Intel Braswell
        usb: dwc3: Reset the transfer resource index on SET_INTERFACE
        usb: gadget: udc: core: Fix argument of dma_map_single for IOMMU
        usb: gadget: mv_udc_core: fix phy_regs I/O memory leak
        usb: ulpi: ulpi_init should be executed in subsys_initcall
        phy: berlin-usb: fix divider for BG2
        phy: berlin-usb: fix divider for BG2CD
        phy/pxa: add HAS_IOMEM dependency
        ...
      26ae19a3
    • L
      Merge tag 'tty-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 82b35f37
      Linus Torvalds 提交于
      Pull tty/serial driver fixes from Greg KH:
       "Here are a number of small serial and tty fixes for reported issues.
      
        All have been in linux-next successfully"
      
      * tag 'tty-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        tty: vt: Fix !TASK_RUNNING diagnostic warning from paste_selection()
        serial: core: Fix crashes while echoing when closing
        m32r: Add ioreadXX/iowriteXX big-endian mmio accessors
        Revert "serial: imx: initialized DMA w/o HW flow enabled"
        sc16is7xx: fix FIFO address of secondary UART
        sc16is7xx: fix Kconfig dependencies
        serial: etraxfs-uart: Fix release etraxfs_uart_ports
        tty/vt: Fix the memory leak in visual_init
        serial: amba-pl011: Fix devm_ioremap_resource return value check
        n_tty: signal and flush atomically
      82b35f37
    • L
      Merge tag 'staging-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · b0de415a
      Linus Torvalds 提交于
      Pull staging driver fixes from Greg KH:
       "Here are a number of iio and staging driver fixes for reported issues
        for 4.2-rc4.
      
        All have been in linux-next for a while with no problems"
      
      * tag 'staging-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (34 commits)
        iio:light:stk3310: make endianness independent of host
        iio:light:stk3310: move device register to end of probe
        iio: mma8452: use iio event type IIO_EV_TYPE_MAG
        iio: mcp320x: Fix NULL pointer dereference
        iio: adc: vf610: fix the adc register read fail issue
        iio: mlx96014: Replace offset sign
        iio: magnetometer: mmc35240: fix SET/RESET sequence
        iio: magnetometer: mmc35240: Fix SET/RESET mask
        iio: magnetometer: mmc35240: Fix crash in pm suspend
        iio:magnetometer:bmc150_magn: output intended variable
        iio:magnetometer:bmc150_magn: add regmap dependency
        staging: vt6656: check ieee80211_bss_conf bssid not NULL
        staging: vt6655: check ieee80211_bss_conf bssid not NULL
        iio: tmp006: Check channel info on write
        iio: sx9500: Add missing init in sx9500_buffer_pre{en,dis}able()
        iio:light:ltr501: fix regmap dependency
        iio:light:ltr501: fix variable in ltr501_init
        iio: sx9500: fix bug in compensation code
        iio: sx9500: rework error handling of raw readings
        iio: magnetometer: mmc35240: fix available sampling frequencies
        ...
      b0de415a
    • L
      Merge tag 'char-misc-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · e433b656
      Linus Torvalds 提交于
      Pull char/misc driver fixes from Greg KH:
       "Here are some char and misc driver fixes for reported issues.
      
        One parport patch is reverted as it was incorrect, thanks to testing
        by the 0-day bot"
      
      * tag 'char-misc-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        parport: Revert "parport: fix memory leak"
        mei: prevent unloading mei hw modules while the device is opened.
        misc: mic: scif bug fix for vmalloc_to_page crash
        parport: fix freeing freed memory
        parport: fix memory leak
        parport: fix error handling
      e433b656
    • S
      parport: Revert "parport: fix memory leak" · 4e5a74f1
      Sudip Mukherjee 提交于
      This reverts commit 23c40591 ("parport: fix memory leak")
      
      par_dev->state was already being removed in parport_unregister_device().
      Reported-by: NYing Huang <ying.huang@intel.com>
      Signed-off-by: NSudip Mukherjee <sudip@vectorindia.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e5a74f1
    • L
      Merge tag 'trace-v4.2-rc2-fix3' of... · 763e326c
      Linus Torvalds 提交于
      Merge tag 'trace-v4.2-rc2-fix3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull ftrace fix from Steven Rostedt:
       "Back in 3.16 the ftrace code was redesigned and cleaned up to remove
        the double iteration list (one for registered ftrace ops, and one for
        registered "global" ops), to just use one list.  That simplified the
        code but also broke the function tracing filtering on pid.
      
        This updates the code to handle the filtering again with the new
        logic"
      
      * tag 'trace-v4.2-rc2-fix3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace: Fix breakage of set_ftrace_pid
      763e326c
    • L
      Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm · 45200838
      Linus Torvalds 提交于
      Pull libnvdimm fix from Dan Williams:
       "A minor fix for the libnvdimm subsystem.
      
        This is not critical.  The problem can be worked around in userspace
        by putting the namespace temporarily into raw mode
        (ndctl_namespace_set_raw_mode() from libndctl), but that is awkward
        for management utilities.
      
      * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm:
        libnvdimm: fix namespace seed creation
      45200838
    • L
      Merge tag 'md/4.2-fixes' of git://neil.brown.name/md · aca105a6
      Linus Torvalds 提交于
      Pull md fixes from Neil Brown:
       "Some md fixes for 4.2
      
        Several are tagged for -stable.
        A few aren't because they are not very, serious or because they are in
        the 'experimental' cluster code"
      
      * tag 'md/4.2-fixes' of git://neil.brown.name/md:
        md/raid5: clear R5_NeedReplace when no longer needed.
        Fix read-balancing during node failure
        md-cluster: fix bitmap sub-offset in bitmap_read_sb
        md: Return error if request_module fails and returns positive value
        md: Skip cluster setup in case of error while reading bitmap
        md/raid1: fix test for 'was read error from last working device'.
        md: Skip cluster setup for dm-raid
        md: flush ->event_work before stopping array.
        md/raid10: always set reshape_safe when initializing reshape_position.
        md/raid5: avoid races when changing cache size.
      aca105a6
    • L
      Merge tag 'for-linus-20150724' of git://git.infradead.org/linux-mtd · 32fd3d4a
      Linus Torvalds 提交于
      Pull MTD fixes from Brian Norris:
       "Two trivial updates.  I meant to send these much earlier, but I've
        been preoccupied.
      
         - Add MAINTAINERS entry for diskonchip g3 driver
      
         - Fix an overlooked conflict in bitfield value assignments
      
        The latter update is a bit overdue, but there's no reason to wait any
        longer"
      
      * tag 'for-linus-20150724' of git://git.infradead.org/linux-mtd:
        mtd: nand: Fix NAND_USE_BOUNCE_BUFFER flag conflict
        MAINTAINERS: mtd: docg3: add docg3 maintainer
      32fd3d4a
    • D
      libnvdimm: fix namespace seed creation · 8ca24353
      Dan Williams 提交于
      A new BLK namespace "seed" device is created whenever the current seed
      is successfully probed.  However, if that namespace is assigned to a BTT
      it may never directly experience a successful probe as it is a
      subordinate device to a BTT configuration.
      
      The effect of the current code is that no new namespaces can be
      instantiated, after the seed namespace, to consume available BLK DPA
      capacity.  Fix this by treating a successful BTT probe event as a
      successful probe event for the backing namespace.
      Reported-by: NNicholas Moulin <nicholas.w.moulin@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      8ca24353
  8. 25 7月, 2015 3 次提交
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 33b40178
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
       "Four smaller fixes for the current series.  This contains:
      
         - A fix for clones of discard bio's, that can cause data corruption.
           From Martin.
      
         - A fix for null_blk, where in certain queue modes it could access a
           request after it had been freed.  From Mike Krinkin.
      
         - An error handling leak fix for blkcg, from Tejun.
      
         - Also from Tejun, export of the functions that a file system needs
           to implement cgroup writeback support"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        block: Do a full clone when splitting discard bios
        block: export bio_associate_*() and wbc_account_io()
        blkcg: fix gendisk reference leak in blkg_conf_prep()
        null_blk: fix use-after-free problem
      33b40178
    • L
      Merge branch 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 9fbf075c
      Linus Torvalds 提交于
      Pull libata fixes from Tejun Heo:
       "A couple important fixes.
      
         - A block layer change which removed restriction on max transfer size
           led to silent data corruption on some devices.  A new quirk is
           added to restore the old size limit for the reported device.  If it
           gets reported on more devices, we might have to consider restoring
           the restriction for ATA devices by default.
      
         - There finally is a SSD which is confirmed to cause data corruption
           on TRIM regardless of which flavor is used.  A new quirk is added
           and the device is blacklisted
      
         - Other device-specific workarounds"
      
      * 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
        libata: Do not blacklist M510DC
        libata: increase the timeout when setting transfer mode
        libata: add ATA_HORKAGE_MAX_SEC_1024 to revert back to previous max_sectors limit
        libata: force disable trim for SuperSSpeed S238
        libata: add ATA_HORKAGE_NOTRIM
        libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for HP 250GB SATA disk VB0250EAVER
        ata: pmp: add quirk for Marvell 4140 SATA PMP
      9fbf075c
    • L
      Merge tag 'mmc-4.2-rc3' of git://git.linaro.org/people/ulf.hansson/mmc · 1e63dca7
      Linus Torvalds 提交于
      Pull MMC fixes from Ulf Hansson:
       "Here are some mmc fixes intended for v4.2 rc4.
      
        Note, most of the changes are for the sdhci-esdhc-imx controller,
        which also required us to modify some related DTS files.  Those
        changes have been acked by the SoC maintainer.
      
        MMC core:
         - Fix a reference inbalance issue for power_ro_lock_show() sysfs handler
      
        MMC host:
         - omap_hsmmc: Fix IRQ errorhandling for CD, DTO, and CRC
         - sdhci: Prevent a kernel panic while using DMA
         - mtk-sd: Let it depend on HAS_DMA to prevent build errors
         - sdhci-esdhc: Make 8BIT bus work
         - sdhci-esdhc-imx: Fix some regressions for DT based platforms
         - sdhci-pxav3: Fix a regression for DT based platforms"
      
      * tag 'mmc-4.2-rc3' of git://git.linaro.org/people/ulf.hansson/mmc:
        mmc: sdhci-pxav3: fix platform_data is not initialized
        dts: mmc: fsl-imx-esdhc: remove fsl,cd-controller support
        mmc: sdhci-esdhc-imx: clear f_max in boarddata
        mmc: sdhci-esdhc-imx: remove duplicated dts parsing
        mmc: sdhci: make max-frequency property in device tree work
        mmc: sdhci-esdhc-imx: move all non dt probe code into one function
        mmc: sdhci-esdhc-imx: fix cd regression for dt platform
        dts: imx7: fix sd card gpio polarity specified in device tree
        dts: imx25: fix sd card gpio polarity specified in device tree
        dts: imx6: fix sd card gpio polarity specified in device tree
        dts: imx53: fix sd card gpio polarity specified in device tree
        dts: imx51: fix sd card gpio polarity specified in device tree
        mmc: sdhci-esdhc: Make 8BIT bus work
        mmc: block: Add missing mmc_blk_put() in power_ro_lock_show()
        mmc: MMC_MTK should depend on HAS_DMA
        mmc: sdhci check parameters before call dma_free_coherent
        mmc: omap_hsmmc: Handle BADA, DEB and CEB interrupts
        mmc: omap_hsmmc: Fix DTO and DCRC handling
      1e63dca7