1. 07 6月, 2015 2 次提交
  2. 20 5月, 2015 1 次提交
  3. 19 5月, 2015 5 次提交
    • R
      sched/numa: Reduce conflict between fbq_classify_rq() and migration · c1ceac62
      Rik van Riel 提交于
      It is possible for fbq_classify_rq() to indicate that a CPU has tasks that
      should be moved to another NUMA node, but for migrate_improves_locality
      and migrate_degrades_locality to not identify those tasks.
      
      This patch always gives preference to preferred node evaluations, and
      only checks the number of faults when evaluating moves between two
      non-preferred nodes on a larger NUMA system.
      
      On a two node system, the number of faults is never evaluated. Either
      a task is about to be pulled off its preferred node, or migrated onto
      it.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: mgorman@suse.de
      Link: http://lkml.kernel.org/r/20150514225936.35b91717@annuminas.surriel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c1ceac62
    • D
      sched/preempt, mm/fault: Count pagefault_disable() levels in pagefault_disabled · 8bcbde54
      David Hildenbrand 提交于
      Until now, pagefault_disable()/pagefault_enabled() used the preempt
      count to track whether in an environment with pagefaults disabled (can
      be queried via in_atomic()).
      
      This patch introduces a separate counter in task_struct to count the
      level of pagefault_disable() calls. We'll keep manipulating the preempt
      count to retain compatibility to existing pagefault handlers.
      
      It is now possible to verify whether in a pagefault_disable() envionment
      by calling pagefault_disabled(). In contrast to in_atomic() it will not
      be influenced by preempt_enable()/preempt_disable().
      
      This patch is based on a patch from Ingo Molnar.
      Reviewed-and-tested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: David.Laight@ACULAB.COM
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: airlied@linux.ie
      Cc: akpm@linux-foundation.org
      Cc: benh@kernel.crashing.org
      Cc: bigeasy@linutronix.de
      Cc: borntraeger@de.ibm.com
      Cc: daniel.vetter@intel.com
      Cc: heiko.carstens@de.ibm.com
      Cc: herbert@gondor.apana.org.au
      Cc: hocko@suse.cz
      Cc: hughd@google.com
      Cc: mst@redhat.com
      Cc: paulus@samba.org
      Cc: ralf@linux-mips.org
      Cc: schwidefsky@de.ibm.com
      Cc: yang.shi@windriver.com
      Link: http://lkml.kernel.org/r/1431359540-32227-2-git-send-email-dahi@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8bcbde54
    • F
      sched/preempt: Optimize preemption operations on __schedule() callers · b30f0e3f
      Frederic Weisbecker 提交于
      __schedule() disables preemption and some of its callers
      (the preempt_schedule*() family) also set PREEMPT_ACTIVE.
      
      So we have two preempt_count() modifications that could be performed
      at once.
      
      Lets remove the preemption disablement from __schedule() and pull
      this responsibility to its callers in order to optimize preempt_count()
      operations in a single place.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1431441711-29753-5-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b30f0e3f
    • S
      sched: always use blk_schedule_flush_plug in io_schedule_out · 10d784ea
      Shaohua Li 提交于
      block plug callback could sleep, so we introduce a parameter
      'from_schedule' and corresponding drivers can use it to destinguish a
      schedule plug flush or a plug finish. Unfortunately io_schedule_out
      still uses blk_flush_plug(). This causes below output (Note, I added a
      might_sleep() in raid1_unplug to make it trigger faster, but the whole
      thing doesn't matter if I add might_sleep). In raid1/10, this can cause
      deadlock.
      
      This patch makes io_schedule_out always uses blk_schedule_flush_plug.
      This should only impact drivers (as far as I know, raid 1/10) which are
      sensitive to the 'from_schedule' parameter.
      
      [  370.817949] ------------[ cut here ]------------
      [  370.817960] WARNING: CPU: 7 PID: 145 at ../kernel/sched/core.c:7306 __might_sleep+0x7f/0x90()
      [  370.817969] do not call blocking ops when !TASK_RUNNING; state=2 set at [<ffffffff81092fcf>] prepare_to_wait+0x2f/0x90
      [  370.817971] Modules linked in: raid1
      [  370.817976] CPU: 7 PID: 145 Comm: kworker/u16:9 Tainted: G        W       4.0.0+ #361
      [  370.817977] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014
      [  370.817983] Workqueue: writeback bdi_writeback_workfn (flush-9:1)
      [  370.817985]  ffffffff81cd83be ffff8800ba8cb298 ffffffff819dd7af 0000000000000001
      [  370.817988]  ffff8800ba8cb2e8 ffff8800ba8cb2d8 ffffffff81051afc ffff8800ba8cb2c8
      [  370.817990]  ffffffffa00061a8 000000000000041e 0000000000000000 ffff8800ba8cba28
      [  370.817993] Call Trace:
      [  370.817999]  [<ffffffff819dd7af>] dump_stack+0x4f/0x7b
      [  370.818002]  [<ffffffff81051afc>] warn_slowpath_common+0x8c/0xd0
      [  370.818004]  [<ffffffff81051b86>] warn_slowpath_fmt+0x46/0x50
      [  370.818006]  [<ffffffff81092fcf>] ? prepare_to_wait+0x2f/0x90
      [  370.818008]  [<ffffffff81092fcf>] ? prepare_to_wait+0x2f/0x90
      [  370.818010]  [<ffffffff810776ef>] __might_sleep+0x7f/0x90
      [  370.818014]  [<ffffffffa0000c03>] raid1_unplug+0xd3/0x170 [raid1]
      [  370.818024]  [<ffffffff81421d9a>] blk_flush_plug_list+0x8a/0x1e0
      [  370.818028]  [<ffffffff819e3550>] ? bit_wait+0x50/0x50
      [  370.818031]  [<ffffffff819e21b0>] io_schedule_timeout+0x130/0x140
      [  370.818033]  [<ffffffff819e3586>] bit_wait_io+0x36/0x50
      [  370.818034]  [<ffffffff819e31b5>] __wait_on_bit+0x65/0x90
      [  370.818041]  [<ffffffff8125b67c>] ? ext4_read_block_bitmap_nowait+0xbc/0x630
      [  370.818043]  [<ffffffff819e3550>] ? bit_wait+0x50/0x50
      [  370.818045]  [<ffffffff819e3302>] out_of_line_wait_on_bit+0x72/0x80
      [  370.818047]  [<ffffffff810935e0>] ? autoremove_wake_function+0x40/0x40
      [  370.818050]  [<ffffffff811de744>] __wait_on_buffer+0x44/0x50
      [  370.818053]  [<ffffffff8125ae80>] ext4_wait_block_bitmap+0xe0/0xf0
      [  370.818058]  [<ffffffff812975d6>] ext4_mb_init_cache+0x206/0x790
      [  370.818062]  [<ffffffff8114bc6c>] ? lru_cache_add+0x1c/0x50
      [  370.818064]  [<ffffffff81297c7e>] ext4_mb_init_group+0x11e/0x200
      [  370.818066]  [<ffffffff81298231>] ext4_mb_load_buddy+0x341/0x360
      [  370.818068]  [<ffffffff8129a1a3>] ext4_mb_find_by_goal+0x93/0x2f0
      [  370.818070]  [<ffffffff81295b54>] ? ext4_mb_normalize_request+0x1e4/0x5b0
      [  370.818072]  [<ffffffff8129ab67>] ext4_mb_regular_allocator+0x67/0x460
      [  370.818074]  [<ffffffff81295b54>] ? ext4_mb_normalize_request+0x1e4/0x5b0
      [  370.818076]  [<ffffffff8129ca4b>] ext4_mb_new_blocks+0x4cb/0x620
      [  370.818079]  [<ffffffff81290956>] ext4_ext_map_blocks+0x4c6/0x14d0
      [  370.818081]  [<ffffffff812a4d4e>] ? ext4_es_lookup_extent+0x4e/0x290
      [  370.818085]  [<ffffffff8126399d>] ext4_map_blocks+0x14d/0x4f0
      [  370.818088]  [<ffffffff81266fbd>] ext4_writepages+0x76d/0xe50
      [  370.818094]  [<ffffffff81149691>] do_writepages+0x21/0x50
      [  370.818097]  [<ffffffff811d5c00>] __writeback_single_inode+0x60/0x490
      [  370.818099]  [<ffffffff811d630a>] writeback_sb_inodes+0x2da/0x590
      [  370.818103]  [<ffffffff811abf4b>] ? trylock_super+0x1b/0x50
      [  370.818105]  [<ffffffff811abf4b>] ? trylock_super+0x1b/0x50
      [  370.818107]  [<ffffffff811d665f>] __writeback_inodes_wb+0x9f/0xd0
      [  370.818109]  [<ffffffff811d69db>] wb_writeback+0x34b/0x3c0
      [  370.818111]  [<ffffffff811d70df>] bdi_writeback_workfn+0x23f/0x550
      [  370.818116]  [<ffffffff8106bbd8>] process_one_work+0x1c8/0x570
      [  370.818117]  [<ffffffff8106bb5b>] ? process_one_work+0x14b/0x570
      [  370.818119]  [<ffffffff8106c09b>] worker_thread+0x11b/0x470
      [  370.818121]  [<ffffffff8106bf80>] ? process_one_work+0x570/0x570
      [  370.818124]  [<ffffffff81071868>] kthread+0xf8/0x110
      [  370.818126]  [<ffffffff81071770>] ? kthread_create_on_node+0x210/0x210
      [  370.818129]  [<ffffffff819e9322>] ret_from_fork+0x42/0x70
      [  370.818131]  [<ffffffff81071770>] ? kthread_create_on_node+0x210/0x210
      [  370.818132] ---[ end trace 7b4deb71e68b6605 ]---
      
      V2: don't change ->in_iowait
      
      Cc: NeilBrown <neilb@suse.de>
      Signed-off-by: NShaohua Li <shli@fb.com>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      10d784ea
    • P
      watchdog: Fix merge 'conflict' · ab992dc3
      Peter Zijlstra 提交于
      Two watchdog changes that came through different trees had a non
      conflicting conflict, that is, one changed the semantics of a variable
      but no actual code conflict happened. So the merge appeared fine, but
      the resulting code did not behave as expected.
      
      Commit 195daf66 ("watchdog: enable the new user interface of the
      watchdog mechanism") changes the semantics of watchdog_user_enabled,
      which thereafter is only used by the functions introduced by
      b3738d29 ("watchdog: Add watchdog enable/disable all functions").
      
      There further appears to be a distinct lack of serialization between
      setting and using watchdog_enabled, so perhaps we should wrap the
      {en,dis}able_all() things in watchdog_proc_mutex.
      
      This patch fixes a s2r failure reported by Michal; which I cannot
      readily explain. But this does make the code internally consistent
      again.
      Reported-and-tested-by: NMichal Hocko <mhocko@suse.cz>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ab992dc3
  4. 17 5月, 2015 1 次提交
    • N
      sched: Fix function declaration return type mismatch · 58ac93e4
      Nicholas Mc Guire 提交于
      static code checking was unhappy with:
      
        ./kernel/sched/fair.c:162 WARNING: return of wrong type
                      int != unsigned int
      
      get_update_sysctl_factor() is declared to return int but is
      currently  returning an unsigned int. The first few preprocessed
      lines are:
      
       static int get_update_sysctl_factor(void)
       {
       unsigned int cpus = ({ int __min1 = (cpumask_weight(cpu_online_mask));
       int __min2 = (8); __min1 < __min2 ? __min1: __min2; });
       unsigned int factor;
      
      The type used by min_t() should be 'unsigned int' and the return type
      of get_update_sysctl_factor() should also be 'unsigned int' as its
      call-site update_sysctl() is expecting 'unsigned int' and the values
      utilizing:
      
        'factor'
        'sysctl_sched_min_granularity'
        'sched_nr_latency'
        'sysctl_sched_wakeup_granularity'
      
      ... are also all 'unsigned int', plus cpumask_weight() is also
      returning 'unsigned int'.
      
      So the natural type to use around here is 'unsigned int'.
      
      ( Patch was compile tested with x86_64_defconfig +
        CONFIG_SCHED_DEBUG=y and the changed sections in
        kernel/sched/fair.i were reviewed. )
      Signed-off-by: NNicholas Mc Guire <hofrat@osadl.org>
      [ Improved the changelog a bit. ]
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1431716742-11077-1-git-send-email-hofrat@osadl.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      58ac93e4
  5. 13 5月, 2015 1 次提交
  6. 09 5月, 2015 1 次提交
  7. 08 5月, 2015 13 次提交
  8. 07 5月, 2015 1 次提交
  9. 01 5月, 2015 1 次提交
  10. 29 4月, 2015 1 次提交
  11. 28 4月, 2015 1 次提交
  12. 27 4月, 2015 1 次提交
  13. 25 4月, 2015 2 次提交
  14. 23 4月, 2015 1 次提交
    • M
      kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP · 7e01b5ac
      Martin Schwidefsky 提交于
      Introduce KEXEC_CONTROL_MEMORY_GFP to allow the architecture code
      to override the gfp flags of the allocation for the kexec control
      page. The loop in kimage_alloc_normal_control_pages allocates pages
      with GFP_KERNEL until a page is found that happens to have an
      address smaller than the KEXEC_CONTROL_MEMORY_LIMIT. On systems
      with a large memory size but a small KEXEC_CONTROL_MEMORY_LIMIT
      the loop will keep allocating memory until the oom killer steps in.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      7e01b5ac
  15. 20 4月, 2015 1 次提交
  16. 17 4月, 2015 7 次提交