1. 11 12月, 2016 5 次提交
    • V
      sched/core: Use load_avg for selecting idlest group · 6b94780e
      Vincent Guittot 提交于
      find_idlest_group() only compares the runnable_load_avg when looking
      for the least loaded group. But on fork intensive use case like
      hackbench where tasks blocked quickly after the fork, this can lead to
      selecting the same CPU instead of other CPUs, which have similar
      runnable load but a lower load_avg.
      
      When the runnable_load_avg of 2 CPUs are close, we now take into
      account the amount of blocked load as a 2nd selection factor. There is
      now 3 zones for the runnable_load of the rq:
      
       - [0 .. (runnable_load - imbalance)]:
      	Select the new rq which has significantly less runnable_load
      
       - [(runnable_load - imbalance) .. (runnable_load + imbalance)]:
      	The runnable loads are close so we use load_avg to chose
      	between the 2 rq
      
       - [(runnable_load + imbalance) .. ULONG_MAX]:
      	Keep the current rq which has significantly less runnable_load
      
      The scale factor that is currently used for comparing runnable_load,
      doesn't work well with small value. As an example, the use of a
      scaling factor fails as soon as this_runnable_load == 0 because we
      always select local rq even if min_runnable_load is only 1, which
      doesn't really make sense because they are just the same. So instead
      of scaling factor, we use an absolute margin for runnable_load to
      detect CPUs with similar runnable_load and we keep using scaling
      factor for blocked load.
      
      For use case like hackbench, this enable the scheduler to select
      different CPUs during the fork sequence and to spread tasks across the
      system.
      
      Tests have been done on a Hikey board (ARM based octo cores) for
      several kernel. The result below gives min, max, avg and stdev values
      of 18 runs with each configuration.
      
      The patches depend on the "no missing update_rq_clock()" work.
      
      hackbench -P -g 1
      
               ea86cb4b  7dc603c9  v4.8        v4.8+patches
        min    0.049         0.050         0.051       0,048
        avg    0.057         0.057(0%)     0.057(0%)   0,055(+5%)
        max    0.066         0.068         0.070       0,063
        stdev  +/-9%         +/-9%         +/-8%       +/-9%
      
      More performance numbers here:
      
        https://lkml.kernel.org/r/20161203214707.GI20785@codeblueprint.co.ukTested-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Signed-off-by: NVincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Morten.Rasmussen@arm.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dietmar.eggemann@arm.com
      Cc: kernellwp@gmail.com
      Cc: umgwanakikbuti@gmail.com
      Cc: yuyang.du@intel.comc
      Link: http://lkml.kernel.org/r/1481216215-24651-3-git-send-email-vincent.guittot@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6b94780e
    • V
      sched/core: Fix find_idlest_group() for fork · f519a3f1
      Vincent Guittot 提交于
      During fork, the utilization of a task is init once the rq has been
      selected because the current utilization level of the rq is used to
      set the utilization of the fork task. As the task's utilization is
      still 0 at this step of the fork sequence, it doesn't make sense to
      look for some spare capacity that can fit the task's utilization.
      Furthermore, I can see perf regressions for the test:
      
         hackbench -P -g 1
      
      because the least loaded policy is always bypassed and tasks are not
      spread during fork.
      
      With this patch and the fix below, we are back to same performances as
      for v4.8. The fix below is only a temporary one used for the test
      until a smarter solution is found because we can't simply remove the
      test which is useful for others benchmarks
      
      | @@ -5708,13 +5708,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
      |
      |	avg_cost = this_sd->avg_scan_cost;
      |
      | -	/*
      | -	 * Due to large variance we need a large fuzz factor; hackbench in
      | -	 * particularly is sensitive here.
      | -	 */
      | -	if ((avg_idle / 512) < avg_cost)
      | -		return -1;
      | -
      |	time = local_clock();
      |
      |	for_each_cpu_wrap(cpu, sched_domain_span(sd), target, wrap) {
      Tested-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Signed-off-by: NVincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Acked-by: NMorten Rasmussen <morten.rasmussen@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dietmar.eggemann@arm.com
      Cc: kernellwp@gmail.com
      Cc: umgwanakikbuti@gmail.com
      Cc: yuyang.du@intel.comc
      Link: http://lkml.kernel.org/r/1481216215-24651-2-git-send-email-vincent.guittot@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f519a3f1
    • I
      Merge branch 'linus' into sched/core, to pick up fixes · 6643aab3
      Ingo Molnar 提交于
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6643aab3
    • L
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 04516981
      Linus Torvalds 提交于
      Pull crypto fixes from Herbert Xu:
       "This fixes the following issues:
      
         - Fix pointer size when caam is used with AArch64 boot loader on
           AArch32 kernel.
      
         - Fix ahash state corruption in marvell driver.
      
         - Fix buggy algif_aed tag handling.
      
         - Prevent mcryptd from being used with incompatible algorithms which
           can cause crashes"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: algif_aead - fix uninitialized variable warning
        crypto: mcryptd - Check mcryptd algorithm compatibility
        crypto: algif_aead - fix AEAD tag memory handling
        crypto: caam - fix pointer size for AArch64 boot loader, AArch32 kernel
        crypto: marvell - Don't corrupt state of an STD req for re-stepped ahash
        crypto: marvell - Don't copy hash operation twice into the SRAM
      04516981
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · cd662895
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Limit the number of can filters to avoid > MAX_ORDER allocations.
          Fix from Marc Kleine-Budde.
      
       2) Limit GSO max size in netvsc driver to avoid problems with NVGRE
          configurations. From Stephen Hemminger.
      
       3) Return proper error when memory allocation fails in
          ser_gigaset_init(), from Dan Carpenter.
      
       4) Missing linkage undo in error paths of ipvlan_link_new(), from Gao
          Feng.
      
       5) Missing necessayr SET_NETDEV_DEV in lantiq and cpmac drivers, from
          Florian Fainelli.
      
       6) Handle probe deferral properly in smsc911x driver.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        net: mlx5: Fix Kconfig help text
        net: smsc911x: back out silently on probe deferrals
        ibmveth: set correct gso_size and gso_type
        net: ethernet: cpmac: Call SET_NETDEV_DEV()
        net: ethernet: lantiq_etop: Call SET_NETDEV_DEV()
        vhost-vsock: fix orphan connection reset
        cxgb4/cxgb4vf: Assign netdev->dev_port with port ID
        driver: ipvlan: Unlink the upper dev when ipvlan_link_new failed
        ser_gigaset: return -ENOMEM on error instead of success
        NET: usb: cdc_mbim: add quirk for supporting Telit LE922A
        can: peak: fix bad memory access and free sequence
        phy: Don't increment MDIO bus refcount unless it's a different owner
        netvsc: reduce maximum GSO size
        drivers: net: cpsw-phy-sel: Clear RGMII_IDMODE on "rgmii" links
        can: raw: raw_setsockopt: limit number of can_filter that can be set
      cd662895
  2. 10 12月, 2016 10 次提交
  3. 09 12月, 2016 13 次提交
  4. 08 12月, 2016 12 次提交