1. 10 9月, 2018 5 次提交
    • S
      sched/fair: Fix vruntime_normalized() for remote non-migration wakeup · d0cdb3ce
      Steve Muckle 提交于
      When a task which previously ran on a given CPU is remotely queued to
      wake up on that same CPU, there is a period where the task's state is
      TASK_WAKING and its vruntime is not normalized. This is not accounted
      for in vruntime_normalized() which will cause an error in the task's
      vruntime if it is switched from the fair class during this time.
      
      For example if it is boosted to RT priority via rt_mutex_setprio(),
      rq->min_vruntime will not be subtracted from the task's vruntime but
      it will be added again when the task returns to the fair class. The
      task's vruntime will have been erroneously doubled and the effective
      priority of the task will be reduced.
      
      Note this will also lead to inflation of all vruntimes since the doubled
      vruntime value will become the rq's min_vruntime when other tasks leave
      the rq. This leads to repeated doubling of the vruntime and priority
      penalty.
      
      Fix this by recognizing a WAKING task's vruntime as normalized only if
      sched_remote_wakeup is true. This indicates a migration, in which case
      the vruntime would have been normalized in migrate_task_rq_fair().
      
      Based on a similar patch from John Dias <joaodias@google.com>.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Tested-by: NDietmar Eggemann <dietmar.eggemann@arm.com>
      Signed-off-by: NSteve Muckle <smuckle@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Chris Redpath <Chris.Redpath@arm.com>
      Cc: John Dias <joaodias@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miguel de Dios <migueldedios@google.com>
      Cc: Morten Rasmussen <Morten.Rasmussen@arm.com>
      Cc: Patrick Bellasi <Patrick.Bellasi@arm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Quentin Perret <quentin.perret@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Todd Kjos <tkjos@google.com>
      Cc: kernel-team@android.com
      Fixes: b5179ac7 ("sched/fair: Prepare to fix fairness problems on migration")
      Link: http://lkml.kernel.org/r/20180831224217.169476-1-smuckle@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d0cdb3ce
    • V
      sched/pelt: Fix update_blocked_averages() for RT and DL classes · 12b04875
      Vincent Guittot 提交于
      update_blocked_averages() is called to periodiccally decay the stalled load
      of idle CPUs and to sync all loads before running load balance.
      
      When cfs rq is idle, it trigs a load balance during pick_next_task_fair()
      in order to potentially pull tasks and to use this newly idle CPU. This
      load balance happens whereas prev task from another class has not been put
      and its utilization updated yet. This may lead to wrongly account running
      time as idle time for RT or DL classes.
      
      Test that no RT or DL task is running when updating their utilization in
      update_blocked_averages().
      
      We still update RT and DL utilization instead of simply skipping them to
      make sure that all metrics are synced when used during load balance.
      Signed-off-by: NVincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 371bf427 ("sched/rt: Add rt_rq utilization tracking")
      Fixes: 3727e0e1 ("sched/dl: Add dl_rq utilization tracking")
      Link: http://lkml.kernel.org/r/1535728975-22799-1-git-send-email-vincent.guittot@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      12b04875
    • S
      sched/topology: Set correct NUMA topology type · e5e96faf
      Srikar Dronamraju 提交于
      With the following commit:
      
        051f3ca0 ("sched/topology: Introduce NUMA identity node sched domain")
      
      the scheduler introduced a new NUMA level. However this leads to the NUMA topology
      on 2 node systems to not be marked as NUMA_DIRECT anymore.
      
      After this commit, it gets reported as NUMA_BACKPLANE, because
      sched_domains_numa_level is now 2 on 2 node systems.
      
      Fix this by allowing setting systems that have up to 2 NUMA levels as
      NUMA_DIRECT.
      
      While here remove code that assumes that level can be 0.
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andre Wild <wild@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
      Fixes: 051f3ca0 "Introduce NUMA identity node sched domain"
      Link: http://lkml.kernel.org/r/1533920419-17410-1-git-send-email-srikar@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e5e96faf
    • J
      sched/debug: Fix potential deadlock when writing to sched_features · e73e8197
      Jiada Wang 提交于
      The following lockdep report can be triggered by writing to /sys/kernel/debug/sched_features:
      
        ======================================================
        WARNING: possible circular locking dependency detected
        4.18.0-rc6-00152-gcd3f77d7-dirty #18 Not tainted
        ------------------------------------------------------
        sh/3358 is trying to acquire lock:
        000000004ad3989d (cpu_hotplug_lock.rw_sem){++++}, at: static_key_enable+0x14/0x30
        but task is already holding lock:
        00000000c1b31a88 (&sb->s_type->i_mutex_key#3){+.+.}, at: sched_feat_write+0x160/0x428
        which lock already depends on the new lock.
        the existing dependency chain (in reverse order) is:
        -> #3 (&sb->s_type->i_mutex_key#3){+.+.}:
               lock_acquire+0xb8/0x148
               down_write+0xac/0x140
               start_creating+0x5c/0x168
               debugfs_create_dir+0x18/0x220
               opp_debug_register+0x8c/0x120
               _add_opp_dev+0x104/0x1f8
               dev_pm_opp_get_opp_table+0x174/0x340
               _of_add_opp_table_v2+0x110/0x760
               dev_pm_opp_of_add_table+0x5c/0x240
               dev_pm_opp_of_cpumask_add_table+0x5c/0x100
               cpufreq_init+0x160/0x430
               cpufreq_online+0x1cc/0xe30
               cpufreq_add_dev+0x78/0x198
               subsys_interface_register+0x168/0x270
               cpufreq_register_driver+0x1c8/0x278
               dt_cpufreq_probe+0xdc/0x1b8
               platform_drv_probe+0xb4/0x168
               driver_probe_device+0x318/0x4b0
               __device_attach_driver+0xfc/0x1f0
               bus_for_each_drv+0xf8/0x180
               __device_attach+0x164/0x200
               device_initial_probe+0x10/0x18
               bus_probe_device+0x110/0x178
               device_add+0x6d8/0x908
               platform_device_add+0x138/0x3d8
               platform_device_register_full+0x1cc/0x1f8
               cpufreq_dt_platdev_init+0x174/0x1bc
               do_one_initcall+0xb8/0x310
               kernel_init_freeable+0x4b8/0x56c
               kernel_init+0x10/0x138
               ret_from_fork+0x10/0x18
        -> #2 (opp_table_lock){+.+.}:
               lock_acquire+0xb8/0x148
               __mutex_lock+0x104/0xf50
               mutex_lock_nested+0x1c/0x28
               _of_add_opp_table_v2+0xb4/0x760
               dev_pm_opp_of_add_table+0x5c/0x240
               dev_pm_opp_of_cpumask_add_table+0x5c/0x100
               cpufreq_init+0x160/0x430
               cpufreq_online+0x1cc/0xe30
               cpufreq_add_dev+0x78/0x198
               subsys_interface_register+0x168/0x270
               cpufreq_register_driver+0x1c8/0x278
               dt_cpufreq_probe+0xdc/0x1b8
               platform_drv_probe+0xb4/0x168
               driver_probe_device+0x318/0x4b0
               __device_attach_driver+0xfc/0x1f0
               bus_for_each_drv+0xf8/0x180
               __device_attach+0x164/0x200
               device_initial_probe+0x10/0x18
               bus_probe_device+0x110/0x178
               device_add+0x6d8/0x908
               platform_device_add+0x138/0x3d8
               platform_device_register_full+0x1cc/0x1f8
               cpufreq_dt_platdev_init+0x174/0x1bc
               do_one_initcall+0xb8/0x310
               kernel_init_freeable+0x4b8/0x56c
               kernel_init+0x10/0x138
               ret_from_fork+0x10/0x18
        -> #1 (subsys mutex#6){+.+.}:
               lock_acquire+0xb8/0x148
               __mutex_lock+0x104/0xf50
               mutex_lock_nested+0x1c/0x28
               subsys_interface_register+0xd8/0x270
               cpufreq_register_driver+0x1c8/0x278
               dt_cpufreq_probe+0xdc/0x1b8
               platform_drv_probe+0xb4/0x168
               driver_probe_device+0x318/0x4b0
               __device_attach_driver+0xfc/0x1f0
               bus_for_each_drv+0xf8/0x180
               __device_attach+0x164/0x200
               device_initial_probe+0x10/0x18
               bus_probe_device+0x110/0x178
               device_add+0x6d8/0x908
               platform_device_add+0x138/0x3d8
               platform_device_register_full+0x1cc/0x1f8
               cpufreq_dt_platdev_init+0x174/0x1bc
               do_one_initcall+0xb8/0x310
               kernel_init_freeable+0x4b8/0x56c
               kernel_init+0x10/0x138
               ret_from_fork+0x10/0x18
        -> #0 (cpu_hotplug_lock.rw_sem){++++}:
               __lock_acquire+0x203c/0x21d0
               lock_acquire+0xb8/0x148
               cpus_read_lock+0x58/0x1c8
               static_key_enable+0x14/0x30
               sched_feat_write+0x314/0x428
               full_proxy_write+0xa0/0x138
               __vfs_write+0xd8/0x388
               vfs_write+0xdc/0x318
               ksys_write+0xb4/0x138
               sys_write+0xc/0x18
               __sys_trace_return+0x0/0x4
        other info that might help us debug this:
        Chain exists of:
          cpu_hotplug_lock.rw_sem --> opp_table_lock --> &sb->s_type->i_mutex_key#3
         Possible unsafe locking scenario:
               CPU0                    CPU1
               ----                    ----
          lock(&sb->s_type->i_mutex_key#3);
                                       lock(opp_table_lock);
                                       lock(&sb->s_type->i_mutex_key#3);
          lock(cpu_hotplug_lock.rw_sem);
         *** DEADLOCK ***
        2 locks held by sh/3358:
         #0: 00000000a8c4b363 (sb_writers#10){.+.+}, at: vfs_write+0x238/0x318
         #1: 00000000c1b31a88 (&sb->s_type->i_mutex_key#3){+.+.}, at: sched_feat_write+0x160/0x428
        stack backtrace:
        CPU: 5 PID: 3358 Comm: sh Not tainted 4.18.0-rc6-00152-gcd3f77d7-dirty #18
        Hardware name: Renesas H3ULCB Kingfisher board based on r8a7795 ES2.0+ (DT)
        Call trace:
         dump_backtrace+0x0/0x288
         show_stack+0x14/0x20
         dump_stack+0x13c/0x1ac
         print_circular_bug.isra.10+0x270/0x438
         check_prev_add.constprop.16+0x4dc/0xb98
         __lock_acquire+0x203c/0x21d0
         lock_acquire+0xb8/0x148
         cpus_read_lock+0x58/0x1c8
         static_key_enable+0x14/0x30
         sched_feat_write+0x314/0x428
         full_proxy_write+0xa0/0x138
         __vfs_write+0xd8/0x388
         vfs_write+0xdc/0x318
         ksys_write+0xb4/0x138
         sys_write+0xc/0x18
         __sys_trace_return+0x0/0x4
      
      This is because when loading the cpufreq_dt module we first acquire
      cpu_hotplug_lock.rw_sem lock, then in cpufreq_init(), we are taking
      the &sb->s_type->i_mutex_key lock.
      
      But when writing to /sys/kernel/debug/sched_features, the
      cpu_hotplug_lock.rw_sem lock depends on the &sb->s_type->i_mutex_key lock.
      
      To fix this bug, reverse the lock acquisition order when writing to
      sched_features, this way cpu_hotplug_lock.rw_sem no longer depends on
      &sb->s_type->i_mutex_key.
      Tested-by: NDietmar Eggemann <dietmar.eggemann@arm.com>
      Signed-off-by: NJiada Wang <jiada_wang@mentor.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Eugeniu Rosca <erosca@de.adit-jv.com>
      Cc: George G. Davis <george_davis@mentor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180731121222.26195-1-jiada_wang@mentor.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e73e8197
    • L
      Linux 4.19-rc3 · 11da3a7f
      Linus Torvalds 提交于
      11da3a7f
  2. 09 9月, 2018 10 次提交
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9a568276
      Linus Torvalds 提交于
      Pull x86 fixes from Thomas Gleixner:
       "A set of fixes for x86:
      
         - Prevent multiplication result truncation on 32bit. Introduced with
           the early timestamp reworrk.
      
         - Ensure microcode revision storage to be consistent under all
           circumstances
      
         - Prevent write tearing of PTEs
      
         - Prevent confusion of user and kernel reegisters when dumping fatal
           signals verbosely
      
         - Make an error return value in a failure path of the vector
           allocation negative. Returning EINVAL might the caller assume
           success and causes further wreckage.
      
         - A trivial kernel doc warning fix"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm: Use WRITE_ONCE() when setting PTEs
        x86/apic/vector: Make error return value negative
        x86/process: Don't mix user/kernel regs in 64bit __show_regs()
        x86/tsc: Prevent result truncation on 32bit
        x86: Fix kernel-doc atomic.h warnings
        x86/microcode: Update the new microcode revision unconditionally
        x86/microcode: Make sure boot_cpu_data.microcode is up-to-date
      9a568276
    • L
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3567994a
      Linus Torvalds 提交于
      Pull timekeeping fixes from Thomas Gleixner:
       "Two fixes for timekeeping:
      
         - Revert to the previous kthread based update, which is unfortunately
           required due to lock ordering issues. The removal caused boot
           failures on old Core2 machines. Add a proper comment why the thread
           needs to stay to prevent accidental removal in the future.
      
         - Fix a silly typo in a function declaration"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        clocksource: Revert "Remove kthread"
        timekeeping: Fix declaration of read_persistent_wall_and_boot_offset()
      3567994a
    • L
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 225ad3cf
      Linus Torvalds 提交于
      Pull irqchip fix from Thomas Gleixner:
       "A single fix to prevent allocating excessive memory in the GIC/ITS
        driver.
      
        While the subject of the patch might suggest otherwise this is a real
        fix as some SoCs exceed the memory allocation limits and fail to boot"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic-v3-its: Cap lpi_id_bits to reduce memory footprint
      225ad3cf
    • L
      Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e0a0d058
      Linus Torvalds 提交于
      Pull cpu hotplug fixes from Thomas Gleixner:
       "Two fixes for the hotplug state machine code:
      
         - Move the misplaces smb() in the hotplug thread function to the
           proper place, otherwise a half update control struct could be
           observed
      
         - Prevent state corruption on error rollback, which causes the state
           to advance by one and as a consequence skip it in the bringup
           sequence"
      
      * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        cpu/hotplug: Prevent state corruption on error rollback
        cpu/hotplug: Adjust misplaced smb() in cpuhp_thread_fun()
      e0a0d058
    • L
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random · 3243a89d
      Linus Torvalds 提交于
      Pull random driver fix from Ted Ts'o:
       "Fix things so the choice of whether or not to trust RDRAND to
        initialize the CRNG is configurable via the boot option
        random.trust_cpu={on,off}"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
        random: make CPU trust a boot parameter
      3243a89d
    • L
      Merge tag 'kbuild-fixes-v4.19' of... · 1d225777
      Linus Torvalds 提交于
      Merge tag 'kbuild-fixes-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - make setlocalversion more robust about -dirty check
      
       - loosen the pkg-config requirement for Kconfig
      
       - change missing depmod to a warning from an error
      
       - warn modules_install when System.map is missing
      
      * tag 'kbuild-fixes-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: modules_install: warn when missing System.map file
        kbuild: make missing $DEPMOD a Warning instead of an Error
        kconfig: do not require pkg-config on make {menu,n}config
        kconfig: remove a spurious self-assignment
        scripts/setlocalversion: git: Make -dirty check more robust
      1d225777
    • R
      kbuild: modules_install: warn when missing System.map file · f0b0d88a
      Randy Dunlap 提交于
      If there is no System.map file for "make modules_install",
      scripts/depmod.sh will silently exit with success, having done
      nothing.  Since this is an unexpected situation, change it to
      report a Warning for the missing file.  The behavior is not
      changed except for the Warning message.
      
      The (previous) silent success and new Warning can be reproduced
      by:
      $ make mrproper; make defconfig
      $ make modules; make modules_install
      
      and since System.map is produced by "make vmlinux", the steps
      above omit producing the System.map file.
      Reported-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      f0b0d88a
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · f8f65382
      Linus Torvalds 提交于
      Pull KVM fixes from Radim Krčmář:
       "ARM:
         - Fix a VFP corruption in 32-bit guest
         - Add missing cache invalidation for CoW pages
         - Two small cleanups
      
        s390:
         - Fallout from the hugetlbfs support: pfmf interpretion and locking
         - VSIE: fix keywrapping for nested guests
      
        PPC:
         - Fix a bug where pages might not get marked dirty, causing guest
           memory corruption on migration
         - Fix a bug causing reads from guest memory to use the wrong guest
           real address for very large HPT guests (>256G of memory), leading
           to failures in instruction emulation.
      
        x86:
         - Fix out of bound access from malicious pv ipi hypercalls
           (introduced in rc1)
         - Fix delivery of pending interrupts when entering a nested guest,
           preventing arbitrarily late injection
         - Sanitize kvm_stat output after destroying a guest
         - Fix infinite loop when emulating a nested guest page fault and
           improve the surrounding emulation code
         - Two minor cleanups"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
        KVM: LAPIC: Fix pv ipis out-of-bounds access
        KVM: nVMX: Fix loss of pending IRQ/NMI before entering L2
        arm64: KVM: Remove pgd_lock
        KVM: Remove obsolete kvm_unmap_hva notifier backend
        arm64: KVM: Only force FPEXC32_EL2.EN if trapping FPSIMD
        KVM: arm/arm64: Clean dcache to PoC when changing PTE due to CoW
        KVM: s390: Properly lock mm context allow_gmap_hpage_1m setting
        KVM: s390: vsie: copy wrapping keys to right place
        KVM: s390: Fix pfmf and conditional skey emulation
        tools/kvm_stat: re-animate display of dead guests
        tools/kvm_stat: indicate dead guests as such
        tools/kvm_stat: handle guest removals more gracefully
        tools/kvm_stat: don't reset stats when setting PID filter for debugfs
        tools/kvm_stat: fix updates for dead guests
        tools/kvm_stat: fix handling of invalid paths in debugfs provider
        tools/kvm_stat: fix python3 issues
        KVM: x86: Unexport x86_emulate_instruction()
        KVM: x86: Rename emulate_instruction() to kvm_emulate_instruction()
        KVM: x86: Do not re-{try,execute} after failed emulation in L2
        KVM: x86: Default to not allowing emulation retry in kvm_mmu_page_fault
        ...
      f8f65382
    • L
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 0f3aa48a
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Olof Johansson:
       "A few more fixes who have trickled in:
      
         - MMC bus width fixup for some Allwinner platforms
      
         - Fix for NULL deref in ti-aemif when no platform data is passed in
      
         - Fix div by 0 in SCMI code
      
         - Add a missing module alias in a new RPi driver"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        memory: ti-aemif: fix a potential NULL-pointer dereference
        firmware: arm_scmi: fix divide by zero when sustained_perf_level is zero
        hwmon: rpi: add module alias to raspberrypi-hwmon
        arm64: allwinner: dts: h6: fix Pine H64 MMC bus width
      0f3aa48a
    • O
      Merge tag 'sunxi-fixes-for-4.19' of... · a132bb90
      Olof Johansson 提交于
      Merge tag 'sunxi-fixes-for-4.19' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
      
      Allwinner fixes for 4.19
      
      Just one fix for H6 mmc on the Pine H64: the mmc bus width was missing
      from the device tree. This was added in 4.19-rc1.
      
      * tag 'sunxi-fixes-for-4.19' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
        arm64: allwinner: dts: h6: fix Pine H64 MMC bus width
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      a132bb90
  3. 08 9月, 2018 15 次提交
  4. 07 9月, 2018 10 次提交