1. 25 7月, 2018 5 次提交
    • I
      4765096f
    • H
      sched/rt: Restore rt_runtime after disabling RT_RUNTIME_SHARE · f3d133ee
      Hailong Liu 提交于
      NO_RT_RUNTIME_SHARE feature is used to prevent a CPU borrow enough
      runtime with a spin-rt-task.
      
      However, if RT_RUNTIME_SHARE feature is enabled and rt_rq has borrowd
      enough rt_runtime at the beginning, rt_runtime can't be restored to
      its initial bandwidth rt_runtime after we disable RT_RUNTIME_SHARE.
      
      E.g. on my PC with 4 cores, procedure to reproduce:
      1) Make sure  RT_RUNTIME_SHARE is enabled
       cat /sys/kernel/debug/sched_features
        GENTLE_FAIR_SLEEPERS START_DEBIT NO_NEXT_BUDDY LAST_BUDDY
        CACHE_HOT_BUDDY WAKEUP_PREEMPTION NO_HRTICK NO_DOUBLE_TICK
        LB_BIAS NONTASK_CAPACITY TTWU_QUEUE NO_SIS_AVG_CPU SIS_PROP
        NO_WARN_DOUBLE_CLOCK RT_PUSH_IPI RT_RUNTIME_SHARE NO_LB_MIN
        ATTACH_AGE_LOAD WA_IDLE WA_WEIGHT WA_BIAS
      2) Start a spin-rt-task
       ./loop_rr &
      3) set affinity to the last cpu
       taskset -p 8 $pid_of_loop_rr
      4) Observe that last cpu have borrowed enough runtime.
       cat /proc/sched_debug | grep rt_runtime
        .rt_runtime                    : 950.000000
        .rt_runtime                    : 900.000000
        .rt_runtime                    : 950.000000
        .rt_runtime                    : 1000.000000
      5) Disable RT_RUNTIME_SHARE
       echo NO_RT_RUNTIME_SHARE > /sys/kernel/debug/sched_features
      6) Observe that rt_runtime can not been restored
       cat /proc/sched_debug | grep rt_runtime
        .rt_runtime                    : 950.000000
        .rt_runtime                    : 900.000000
        .rt_runtime                    : 950.000000
        .rt_runtime                    : 1000.000000
      
      This patch help to restore rt_runtime after we disable
      RT_RUNTIME_SHARE.
      Signed-off-by: NHailong Liu <liu.hailong6@zte.com.cn>
      Signed-off-by: NJiang Biao <jiang.biao2@zte.com.cn>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: zhong.weidong@zte.com.cn
      Link: http://lkml.kernel.org/r/1531874815-39357-1-git-send-email-liu.hailong6@zte.com.cnSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f3d133ee
    • D
      sched/deadline: Update rq_clock of later_rq when pushing a task · 840d7196
      Daniel Bristot de Oliveira 提交于
      Daniel Casini got this warn while running a DL task here at RetisLab:
      
        [  461.137582] ------------[ cut here ]------------
        [  461.137583] rq->clock_update_flags < RQCF_ACT_SKIP
        [  461.137599] WARNING: CPU: 4 PID: 2354 at kernel/sched/sched.h:967 assert_clock_updated.isra.32.part.33+0x17/0x20
            [a ton of modules]
        [  461.137646] CPU: 4 PID: 2354 Comm: label_image Not tainted 4.18.0-rc4+ #3
        [  461.137647] Hardware name: ASUS All Series/Z87-K, BIOS 0801 09/02/2013
        [  461.137649] RIP: 0010:assert_clock_updated.isra.32.part.33+0x17/0x20
        [  461.137649] Code: ff 48 89 83 08 09 00 00 eb c6 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 98 7a 6c a5 c6 05 bc 0d 54 01 01 48 89 e5 e8 a9 84 fb ff <0f> 0b 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 83 7e 60 01 74 0a 48 3b
        [  461.137673] RSP: 0018:ffffa77e08cafc68 EFLAGS: 00010082
        [  461.137674] RAX: 0000000000000000 RBX: ffff8b3fc1702d80 RCX: 0000000000000006
        [  461.137674] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff8b3fded164b0
        [  461.137675] RBP: ffffa77e08cafc68 R08: 0000000000000026 R09: 0000000000000339
        [  461.137676] R10: ffff8b3fd060d410 R11: 0000000000000026 R12: ffffffffa4e14e20
        [  461.137677] R13: ffff8b3fdec22940 R14: ffff8b3fc1702da0 R15: ffff8b3fdec22940
        [  461.137678] FS:  00007efe43ee5700(0000) GS:ffff8b3fded00000(0000) knlGS:0000000000000000
        [  461.137679] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [  461.137680] CR2: 00007efe30000010 CR3: 0000000301744003 CR4: 00000000001606e0
        [  461.137680] Call Trace:
        [  461.137684]  push_dl_task.part.46+0x3bc/0x460
        [  461.137686]  task_woken_dl+0x60/0x80
        [  461.137689]  ttwu_do_wakeup+0x4f/0x150
        [  461.137690]  ttwu_do_activate+0x77/0x80
        [  461.137692]  try_to_wake_up+0x1d6/0x4c0
        [  461.137693]  wake_up_q+0x32/0x70
        [  461.137696]  do_futex+0x7e7/0xb50
        [  461.137698]  __x64_sys_futex+0x8b/0x180
        [  461.137701]  do_syscall_64+0x5a/0x110
        [  461.137703]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
        [  461.137705] RIP: 0033:0x7efe4918ca26
        [  461.137705] Code: 00 00 00 74 17 49 8b 48 20 44 8b 59 10 41 83 e3 30 41 83 fb 20 74 1e be 85 00 00 00 41 ba 01 00 00 00 41 b9 01 00 00 04 0f 05 <48> 3d 01 f0 ff ff 73 1f 31 c0 c3 be 8c 00 00 00 49 89 c8 4d 31 d2
        [  461.137738] RSP: 002b:00007efe43ee4928 EFLAGS: 00000283 ORIG_RAX: 00000000000000ca
        [  461.137739] RAX: ffffffffffffffda RBX: 0000000005094df0 RCX: 00007efe4918ca26
        [  461.137740] RDX: 0000000000000001 RSI: 0000000000000085 RDI: 0000000005094e24
        [  461.137741] RBP: 00007efe43ee49c0 R08: 0000000005094e20 R09: 0000000004000001
        [  461.137741] R10: 0000000000000001 R11: 0000000000000283 R12: 0000000000000000
        [  461.137742] R13: 0000000005094df8 R14: 0000000000000001 R15: 0000000000448a10
        [  461.137743] ---[ end trace 187df4cad2bf7649 ]---
      
      This warning happened in the push_dl_task(), because
      __add_running_bw()->cpufreq_update_util() is getting the rq_clock of
      the later_rq before its update, which takes place at activate_task().
      The fix then is to update the rq_clock before calling add_running_bw().
      
      To avoid double rq_clock_update() call, we set ENQUEUE_NOCLOCK flag to
      activate_task().
      Reported-by: NDaniel Casini <daniel.casini@santannapisa.it>
      Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NJuri Lelli <juri.lelli@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luca Abeni <luca.abeni@santannapisa.it>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tommaso Cucinotta <tommaso.cucinotta@santannapisa.it>
      Fixes: e0367b12 sched/deadline: Move CPU frequency selection triggering points
      Link: http://lkml.kernel.org/r/ca31d073a4788acf0684a8b255f14fea775ccf20.1532077269.git.bristot@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      840d7196
    • I
      stop_machine: Disable preemption after queueing stopper threads · 2610e889
      Isaac J. Manjarres 提交于
      This commit:
      
        9fb8d5dc ("stop_machine, Disable preemption when waking two stopper threads")
      
      does not fully address the race condition that can occur
      as follows:
      
      On one CPU, call it CPU 3, thread 1 invokes
      cpu_stop_queue_two_works(2, 3,...), and the execution is such
      that thread 1 queues the works for migration/2 and migration/3,
      and is preempted after releasing the locks for migration/2 and
      migration/3, but before waking the threads.
      
      Then, On CPU 2, a kworker, call it thread 2, is running,
      and it invokes cpu_stop_queue_two_works(1, 2,...), such that
      thread 2 queues the works for migration/1 and migration/2.
      Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
      migration/2 and migration/3. This means that when CPU 2
      releases the locks for migration/1 and migration/2, but before
      it wakes those threads, it can be preempted by migration/2.
      
      If thread 2 is preempted by migration/2, then migration/2 will
      execute the first work item successfully, since migration/3
      was woken up by CPU 3, but when it goes to execute the second
      work item, it disables preemption, calls multi_cpu_stop(),
      and thus, CPU 2 will wait forever for migration/1, which should
      have been woken up by thread 2. However migration/1 cannot be
      woken up by thread 2, since it is a kworker, so it is affine to
      CPU 2, but CPU 2 is running migration/2 with preemption
      disabled, so thread 2 will never run.
      
      Disable preemption after queueing works for stopper threads
      to ensure that the operation of queueing the works and waking
      the stopper threads is atomic.
      Co-Developed-by: NPrasad Sodagudi <psodagud@codeaurora.org>
      Co-Developed-by: NPavankumar Kondeti <pkondeti@codeaurora.org>
      Signed-off-by: NIsaac J. Manjarres <isaacm@codeaurora.org>
      Signed-off-by: NPrasad Sodagudi <psodagud@codeaurora.org>
      Signed-off-by: NPavankumar Kondeti <pkondeti@codeaurora.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: bigeasy@linutronix.de
      Cc: gregkh@linuxfoundation.org
      Cc: matt@codeblueprint.co.uk
      Fixes: 9fb8d5dc ("stop_machine, Disable preemption when waking two stopper threads")
      Link: http://lkml.kernel.org/r/1531856129-9871-1-git-send-email-isaacm@codeaurora.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2610e889
    • Y
      sched/topology: Check variable group before dereferencing it · 6cd0c583
      Yi Wang 提交于
      The 'group' variable in sched_domain_debug_one() is not checked
      when firstly used in cpumask_test_cpu(cpu, sched_group_span(group)),
      but it might be NULL (it is checked later in the following while loop)
      and may cause NULL pointer dereference.
      
      We need to check it before using to avoid NULL dereference.
      Signed-off-by: NYi Wang <wang.yi59@zte.com.cn>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NJiang Biao <jiang.biao2@zte.com.cn>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: zhong.weidong@zte.com.cn
      Link: http://lkml.kernel.org/r/1532319547-33335-1-git-send-email-wang.yi59@zte.com.cnSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6cd0c583
  2. 23 7月, 2018 4 次提交
    • L
      Linux 4.18-rc6 · d72e90f3
      Linus Torvalds 提交于
      d72e90f3
    • L
      Merge tag 'nvme-for-4.18' of git://git.infradead.org/nvme · 74413084
      Linus Torvalds 提交于
      Pull NVMe fixes from Christoph Hellwig:
      
       - fix a regression in 4.18 that causes a memory leak on probe failure
         (Keith Bush)
      
       - fix a deadlock in the passthrough ioctl code (Scott Bauer)
      
       - don't enable AENs if not supported (Weiping Zhang)
      
       - fix an old regression in metadata handling in the passthrough ioctl
         code (Roland Dreier)
      
      * tag 'nvme-for-4.18' of git://git.infradead.org/nvme:
        nvme: fix handling of metadata_len for NVME_IOCTL_IO_CMD
        nvme: don't enable AEN if not supported
        nvme: ensure forward progress during Admin passthru
        nvme-pci: fix memory leak on probe failure
      74413084
    • L
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 165ea0d1
      Linus Torvalds 提交于
      Pull vfs fixes from Al Viro:
       "Fix several places that screw up cleanups after failures halfway
        through opening a file (one open-coding filp_clone_open() and getting
        it wrong, two misusing alloc_file()). That part is -stable fodder from
        the 'work.open' branch.
      
        And Christoph's regression fix for uapi breakage in aio series;
        include/uapi/linux/aio_abi.h shouldn't be pulling in the kernel
        definition of sigset_t, the reason for doing so in the first place had
        been bogus - there's no need to expose struct __aio_sigset in
        aio_abi.h at all"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        aio: don't expose __aio_sigset in uapi
        ocxlflash_getfile(): fix double-iput() on alloc_file() failures
        cxl_getfile(): fix double-iput() on alloc_file() failures
        drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()
      165ea0d1
    • A
      alpha: fix osf_wait4() breakage · f88a333b
      Al Viro 提交于
      kernel_wait4() expects a userland address for status - it's only
      rusage that goes as a kernel one (and needs a copyout afterwards)
      
      [ Also, fix the prototype of kernel_wait4() to have that __user
        annotation   - Linus ]
      
      Fixes: 92ebce5a ("osf_wait4: switch to kernel_wait4()")
      Cc: stable@kernel.org # v4.13+
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f88a333b
  3. 22 7月, 2018 16 次提交
  4. 21 7月, 2018 10 次提交
  5. 20 7月, 2018 5 次提交