• J
    sched/deadline: Fix priority inheritance with multiple scheduling classes · 2279f540
    Juri Lelli 提交于
    Glenn reported that "an application [he developed produces] a BUG in
    deadline.c when a SCHED_DEADLINE task contends with CFS tasks on nested
    PTHREAD_PRIO_INHERIT mutexes.  I believe the bug is triggered when a CFS
    task that was boosted by a SCHED_DEADLINE task boosts another CFS task
    (nested priority inheritance).
    
     ------------[ cut here ]------------
     kernel BUG at kernel/sched/deadline.c:1462!
     invalid opcode: 0000 [#1] PREEMPT SMP
     CPU: 12 PID: 19171 Comm: dl_boost_bug Tainted: ...
     Hardware name: ...
     RIP: 0010:enqueue_task_dl+0x335/0x910
     Code: ...
     RSP: 0018:ffffc9000c2bbc68 EFLAGS: 00010002
     RAX: 0000000000000009 RBX: ffff888c0af94c00 RCX: ffffffff81e12500
     RDX: 000000000000002e RSI: ffff888c0af94c00 RDI: ffff888c10b22600
     RBP: ffffc9000c2bbd08 R08: 0000000000000009 R09: 0000000000000078
     R10: ffffffff81e12440 R11: ffffffff81e1236c R12: ffff888bc8932600
     R13: ffff888c0af94eb8 R14: ffff888c10b22600 R15: ffff888bc8932600
     FS:  00007fa58ac55700(0000) GS:ffff888c10b00000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 00007fa58b523230 CR3: 0000000bf44ab003 CR4: 00000000007606e0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
     PKRU: 55555554
     Call Trace:
      ? intel_pstate_update_util_hwp+0x13/0x170
      rt_mutex_setprio+0x1cc/0x4b0
      task_blocks_on_rt_mutex+0x225/0x260
      rt_spin_lock_slowlock_locked+0xab/0x2d0
      rt_spin_lock_slowlock+0x50/0x80
      hrtimer_grab_expiry_lock+0x20/0x30
      hrtimer_cancel+0x13/0x30
      do_nanosleep+0xa0/0x150
      hrtimer_nanosleep+0xe1/0x230
      ? __hrtimer_init_sleeper+0x60/0x60
      __x64_sys_nanosleep+0x8d/0xa0
      do_syscall_64+0x4a/0x100
      entry_SYSCALL_64_after_hwframe+0x49/0xbe
     RIP: 0033:0x7fa58b52330d
     ...
     ---[ end trace 0000000000000002 ]—
    
    He also provided a simple reproducer creating the situation below:
    
     So the execution order of locking steps are the following
     (N1 and N2 are non-deadline tasks. D1 is a deadline task. M1 and M2
     are mutexes that are enabled * with priority inheritance.)
    
     Time moves forward as this timeline goes down:
    
     N1              N2               D1
     |               |                |
     |               |                |
     Lock(M1)        |                |
     |               |                |
     |             Lock(M2)           |
     |               |                |
     |               |              Lock(M2)
     |               |                |
     |             Lock(M1)           |
     |             (!!bug triggered!) |
    
    Daniel reported a similar situation as well, by just letting ksoftirqd
    run with DEADLINE (and eventually block on a mutex).
    
    Problem is that boosted entities (Priority Inheritance) use static
    DEADLINE parameters of the top priority waiter. However, there might be
    cases where top waiter could be a non-DEADLINE entity that is currently
    boosted by a DEADLINE entity from a different lock chain (i.e., nested
    priority chains involving entities of non-DEADLINE classes). In this
    case, top waiter static DEADLINE parameters could be null (initialized
    to 0 at fork()) and replenish_dl_entity() would hit a BUG().
    
    Fix this by keeping track of the original donor and using its parameters
    when a task is boosted.
    Reported-by: NGlenn Elliott <glenn@aurora.tech>
    Reported-by: NDaniel Bristot de Oliveira <bristot@redhat.com>
    Signed-off-by: NJuri Lelli <juri.lelli@redhat.com>
    Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
    Tested-by: NDaniel Bristot de Oliveira <bristot@redhat.com>
    Link: https://lkml.kernel.org/r/20201117061432.517340-1-juri.lelli@redhat.com
    2279f540
sched.h 57.4 KB