1. 02 12月, 2020 2 次提交
  2. 27 11月, 2020 12 次提交
  3. 26 11月, 2020 2 次提交
    • Y
      lockdep: Introduce in_softirq lockdep assert · 8b5536ad
      Yunsheng Lin 提交于
      The current semantic for napi_consume_skb() is that caller need
      to provide non-zero budget when calling from NAPI context, and
      breaking this semantic will cause hard to debug problem, because
      _kfree_skb_defer() need to run in atomic context in order to push
      the skb to the particular cpu' napi_alloc_cache atomically.
      
      So add the lockdep_assert_in_softirq() to assert when the running
      context is not in_softirq, in_softirq means softirq is serving or
      BH is disabled, which has a ambiguous semantics due to the BH
      disabled confusion, so add a comment to emphasize that.
      
      And the softirq context can be interrupted by hard IRQ or NMI
      context, lockdep_assert_in_softirq() need to assert about hard
      IRQ or NMI context too.
      Suggested-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      8b5536ad
    • I
      net: phy: remove the .did_interrupt() and .ack_interrupt() callback · 6527b938
      Ioana Ciornei 提交于
      Now that all the PHY drivers have been migrated to directly implement
      the generic .handle_interrupt() callback for a seamless support of
      shared IRQs and all the .config_inter() implementations clear any
      pending interrupts, we can safely remove the two callbacks.
      
      With this patch, phylib has a proper support for shared IRQs (and not
      just for multi-PHY devices. A PHY driver must implement both the
      .handle_interrupt() and .config_intr() callbacks for the IRQs to be
      actually used.
      Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      6527b938
  4. 24 11月, 2020 5 次提交
  5. 23 11月, 2020 3 次提交
  6. 21 11月, 2020 2 次提交
  7. 20 11月, 2020 6 次提交
  8. 19 11月, 2020 2 次提交
    • S
      atm: nicstar: Replace in_interrupt() usage · f2bcc2fa
      Sebastian Andrzej Siewior 提交于
      push_scqe() uses in_interrupt() to figure out if it is allowed to sleep.
      
      The usage of in_interrupt() in drivers is phased out and Linus clearly
      requested that code which changes behaviour depending on context should
      either be separated or the context be conveyed in an argument passed by the
      caller, which usually knows the context.
      
      Aside of that in_interrupt() is not correct as it does not catch preempt
      disabled regions which neither can sleep.
      
      ns_send() (the only caller of push_scqe()) has the following callers:
      
      - vcc_sendmsg() used as proto_ops::sendmsg is expected to be invoked in
        preemtible context.
        -> vcc->dev->ops->send() (ns_send())
      
      - atm_vcc::send via atmdev_ops::send either directly (pointer copied by
        atm_init_aal34() or atm_init_aal5()) or via atm_send_aal0().
        This is invoked by drivers (like br2684, clip, pppoatm, ...) which are
        called from net_device_ops::ndo_start_xmit with BH disabled.
      
      Add atmdev_ops::send_bh which is used by callers from BH context
      (atm_send_aal*()) and if this callback missing then ::send is used
      instead.
      Implement this callback in nicstar and use it to replace in_interrupt().
      
      Cc: Chas Williams <3chas3@gmail.com>
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      f2bcc2fa
    • A
      ptp: document struct ptp_clock_request members · d04a53b1
      Ahmad Fatoum 提交于
      It's arguable most people interested in configuring a PPS signal
      want it as external output, not as kernel input. PTP_CLK_REQ_PPS
      is for input though. Add documentation to nudge readers into
      the correct direction.
      Signed-off-by: NAhmad Fatoum <a.fatoum@pengutronix.de>
      Acked-by: NRichard Cochran <richardcochran@gmail.com>
      Link: https://lore.kernel.org/r/20201117213826.18235-1-a.fatoum@pengutronix.deSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      d04a53b1
  9. 18 11月, 2020 4 次提交
  10. 17 11月, 2020 2 次提交
    • J
      sched/deadline: Fix priority inheritance with multiple scheduling classes · 2279f540
      Juri Lelli 提交于
      Glenn reported that "an application [he developed produces] a BUG in
      deadline.c when a SCHED_DEADLINE task contends with CFS tasks on nested
      PTHREAD_PRIO_INHERIT mutexes.  I believe the bug is triggered when a CFS
      task that was boosted by a SCHED_DEADLINE task boosts another CFS task
      (nested priority inheritance).
      
       ------------[ cut here ]------------
       kernel BUG at kernel/sched/deadline.c:1462!
       invalid opcode: 0000 [#1] PREEMPT SMP
       CPU: 12 PID: 19171 Comm: dl_boost_bug Tainted: ...
       Hardware name: ...
       RIP: 0010:enqueue_task_dl+0x335/0x910
       Code: ...
       RSP: 0018:ffffc9000c2bbc68 EFLAGS: 00010002
       RAX: 0000000000000009 RBX: ffff888c0af94c00 RCX: ffffffff81e12500
       RDX: 000000000000002e RSI: ffff888c0af94c00 RDI: ffff888c10b22600
       RBP: ffffc9000c2bbd08 R08: 0000000000000009 R09: 0000000000000078
       R10: ffffffff81e12440 R11: ffffffff81e1236c R12: ffff888bc8932600
       R13: ffff888c0af94eb8 R14: ffff888c10b22600 R15: ffff888bc8932600
       FS:  00007fa58ac55700(0000) GS:ffff888c10b00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007fa58b523230 CR3: 0000000bf44ab003 CR4: 00000000007606e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       PKRU: 55555554
       Call Trace:
        ? intel_pstate_update_util_hwp+0x13/0x170
        rt_mutex_setprio+0x1cc/0x4b0
        task_blocks_on_rt_mutex+0x225/0x260
        rt_spin_lock_slowlock_locked+0xab/0x2d0
        rt_spin_lock_slowlock+0x50/0x80
        hrtimer_grab_expiry_lock+0x20/0x30
        hrtimer_cancel+0x13/0x30
        do_nanosleep+0xa0/0x150
        hrtimer_nanosleep+0xe1/0x230
        ? __hrtimer_init_sleeper+0x60/0x60
        __x64_sys_nanosleep+0x8d/0xa0
        do_syscall_64+0x4a/0x100
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
       RIP: 0033:0x7fa58b52330d
       ...
       ---[ end trace 0000000000000002 ]—
      
      He also provided a simple reproducer creating the situation below:
      
       So the execution order of locking steps are the following
       (N1 and N2 are non-deadline tasks. D1 is a deadline task. M1 and M2
       are mutexes that are enabled * with priority inheritance.)
      
       Time moves forward as this timeline goes down:
      
       N1              N2               D1
       |               |                |
       |               |                |
       Lock(M1)        |                |
       |               |                |
       |             Lock(M2)           |
       |               |                |
       |               |              Lock(M2)
       |               |                |
       |             Lock(M1)           |
       |             (!!bug triggered!) |
      
      Daniel reported a similar situation as well, by just letting ksoftirqd
      run with DEADLINE (and eventually block on a mutex).
      
      Problem is that boosted entities (Priority Inheritance) use static
      DEADLINE parameters of the top priority waiter. However, there might be
      cases where top waiter could be a non-DEADLINE entity that is currently
      boosted by a DEADLINE entity from a different lock chain (i.e., nested
      priority chains involving entities of non-DEADLINE classes). In this
      case, top waiter static DEADLINE parameters could be null (initialized
      to 0 at fork()) and replenish_dl_entity() would hit a BUG().
      
      Fix this by keeping track of the original donor and using its parameters
      when a task is boosted.
      Reported-by: NGlenn Elliott <glenn@aurora.tech>
      Reported-by: NDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: NJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: NDaniel Bristot de Oliveira <bristot@redhat.com>
      Link: https://lkml.kernel.org/r/20201117061432.517340-1-juri.lelli@redhat.com
      2279f540
    • P
      sched: Fix data-race in wakeup · f97bb527
      Peter Zijlstra 提交于
      Mel reported that on some ARM64 platforms loadavg goes bananas and
      Will tracked it down to the following race:
      
        CPU0					CPU1
      
        schedule()
          prev->sched_contributes_to_load = X;
          deactivate_task(prev);
      
      					try_to_wake_up()
      					  if (p->on_rq &&) // false
      					  if (smp_load_acquire(&p->on_cpu) && // true
      					      ttwu_queue_wakelist())
      					        p->sched_remote_wakeup = Y;
      
          smp_store_release(prev->on_cpu, 0);
      
      where both p->sched_contributes_to_load and p->sched_remote_wakeup are
      in the same word, and thus the stores X and Y race (and can clobber
      one another's data).
      
      Whereas prior to commit c6e7bd7a ("sched/core: Optimize ttwu()
      spinning on p->on_cpu") the p->on_cpu handoff serialized access to
      p->sched_remote_wakeup (just as it still does with
      p->sched_contributes_to_load) that commit broke that by calling
      ttwu_queue_wakelist() with p->on_cpu != 0.
      
      However, due to
      
        p->XXX = X			ttwu()
        schedule()			  if (p->on_rq && ...) // false
          smp_mb__after_spinlock()	  if (smp_load_acquire(&p->on_cpu) &&
          deactivate_task()		      ttwu_queue_wakelist())
            p->on_rq = 0;		        p->sched_remote_wakeup = Y;
      
      We can be sure any 'current' store is complete and 'current' is
      guaranteed asleep. Therefore we can move p->sched_remote_wakeup into
      the current flags word.
      
      Note: while the observed failure was loadavg accounting gone wrong due
      to ttwu() cobbering p->sched_contributes_to_load, the reverse problem
      is also possible where schedule() clobbers p->sched_remote_wakeup,
      this could result in enqueue_entity() wrecking ->vruntime and causing
      scheduling artifacts.
      
      Fixes: c6e7bd7a ("sched/core: Optimize ttwu() spinning on p->on_cpu")
      Reported-by: NMel Gorman <mgorman@techsingularity.net>
      Debugged-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20201117083016.GK3121392@hirez.programming.kicks-ass.net
      f97bb527