1. 28 6月, 2020 2 次提交
  2. 23 6月, 2020 2 次提交
  3. 20 6月, 2020 1 次提交
  4. 19 6月, 2020 4 次提交
    • W
      i2c: remove deprecated i2c_new_device API · 390fd047
      Wolfram Sang 提交于
      All in-tree users have been converted to the new i2c_new_client_device
      function, so remove this deprecated one.
      Signed-off-by: NWolfram Sang <wsa+renesas@sang-engineering.com>
      Signed-off-by: NWolfram Sang <wsa@kernel.org>
      390fd047
    • L
      maccess: make get_kernel_nofault() check for minimal type compatibility · 0c389d89
      Linus Torvalds 提交于
      Now that we've renamed probe_kernel_address() to get_kernel_nofault()
      and made it look and behave more in line with get_user(), some of the
      subtle type behavior differences end up being more obvious and possibly
      dangerous.
      
      When you do
      
              get_user(val, user_ptr);
      
      the type of the access comes from the "user_ptr" part, and the above
      basically acts as
      
              val = *user_ptr;
      
      by design (except, of course, for the fact that the actual dereference
      is done with a user access).
      
      Note how in the above case, the type of the end result comes from the
      pointer argument, and then the value is cast to the type of 'val' as
      part of the assignment.
      
      So the type of the pointer is ultimately the more important type both
      for the access itself.
      
      But 'get_kernel_nofault()' may now _look_ similar, but it behaves very
      differently.  When you do
      
              get_kernel_nofault(val, kernel_ptr);
      
      it behaves like
      
              val = *(typeof(val) *)kernel_ptr;
      
      except, of course, for the fact that the actual dereference is done with
      exception handling so that a faulting access is suppressed and returned
      as the error code.
      
      But note how different the casting behavior of the two superficially
      similar accesses are: one does the actual access in the size of the type
      the pointer points to, while the other does the access in the size of
      the target, and ignores the pointer type entirely.
      
      Actually changing get_kernel_nofault() to act like get_user() is almost
      certainly the right thing to do eventually, but in the meantime this
      patch adds logit to at least verify that the pointer type is compatible
      with the type of the result.
      
      In many cases, this involves just casting the pointer to 'void *' to
      make it obvious that the type of the pointer is not the important part.
      It's not how 'get_user()' acts, but at least the behavioral difference
      is now obvious and explicit.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0c389d89
    • C
      maccess: rename probe_kernel_address to get_kernel_nofault · 25f12ae4
      Christoph Hellwig 提交于
      Better describe what this helper does, and match the naming of
      copy_from_kernel_nofault.
      
      Also switch the argument order around, so that it acts and looks
      like get_user().
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      25f12ae4
    • L
      sparse: use identifiers to define address spaces · 670d0a4b
      Luc Van Oostenryck 提交于
      Currently, address spaces in warnings are displayed as '<asn:X>' with
      'X' being the address space's arbitrary number.
      
      But since sparse v0.6.0-rc1 (late December 2018), sparse allows you to
      define the address spaces using an identifier instead of a number.  This
      identifier is then directly used in the warnings.
      
      So, use the identifiers '__user', '__iomem', '__percpu' & '__rcu' for
      the corresponding address spaces.  The default address space, __kernel,
      being not displayed in warnings, stays defined as '0'.
      
      With this change, warnings that used to be displayed as:
      
      	cast removes address space '<asn:1>' of expression
      	... void [noderef] <asn:2> *
      
      will now be displayed as:
      
      	cast removes address space '__user' of expression
      	... void [noderef] __iomem *
      
      This also moves the __kernel annotation to be the first one, since it is
      quite different from the others because it's the default one, and so:
      
       - it's never displayed
      
       - it's normally not needed, nor in type annotations, nor in cast
         between address spaces. The only time it's needed is when it's
         combined with a typeof to express "the same type as this one but
         without the address space"
      
       - it can't be defined with a name, '0' must be used.
      
      So, it seemed strange to me to have it in the middle of the other
      ones.
      Signed-off-by: NLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Acked-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      670d0a4b
  5. 18 6月, 2020 4 次提交
  6. 17 6月, 2020 2 次提交
    • G
      overflow.h: Add flex_array_size() helper · b19d57d0
      Gustavo A. R. Silva 提交于
      Add flex_array_size() helper for the calculation of the size, in bytes,
      of a flexible array member contained within an enclosing structure.
      
      Example of usage:
      
      struct something {
      	size_t count;
      	struct foo items[];
      };
      
      struct something *instance;
      
      instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
      instance->count = count;
      memcpy(instance->items, src, flex_array_size(instance, items, instance->count));
      
      The helper returns SIZE_MAX on overflow instead of wrapping around.
      
      Additionally replaces parameter "n" with "count" in struct_size() helper
      for greater clarity and unification.
      Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
      Link: https://lore.kernel.org/r/20200609012233.GA3371@embeddedorSigned-off-by: NKees Cook <keescook@chromium.org>
      b19d57d0
    • J
      kretprobe: Prevent triggering kretprobe from within kprobe_flush_task · 9b38cc70
      Jiri Olsa 提交于
      Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave.
      My test was also able to trigger lockdep output:
      
       ============================================
       WARNING: possible recursive locking detected
       5.6.0-rc6+ #6 Not tainted
       --------------------------------------------
       sched-messaging/2767 is trying to acquire lock:
       ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0
      
       but task is already holding lock:
       ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&(kretprobe_table_locks[i].lock));
         lock(&(kretprobe_table_locks[i].lock));
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       1 lock held by sched-messaging/2767:
        #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
      
       stack backtrace:
       CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6
       Call Trace:
        dump_stack+0x96/0xe0
        __lock_acquire.cold.57+0x173/0x2b7
        ? native_queued_spin_lock_slowpath+0x42b/0x9e0
        ? lockdep_hardirqs_on+0x590/0x590
        ? __lock_acquire+0xf63/0x4030
        lock_acquire+0x15a/0x3d0
        ? kretprobe_hash_lock+0x52/0xa0
        _raw_spin_lock_irqsave+0x36/0x70
        ? kretprobe_hash_lock+0x52/0xa0
        kretprobe_hash_lock+0x52/0xa0
        trampoline_handler+0xf8/0x940
        ? kprobe_fault_handler+0x380/0x380
        ? find_held_lock+0x3a/0x1c0
        kretprobe_trampoline+0x25/0x50
        ? lock_acquired+0x392/0xbc0
        ? _raw_spin_lock_irqsave+0x50/0x70
        ? __get_valid_kprobe+0x1f0/0x1f0
        ? _raw_spin_unlock_irqrestore+0x3b/0x40
        ? finish_task_switch+0x4b9/0x6d0
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
      
      The code within the kretprobe handler checks for probe reentrancy,
      so we won't trigger any _raw_spin_lock_irqsave probe in there.
      
      The problem is in outside kprobe_flush_task, where we call:
      
        kprobe_flush_task
          kretprobe_table_lock
            raw_spin_lock_irqsave
              _raw_spin_lock_irqsave
      
      where _raw_spin_lock_irqsave triggers the kretprobe and installs
      kretprobe_trampoline handler on _raw_spin_lock_irqsave return.
      
      The kretprobe_trampoline handler is then executed with already
      locked kretprobe_table_locks, and first thing it does is to
      lock kretprobe_table_locks ;-) the whole lockup path like:
      
        kprobe_flush_task
          kretprobe_table_lock
            raw_spin_lock_irqsave
              _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed
      
              ---> kretprobe_table_locks locked
      
              kretprobe_trampoline
                trampoline_handler
                  kretprobe_hash_lock(current, &head, &flags);  <--- deadlock
      
      Adding kprobe_busy_begin/end helpers that mark code with fake
      probe installed to prevent triggering of another kprobe within
      this code.
      
      Using these helpers in kprobe_flush_task, so the probe recursion
      protection check is hit and the probe is never set to prevent
      above lockup.
      
      Link: http://lkml.kernel.org/r/158927059835.27680.7011202830041561604.stgit@devnote2
      
      Fixes: ef53d9c5 ("kprobes: improve kretprobe scalability with hashed locking")
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: "Gustavo A . R . Silva" <gustavoars@kernel.org>
      Cc: Anders Roxell <anders.roxell@linaro.org>
      Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Reported-by: N"Ziqian SUN (Zamir)" <zsun@redhat.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      9b38cc70
  7. 16 6月, 2020 12 次提交
  8. 15 6月, 2020 2 次提交
  9. 13 6月, 2020 1 次提交
  10. 12 6月, 2020 8 次提交
  11. 11 6月, 2020 2 次提交
    • T
      x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned · 17fae129
      Tony Luck 提交于
      An interesting thing happened when a guest Linux instance took a machine
      check. The VMM unmapped the bad page from guest physical space and
      passed the machine check to the guest.
      
      Linux took all the normal actions to offline the page from the process
      that was using it. But then guest Linux crashed because it said there
      was a second machine check inside the kernel with this stack trace:
      
      do_memory_failure
          set_mce_nospec
               set_memory_uc
                    _set_memory_uc
                         change_page_attr_set_clr
                              cpa_flush
                                   clflush_cache_range_opt
      
      This was odd, because a CLFLUSH instruction shouldn't raise a machine
      check (it isn't consuming the data). Further investigation showed that
      the VMM had passed in another machine check because is appeared that the
      guest was accessing the bad page.
      
      Fix is to check the scope of the poison by checking the MCi_MISC register.
      If the entire page is affected, then unmap the page. If only part of the
      page is affected, then mark the page as uncacheable.
      
      This assumes that VMMs will do the logical thing and pass in the "whole
      page scope" via the MCi_MISC register (since they unmapped the entire
      page).
      
        [ bp: Adjust to x86/entry changes. ]
      
      Fixes: 284ce401 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()")
      Reported-by: NJue Wang <juew@google.com>
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJue Wang <juew@google.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20200520163546.GA7977@agluck-desk2.amr.corp.intel.com
      
      
      17fae129
    • T
      x86/entry: Unbreak __irqentry_text_start/end magic · f0178fc0
      Thomas Gleixner 提交于
      The entry rework moved interrupt entry code from the irqentry to the
      noinstr section which made the irqentry section empty.
      
      This breaks boundary checks which rely on the __irqentry_text_start/end
      markers to find out whether a function in a stack trace is
      interrupt/exception entry code. This affects the function graph tracer and
      filter_irq_stacks().
      
      As the IDT entry points are all sequentialy emitted this is rather simple
      to unbreak by injecting __irqentry_text_start/end as global labels.
      
      To make this work correctly:
      
        - Remove the IRQENTRY_TEXT section from the x86 linker script
        - Define __irqentry so it breaks the build if it's used
        - Adjust the entry mirroring in PTI
        - Remove the redundant kprobes and unwinder bound checks
      Reported-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      f0178fc0