1. 09 10月, 2018 1 次提交
    • M
      init: add arch_call_rest_init to allow stack switching · 53c99bd6
      Martin Schwidefsky 提交于
      With CONFIG_VMAP_STACK=y the kernel stack of all tasks should be
      allocated in the vmalloc space. The initial stack used for all
      the early init code is in the init_thread_union. To be able to
      switch from this early stack to a properly allocated stack
      from vmalloc the architecture needs a switch-over point.
      
      Introduce the arch_call_rest_init() function with a weak definition
      in init/main.c with the only purpose to call rest_init() from the
      end of start_kernel(). The architecture override can then do the
      necessary magic to switch to the new vmalloc'ed stack.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      53c99bd6
  2. 27 9月, 2018 1 次提交
    • A
      jump_label: Annotate entries that operate on __init code earlier · 19483677
      Ard Biesheuvel 提交于
      Jump table entries are mostly read-only, with the exception of the
      init and module loader code that defuses entries that point into init
      code when the code being referred to is freed.
      
      For robustness, it would be better to move these entries into the
      ro_after_init section, but clearing the 'code' member of each jump
      table entry referring to init code at module load time races with the
      module_enable_ro() call that remaps the ro_after_init section read
      only, so we'd like to do it earlier.
      
      So given that whether such an entry refers to init code can be decided
      much earlier, we can pull this check forward. Since we may still need
      the code entry at this point, let's switch to setting a low bit in the
      'key' member just like we do to annotate the default state of a jump
      table entry.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-s390@vger.kernel.org
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Jessica Yu <jeyu@kernel.org>
      Link: https://lkml.kernel.org/r/20180919065144.25010-8-ard.biesheuvel@linaro.org
      19483677
  3. 23 8月, 2018 2 次提交
  4. 13 8月, 2018 1 次提交
    • L
      init: rename and re-order boot_cpu_state_init() · b5b1404d
      Linus Torvalds 提交于
      This is purely a preparatory patch for upcoming changes during the 4.19
      merge window.
      
      We have a function called "boot_cpu_state_init()" that isn't really
      about the bootup cpu state: that is done much earlier by the similarly
      named "boot_cpu_init()" (note lack of "state" in name).
      
      This function initializes some hotplug CPU state, and needs to run after
      the percpu data has been properly initialized.  It even has a comment to
      that effect.
      
      Except it _doesn't_ actually run after the percpu data has been properly
      initialized.  On x86 it happens to do that, but on at least arm and
      arm64, the percpu base pointers are initialized by the arch-specific
      'smp_prepare_boot_cpu()' hook, which ran _after_ boot_cpu_state_init().
      
      This had some unexpected results, and in particular we have a patch
      pending for the merge window that did the obvious cleanup of using
      'this_cpu_write()' in the cpu hotplug init code:
      
        -       per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE;
        +       this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);
      
      which is obviously the right thing to do.  Except because of the
      ordering issue, it actually failed miserably and unexpectedly on arm64.
      
      So this just fixes the ordering, and changes the name of the function to
      be 'boot_cpu_hotplug_init()' to make it obvious that it's about cpu
      hotplug state, because the core CPU state was supposed to have already
      been done earlier.
      
      Marked for stable, since the (not yet merged) patch that will show this
      problem is marked for stable.
      Reported-by: NVlastimil Babka <vbabka@suse.cz>
      Reported-by: NMian Yousaf Kaukab <yousaf.kaukab@suse.com>
      Suggested-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5b1404d
  5. 11 8月, 2018 1 次提交
  6. 31 7月, 2018 1 次提交
    • J
      tracing: Centralize preemptirq tracepoints and unify their usage · c3bc8fd6
      Joel Fernandes (Google) 提交于
      This patch detaches the preemptirq tracepoints from the tracers and
      keeps it separate.
      
      Advantages:
      * Lockdep and irqsoff event can now run in parallel since they no longer
      have their own calls.
      
      * This unifies the usecase of adding hooks to an irqsoff and irqson
      event, and a preemptoff and preempton event.
        3 users of the events exist:
        - Lockdep
        - irqsoff and preemptoff tracers
        - irqs and preempt trace events
      
      The unification cleans up several ifdefs and makes the code in preempt
      tracer and irqsoff tracers simpler. It gets rid of all the horrific
      ifdeferry around PROVE_LOCKING and makes configuration of the different
      users of the tracepoints more easy and understandable. It also gets rid
      of the time_* function calls from the lockdep hooks used to call into
      the preemptirq tracer which is not needed anymore. The negative delta in
      lines of code in this patch is quite large too.
      
      In the patch we introduce a new CONFIG option PREEMPTIRQ_TRACEPOINTS
      as a single point for registering probes onto the tracepoints. With
      this,
      the web of config options for preempt/irq toggle tracepoints and its
      users becomes:
      
       PREEMPT_TRACER   PREEMPTIRQ_EVENTS  IRQSOFF_TRACER PROVE_LOCKING
             |                 |     \         |           |
             \    (selects)    /      \        \ (selects) /
            TRACE_PREEMPT_TOGGLE       ----> TRACE_IRQFLAGS
                            \                  /
                             \ (depends on)   /
                           PREEMPTIRQ_TRACEPOINTS
      
      Other than the performance tests mentioned in the previous patch, I also
      ran the locking API test suite. I verified that all tests cases are
      passing.
      
      I also injected issues by not registering lockdep probes onto the
      tracepoints and I see failures to confirm that the probes are indeed
      working.
      
      This series + lockdep probes not registered (just to inject errors):
      [    0.000000]      hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
      [    0.000000]      soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
      [    0.000000]        sirq-safe-A => hirqs-on/12:FAILED|FAILED|  ok  |
      [    0.000000]        sirq-safe-A => hirqs-on/21:FAILED|FAILED|  ok  |
      [    0.000000]          hard-safe-A + irqs-on/12:FAILED|FAILED|  ok  |
      [    0.000000]          soft-safe-A + irqs-on/12:FAILED|FAILED|  ok  |
      [    0.000000]          hard-safe-A + irqs-on/21:FAILED|FAILED|  ok  |
      [    0.000000]          soft-safe-A + irqs-on/21:FAILED|FAILED|  ok  |
      [    0.000000]     hard-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
      [    0.000000]     soft-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
      
      With this series + lockdep probes registered, all locking tests pass:
      
      [    0.000000]      hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
      [    0.000000]      soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
      [    0.000000]        sirq-safe-A => hirqs-on/12:  ok  |  ok  |  ok  |
      [    0.000000]        sirq-safe-A => hirqs-on/21:  ok  |  ok  |  ok  |
      [    0.000000]          hard-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
      [    0.000000]          soft-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
      [    0.000000]          hard-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
      [    0.000000]          soft-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
      [    0.000000]     hard-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
      [    0.000000]     soft-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
      
      Link: http://lkml.kernel.org/r/20180730222423.196630-4-joel@joelfernandes.orgAcked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      c3bc8fd6
  7. 20 7月, 2018 3 次提交
    • J
      x86/mm/pti: Introduce pti_finalize() · b976690f
      Joerg Roedel 提交于
      Introduce a new function to finalize the kernel mappings for the userspace
      page-table after all ro/nx protections have been applied to the kernel
      mappings.
      
      Also move the call to pti_clone_kernel_text() to that function so that it
      will run on 32 bit kernels too.
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NPavel Machek <pavel@ucw.cz>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: linux-mm@kvack.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Waiman Long <llong@redhat.com>
      Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca>
      Cc: joro@8bytes.org
      Link: https://lkml.kernel.org/r/1531906876-13451-30-git-send-email-joro@8bytes.org
      b976690f
    • P
      sched/clock: Enable sched clock early · 857baa87
      Pavel Tatashin 提交于
      Allow sched_clock() to be used before schec_clock_init() is called.  This
      provides a way to get early boot timestamps on machines with unstable
      clocks.
      Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: steven.sistare@oracle.com
      Cc: daniel.m.jordan@oracle.com
      Cc: linux@armlinux.org.uk
      Cc: schwidefsky@de.ibm.com
      Cc: heiko.carstens@de.ibm.com
      Cc: john.stultz@linaro.org
      Cc: sboyd@codeaurora.org
      Cc: hpa@zytor.com
      Cc: douly.fnst@cn.fujitsu.com
      Cc: peterz@infradead.org
      Cc: prarit@redhat.com
      Cc: feng.tang@intel.com
      Cc: pmladek@suse.com
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: linux-s390@vger.kernel.org
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: pbonzini@redhat.com
      Link: https://lkml.kernel.org/r/20180719205545.16512-24-pasha.tatashin@oracle.com
      857baa87
    • P
      sched/clock: Move sched clock initialization and merge with generic clock · 5d2a4e91
      Pavel Tatashin 提交于
      sched_clock_postinit() initializes a generic clock on systems where no
      other clock is provided. This function may be called only after
      timekeeping_init().
      
      Rename sched_clock_postinit to generic_clock_inti() and call it from
      sched_clock_init(). Move the call for sched_clock_init() until after
      time_init().
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: steven.sistare@oracle.com
      Cc: daniel.m.jordan@oracle.com
      Cc: linux@armlinux.org.uk
      Cc: schwidefsky@de.ibm.com
      Cc: heiko.carstens@de.ibm.com
      Cc: john.stultz@linaro.org
      Cc: sboyd@codeaurora.org
      Cc: hpa@zytor.com
      Cc: douly.fnst@cn.fujitsu.com
      Cc: prarit@redhat.com
      Cc: feng.tang@intel.com
      Cc: pmladek@suse.com
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: linux-s390@vger.kernel.org
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: pbonzini@redhat.com
      Link: https://lkml.kernel.org/r/20180719205545.16512-23-pasha.tatashin@oracle.com
      5d2a4e91
  8. 26 5月, 2018 1 次提交
  9. 12 5月, 2018 1 次提交
  10. 07 5月, 2018 1 次提交
  11. 12 4月, 2018 2 次提交
  12. 08 4月, 2018 1 次提交
  13. 06 4月, 2018 2 次提交
    • S
      init, tracing: Have printk come through the trace events for initcall_debug · 4e37958d
      Steven Rostedt (VMware) 提交于
      With trace events set before and after the initcall function calls, instead
      of having a separate routine for printing out the initcalls when
      initcall_debug is specified on the kernel command line, have the code
      register a callback to the tracepoints where the initcall trace events are.
      
      This removes the need for having a separate function to do the initcalls as
      the tracepoint callbacks can handle the printk. It also includes other
      initcalls that are not called by the do_one_initcall() which includes
      console and security initcalls.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      4e37958d
    • S
      init, tracing: Add initcall trace events · 4ee7c60d
      Steven Rostedt (VMware) 提交于
      Being able to trace the start and stop of initcalls is useful to see where
      the timings are an issue. There is already an "initcall_debug" parameter,
      but that can cause a large overhead itself, as the printing of the
      information may take longer than the initcall functions.
      
      Adding in a start and finish trace event around the initcall functions, as
      well as a trace event that records the level of the initcalls, one can get a
      much finer measurement of the times and interactions of the initcalls
      themselves, as trace events are much lighter than printk()s.
      Suggested-by: NAbderrahmane Benbachir <abderrahmane.benbachir@polymtl.ca>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      4ee7c60d
  14. 03 4月, 2018 3 次提交
  15. 23 3月, 2018 1 次提交
    • S
      init: Fix initcall0 name as it is "pure" not "early" · a6fb6012
      Steven Rostedt (VMware) 提交于
      The early_initcall() functions get assigned to __initcall_start[]. These are
      called by do_pre_smp_initcalls(). The initcall_levels[] array starts with
      __initcall0_start[], and initcall_levels[] are to match the
      initcall_level_names[] array. The first name in that array is "early", but
      that is not correct. As pure_initcall() functions get assigned to
      __initcall0_start[] array.
      
      Change the first name in initcall_level_names[] array to "pure".
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      a6fb6012
  16. 20 3月, 2018 1 次提交
    • J
      jump_label: Disable jump labels in __exit code · 578ae447
      Josh Poimboeuf 提交于
      With the following commit:
      
        33352244 ("jump_label: Explicitly disable jump labels in __init code")
      
      ... we explicitly disabled jump labels in __init code, so they could be
      detected and not warned about in the following commit:
      
        dc1dd184 ("jump_label: Warn on failed jump_label patching attempt")
      
      In-kernel __exit code has the same issue.  It's never used, so it's
      freed along with the rest of initmem.  But jump label entries in __exit
      code aren't explicitly disabled, so we get the following warning when
      enabling pr_debug() in __exit code:
      
        can't patch jump_label at dmi_sysfs_exit+0x0/0x2d
        WARNING: CPU: 0 PID: 22572 at kernel/jump_label.c:376 __jump_label_update+0x9d/0xb0
      
      Fix the warning by disabling all jump labels in initmem (which includes
      both __init and __exit code).
      Reported-and-tested-by: NLi Wang <liwang@redhat.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: dc1dd184 ("jump_label: Warn on failed jump_label patching attempt")
      Link: http://lkml.kernel.org/r/7121e6e595374f06616c505b6e690e275c0054d1.1521483452.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      578ae447
  17. 21 2月, 2018 1 次提交
  18. 24 12月, 2017 1 次提交
    • T
      x86/mm/pti: Add infrastructure for page table isolation · aa8c6248
      Thomas Gleixner 提交于
      Add the initial files for kernel page table isolation, with a minimal init
      function and the boot time detection for this misfeature.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      aa8c6248
  19. 23 12月, 2017 1 次提交
    • T
      init: Invoke init_espfix_bsp() from mm_init() · 613e396b
      Thomas Gleixner 提交于
      init_espfix_bsp() needs to be invoked before the page table isolation
      initialization. Move it into mm_init() which is the place where pti_init()
      will be added.
      
      While at it get rid of the #ifdeffery and provide proper stub functions.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      613e396b
  20. 28 11月, 2017 1 次提交
  21. 18 11月, 2017 2 次提交
  22. 16 11月, 2017 1 次提交
  23. 27 10月, 2017 1 次提交
  24. 27 9月, 2017 1 次提交
    • D
      ACPI/init: Invoke early ACPI initialization earlier · 9c71206d
      Dou Liyang 提交于
      acpi_early_init() unmaps the temporary ACPI Table mappings which are used
      in the early startup code and prepares for permanent table mappings.
      
      Before the consolidation of the x86 APIC setup code the invocation of
      acpi_early_init() happened before the interrupt remapping unit was
      initialized. With the rework the remapping unit initialization moved in
      front of acpi_early_init() which causes an ACPI warning when the ACPI root
      tables get reallocated afterwards.
      
      Invoke acpi_early_init() before late_time_init() which is before the access
      to the DMAR tables happens.
      
      Fixes: 935356ce ("x86/apic: Initialize interrupt mode after timer init")
      Reported-by: NXiaolong Ye <xiaolong.ye@intel.com>
      Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-ia64@vger.kernel.org
      Cc: bhe@redhat.com
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-acpi@vger.kernel.org
      Cc: bp@alien8.de
      Cc: Lv" <lv.zheng@intel.com>
      Cc: yinghai@kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lkml.kernel.org/r/1505294274-441-1-git-send-email-douly.fnst@cn.fujitsu.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      9c71206d
  25. 09 9月, 2017 2 次提交
    • D
      init/main.c: extract early boot entropy from the passed cmdline · 33d72f38
      Daniel Micay 提交于
      Feed the boot command-line as to the /dev/random entropy pool
      
      Existing Android bootloaders usually pass data which may not be known by
      an external attacker on the kernel command-line.  It may also be the
      case on other embedded systems.  Sample command-line from a Google Pixel
      running CopperheadOS....
      
          console=ttyHSL0,115200,n8 androidboot.console=ttyHSL0
          androidboot.hardware=sailfish user_debug=31 ehci-hcd.park=3
          lpm_levels.sleep_disabled=1 cma=32M@0-0xffffffff buildvariant=user
          veritykeyid=id:dfcb9db0089e5b3b4090a592415c28e1cb4545ab
          androidboot.bootdevice=624000.ufshc androidboot.verifiedbootstate=yellow
          androidboot.veritymode=enforcing androidboot.keymaster=1
          androidboot.serialno=FA6CE0305299 androidboot.baseband=msm
          mdss_mdp.panel=1:dsi:0:qcom,mdss_dsi_samsung_ea8064tg_1080p_cmd:1:none:cfg:single_dsi
          androidboot.slot_suffix=_b fpsimd.fpsimd_settings=0
          app_setting.use_app_setting=0 kernelflag=0x00000000 debugflag=0x00000000
          androidboot.hardware.revision=PVT radioflag=0x00000000
          radioflagex1=0x00000000 radioflagex2=0x00000000 cpumask=0x00000000
          androidboot.hardware.ddr=4096MB,Hynix,LPDDR4 androidboot.ddrinfo=00000006
          androidboot.ddrsize=4GB androidboot.hardware.color=GRA00
          androidboot.hardware.ufs=32GB,Samsung androidboot.msm.hw_ver_id=268824801
          androidboot.qf.st=2 androidboot.cid=11111111 androidboot.mid=G-2PW4100
          androidboot.bootloader=8996-012001-1704121145
          androidboot.oem_unlock_support=1 androidboot.fp_src=1
          androidboot.htc.hrdump=detected androidboot.ramdump.opt=mem@2g:2g,mem@4g:2g
          androidboot.bootreason=reboot androidboot.ramdump_enable=0 ro
          root=/dev/dm-0 dm="system none ro,0 1 android-verity /dev/sda34"
          rootwait skip_initramfs init=/init androidboot.wificountrycode=US
          androidboot.boottime=1BLL:85,1BLE:669,2BLL:0,2BLE:1777,SW:6,KL:8136
      
      Among other things, it contains a value unique to the device
      (androidboot.serialno=FA6CE0305299), unique to the OS builds for the
      device variant (veritykeyid=id:dfcb9db0089e5b3b4090a592415c28e1cb4545ab)
      and timings from the bootloader stages in milliseconds
      (androidboot.boottime=1BLL:85,1BLE:669,2BLL:0,2BLE:1777,SW:6,KL:8136).
      
      [tytso@mit.edu: changelog tweak]
      [labbott@redhat.com: line-wrapped command line]
      Link: http://lkml.kernel.org/r/20170816231458.2299-3-labbott@redhat.comSigned-off-by: NDaniel Micay <danielmicay@gmail.com>
      Signed-off-by: NLaura Abbott <labbott@redhat.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Nick Kralevich <nnk@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      33d72f38
    • L
      init: move stack canary initialization after setup_arch · 121388a3
      Laura Abbott 提交于
      Patch series "Command line randomness", v3.
      
      A series to add the kernel command line as a source of randomness.
      
      This patch (of 2):
      
      Stack canary intialization involves getting a random number.  Getting this
      random number may involve accessing caches or other architectural specific
      features which are not available until after the architecture is setup.
      Move the stack canary initialization later to accommodate this.
      
      Link: http://lkml.kernel.org/r/20170816231458.2299-2-labbott@redhat.comSigned-off-by: NLaura Abbott <lauraa@codeaurora.org>
      Signed-off-by: NLaura Abbott <labbott@redhat.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Daniel Micay <danielmicay@gmail.com>
      Cc: Nick Kralevich <nnk@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      121388a3
  26. 07 9月, 2017 1 次提交
  27. 14 8月, 2017 1 次提交
    • W
      debugobjects: Make kmemleak ignore debug objects · caba4cbb
      Waiman Long 提交于
      The allocated debug objects are either on the free list or in the
      hashed bucket lists. So they won't get lost. However if both debug
      objects and kmemleak are enabled and kmemleak scanning is done
      while some of the debug objects are transitioning from one list to
      the others, false negative reporting of memory leaks may happen for
      those objects. For example,
      
      [38687.275678] kmemleak: 12 new suspected memory leaks (see
      /sys/kernel/debug/kmemleak)
      unreferenced object 0xffff92e98aabeb68 (size 40):
        comm "ksmtuned", pid 4344, jiffies 4298403600 (age 906.430s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 d0 bc db 92 e9 92 ff ff  ................
          01 00 00 00 00 00 00 00 38 36 8a 61 e9 92 ff ff  ........86.a....
        backtrace:
          [<ffffffff8fa5378a>] kmemleak_alloc+0x4a/0xa0
          [<ffffffff8f47c019>] kmem_cache_alloc+0xe9/0x320
          [<ffffffff8f62ed96>] __debug_object_init+0x3e6/0x400
          [<ffffffff8f62ef01>] debug_object_activate+0x131/0x210
          [<ffffffff8f330d9f>] __call_rcu+0x3f/0x400
          [<ffffffff8f33117d>] call_rcu_sched+0x1d/0x20
          [<ffffffff8f4a183c>] put_object+0x2c/0x40
          [<ffffffff8f4a188c>] __delete_object+0x3c/0x50
          [<ffffffff8f4a18bd>] delete_object_full+0x1d/0x20
          [<ffffffff8fa535c2>] kmemleak_free+0x32/0x80
          [<ffffffff8f47af07>] kmem_cache_free+0x77/0x350
          [<ffffffff8f453912>] unlink_anon_vmas+0x82/0x1e0
          [<ffffffff8f440341>] free_pgtables+0xa1/0x110
          [<ffffffff8f44af91>] exit_mmap+0xc1/0x170
          [<ffffffff8f29db60>] mmput+0x80/0x150
          [<ffffffff8f2a7609>] do_exit+0x2a9/0xd20
      
      The references in the debug objects may also hide a real memory leak.
      
      As there is no point in having kmemleak to track debug object
      allocations, kmemleak checking is now disabled for debug objects.
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1502718733-8527-1-git-send-email-longman@redhat.com
      caba4cbb
  28. 10 8月, 2017 1 次提交
  29. 27 7月, 2017 1 次提交
    • D
      percpu: replace area map allocator with bitmap · 40064aec
      Dennis Zhou (Facebook) 提交于
      The percpu memory allocator is experiencing scalability issues when
      allocating and freeing large numbers of counters as in BPF.
      Additionally, there is a corner case where iteration is triggered over
      all chunks if the contig_hint is the right size, but wrong alignment.
      
      This patch replaces the area map allocator with a basic bitmap allocator
      implementation. Each subsequent patch will introduce new features and
      replace full scanning functions with faster non-scanning options when
      possible.
      
      Implementation:
      This patchset removes the area map allocator in favor of a bitmap
      allocator backed by metadata blocks. The primary goal is to provide
      consistency in performance and memory footprint with a focus on small
      allocations (< 64 bytes). The bitmap removes the heavy memmove from the
      freeing critical path and provides a consistent memory footprint. The
      metadata blocks provide a bound on the amount of scanning required by
      maintaining a set of hints.
      
      In an effort to make freeing fast, the metadata is updated on the free
      path if the new free area makes a page free, a block free, or spans
      across blocks. This causes the chunk's contig hint to potentially be
      smaller than what it could allocate by up to the smaller of a page or a
      block. If the chunk's contig hint is contained within a block, a check
      occurs and the hint is kept accurate. Metadata is always kept accurate
      on allocation, so there will not be a situation where a chunk has a
      later contig hint than available.
      
      Evaluation:
      I have primarily done testing against a simple workload of allocation of
      1 million objects (2^20) of varying size. Deallocation was done by in
      order, alternating, and in reverse. These numbers were collected after
      rebasing ontop of a80099a1. I present the worst-case numbers here:
      
        Area Map Allocator:
      
              Object Size | Alloc Time (ms) | Free Time (ms)
              ----------------------------------------------
                    4B    |        310      |     4770
                   16B    |        557      |     1325
                   64B    |        436      |      273
                  256B    |        776      |      131
                 1024B    |       3280      |      122
      
        Bitmap Allocator:
      
              Object Size | Alloc Time (ms) | Free Time (ms)
              ----------------------------------------------
                    4B    |        490      |       70
                   16B    |        515      |       75
                   64B    |        610      |       80
                  256B    |        950      |      100
                 1024B    |       3520      |      200
      
      This data demonstrates the inability for the area map allocator to
      handle less than ideal situations. In the best case of reverse
      deallocation, the area map allocator was able to perform within range
      of the bitmap allocator. In the worst case situation, freeing took
      nearly 5 seconds for 1 million 4-byte objects. The bitmap allocator
      dramatically improves the consistency of the free path. The small
      allocations performed nearly identical regardless of the freeing
      pattern.
      
      While it does add to the allocation latency, the allocation scenario
      here is optimal for the area map allocator. The area map allocator runs
      into trouble when it is allocating in chunks where the latter half is
      full. It is difficult to replicate this, so I present a variant where
      the pages are second half filled. Freeing was done sequentially. Below
      are the numbers for this scenario:
      
        Area Map Allocator:
      
              Object Size | Alloc Time (ms) | Free Time (ms)
              ----------------------------------------------
                    4B    |       4118      |     4892
                   16B    |       1651      |     1163
                   64B    |        598      |      285
                  256B    |        771      |      158
                 1024B    |       3034      |      160
      
        Bitmap Allocator:
      
              Object Size | Alloc Time (ms) | Free Time (ms)
              ----------------------------------------------
                    4B    |        481      |       67
                   16B    |        506      |       69
                   64B    |        636      |       75
                  256B    |        892      |       90
                 1024B    |       3262      |      147
      
      The data shows a parabolic curve of performance for the area map
      allocator. This is due to the memmove operation being the dominant cost
      with the lower object sizes as more objects are packed in a chunk and at
      higher object sizes, the traversal of the chunk slots is the dominating
      cost. The bitmap allocator suffers this problem as well. The above data
      shows the inability to scale for the allocation path with the area map
      allocator and that the bitmap allocator demonstrates consistent
      performance in general.
      
      The second problem of additional scanning can result in the area map
      allocator completing in 52 minutes when trying to allocate 1 million
      4-byte objects with 8-byte alignment. The same workload takes
      approximately 16 seconds to complete for the bitmap allocator.
      
      V2:
      Fixed a bug in pcpu_alloc_first_chunk end_offset was setting the bitmap
      using bytes instead of bits.
      
      Added a comment to pcpu_cnt_pop_pages to explain bitmap_weight.
      Signed-off-by: NDennis Zhou <dennisszhou@gmail.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      40064aec
  30. 18 7月, 2017 1 次提交
    • T
      x86, swiotlb: Add memory encryption support · c7753208
      Tom Lendacky 提交于
      Since DMA addresses will effectively look like 48-bit addresses when the
      memory encryption mask is set, SWIOTLB is needed if the DMA mask of the
      device performing the DMA does not support 48-bits. SWIOTLB will be
      initialized to create decrypted bounce buffers for use by these devices.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Toshimitsu Kani <toshi.kani@hpe.com>
      Cc: kasan-dev@googlegroups.com
      Cc: kvm@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-efi@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/aa2d29b78ae7d508db8881e46a3215231b9327a7.1500319216.git.thomas.lendacky@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c7753208
  31. 13 7月, 2017 1 次提交
    • K
      random: do not ignore early device randomness · ee7998c5
      Kees Cook 提交于
      The add_device_randomness() function would ignore incoming bytes if the
      crng wasn't ready.  This additionally makes sure to make an early enough
      call to add_latent_entropy() to influence the initial stack canary,
      which is especially important on non-x86 systems where it stays the same
      through the life of the boot.
      
      Link: http://lkml.kernel.org/r/20170626233038.GA48751@beastSigned-off-by: NKees Cook <keescook@chromium.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jessica Yu <jeyu@redhat.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Lokesh Vutla <lokeshvutla@ti.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee7998c5