1. 18 10月, 2019 19 次提交
    • J
      x86/asm: Remove the last GLOBAL user and remove the macro · b4edca15
      Jiri Slaby 提交于
      Convert the remaining 32bit users and remove the GLOBAL macro finally.
      In particular, this means to use SYM_ENTRY for the singlestepping hack
      region.
      
      Exclude the global definition of GLOBAL from x86 too.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-20-jslaby@suse.cz
      b4edca15
    • J
      x86/asm/realmode: Use SYM_DATA_* instead of GLOBAL · 78f44330
      Jiri Slaby 提交于
      GLOBAL had several meanings and is going away. Convert all the data
      marked using GLOBAL to use SYM_DATA_START or SYM_DATA instead.
      
      Note that SYM_DATA_END_LABEL is used to generate tr_gdt_end too.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Pingfan Liu <kernelfans@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-19-jslaby@suse.cz
      78f44330
    • J
      x86/asm: Use SYM_INNER_LABEL instead of GLOBAL · 26ba4e57
      Jiri Slaby 提交于
      The GLOBAL macro had several meanings and is going away. Convert all the
      inner function labels marked with GLOBAL to use SYM_INNER_LABEL instead.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-18-jslaby@suse.cz
      26ba4e57
    • J
      x86/asm: Do not annotate functions with GLOBAL · 37818afd
      Jiri Slaby 提交于
      GLOBAL is an x86's custom macro and is going to die very soon. It was
      meant for global symbols, but here, it was used for functions. Instead,
      use the new macros SYM_FUNC_START* and SYM_CODE_START* (depending on the
      type of the function) which are dedicated to global functions. And since
      they both require a closing by SYM_*_END, do that here too.
      
      startup_64, which does not use GLOBAL but uses .globl explicitly, is
      converted too.
      
      "No alignments" are preserved.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Allison Randal <allison@lohutok.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Cao jin <caoj.fnst@cn.fujitsu.com>
      Cc: Enrico Weigelt <info@metux.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: linux-arch@vger.kernel.org
      Cc: Maran Wilson <maran.wilson@oracle.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-17-jslaby@suse.cz
      37818afd
    • J
      x86/asm/purgatory: Start using annotations · b16fed65
      Jiri Slaby 提交于
      Purgatory used no annotations at all. So include linux/linkage.h and
      annotate everything:
      
      * code by SYM_CODE_*
      * data by SYM_DATA_*
      
       [ bp: Fixup comment in gdt: ]
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Alexios Zavras <alexios.zavras@intel.com>
      Cc: Allison Randal <allison@lohutok.net>
      Cc: Enrico Weigelt <info@metux.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-16-jslaby@suse.cz
      b16fed65
    • J
      xen/pvh: Annotate data appropriately · 1de5bdce
      Jiri Slaby 提交于
      Use the new SYM_DATA_START_LOCAL, and SYM_DATA_END* macros to get:
      
        0000     8 OBJECT  LOCAL  DEFAULT    6 gdt
        0008    32 OBJECT  LOCAL  DEFAULT    6 gdt_start
        0028     0 OBJECT  LOCAL  DEFAULT    6 gdt_end
        0028   256 OBJECT  LOCAL  DEFAULT    6 early_stack
        0128     0 OBJECT  LOCAL  DEFAULT    6 early_stack
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Andy Shevchenko <andy@infradead.org>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: linux-arch@vger.kernel.org
      Cc: platform-driver-x86@vger.kernel.org
      Cc: Stefano Stabellini <sstabellini@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Cc: xen-devel@lists.xenproject.org
      Link: https://lkml.kernel.org/r/20191011115108.12392-15-jslaby@suse.cz
      1de5bdce
    • J
      x86/um: Annotate data appropriately · 773a37b1
      Jiri Slaby 提交于
      Use the new SYM_DATA_START and SYM_DATA_END_LABEL macros for vdso_start.
      
      Result is:
        0000  2376 OBJECT  GLOBAL DEFAULT    4 vdso_start
        0948     0 OBJECT  GLOBAL DEFAULT    4 vdso_end
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-um@lists.infradead.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: user-mode-linux-devel@lists.sourceforge.net
      Cc: user-mode-linux-user@lists.sourceforge.net
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-14-jslaby@suse.cz
      773a37b1
    • J
      x86/boot: Annotate data appropriately · b8c3f9b5
      Jiri Slaby 提交于
      Use the new SYM_DATA, SYM_DATA_START, and SYM_DATA_END* macros for data,
      so that the data in the object file look sane:
      
        Value   Size Type    Bind   Vis      Ndx Name
          0000    10 OBJECT  GLOBAL DEFAULT    3 efi32_boot_gdt
          000a    10 OBJECT  LOCAL  DEFAULT    3 save_gdt
          0014     8 OBJECT  LOCAL  DEFAULT    3 func_rt_ptr
          001c    48 OBJECT  GLOBAL DEFAULT    3 efi_gdt64
          004c     0 OBJECT  LOCAL  DEFAULT    3 efi_gdt64_end
      
          0000    48 OBJECT  LOCAL  DEFAULT    3 gdt
          0030     0 OBJECT  LOCAL  DEFAULT    3 gdt_end
          0030     8 OBJECT  LOCAL  DEFAULT    3 efi_config
          0038    49 OBJECT  GLOBAL DEFAULT    3 efi32_config
          0069    49 OBJECT  GLOBAL DEFAULT    3 efi64_config
      
      All have correct size and type now.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Allison Randal <allison@lohutok.net>
      Cc: Cao jin <caoj.fnst@cn.fujitsu.com>
      Cc: Enrico Weigelt <info@metux.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wei Huang <wei@redhat.com>
      Cc: x86-ml <x86@kernel.org>
      Cc: Xiaoyao Li <xiaoyao.li@linux.intel.com>
      Link: https://lkml.kernel.org/r/20191011115108.12392-13-jslaby@suse.cz
      b8c3f9b5
    • J
      x86/asm/head: Annotate data appropriately · b1bd27b9
      Jiri Slaby 提交于
      Use the new SYM_DATA, SYM_DATA_START, and SYM_DATA_END in both 32 and 64
      bit head_*.S. In the 64-bit version, define also
      SYM_DATA_START_PAGE_ALIGNED locally using the new SYM_START. It is used
      in the code instead of NEXT_PAGE() which was defined in this file and
      had been using the obsolete macro GLOBAL().
      
      Now, the data in the 64-bit object file look sane:
        Value   Size Type    Bind   Vis      Ndx Name
          0000  4096 OBJECT  GLOBAL DEFAULT   15 init_level4_pgt
          1000  4096 OBJECT  GLOBAL DEFAULT   15 level3_kernel_pgt
          2000  2048 OBJECT  GLOBAL DEFAULT   15 level2_kernel_pgt
          3000  4096 OBJECT  GLOBAL DEFAULT   15 level2_fixmap_pgt
          4000  4096 OBJECT  GLOBAL DEFAULT   15 level1_fixmap_pgt
          5000     2 OBJECT  GLOBAL DEFAULT   15 early_gdt_descr
          5002     8 OBJECT  LOCAL  DEFAULT   15 early_gdt_descr_base
          500a     8 OBJECT  GLOBAL DEFAULT   15 phys_base
          0000     8 OBJECT  GLOBAL DEFAULT   17 initial_code
          0008     8 OBJECT  GLOBAL DEFAULT   17 initial_gs
          0010     8 OBJECT  GLOBAL DEFAULT   17 initial_stack
          0000     4 OBJECT  GLOBAL DEFAULT   19 early_recursion_flag
          1000  4096 OBJECT  GLOBAL DEFAULT   19 early_level4_pgt
          2000 0x40000 OBJECT  GLOBAL DEFAULT   19 early_dynamic_pgts
          0000  4096 OBJECT  GLOBAL DEFAULT   22 empty_zero_page
      
      All have correct size and type now.
      
      Note that this also removes implicit 16B alignment previously inserted
      by ENTRY:
      
      * initial_code, setup_once_ref, initial_page_table, initial_stack,
        boot_gdt are still aligned
      * early_gdt_descr is now properly aligned as was intended before ENTRY
        was added there long time ago
      * phys_base's alignment is kept by an explicitly added new alignment
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Cao jin <caoj.fnst@cn.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Maran Wilson <maran.wilson@oracle.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-12-jslaby@suse.cz
      b1bd27b9
    • J
      x86/asm/entry: Annotate interrupt symbols properly · cc66936e
      Jiri Slaby 提交于
      * annotate functions properly by SYM_CODE_START, SYM_CODE_START_LOCAL*
        and SYM_CODE_END -- these are not C-like functions, so they have to
        be annotated using CODE.
      * use SYM_INNER_LABEL* for labels being in the middle of other functions
        This prevents nested labels annotations.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-11-jslaby@suse.cz
      cc66936e
    • J
      x86/asm: Annotate aliases · e9b9d020
      Jiri Slaby 提交于
      _key_expansion_128 is an alias to _key_expansion_256a, __memcpy to
      memcpy, xen_syscall32_target to xen_sysenter_target, and so on. Annotate
      them all using the new SYM_FUNC_START_ALIAS, SYM_FUNC_START_LOCAL_ALIAS,
      and SYM_FUNC_END_ALIAS. This will make the tools generating the
      debuginfo happy as it avoids nesting and double symbols.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: Juergen Gross <jgross@suse.com> [xen parts]
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-crypto@vger.kernel.org
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stefano Stabellini <sstabellini@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Cc: xen-devel@lists.xenproject.org
      Link: https://lkml.kernel.org/r/20191011115108.12392-10-jslaby@suse.cz
      e9b9d020
    • J
      x86/uaccess: Annotate local function · fa972201
      Jiri Slaby 提交于
      .Lcopy_user_handle_tail is a self-standing local function, annotate it
      as such using SYM_CODE_START_LOCAL.
      
      Again, no functional change, just documentation.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-9-jslaby@suse.cz
      fa972201
    • J
      x86/boot: Annotate local functions · deff8a24
      Jiri Slaby 提交于
      .Lrelocated, .Lpaging_enabled, .Lno_longmode, and .Lin_pm32 are
      self-standing local functions, annotate them as such and preserve "no
      alignment".
      
      The annotations do not generate anything yet.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Cao jin <caoj.fnst@cn.fujitsu.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wei Huang <wei@redhat.com>
      Cc: x86-ml <x86@kernel.org>
      Cc: Xiaoyao Li <xiaoyao.li@linux.intel.com>
      Link: https://lkml.kernel.org/r/20191011115108.12392-8-jslaby@suse.cz
      deff8a24
    • J
      x86/asm/crypto: Annotate local functions · 74d8b90a
      Jiri Slaby 提交于
      Use the newly added SYM_FUNC_START_LOCAL to annotate beginnings of all
      functions which do not have ".globl" annotation, but their endings are
      annotated by ENDPROC. This is needed to balance ENDPROC for tools that
      generate debuginfo.
      
      These function names are not prepended with ".L" as they might appear in
      call traces and they wouldn't be visible after such change.
      
      To be symmetric, the functions' ENDPROCs are converted to the new
      SYM_FUNC_END.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-crypto@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-7-jslaby@suse.cz
      74d8b90a
    • J
      x86/asm: Annotate local pseudo-functions · ef77e688
      Jiri Slaby 提交于
      Use the newly added SYM_CODE_START_LOCAL* to annotate beginnings of
      all pseudo-functions (those ending with END until now) which do not
      have ".globl" annotation. This is needed to balance END for tools that
      generate debuginfo. Note that ENDs are switched to SYM_CODE_END too so
      that everybody can see the pairing.
      
      C-like functions (which handle frame ptr etc.) are not annotated here,
      hence SYM_CODE_* macros are used here, not SYM_FUNC_*. Note that the
      32bit version of early_idt_handler_common already had ENDPROC -- switch
      that to SYM_CODE_END for the same reason as above (and to be the same as
      64bit).
      
      While early_idt_handler_common is LOCAL, it's name is not prepended with
      ".L" as it happens to appear in call traces.
      
      bad_get_user*, and bad_put_user are now aligned, as they are separate
      functions. They do not mind to be aligned -- no need to be compact
      there.
      
      early_idt_handler_common is aligned now too, as it is after
      early_idt_handler_array, so as well no need to be compact there.
      
      verify_cpu is self-standing and included in other .S files, so align it
      too.
      
      The others have alignment preserved to what it used to be (using the
      _NOALIGN variant of macros).
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Alexios Zavras <alexios.zavras@intel.com>
      Cc: Allison Randal <allison@lohutok.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Cao jin <caoj.fnst@cn.fujitsu.com>
      Cc: Enrico Weigelt <info@metux.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Maran Wilson <maran.wilson@oracle.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-6-jslaby@suse.cz
      ef77e688
    • J
      x86/asm/entry: Annotate THUNKs · 76dc6d60
      Jiri Slaby 提交于
      Place SYM_*_START_NOALIGN and SYM_*_END around the THUNK macro body.
      Preserve @function by FUNC (64bit) and CODE (32bit). Given it was not
      marked as aligned, use NOALIGN.
      
      The result:
       Value  Size Type    Bind   Vis      Ndx Name
        0000    28 FUNC    GLOBAL DEFAULT    1 trace_hardirqs_on_thunk
        001c    28 FUNC    GLOBAL DEFAULT    1 trace_hardirqs_off_thunk
        0038    24 FUNC    GLOBAL DEFAULT    1 lockdep_sys_exit_thunk
        0050    24 FUNC    GLOBAL DEFAULT    1 ___preempt_schedule
        0068    24 FUNC    GLOBAL DEFAULT    1 ___preempt_schedule_notra
      
      The annotation of .L_restore does not generate anything (at the moment).
      Here, it just serves documentation purposes (as opening and closing
      brackets of functions).
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-5-jslaby@suse.cz
      76dc6d60
    • J
      x86/asm: Annotate relocate_kernel_{32,64}.c · 6ec2a968
      Jiri Slaby 提交于
      There are functions in relocate_kernel_{32,64}.c which are not
      annotated. This makes automatic annotations on them rather hard. So
      annotate all the functions now.
      
      Note that these are not C-like functions, so FUNC is not used. Instead
      CODE markers are used. Also the functions are not aligned, so the
      NOALIGN versions are used:
      
      - SYM_CODE_START_NOALIGN
      - SYM_CODE_START_LOCAL_NOALIGN
      - SYM_CODE_END
      
      The result is:
        0000   108 NOTYPE  GLOBAL DEFAULT    1 relocate_kernel
        006c   165 NOTYPE  LOCAL  DEFAULT    1 identity_mapped
        0146   127 NOTYPE  LOCAL  DEFAULT    1 swap_pages
        0111    53 NOTYPE  LOCAL  DEFAULT    1 virtual_mapped
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Alexios Zavras <alexios.zavras@intel.com>
      Cc: Allison Randal <allison@lohutok.net>
      Cc: Enrico Weigelt <info@metux.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-4-jslaby@suse.cz
      6ec2a968
    • J
      x86/asm/suspend: Use SYM_DATA for data · 37503f73
      Jiri Slaby 提交于
      Some global data in the suspend code were marked as `ENTRY'. ENTRY was
      intended for functions and shall be paired with ENDPROC. ENTRY also
      aligns symbols to 16 bytes which creates unnecessary holes.
      
      Note that:
      
      * saved_magic (long) in wakeup_32 is still prepended by section's ALIGN
      * saved_magic (quad) in wakeup_64 follows a bunch of quads which are
        aligned (but need not be aligned to 16)
      
      Since historical markings are being dropped, make proper use of newly
      added SYM_DATA in this code.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-pm@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-3-jslaby@suse.cz
      37503f73
    • J
      linkage: Introduce new macros for assembler symbols · ffedeeb7
      Jiri Slaby 提交于
      Introduce new C macros for annotations of functions and data in
      assembly. There is a long-standing mess in macros like ENTRY, END,
      ENDPROC and similar. They are used in different manners and sometimes
      incorrectly.
      
      So introduce macros with clear use to annotate assembly as follows:
      
      a) Support macros for the ones below
         SYM_T_FUNC -- type used by assembler to mark functions
         SYM_T_OBJECT -- type used by assembler to mark data
         SYM_T_NONE -- type used by assembler to mark entries of unknown type
      
         They are defined as STT_FUNC, STT_OBJECT, and STT_NOTYPE
         respectively. According to the gas manual, this is the most portable
         way. I am not sure about other assemblers, so this can be switched
         back to %function and %object if this turns into a problem.
         Architectures can also override them by something like ", @function"
         if they need.
      
         SYM_A_ALIGN, SYM_A_NONE -- align the symbol?
         SYM_L_GLOBAL, SYM_L_WEAK, SYM_L_LOCAL -- linkage of symbols
      
      b) Mostly internal annotations, used by the ones below
         SYM_ENTRY -- use only if you have to (for non-paired symbols)
         SYM_START -- use only if you have to (for paired symbols)
         SYM_END -- use only if you have to (for paired symbols)
      
      c) Annotations for code
         SYM_INNER_LABEL_ALIGN -- only for labels in the middle of code
         SYM_INNER_LABEL -- only for labels in the middle of code
      
         SYM_FUNC_START_LOCAL_ALIAS -- use where there are two local names for
      	one function
         SYM_FUNC_START_ALIAS -- use where there are two global names for one
      	function
         SYM_FUNC_END_ALIAS -- the end of LOCAL_ALIASed or ALIASed function
      
         SYM_FUNC_START -- use for global functions
         SYM_FUNC_START_NOALIGN -- use for global functions, w/o alignment
         SYM_FUNC_START_LOCAL -- use for local functions
         SYM_FUNC_START_LOCAL_NOALIGN -- use for local functions, w/o
      	alignment
         SYM_FUNC_START_WEAK -- use for weak functions
         SYM_FUNC_START_WEAK_NOALIGN -- use for weak functions, w/o alignment
         SYM_FUNC_END -- the end of SYM_FUNC_START_LOCAL, SYM_FUNC_START,
      	SYM_FUNC_START_WEAK, ...
      
         For functions with special (non-C) calling conventions:
         SYM_CODE_START -- use for non-C (special) functions
         SYM_CODE_START_NOALIGN -- use for non-C (special) functions, w/o
      	alignment
         SYM_CODE_START_LOCAL -- use for local non-C (special) functions
         SYM_CODE_START_LOCAL_NOALIGN -- use for local non-C (special)
      	functions, w/o alignment
         SYM_CODE_END -- the end of SYM_CODE_START_LOCAL or SYM_CODE_START
      
      d) For data
         SYM_DATA_START -- global data symbol
         SYM_DATA_START_LOCAL -- local data symbol
         SYM_DATA_END -- the end of the SYM_DATA_START symbol
         SYM_DATA_END_LABEL -- the labeled end of SYM_DATA_START symbol
         SYM_DATA -- start+end wrapper around simple global data
         SYM_DATA_LOCAL -- start+end wrapper around simple local data
      
      ==========
      
      The macros allow to pair starts and ends of functions and mark functions
      correctly in the output ELF objects.
      
      All users of the old macros in x86 are converted to use these in further
      patches.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-pm@vger.kernel.org
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: x86-ml <x86@kernel.org>
      Cc: xen-devel@lists.xenproject.org
      Link: https://lkml.kernel.org/r/20191011115108.12392-2-jslaby@suse.cz
      ffedeeb7
  2. 11 10月, 2019 1 次提交
    • J
      x86/asm: Make more symbols local · 30a2441c
      Jiri Slaby 提交于
      During the assembly cleanup patchset review, I found more symbols which
      are used only locally. So make them really local by prepending ".L" to
      them. Namely:
      
       - wakeup_idt is used only in realmode/rm/wakeup_asm.S.
       - in_pm32 is used only in boot/pmjump.S.
       - retint_user is used only in entry/entry_64.S, perhaps since commit
         2ec67971 ("x86/entry/64/compat: Remove most of the fast system
         call machinery"), where entry_64_compat's caller was removed.
      
      Drop GLOBAL from all of them too. I do not see more candidates in the
      series.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Link: https://lkml.kernel.org/r/20191011092213.31470-1-jslaby@suse.czSigned-off-by: NIngo Molnar <mingo@kernel.org>
      30a2441c
  3. 05 10月, 2019 2 次提交
    • J
      x86/asm: Make boot_gdt_descr local · 5aa5cbd2
      Jiri Slaby 提交于
      As far as I can see, it was never used outside of head_32.S. Not even
      when added in 2004. So make it local.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191003095238.29831-2-jslaby@suse.cz
      5aa5cbd2
    • J
      x86/asm: Reorder early variables · 1a8770b7
      Jiri Slaby 提交于
      Moving early_recursion_flag (4 bytes) after early_level4_pgt (4k) and
      early_dynamic_pgts (256k) saves 4k which are used for alignment of
      early_level4_pgt after early_recursion_flag.
      
      The real improvement is merely on the source code side. Previously it
      was:
      * __INITDATA + .balign
      * early_recursion_flag variable
      * a ton of CPP MACROS
      * __INITDATA (again)
      * early_top_pgt and early_recursion_flag variables
      * .data
      
      Now, it is a bit simpler:
      * a ton of CPP MACROS
      * __INITDATA + .balign
      * early_top_pgt and early_recursion_flag variables
      * early_recursion_flag variable
      * .data
      
      On the binary level the change looks like this:
      Before:
       (sections)
        12 .init.data    00042000  0000000000000000  0000000000000000 00008000  2**12
       (symbols)
        000000       4 OBJECT  GLOBAL DEFAULT   22 early_recursion_flag
        001000    4096 OBJECT  GLOBAL DEFAULT   22 early_top_pgt
        002000 0x40000 OBJECT  GLOBAL DEFAULT   22 early_dynamic_pgts
      
      After:
       (sections)
        12 .init.data    00041004  0000000000000000  0000000000000000 00008000  2**12
       (symbols)
        000000    4096 OBJECT  GLOBAL DEFAULT   22 early_top_pgt
        001000 0x40000 OBJECT  GLOBAL DEFAULT   22 early_dynamic_pgts
        041000       4 OBJECT  GLOBAL DEFAULT   22 early_recursion_flag
      
      So the resulting vmlinux is smaller by 4k with my toolchain as many
      other variables can be placed after early_recursion_flag to fill the
      rest of the page. Note that this is only .init data, so it is freed
      right after being booted anyway. Savings on-disk are none -- compression
      of zeros is easy, so the size of bzImage is the same pre and post the
      change.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191003095238.29831-1-jslaby@suse.cz
      1a8770b7
  4. 30 9月, 2019 5 次提交
  5. 27 9月, 2019 3 次提交
  6. 26 9月, 2019 6 次提交
    • M
      hexagon: drop empty and unused free_initrd_mem · c7cc8d77
      Mike Rapoport 提交于
      hexagon never reserves or initializes initrd and the only mention of it is
      the empty free_initrd_mem() function.
      
      As we have a generic implementation of free_initrd_mem(), there is no need
      to define an empty stub for the hexagon implementation and it can be
      dropped.
      
      Link: http://lkml.kernel.org/r/1565858133-25852-1-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c7cc8d77
    • M
      mm: introduce MADV_PAGEOUT · 1a4e58cc
      Minchan Kim 提交于
      When a process expects no accesses to a certain memory range for a long
      time, it could hint kernel that the pages can be reclaimed instantly but
      data should be preserved for future use.  This could reduce workingset
      eviction so it ends up increasing performance.
      
      This patch introduces the new MADV_PAGEOUT hint to madvise(2) syscall.
      MADV_PAGEOUT can be used by a process to mark a memory range as not
      expected to be used for a long time so that kernel reclaims *any LRU*
      pages instantly.  The hint can help kernel in deciding which pages to
      evict proactively.
      
      A note: It doesn't apply SWAP_CLUSTER_MAX LRU page isolation limit
      intentionally because it's automatically bounded by PMD size.  If PMD
      size(e.g., 256) makes some trouble, we could fix it later by limit it to
      SWAP_CLUSTER_MAX[1].
      
      - man-page material
      
      MADV_PAGEOUT (since Linux x.x)
      
      Do not expect access in the near future so pages in the specified
      regions could be reclaimed instantly regardless of memory pressure.
      Thus, access in the range after successful operation could cause
      major page fault but never lose the up-to-date contents unlike
      MADV_DONTNEED. Pages belonging to a shared mapping are only processed
      if a write access is allowed for the calling process.
      
      MADV_PAGEOUT cannot be applied to locked pages, Huge TLB pages, or
      VM_PFNMAP pages.
      
      [1] https://lore.kernel.org/lkml/20190710194719.GS29695@dhcp22.suse.cz/
      
      [minchan@kernel.org: clear PG_active on MADV_PAGEOUT]
        Link: http://lkml.kernel.org/r/20190802200643.GA181880@google.com
      [akpm@linux-foundation.org: resolve conflicts with hmm.git]
      Link: http://lkml.kernel.org/r/20190726023435.214162-5-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Daniel Colascione <dancol@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Oleksandr Natalenko <oleksandr@redhat.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Sonny Rao <sonnyrao@google.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Tim Murray <timmurray@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1a4e58cc
    • M
      mm: introduce MADV_COLD · 9c276cc6
      Minchan Kim 提交于
      Patch series "Introduce MADV_COLD and MADV_PAGEOUT", v7.
      
      - Background
      
      The Android terminology used for forking a new process and starting an app
      from scratch is a cold start, while resuming an existing app is a hot
      start.  While we continually try to improve the performance of cold
      starts, hot starts will always be significantly less power hungry as well
      as faster so we are trying to make hot start more likely than cold start.
      
      To increase hot start, Android userspace manages the order that apps
      should be killed in a process called ActivityManagerService.
      ActivityManagerService tracks every Android app or service that the user
      could be interacting with at any time and translates that into a ranked
      list for lmkd(low memory killer daemon).  They are likely to be killed by
      lmkd if the system has to reclaim memory.  In that sense they are similar
      to entries in any other cache.  Those apps are kept alive for
      opportunistic performance improvements but those performance improvements
      will vary based on the memory requirements of individual workloads.
      
      - Problem
      
      Naturally, cached apps were dominant consumers of memory on the system.
      However, they were not significant consumers of swap even though they are
      good candidate for swap.  Under investigation, swapping out only begins
      once the low zone watermark is hit and kswapd wakes up, but the overall
      allocation rate in the system might trip lmkd thresholds and cause a
      cached process to be killed(we measured performance swapping out vs.
      zapping the memory by killing a process.  Unsurprisingly, zapping is 10x
      times faster even though we use zram which is much faster than real
      storage) so kill from lmkd will often satisfy the high zone watermark,
      resulting in very few pages actually being moved to swap.
      
      - Approach
      
      The approach we chose was to use a new interface to allow userspace to
      proactively reclaim entire processes by leveraging platform information.
      This allowed us to bypass the inaccuracy of the kernel’s LRUs for pages
      that are known to be cold from userspace and to avoid races with lmkd by
      reclaiming apps as soon as they entered the cached state.  Additionally,
      it could provide many chances for platform to use much information to
      optimize memory efficiency.
      
      To achieve the goal, the patchset introduce two new options for madvise.
      One is MADV_COLD which will deactivate activated pages and the other is
      MADV_PAGEOUT which will reclaim private pages instantly.  These new
      options complement MADV_DONTNEED and MADV_FREE by adding non-destructive
      ways to gain some free memory space.  MADV_PAGEOUT is similar to
      MADV_DONTNEED in a way that it hints the kernel that memory region is not
      currently needed and should be reclaimed immediately; MADV_COLD is similar
      to MADV_FREE in a way that it hints the kernel that memory region is not
      currently needed and should be reclaimed when memory pressure rises.
      
      This patch (of 5):
      
      When a process expects no accesses to a certain memory range, it could
      give a hint to kernel that the pages can be reclaimed when memory pressure
      happens but data should be preserved for future use.  This could reduce
      workingset eviction so it ends up increasing performance.
      
      This patch introduces the new MADV_COLD hint to madvise(2) syscall.
      MADV_COLD can be used by a process to mark a memory range as not expected
      to be used in the near future.  The hint can help kernel in deciding which
      pages to evict early during memory pressure.
      
      It works for every LRU pages like MADV_[DONTNEED|FREE]. IOW, It moves
      
      	active file page -> inactive file LRU
      	active anon page -> inacdtive anon LRU
      
      Unlike MADV_FREE, it doesn't move active anonymous pages to inactive file
      LRU's head because MADV_COLD is a little bit different symantic.
      MADV_FREE means it's okay to discard when the memory pressure because the
      content of the page is *garbage* so freeing such pages is almost zero
      overhead since we don't need to swap out and access afterward causes just
      minor fault.  Thus, it would make sense to put those freeable pages in
      inactive file LRU to compete other used-once pages.  It makes sense for
      implmentaion point of view, too because it's not swapbacked memory any
      longer until it would be re-dirtied.  Even, it could give a bonus to make
      them be reclaimed on swapless system.  However, MADV_COLD doesn't mean
      garbage so reclaiming them requires swap-out/in in the end so it's bigger
      cost.  Since we have designed VM LRU aging based on cost-model, anonymous
      cold pages would be better to position inactive anon's LRU list, not file
      LRU.  Furthermore, it would help to avoid unnecessary scanning if system
      doesn't have a swap device.  Let's start simpler way without adding
      complexity at this moment.  However, keep in mind, too that it's a caveat
      that workloads with a lot of pages cache are likely to ignore MADV_COLD on
      anonymous memory because we rarely age anonymous LRU lists.
      
      * man-page material
      
      MADV_COLD (since Linux x.x)
      
      Pages in the specified regions will be treated as less-recently-accessed
      compared to pages in the system with similar access frequencies.  In
      contrast to MADV_FREE, the contents of the region are preserved regardless
      of subsequent writes to pages.
      
      MADV_COLD cannot be applied to locked pages, Huge TLB pages, or VM_PFNMAP
      pages.
      
      [akpm@linux-foundation.org: resolve conflicts with hmm.git]
      Link: http://lkml.kernel.org/r/20190726023435.214162-2-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Daniel Colascione <dancol@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Oleksandr Natalenko <oleksandr@redhat.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Sonny Rao <sonnyrao@google.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Tim Murray <timmurray@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9c276cc6
    • A
      lib: untag user pointers in strn*_user · 903f433f
      Andrey Konovalov 提交于
      Patch series "arm64: untag user pointers passed to the kernel", v19.
      
      === Overview
      
      arm64 has a feature called Top Byte Ignore, which allows to embed pointer
      tags into the top byte of each pointer.  Userspace programs (such as
      HWASan, a memory debugging tool [1]) might use this feature and pass
      tagged user pointers to the kernel through syscalls or other interfaces.
      
      Right now the kernel is already able to handle user faults with tagged
      pointers, due to these patches:
      
      1. 81cddd65 ("arm64: traps: fix userspace cache maintenance emulation on a
                   tagged pointer")
      2. 7dcd9dd8 ("arm64: hw_breakpoint: fix watchpoint matching for tagged
      	      pointers")
      3. 276e9327 ("arm64: entry: improve data abort handling of tagged
      	      pointers")
      
      This patchset extends tagged pointer support to syscall arguments.
      
      As per the proposed ABI change [3], tagged pointers are only allowed to be
      passed to syscalls when they point to memory ranges obtained by anonymous
      mmap() or sbrk() (see the patchset [3] for more details).
      
      For non-memory syscalls this is done by untaging user pointers when the
      kernel performs pointer checking to find out whether the pointer comes
      from userspace (most notably in access_ok).  The untagging is done only
      when the pointer is being checked, the tag is preserved as the pointer
      makes its way through the kernel and stays tagged when the kernel
      dereferences the pointer when perfoming user memory accesses.
      
      The mmap and mremap (only new_addr) syscalls do not currently accept
      tagged addresses.  Architectures may interpret the tag as a background
      colour for the corresponding vma.
      
      Other memory syscalls (mprotect, etc.) don't do user memory accesses but
      rather deal with memory ranges, and untagged pointers are better suited to
      describe memory ranges internally.  Thus for memory syscalls we untag
      pointers completely when they enter the kernel.
      
      === Other approaches
      
      One of the alternative approaches to untagging that was considered is to
      completely strip the pointer tag as the pointer enters the kernel with
      some kind of a syscall wrapper, but that won't work with the countless
      number of different ioctl calls.  With this approach we would need a
      custom wrapper for each ioctl variation, which doesn't seem practical.
      
      An alternative approach to untagging pointers in memory syscalls prologues
      is to inspead allow tagged pointers to be passed to find_vma() (and other
      vma related functions) and untag them there.  Unfortunately, a lot of
      find_vma() callers then compare or subtract the returned vma start and end
      fields against the pointer that was being searched.  Thus this approach
      would still require changing all find_vma() callers.
      
      === Testing
      
      The following testing approaches has been taken to find potential issues
      with user pointer untagging:
      
      1. Static testing (with sparse [2] and separately with a custom static
         analyzer based on Clang) to track casts of __user pointers to integer
         types to find places where untagging needs to be done.
      
      2. Static testing with grep to find parts of the kernel that call
         find_vma() (and other similar functions) or directly compare against
         vm_start/vm_end fields of vma.
      
      3. Static testing with grep to find parts of the kernel that compare
         user pointers with TASK_SIZE or other similar consts and macros.
      
      4. Dynamic testing: adding BUG_ON(has_tag(addr)) to find_vma() and running
         a modified syzkaller version that passes tagged pointers to the kernel.
      
      Based on the results of the testing the requried patches have been added
      to the patchset.
      
      === Notes
      
      This patchset is meant to be merged together with "arm64 relaxed ABI" [3].
      
      This patchset is a prerequisite for ARM's memory tagging hardware feature
      support [4].
      
      This patchset has been merged into the Pixel 2 & 3 kernel trees and is
      now being used to enable testing of Pixel phones with HWASan.
      
      Thanks!
      
      [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html
      
      [2] https://github.com/lucvoo/sparse-dev/commit/5f960cb10f56ec2017c128ef9d16060e0145f292
      
      [3] https://lkml.org/lkml/2019/6/12/745
      
      [4] https://community.arm.com/processors/b/blog/posts/arm-a-profile-architecture-2018-developments-armv85a
      
      This patch (of 11)
      
      This patch is a part of a series that extends kernel ABI to allow to pass
      tagged user pointers (with the top byte set to something else other than
      0x00) as syscall arguments.
      
      strncpy_from_user and strnlen_user accept user addresses as arguments, and
      do not go through the same path as copy_from_user and others, so here we
      need to handle the case of tagged user addresses separately.
      
      Untag user pointers passed to these functions.
      
      Note, that this patch only temporarily untags the pointers to perform
      validity checks, but then uses them as is to perform user memory accesses.
      
      [andreyknvl@google.com: fix sparc4 build]
       Link: http://lkml.kernel.org/r/CAAeHK+yx4a-P0sDrXTUxMvO2H0CJZUFPffBrg_cU7oJOZyC7ew@mail.gmail.com
      Link: http://lkml.kernel.org/r/c5a78bcad3e94d6cda71fcaa60a423231ae71e4c.1563904656.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
      Reviewed-by: NKhalid Aziz <khalid.aziz@oracle.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Auger <eric.auger@redhat.com>
      Cc: Felix Kuehling <Felix.Kuehling@amd.com>
      Cc: Jens Wiklander <jens.wiklander@linaro.org>
      Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      903f433f
    • M
      augmented rbtree: add new RB_DECLARE_CALLBACKS_MAX macro · 315cc066
      Michel Lespinasse 提交于
      Add RB_DECLARE_CALLBACKS_MAX, which generates augmented rbtree callbacks
      for the case where the augmented value is a scalar whose definition
      follows a max(f(node)) pattern.  This actually covers all present uses of
      RB_DECLARE_CALLBACKS, and saves some (source) code duplication in the
      various RBCOMPUTE function definitions.
      
      [walken@google.com: fix mm/vmalloc.c]
        Link: http://lkml.kernel.org/r/CANN689FXgK13wDYNh1zKxdipeTuALG4eKvKpsdZqKFJ-rvtGiQ@mail.gmail.com
      [walken@google.com: re-add check to check_augmented()]
        Link: http://lkml.kernel.org/r/20190727022027.GA86863@google.com
      Link: http://lkml.kernel.org/r/20190703040156.56953-3-walken@google.comSigned-off-by: NMichel Lespinasse <walken@google.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      315cc066
    • P
      KVM: nVMX: cleanup and fix host 64-bit mode checks · fd3edd4a
      Paolo Bonzini 提交于
      KVM was incorrectly checking vmcs12->host_ia32_efer even if the "load
      IA32_EFER" exit control was reset.  Also, some checks were not using
      the new CC macro for tracing.
      
      Cleanup everything so that the vCPU's 64-bit mode is determined
      directly from EFER_LMA and the VMCS checks are based on that, which
      matches section 26.2.4 of the SDM.
      
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Krish Sadhukhan <krish.sadhukhan@oracle.com>
      Fixes: 5845038cReviewed-by: NJim Mattson <jmattson@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fd3edd4a
  7. 25 9月, 2019 4 次提交