1. 17 2月, 2018 1 次提交
  2. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  3. 13 6月, 2017 1 次提交
  4. 02 5月, 2017 5 次提交
  5. 16 3月, 2017 1 次提交
    • T
      x86: Remap GDT tables in the fixmap section · 69218e47
      Thomas Garnier 提交于
      Each processor holds a GDT in its per-cpu structure. The sgdt
      instruction gives the base address of the current GDT. This address can
      be used to bypass KASLR memory randomization. With another bug, an
      attacker could target other per-cpu structures or deduce the base of
      the main memory section (PAGE_OFFSET).
      
      This patch relocates the GDT table for each processor inside the
      fixmap section. The space is reserved based on number of supported
      processors.
      
      For consistency, the remapping is done by default on 32 and 64-bit.
      
      Each processor switches to its remapped GDT at the end of
      initialization. For hibernation, the main processor returns with the
      original GDT and switches back to the remapping at completion.
      
      This patch was tested on both architectures. Hibernation and KVM were
      both tested specially for their usage of the GDT.
      
      Thanks to Boris Ostrovsky <boris.ostrovsky@oracle.com> for testing and
      recommending changes for Xen support.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lorenzo Stoakes <lstoakes@gmail.com>
      Cc: Luis R . Rodriguez <mcgrof@kernel.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rafael J . Wysocki <rjw@rjwysocki.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: kasan-dev@googlegroups.com
      Cc: kernel-hardening@lists.openwall.com
      Cc: kvm@vger.kernel.org
      Cc: lguest@lists.ozlabs.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-efi@vger.kernel.org
      Cc: linux-mm@kvack.org
      Cc: linux-pm@vger.kernel.org
      Cc: xen-devel@lists.xenproject.org
      Cc: zijun_hu <zijun_hu@htc.com>
      Link: http://lkml.kernel.org/r/20170314170508.100882-2-thgarnie@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      69218e47
  6. 02 3月, 2017 1 次提交
  7. 07 2月, 2017 1 次提交
  8. 13 12月, 2016 1 次提交
    • T
      x86/smpboot: Make logical package management more robust · 9d85eb91
      Thomas Gleixner 提交于
      The logical package management has several issues:
      
       - The APIC ids provided by ACPI are not required to be the same as the
         initial APIC id which can be retrieved by CPUID. The APIC ids provided
         by ACPI are those which are written by the BIOS into the APIC. The
         initial id is set by hardware and can not be changed. The hardware
         provided ids contain the real hardware package information.
      
         Especially AMD sets the effective APIC id different from the hardware id
         as they need to reserve space for the IOAPIC ids starting at id 0.
      
         As a consequence those machines trigger the currently active firmware
         bug printouts in dmesg, These are obviously wrong.
      
       - Virtual machines have their own interesting of enumerating APICs and
         packages which are not reliably covered by the current implementation.
      
      The sizing of the mapping array has been tweaked to be generously large to
      handle systems which provide a wrong core count when HT is disabled so the
      whole magic which checks for space in the physical hotplug case is not
      needed anymore.
      
      Simplify the whole machinery and do the mapping when the CPU starts and the
      CPUID derived physical package information is available. This solves the
      observed problems on AMD machines and works for the virtualization issues
      as well.
      
      Remove the extra call from XEN cpu bringup code as it is not longer
      required.
      
      Fixes: d49597fd ("x86/cpu: Deal with broken firmware (VMWare/XEN)")
      Reported-and-tested-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: M. Vefa Bicakci <m.v.b@runbox.com>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Cc: Charles (Chas) Williams <ciwillia@brocade.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1612121102260.3429@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      9d85eb91
  9. 06 10月, 2016 1 次提交
    • B
      xen/x86: Update topology map for PV VCPUs · a6a198bc
      Boris Ostrovsky 提交于
      Early during boot topology_update_package_map() computes
      logical_pkg_ids for all present processors.
      
      Later, when processors are brought up, identify_cpu() updates
      these values based on phys_pkg_id which is a function of
      initial_apicid. On PV guests the latter may point to a
      non-existing node, causing logical_pkg_ids to be set to -1.
      
      Intel's RAPL uses logical_pkg_id (as topology_logical_package_id())
      to index its arrays and therefore in this case will point to index
      65535 (since logical_pkg_id is a u16). This could lead to either a
      crash or may actually access random memory location.
      
      As a workaround, we recompute topology during CPU bringup to reset
      logical_pkg_id to a valid value.
      
      (The reason for initial_apicid being bogus is because it is
      initial_apicid of the processor from which the guest is launched.
      This value is CPUID(1).EBX[31:24])
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      a6a198bc
  10. 30 9月, 2016 1 次提交
    • K
      xen: Remove event channel notification through Xen PCI platform device · 72a9b186
      KarimAllah Ahmed 提交于
      Ever since commit 254d1a3f ("xen/pv-on-hvm kexec: shutdown watches
      from old kernel") using the INTx interrupt from Xen PCI platform
      device for event channel notification would just lockup the guest
      during bootup.  postcore_initcall now calls xs_reset_watches which
      will eventually try to read a value from XenStore and will get stuck
      on read_reply at XenBus forever since the platform driver is not
      probed yet and its INTx interrupt handler is not registered yet. That
      means that the guest can not be notified at this moment of any pending
      event channels and none of the per-event handlers will ever be invoked
      (including the XenStore one) and the reply will never be picked up by
      the kernel.
      
      The exact stack where things get stuck during xenbus_init:
      
      -xenbus_init
       -xs_init
        -xs_reset_watches
         -xenbus_scanf
          -xenbus_read
           -xs_single
            -xs_single
             -xs_talkv
      
      Vector callbacks have always been the favourite event notification
      mechanism since their introduction in commit 38e20b07 ("x86/xen:
      event channels delivery on HVM.") and the vector callback feature has
      always been advertised for quite some time by Xen that's why INTx was
      broken for several years now without impacting anyone.
      
      Luckily this also means that event channel notification through INTx
      is basically dead-code which can be safely removed without impacting
      anybody since it has been effectively disabled for more than 4 years
      with nobody complaining about it (at least as far as I'm aware of).
      
      This commit removes event channel notification through Xen PCI
      platform device.
      
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
      Cc: Julien Grall <julien.grall@citrix.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
      Cc: xen-devel@lists.xenproject.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-pci@vger.kernel.org
      Cc: Anthony Liguori <aliguori@amazon.com>
      Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      72a9b186
  11. 25 8月, 2016 1 次提交
  12. 25 7月, 2016 2 次提交
    • V
      xen/pvhvm: run xen_vcpu_setup() for the boot CPU · ee42d665
      Vitaly Kuznetsov 提交于
      Historically we didn't call VCPUOP_register_vcpu_info for CPU0 for
      PVHVM guests (while we had it for PV and ARM guests). This is usually
      fine as we can use vcpu info in the shared_info page but when we try
      booting on a vCPU with Xen's vCPU id > 31 (e.g. when we try to kdump
      after crashing on this CPU) we're not able to boot.
      
      Switch to always doing VCPUOP_register_vcpu_info for the boot CPU.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      ee42d665
    • V
      x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op · ad5475f9
      Vitaly Kuznetsov 提交于
      HYPERVISOR_vcpu_op() passes Linux's idea of vCPU id as a parameter
      while Xen's idea is expected. In some cases these ideas diverge so we
      need to do remapping.
      
      Convert all callers of HYPERVISOR_vcpu_op() to use xen_vcpu_nr().
      
      Leave xen_fill_possible_map() and xen_filter_cpu_maps() intact as
      they're only being called by PV guests before perpu areas are
      initialized. While the issue could be solved by switching to
      early_percpu for xen_vcpu_id I think it's not worth it: PV guests will
      probably never get to the point where their idea of vCPU id diverges
      from Xen's.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      ad5475f9
  13. 29 3月, 2016 1 次提交
  14. 02 3月, 2016 1 次提交
    • T
      arch/hotplug: Call into idle with a proper state · fc6d73d6
      Thomas Gleixner 提交于
      Let the non boot cpus call into idle with the corresponding hotplug state, so
      the hotplug core can handle the further bringup. That's a first step to
      convert the boot side of the hotplugged cpus to do all the synchronization
      with the other side through the state machine. For now it'll only start the
      hotplug thread and kick the full bringup of the cpu.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: linux-arch@vger.kernel.org
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
      Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Turner <pjt@google.com>
      Link: http://lkml.kernel.org/r/20160226182341.614102639@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      fc6d73d6
  15. 09 9月, 2015 1 次提交
  16. 20 8月, 2015 1 次提交
    • B
      xen/PMU: Initialization code for Xen PMU · 65d0cf0b
      Boris Ostrovsky 提交于
      Map shared data structure that will hold CPU registers, VPMU context,
      V/PCPU IDs of the CPU interrupted by PMU interrupt. Hypervisor fills
      this information in its handler and passes it to the guest for further
      processing.
      
      Set up PMU VIRQ.
      
      Now that perf infrastructure will assume that PMU is available on a PV
      guest we need to be careful and make sure that accesses via RDPMC
      instruction don't cause fatal traps by the hypervisor. Provide a nop
      RDPMC handler.
      
      For the same reason avoid issuing a warning on a write to APIC's LVTPC.
      
      Both of these will be made functional in later patches.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      65d0cf0b
  17. 02 4月, 2015 1 次提交
  18. 25 3月, 2015 1 次提交
    • D
      x86/asm/entry: Get rid of KERNEL_STACK_OFFSET · ef593260
      Denys Vlasenko 提交于
      PER_CPU_VAR(kernel_stack) was set up in a way where it points
      five stack slots below the top of stack.
      
      Presumably, it was done to avoid one "sub $5*8,%rsp"
      in syscall/sysenter code paths, where iret frame needs to be
      created by hand.
      
      Ironically, none of them benefits from this optimization,
      since all of them need to allocate additional data on stack
      (struct pt_regs), so they still have to perform subtraction.
      
      This patch eliminates KERNEL_STACK_OFFSET.
      
      PER_CPU_VAR(kernel_stack) now points directly to top of stack.
      pt_regs allocations are adjusted to allocate iret frame as well.
      Hopefully we can merge it later with 32-bit specific
      PER_CPU_VAR(cpu_current_top_of_stack) variable...
      
      Net result in generated code is that constants in several insns
      are changed.
      
      This change is necessary for changing struct pt_regs creation
      in SYSCALL64 code path from MOV to PUSH instructions.
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1426785469-15125-2-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ef593260
  19. 12 3月, 2015 1 次提交
    • P
      x86: Use common outgoing-CPU-notification code · 2a442c9c
      Paul E. McKenney 提交于
      This commit removes the open-coded CPU-offline notification with new
      common code.  Among other things, this change avoids calling scheduler
      code using RCU from an offline CPU that RCU is ignoring.  It also allows
      Xen to notice at online time that the CPU did not go offline correctly.
      Note that Xen has the surviving CPU carry out some cleanup operations,
      so if the surviving CPU times out, these cleanup operations might have
      been carried out while the outgoing CPU was still running.  It might
      therefore be unwise to bring this CPU back online, and this commit
      avoids doing so.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: <x86@kernel.org>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: <xen-devel@lists.xenproject.org>
      2a442c9c
  20. 26 1月, 2015 1 次提交
  21. 10 11月, 2014 1 次提交
  22. 06 10月, 2014 1 次提交
  23. 16 9月, 2014 1 次提交
  24. 15 4月, 2014 1 次提交
    • B
      x86/xen: Fix 32-bit PV guests's usage of kernel_stack · 4461bbc0
      Boris Ostrovsky 提交于
      Commit 198d208d ("x86: Keep
      thread_info on thread stack in x86_32") made 32-bit kernels use
      kernel_stack to point to thread_info. That change missed a couple of
      updates needed by Xen's 32-bit PV guests:
      
      1. kernel_stack needs to be initialized for secondary CPUs
      
      2. GET_THREAD_INFO() now uses %fs register which may not be the
         kernel's version when executing xen_iret().
      
      With respect to the second issue, we don't need GET_THREAD_INFO()
      anymore: we used it as an intermediate step to get to per_cpu xen_vcpu
      and avoid referencing %fs. Now that we are going to use %fs anyway we
      may as well go directly to xen_vcpu.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      4461bbc0
  25. 22 1月, 2014 1 次提交
    • R
      xen/pvh: Set X86_CR0_WP and others in CR0 (v2) · c9f6e997
      Roger Pau Monne 提交于
      otherwise we will get for some user-space applications
      that use 'clone' with CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID
      end up hitting an assert in glibc manifested by:
      
      general protection ip:7f80720d364c sp:7fff98fd8a80 error:0 in
      libc-2.13.so[7f807209e000+180000]
      
      This is due to the nature of said operations which sets and clears
      the PID.  "In the successful one I can see that the page table of
      the parent process has been updated successfully to use a
      different physical page, so the write of the tid on
      that page only affects the child...
      
      On the other hand, in the failed case, the write seems to happen before
      the copy of the original page is done, so both the parent and the child
      end up with the same value (because the parent copies the page after
      the write of the child tid has already happened)."
      (Roger's analysis). The nature of this is due to the Xen's commit
      of 51e2cac257ec8b4080d89f0855c498cbbd76a5e5
      "x86/pvh: set only minimal cr0 and cr4 flags in order to use paging"
      the CR0_WP was removed so COW features of the Linux kernel were not
      operating properly.
      
      While doing that also update the rest of the CR0 flags to be inline
      with what a baremetal Linux kernel would set them to.
      
      In 'secondary_startup_64' (baremetal Linux) sets:
      
      X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP |
      X86_CR0_AM | X86_CR0_PG
      
      The hypervisor for HVM type guests (which PVH is a bit) sets:
      X86_CR0_PE | X86_CR0_ET | X86_CR0_TS
      For PVH it specifically sets:
      X86_CR0_PG
      
      Which means we need to set the rest: X86_CR0_MP | X86_CR0_NE  |
      X86_CR0_WP | X86_CR0_AM to have full parity.
      Signed-off-by: NRoger Pau Monne <roger.pau@citrix.com>
      Signed-off-by: NMukesh Rathor <mukesh.rathor@oracle.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      [v1: Took out the cr4 writes to be a seperate patch]
      [v2: 0-DAY kernel found xen_setup_gdt to be missing a static]
      c9f6e997
  26. 06 1月, 2014 1 次提交
    • M
      xen/pvh: Secondary VCPU bringup (non-bootup CPUs) · 5840c84b
      Mukesh Rathor 提交于
      The VCPU bringup protocol follows the PV with certain twists.
      From xen/include/public/arch-x86/xen.h:
      
      Also note that when calling DOMCTL_setvcpucontext and VCPU_initialise
      for HVM and PVH guests, not all information in this structure is updated:
      
       - For HVM guests, the structures read include: fpu_ctxt (if
       VGCT_I387_VALID is set), flags, user_regs, debugreg[*]
      
       - PVH guests are the same as HVM guests, but additionally use ctrlreg[3] to
       set cr3. All other fields not used should be set to 0.
      
      This is what we do. We piggyback on the 'xen_setup_gdt' - but modify
      a bit - we need to call 'load_percpu_segment' so that 'switch_to_new_gdt'
      can load per-cpu data-structures. It has no effect on the VCPU0.
      
      We also piggyback on the %rdi register to pass in the CPU number - so
      that when we bootup a new CPU, the cpu_bringup_and_idle will have
      passed as the first parameter the CPU number (via %rdi for 64-bit).
      Signed-off-by: NMukesh Rathor <mukesh.rathor@oracle.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      5840c84b
  27. 07 11月, 2013 1 次提交
  28. 10 10月, 2013 1 次提交
    • F
      xen: Fix possible user space selector corruption · 7cde9b27
      Frediano Ziglio 提交于
      Due to the way kernel is initialized under Xen is possible that the
      ring1 selector used by the kernel for the boot cpu end up to be copied
      to userspace leading to segmentation fault in the userspace.
      
      Xen code in the kernel initialize no-boot cpus with correct selectors (ds
      and es set to __USER_DS) but the boot one keep the ring1 (passed by Xen).
      On task context switch (switch_to) we assume that ds, es and cs already
      point to __USER_DS and __KERNEL_CSso these selector are not changed.
      
      If processor is an Intel that support sysenter instruction sysenter/sysexit
      is used so ds and es are not restored switching back from kernel to
      userspace. In the case the selectors point to a ring1 instead of __USER_DS
      the userspace code will crash on first memory access attempt (to be
      precise Xen on the emulated iret used to do sysexit will detect and set ds
      and es to zero which lead to GPF anyway).
      
      Now if an userspace process call kernel using sysenter and get rescheduled
      (for me it happen on a specific init calling wait4) could happen that the
      ring1 selector is set to ds and es.
      
      This is quite hard to detect cause after a while these selectors are fixed
      (__USER_DS seems sticky).
      
      Bisecting the code commit 7076aada appears
      to be the first one that have this issue.
      Signed-off-by: NFrediano Ziglio <frediano.ziglio@citrix.com>
      Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Reviewed-by: NAndrew Cooper <andrew.cooper3@citrix.com>
      7cde9b27
  29. 10 9月, 2013 2 次提交
    • K
      xen/smp: Update pv_lock_ops functions before alternative code starts under PVHVM · 26a79995
      Konrad Rzeszutek Wilk 提交于
      Before this patch we would patch all of the pv_lock_ops sites
      using alternative assembler. Then later in the bootup cycle
      change the unlock_kick and lock_spinning to the Xen specific -
      without re patching.
      
      That meant that for the core of the kernel we would be running
      with the baremetal version of unlock_kick and lock_spinning while
      for modules we would have the proper Xen specific slowpaths.
      
      As most of the module uses some API from the core kernel that ended
      up with slowpath lockers waiting forever to be kicked (b/c they
      would be using the Xen specific slowpath logic). And the
      kick never came b/c the unlock path that was taken was the
      baremetal one.
      
      On PV we do not have the problem as we initialise before the
      alternative code kicks in.
      
      The fix is to make the updating of the pv_lock_ops function
      be done before the alternative code starts patching.
      
      Note that this patch fixes issues discovered by commit
      f10cd522.
      ("xen: disable PV spinlocks on HVM") wherein it mentioned
      
         PV spinlocks cannot possibly work with the current code because they are
         enabled after pvops patching has already been done, and because PV
         spinlocks use a different data structure than native spinlocks so we
         cannot switch between them dynamically.
      
      The first problem is solved by this patch.
      
      The second problem has been solved by commit
      816434ec
      (Merge branch 'x86-spinlocks-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip)
      
      P.S.
      There is still the commit 70dd4998
      (xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM) to
      revert but that can be done later after all other bugs have been
      fixed.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      26a79995
    • K
      xen/spinlock: Fix locking path engaging too soon under PVHVM. · 1fb3a8b2
      Konrad Rzeszutek Wilk 提交于
      The xen_lock_spinning has a check for the kicker interrupts
      and if it is not initialized it will spin normally (not enter
      the slowpath).
      
      But for PVHVM case we would initialize the kicker interrupt
      before the CPU came online. This meant that if the booting
      CPU used a spinlock and went in the slowpath - it would
      enter the slowpath and block forever. The forever part because
      during bootup: the spinlock would be taken _before_ the CPU
      sets itself to be online (more on this further), and we enter
      to poll on the event channel forever.
      
      The bootup CPU (see commit fc78d343
      "xen/smp: initialize IPI vectors before marking CPU online"
      for details) and the CPU that started the bootup consult
      the cpu_online_mask to determine whether the booting CPU should
      get an IPI. The booting CPU has to set itself in this mask via:
      
        set_cpu_online(smp_processor_id(), true);
      
      However, if the spinlock is taken before this (and it is) and
      it polls on an event channel - it will never be woken up as
      the kernel will never send an IPI to an offline CPU.
      
      Note that the PVHVM logic in sending IPIs is using the HVM
      path which has numerous checks using the cpu_online_mask
      and cpu_active_mask. See above mention git commit for details.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      1fb3a8b2
  30. 20 8月, 2013 1 次提交
    • C
      xen/smp: initialize IPI vectors before marking CPU online · fc78d343
      Chuck Anderson 提交于
      An older PVHVM guest (v3.0 based) crashed during vCPU hot-plug with:
      
      	kernel BUG at drivers/xen/events.c:1328!
      
      RCU has detected that a CPU has not entered a quiescent state within the
      grace period.  It needs to send the CPU a reschedule IPI if it is not
      offline.  rcu_implicit_offline_qs() does this check:
      
      	/*
      	 * If the CPU is offline, it is in a quiescent state.  We can
      	 * trust its state not to change because interrupts are disabled.
      	 */
      	if (cpu_is_offline(rdp->cpu)) {
      		rdp->offline_fqs++;
      		return 1;
      	}
      
      	Else the CPU is online.  Send it a reschedule IPI.
      
      The CPU is in the middle of being hot-plugged and has been marked online
      (!cpu_is_offline()).  See start_secondary():
      
      	set_cpu_online(smp_processor_id(), true);
      	...
      	per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
      
      start_secondary() then waits for the CPU bringing up the hot-plugged CPU to
      mark it as active:
      
      	/*
      	 * Wait until the cpu which brought this one up marked it
      	 * online before enabling interrupts. If we don't do that then
      	 * we can end up waking up the softirq thread before this cpu
      	 * reached the active state, which makes the scheduler unhappy
      	 * and schedule the softirq thread on the wrong cpu. This is
      	 * only observable with forced threaded interrupts, but in
      	 * theory it could also happen w/o them. It's just way harder
      	 * to achieve.
      	 */
      	while (!cpumask_test_cpu(smp_processor_id(), cpu_active_mask))
      		cpu_relax();
      
      	/* enable local interrupts */
      	local_irq_enable();
      
      The CPU being hot-plugged will be marked active after it has been fully
      initialized by the CPU managing the hot-plug.  In the Xen PVHVM case
      xen_smp_intr_init() is called to set up the hot-plugged vCPU's
      XEN_RESCHEDULE_VECTOR.
      
      The hot-plugging CPU is marked online, not marked active and does not have
      its IPI vectors set up.  rcu_implicit_offline_qs() sees the hot-plugging
      cpu is !cpu_is_offline() and tries to send it a reschedule IPI:
      This will lead to:
      
      	kernel BUG at drivers/xen/events.c:1328!
      
      	xen_send_IPI_one()
      	xen_smp_send_reschedule()
      	rcu_implicit_offline_qs()
      	rcu_implicit_dynticks_qs()
      	force_qs_rnp()
      	force_quiescent_state()
      	__rcu_process_callbacks()
      	rcu_process_callbacks()
      	__do_softirq()
      	call_softirq()
      	do_softirq()
      	irq_exit()
      	xen_evtchn_do_upcall()
      
      because xen_send_IPI_one() will attempt to use an uninitialized IRQ for
      the XEN_RESCHEDULE_VECTOR.
      
      There is at least one other place that has caused the same crash:
      
      	xen_smp_send_reschedule()
      	wake_up_idle_cpu()
      	add_timer_on()
      	clocksource_watchdog()
      	call_timer_fn()
      	run_timer_softirq()
      	__do_softirq()
      	call_softirq()
      	do_softirq()
      	irq_exit()
      	xen_evtchn_do_upcall()
      	xen_hvm_callback_vector()
      
      clocksource_watchdog() uses cpu_online_mask to pick the next CPU to handle
      a watchdog timer:
      
      	/*
      	 * Cycle through CPUs to check if the CPUs stay synchronized
      	 * to each other.
      	 */
      	next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask);
      	if (next_cpu >= nr_cpu_ids)
      		next_cpu = cpumask_first(cpu_online_mask);
      	watchdog_timer.expires += WATCHDOG_INTERVAL;
      	add_timer_on(&watchdog_timer, next_cpu);
      
      This resulted in an attempt to send an IPI to a hot-plugging CPU that
      had not initialized its reschedule vector. One option would be to make
      the RCU code check to not check for CPU offline but for CPU active.
      As becoming active is done after a CPU is online (in older kernels).
      
      But Srivatsa pointed out that "the cpu_active vs cpu_online ordering has been
      completely reworked - in the online path, cpu_active is set *before* cpu_online,
      and also, in the cpu offline path, the cpu_active bit is reset in the CPU_DYING
      notification instead of CPU_DOWN_PREPARE." Drilling in this the bring-up
      path: "[brought up CPU].. send out a CPU_STARTING notification, and in response
      to that, the scheduler sets the CPU in the cpu_active_mask. Again, this mask
      is better left to the scheduler alone, since it has the intelligence to use it
      judiciously."
      
      The conclusion was that:
      "
      1. At the IPI sender side:
      
         It is incorrect to send an IPI to an offline CPU (cpu not present in
         the cpu_online_mask). There are numerous places where we check this
         and warn/complain.
      
      2. At the IPI receiver side:
      
         It is incorrect to let the world know of our presence (by setting
         ourselves in global bitmasks) until our initialization steps are complete
         to such an extent that we can handle the consequences (such as
         receiving interrupts without crashing the sender etc.)
      " (from Srivatsa)
      
      As the native code enables the interrupts at some point we need to be
      able to service them. In other words a CPU must have valid IPI vectors
      if it has been marked online.
      
      It doesn't need to handle the IPI (interrupts may be disabled) but needs
      to have valid IPI vectors because another CPU may find it in cpu_online_mask
      and attempt to send it an IPI.
      
      This patch will change the order of the Xen vCPU bring-up functions so that
      Xen vectors have been set up before start_secondary() is called.
      It also will not continue to bring up a Xen vCPU if xen_smp_intr_init() fails
      to initialize it.
      
      Orabug 13823853
      Signed-off-by Chuck Anderson <chuck.anderson@oracle.com>
      Acked-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      fc78d343
  31. 09 8月, 2013 2 次提交
  32. 15 7月, 2013 1 次提交
    • P
      x86: delete __cpuinit usage from all x86 files · 148f9bb8
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      Note that some harmless section mismatch warnings may result, since
      notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
      are flagged as __cpuinit  -- so if we remove the __cpuinit from
      arch specific callers, we will also get section mismatch warnings.
      As an intermediate step, we intend to turn the linux/init.h cpuinit
      content into no-ops as early as possible, since that will get rid
      of these warnings.  In any case, they are temporary and harmless.
      
      This removes all the arch/x86 uses of the __cpuinit macros from
      all C files.  x86 only had the one __CPUINIT used in assembly files,
      and it wasn't paired off with a .previous or a __FINIT, so we can
      delete it directly w/o any corresponding additional change there.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      148f9bb8
  33. 10 6月, 2013 1 次提交
    • K
      xen/smp: Don't leak interrupt name when offlining. · b85fffec
      Konrad Rzeszutek Wilk 提交于
      When the user does:
      echo 0 > /sys/devices/system/cpu/cpu1/online
      echo 1 > /sys/devices/system/cpu/cpu1/online
      
      kmemleak reports:
      kmemleak: 7 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
      
      unreferenced object 0xffff88003fa51240 (size 32):
        comm "swapper/0", pid 1, jiffies 4294667339 (age 1027.789s)
        hex dump (first 32 bytes):
          72 65 73 63 68 65 64 31 00 00 00 00 00 00 00 00  resched1........
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff81660721>] kmemleak_alloc+0x21/0x50
          [<ffffffff81190aac>] __kmalloc_track_caller+0xec/0x2a0
          [<ffffffff812fe1bb>] kvasprintf+0x5b/0x90
          [<ffffffff812fe228>] kasprintf+0x38/0x40
          [<ffffffff81047ed1>] xen_smp_intr_init+0x41/0x2c0
          [<ffffffff816636d3>] xen_cpu_up+0x393/0x3e8
          [<ffffffff8166bbf5>] _cpu_up+0xd1/0x14b
          [<ffffffff8166bd48>] cpu_up+0xd9/0xec
          [<ffffffff81ae6e4a>] smp_init+0x4b/0xa3
          [<ffffffff81ac4981>] kernel_init_freeable+0xdb/0x1e6
          [<ffffffff8165ce39>] kernel_init+0x9/0xf0
          [<ffffffff8167edfc>] ret_from_fork+0x7c/0xb0
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      This patch fixes some of it by using the 'struct xen_common_irq->name'
      field to stash away the char so that it can be freed when
      the interrupt line is destroyed.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      b85fffec