1. 14 3月, 2020 1 次提交
  2. 10 2月, 2020 1 次提交
  3. 05 2月, 2020 1 次提交
  4. 01 2月, 2020 1 次提交
  5. 25 1月, 2020 2 次提交
    • J
      rcu: Remove kfree_call_rcu_nobatch() · 189a6883
      Joel Fernandes (Google) 提交于
      Now that the kfree_rcu() special-casing has been removed from tree RCU,
      this commit removes kfree_call_rcu_nobatch() since it is no longer needed.
      Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      189a6883
    • J
      rcuperf: Add kfree_rcu() performance Tests · e6e78b00
      Joel Fernandes (Google) 提交于
      This test runs kfree_rcu() in a loop to measure performance of the new
      kfree_rcu() batching functionality.
      
      The following table shows results when booting with arguments:
      rcuperf.kfree_loops=20000 rcuperf.kfree_alloc_num=8000
      rcuperf.kfree_rcu_test=1 rcuperf.kfree_no_batch=X
      
      rcuperf.kfree_no_batch=X    # Grace Periods	Test Duration (s)
        X=1 (old behavior)              9133                 11.5
        X=0 (new behavior)              1732                 12.5
      
      On a 16 CPU system with the above boot parameters, we see that the total
      number of grace periods that elapse during the test drops from 9133 when
      not batching to 1732 when batching (a 5X improvement). The kfree_rcu()
      flood itself slows down a bit when batching, though, as shown.
      
      Note that the active memory consumption during the kfree_rcu() flood
      does increase to around 200-250MB due to the batching (from around 50MB
      without batching). However, this memory consumption is relatively
      constant. In other words, the system is able to keep up with the
      kfree_rcu() load. The memory consumption comes down considerably if
      KFREE_DRAIN_JIFFIES is increased from HZ/50 to HZ/80. A later patch will
      reduce memory consumption further by using multiple lists.
      
      Also, when running the test, please disable CONFIG_DEBUG_PREEMPT and
      CONFIG_PROVE_RCU for realistic comparisons with/without batching.
      Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      e6e78b00
  6. 22 1月, 2020 1 次提交
    • M
      genirq, sched/isolation: Isolate from handling managed interrupts · 11ea68f5
      Ming Lei 提交于
      The affinity of managed interrupts is completely handled in the kernel and
      cannot be changed via the /proc/irq/* interfaces from user space. As the
      kernel tries to spread out interrupts evenly accross CPUs on x86 to prevent
      vector exhaustion, it can happen that a managed interrupt whose affinity
      mask contains both isolated and housekeeping CPUs is routed to an isolated
      CPU. As a consequence IO submitted on a housekeeping CPU causes interrupts
      on the isolated CPU.
      
      Add a new sub-parameter 'managed_irq' for 'isolcpus' and the corresponding
      logic in the interrupt affinity selection code.
      
      The subparameter indicates to the interrupt affinity selection logic that
      it should try to avoid the above scenario.
      
      This isolation is best effort and only effective if the automatically
      assigned interrupt mask of a device queue contains isolated and
      housekeeping CPUs. If housekeeping CPUs are online then such interrupts are
      directed to the housekeeping CPU so that IO submitted on the housekeeping
      CPU cannot disturb the isolated CPU.
      
      If a queue's affinity mask contains only isolated CPUs then this parameter
      has no effect on the interrupt routing decision, though interrupts are only
      happening when tasks running on those isolated CPUs submit IO. IO submitted
      on housekeeping CPUs has no influence on those queues.
      
      If the affinity mask contains both housekeeping and isolated CPUs, but none
      of the contained housekeeping CPUs is online, then the interrupt is also
      routed to an isolated CPU. Interrupts are only delivered when one of the
      isolated CPUs in the affinity mask submits IO. If one of the contained
      housekeeping CPUs comes online, the CPU hotplug logic migrates the
      interrupt automatically back to the upcoming housekeeping CPU. Depending on
      the type of interrupt controller, this can require that at least one
      interrupt is delivered to the isolated CPU in order to complete the
      migration.
      
      [ tglx: Removed unused parameter, added and edited comments/documentation
        	and rephrased the changelog so it contains more details. ]
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Link: https://lore.kernel.org/r/20200120091625.17912-1-ming.lei@redhat.com
      11ea68f5
  7. 20 1月, 2020 1 次提交
    • A
      efi/x86: Limit EFI old memory map to SGI UV machines · 1f299fad
      Ard Biesheuvel 提交于
      We carry a quirk in the x86 EFI code to switch back to an older
      method of mapping the EFI runtime services memory regions, because
      it was deemed risky at the time to implement a new method without
      providing a fallback to the old method in case problems arose.
      
      Such problems did arise, but they appear to be limited to SGI UV1
      machines, and so these are the only ones for which the fallback gets
      enabled automatically (via a DMI quirk). The fallback can be enabled
      manually as well, by passing efi=old_map, but there is very little
      evidence that suggests that this is something that is being relied
      upon in the field.
      
      Given that UV1 support is not enabled by default by the distros
      (Ubuntu, Fedora), there is no point in carrying this fallback code
      all the time if there are no other users. So let's move it into the
      UV support code, and document that efi=old_map now requires this
      support code to be enabled.
      
      Note that efi=old_map has been used in the past on other SGI UV
      machines to work around kernel regressions in production, so we
      keep the option to enable it by hand, but only if the kernel was
      built with UV support.
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20200113172245.27925-8-ardb@kernel.org
      1f299fad
  8. 11 1月, 2020 1 次提交
    • M
      efi: Allow disabling PCI busmastering on bridges during boot · 4444f854
      Matthew Garrett 提交于
      Add an option to disable the busmaster bit in the control register on
      all PCI bridges before calling ExitBootServices() and passing control
      to the runtime kernel. System firmware may configure the IOMMU to prevent
      malicious PCI devices from being able to attack the OS via DMA. However,
      since firmware can't guarantee that the OS is IOMMU-aware, it will tear
      down IOMMU configuration when ExitBootServices() is called. This leaves
      a window between where a hostile device could still cause damage before
      Linux configures the IOMMU again.
      
      If CONFIG_EFI_DISABLE_PCI_DMA is enabled or "efi=disable_early_pci_dma"
      is passed on the command line, the EFI stub will clear the busmaster bit
      on all PCI bridges before ExitBootServices() is called. This will
      prevent any malicious PCI devices from being able to perform DMA until
      the kernel reenables busmastering after configuring the IOMMU.
      
      This option may cause failures with some poorly behaved hardware and
      should not be enabled without testing. The kernel commandline options
      "efi=disable_early_pci_dma" or "efi=no_disable_early_pci_dma" may be
      used to override the default. Note that PCI devices downstream from PCI
      bridges are disconnected from their drivers first, using the UEFI
      driver model API, so that DMA can be disabled safely at the bridge
      level.
      
      [ardb: disconnect PCI I/O handles first, as suggested by Arvind]
      Co-developed-by: NMatthew Garrett <mjg59@google.com>
      Signed-off-by: NMatthew Garrett <mjg59@google.com>
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Matthew Garrett <matthewgarrett@google.com>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200103113953.9571-18-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4444f854
  9. 08 1月, 2020 1 次提交
    • S
      Documentation,selinux: fix references to old selinuxfs mount point · d41415eb
      Stephen Smalley 提交于
      selinuxfs was originally mounted on /selinux, and various docs and
      kconfig help texts referred to nodes under it.  In Linux 3.0,
      /sys/fs/selinux was introduced as the preferred mount point for selinuxfs.
      Fix all the old references to /selinux/ to /sys/fs/selinux/.
      While we are there, update the description of the selinux boot parameter
      to reflect the fact that the default value is always 1 since
      commit be6ec88f ("selinux: Remove SECURITY_SELINUX_BOOTPARAM_VALUE")
      and drop discussion of runtime disable since it is deprecated.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      d41415eb
  10. 23 11月, 2019 1 次提交
  11. 19 11月, 2019 1 次提交
    • Y
      ACPI: sysfs: Change ACPI_MASKABLE_GPE_MAX to 0x100 · a7583e72
      Yunfeng Ye 提交于
      The commit 0f27cff8 ("ACPI: sysfs: Make ACPI GPE mask kernel
      parameter cover all GPEs") says:
        "Use a bitmap of size 0xFF instead of a u64 for the GPE mask so 256
         GPEs can be masked"
      
      But the masking of GPE 0xFF it not supported and the check condition
      "gpe > ACPI_MASKABLE_GPE_MAX" is not valid because the type of gpe is
      u8.
      
      So modify the macro ACPI_MASKABLE_GPE_MAX to 0x100, and drop the "gpe >
      ACPI_MASKABLE_GPE_MAX" check. In addition, update the docs "Format" for
      acpi_mask_gpe parameter.
      
      Fixes: 0f27cff8 ("ACPI: sysfs: Make ACPI GPE mask kernel parameter cover all GPEs")
      Signed-off-by: NYunfeng Ye <yeyunfeng@huawei.com>
      [ rjw: Use u16 as gpe data type in acpi_gpe_apply_masked_gpes() ]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a7583e72
  12. 18 11月, 2019 1 次提交
  13. 16 11月, 2019 1 次提交
    • W
      x86/speculation: Fix incorrect MDS/TAA mitigation status · 64870ed1
      Waiman Long 提交于
      For MDS vulnerable processors with TSX support, enabling either MDS or
      TAA mitigations will enable the use of VERW to flush internal processor
      buffers at the right code path. IOW, they are either both mitigated
      or both not. However, if the command line options are inconsistent,
      the vulnerabilites sysfs files may not report the mitigation status
      correctly.
      
      For example, with only the "mds=off" option:
      
        vulnerabilities/mds:Vulnerable; SMT vulnerable
        vulnerabilities/tsx_async_abort:Mitigation: Clear CPU buffers; SMT vulnerable
      
      The mds vulnerabilities file has wrong status in this case. Similarly,
      the taa vulnerability file will be wrong with mds mitigation on, but
      taa off.
      
      Change taa_select_mitigation() to sync up the two mitigation status
      and have them turned off if both "mds=off" and "tsx_async_abort=off"
      are present.
      
      Update documentation to emphasize the fact that both "mds=off" and
      "tsx_async_abort=off" have to be specified together for processors that
      are affected by both TAA and MDS to be effective.
      
       [ bp: Massage and add kernel-parameters.txt change too. ]
      
      Fixes: 1b42f017 ("x86/speculation/taa: Add mitigation for TSX Async Abort")
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: linux-doc@vger.kernel.org
      Cc: Mark Gross <mgross@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Tyler Hicks <tyhicks@canonical.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191115161445.30809-2-longman@redhat.com
      64870ed1
  14. 13 11月, 2019 1 次提交
  15. 07 11月, 2019 2 次提交
    • D
      x86/efi: Add efi_fake_mem support for EFI_MEMORY_SP · 199c8471
      Dan Williams 提交于
      Given that EFI_MEMORY_SP is platform BIOS policy decision for marking
      memory ranges as "reserved for a specific purpose" there will inevitably
      be scenarios where the BIOS omits the attribute in situations where it
      is desired. Unlike other attributes if the OS wants to reserve this
      memory from the kernel the reservation needs to happen early in init. So
      early, in fact, that it needs to happen before e820__memblock_setup()
      which is a pre-requisite for efi_fake_memmap() that wants to allocate
      memory for the updated table.
      
      Introduce an x86 specific efi_fake_memmap_early() that can search for
      attempts to set EFI_MEMORY_SP via efi_fake_mem and update the e820 table
      accordingly.
      
      The KASLR code that scans the command line looking for user-directed
      memory reservations also needs to be updated to consider
      "efi_fake_mem=nn@ss:0x40000" requests.
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      199c8471
    • D
      efi: Common enable/disable infrastructure for EFI soft reservation · b617c526
      Dan Williams 提交于
      UEFI 2.8 defines an EFI_MEMORY_SP attribute bit to augment the
      interpretation of the EFI Memory Types as "reserved for a specific
      purpose".
      
      The proposed Linux behavior for specific purpose memory is that it is
      reserved for direct-access (device-dax) by default and not available for
      any kernel usage, not even as an OOM fallback.  Later, through udev
      scripts or another init mechanism, these device-dax claimed ranges can
      be reconfigured and hot-added to the available System-RAM with a unique
      node identifier. This device-dax management scheme implements "soft" in
      the "soft reserved" designation by allowing some or all of the
      reservation to be recovered as typical memory. This policy can be
      disabled at compile-time with CONFIG_EFI_SOFT_RESERVE=n, or runtime with
      efi=nosoftreserve.
      
      As for this patch, define the common helpers to determine if the
      EFI_MEMORY_SP attribute should be honored. The determination needs to be
      made early to prevent the kernel from being loaded into soft-reserved
      memory, or otherwise allowing early allocations to land there. Follow-on
      changes are needed per architecture to leverage these helpers in their
      respective mem-init paths.
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      b617c526
  16. 05 11月, 2019 1 次提交
  17. 04 11月, 2019 1 次提交
    • P
      kvm: mmu: ITLB_MULTIHIT mitigation · b8e8c830
      Paolo Bonzini 提交于
      With some Intel processors, putting the same virtual address in the TLB
      as both a 4 KiB and 2 MiB page can confuse the instruction fetch unit
      and cause the processor to issue a machine check resulting in a CPU lockup.
      
      Unfortunately when EPT page tables use huge pages, it is possible for a
      malicious guest to cause this situation.
      
      Add a knob to mark huge pages as non-executable. When the nx_huge_pages
      parameter is enabled (and we are using EPT), all huge pages are marked as
      NX. If the guest attempts to execute in one of those pages, the page is
      broken down into 4K pages, which are then marked executable.
      
      This is not an issue for shadow paging (except nested EPT), because then
      the host is in control of TLB flushes and the problematic situation cannot
      happen.  With nested EPT, again the nested guest can cause problems shadow
      and direct EPT is treated in the same way.
      
      [ tglx: Fixup default to auto and massage wording a bit ]
      Originally-by: NJunaid Shahid <junaids@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      b8e8c830
  18. 28 10月, 2019 3 次提交
  19. 26 10月, 2019 1 次提交
  20. 23 10月, 2019 1 次提交
  21. 22 10月, 2019 1 次提交
    • S
      arm64: Retrieve stolen time as paravirtualized guest · e0685fa2
      Steven Price 提交于
      Enable paravirtualization features when running under a hypervisor
      supporting the PV_TIME_ST hypercall.
      
      For each (v)CPU, we ask the hypervisor for the location of a shared
      page which the hypervisor will use to report stolen time to us. We set
      pv_time_ops to the stolen time function which simply reads the stolen
      value from the shared page for a VCPU. We guarantee single-copy
      atomicity using READ_ONCE which means we can also read the stolen
      time for another VCPU than the currently running one while it is
      potentially being updated by the hypervisor.
      Signed-off-by: NSteven Price <steven.price@arm.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      e0685fa2
  22. 16 10月, 2019 1 次提交
  23. 11 10月, 2019 1 次提交
  24. 08 10月, 2019 1 次提交
    • B
      x86/xen: Return from panic notifier · c6875f3a
      Boris Ostrovsky 提交于
      Currently execution of panic() continues until Xen's panic notifier
      (xen_panic_event()) is called at which point we make a hypercall that
      never returns.
      
      This means that any notifier that is supposed to be called later as
      well as significant part of panic() code (such as pstore writes from
      kmsg_dump()) is never executed.
      
      There is no reason for xen_panic_event() to be this last point in
      execution since panic()'s emergency_restart() will call into
      xen_emergency_restart() from where we can perform our hypercall.
      
      Nevertheless, we will provide xen_legacy_crash boot option that will
      preserve original behavior during crash. This option could be used,
      for example, if running kernel dumper (which happens after panic
      notifiers) is undesirable.
      Reported-by: NJames Dingwall <james@dingwall.me.uk>
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      c6875f3a
  25. 04 10月, 2019 1 次提交
    • S
      of: property: Add functional dependency link from DT bindings · a3e1d1a7
      Saravana Kannan 提交于
      Add device links after the devices are created (but before they are
      probed) by looking at common DT bindings like clocks and
      interconnects.
      
      Automatically adding device links for functional dependencies at the
      framework level provides the following benefits:
      
      - Optimizes device probe order and avoids the useless work of
        attempting probes of devices that will not probe successfully
        (because their suppliers aren't present or haven't probed yet).
      
        For example, in a commonly available mobile SoC, registering just
        one consumer device's driver at an initcall level earlier than the
        supplier device's driver causes 11 failed probe attempts before the
        consumer device probes successfully. This was with a kernel with all
        the drivers statically compiled in. This problem gets a lot worse if
        all the drivers are loaded as modules without direct symbol
        dependencies.
      
      - Supplier devices like clock providers, interconnect providers, etc
        need to keep the resources they provide active and at a particular
        state(s) during boot up even if their current set of consumers don't
        request the resource to be active. This is because the rest of the
        consumers might not have probed yet and turning off the resource
        before all the consumers have probed could lead to a hang or
        undesired user experience.
      
        Some frameworks (Eg: regulator) handle this today by turning off
        "unused" resources at late_initcall_sync and hoping all the devices
        have probed by then. This is not a valid assumption for systems with
        loadable modules. Other frameworks (Eg: clock) just don't handle
        this due to the lack of a clear signal for when they can turn off
        resources. This leads to downstream hacks to handle cases like this
        that can easily be solved in the upstream kernel.
      
        By linking devices before they are probed, we give suppliers a clear
        count of the number of dependent consumers. Once all of the
        consumers are active, the suppliers can turn off the unused
        resources without making assumptions about the number of consumers.
      
      By default we just add device-links to track "driver presence" (probe
      succeeded) of the supplier device. If any other functionality provided
      by device-links are needed, it is left to the consumer/supplier
      devices to change the link when they probe.
      
      kbuild test robot reported clang error about missing const
      Reported-by: Nkbuild test robot <lkp@intel.com>
      Signed-off-by: NSaravana Kannan <saravanak@google.com>
      Link: https://lore.kernel.org/r/20190904211126.47518-4-saravanak@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a3e1d1a7
  26. 01 10月, 2019 1 次提交
  27. 25 9月, 2019 1 次提交
    • V
      mm, page_owner, debug_pagealloc: save and dump freeing stack trace · 8974558f
      Vlastimil Babka 提交于
      The debug_pagealloc functionality is useful to catch buggy page allocator
      users that cause e.g.  use after free or double free.  When page
      inconsistency is detected, debugging is often simpler by knowing the call
      stack of process that last allocated and freed the page.  When page_owner
      is also enabled, we record the allocation stack trace, but not freeing.
      
      This patch therefore adds recording of freeing process stack trace to page
      owner info, if both page_owner and debug_pagealloc are configured and
      enabled.  With only page_owner enabled, this info is not useful for the
      memory leak debugging use case.  dump_page() is adjusted to print the
      info.  An example result of calling __free_pages() twice may look like
      this (note the page last free stack trace):
      
      BUG: Bad page state in process bash  pfn:13d8f8
      page:ffffc31984f63e00 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0
      flags: 0x1affff800000000()
      raw: 01affff800000000 dead000000000100 dead000000000122 0000000000000000
      raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
      page dumped because: nonzero _refcount
      page_owner tracks the page as freed
      page last allocated via order 0, migratetype Unmovable, gfp_mask 0xcc0(GFP_KERNEL)
       prep_new_page+0x143/0x150
       get_page_from_freelist+0x289/0x380
       __alloc_pages_nodemask+0x13c/0x2d0
       khugepaged+0x6e/0xc10
       kthread+0xf9/0x130
       ret_from_fork+0x3a/0x50
      page last free stack trace:
       free_pcp_prepare+0x134/0x1e0
       free_unref_page+0x18/0x90
       khugepaged+0x7b/0xc10
       kthread+0xf9/0x130
       ret_from_fork+0x3a/0x50
      Modules linked in:
      CPU: 3 PID: 271 Comm: bash Not tainted 5.3.0-rc4-2.g07a1a73-default+ #57
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
      Call Trace:
       dump_stack+0x85/0xc0
       bad_page.cold+0xba/0xbf
       rmqueue_pcplist.isra.0+0x6c5/0x6d0
       rmqueue+0x2d/0x810
       get_page_from_freelist+0x191/0x380
       __alloc_pages_nodemask+0x13c/0x2d0
       __get_free_pages+0xd/0x30
       __pud_alloc+0x2c/0x110
       copy_page_range+0x4f9/0x630
       dup_mmap+0x362/0x480
       dup_mm+0x68/0x110
       copy_process+0x19e1/0x1b40
       _do_fork+0x73/0x310
       __x64_sys_clone+0x75/0x80
       do_syscall_64+0x6e/0x1e0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x7f10af854a10
      ...
      
      Link: http://lkml.kernel.org/r/20190820131828.22684-5-vbabka@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8974558f
  28. 14 9月, 2019 1 次提交
  29. 11 9月, 2019 1 次提交
  30. 05 9月, 2019 1 次提交
    • N
      powerpc/64s/radix: introduce options to disable use of the tlbie instruction · 2275d7b5
      Nicholas Piggin 提交于
      Introduce two options to control the use of the tlbie instruction. A
      boot time option which completely disables the kernel using the
      instruction, this is currently incompatible with HASH MMU, KVM, and
      coherent accelerators.
      
      And a debugfs option can be switched at runtime and avoids using tlbie
      for invalidating CPU TLBs for normal process and kernel address
      mappings. Coherent accelerators are still managed with tlbie, as will
      KVM partition scope translations.
      
      Cross-CPU TLB flushing is implemented with IPIs and tlbiel. This is a
      basic implementation which does not attempt to make any optimisation
      beyond the tlbie implementation.
      
      This is useful for performance testing among other things. For example
      in certain situations on large systems, using IPIs may be faster than
      tlbie as they can be directed rather than broadcast. Later we may also
      take advantage of the IPIs to do more interesting things such as trim
      the mm cpumask more aggressively.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190902152931.17840-7-npiggin@gmail.com
      2275d7b5
  31. 04 9月, 2019 1 次提交
  32. 03 9月, 2019 1 次提交
  33. 30 8月, 2019 1 次提交
  34. 28 8月, 2019 1 次提交
  35. 23 8月, 2019 1 次提交
  36. 22 8月, 2019 1 次提交