1. 13 1月, 2014 4 次提交
    • P
      sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs · 20d1c86a
      Peter Zijlstra 提交于
      Use a ring-buffer like multi-version object structure which allows
      always having a coherent object; we use this to avoid having to
      disable IRQs while reading sched_clock() and avoids a problem when
      getting an NMI while changing the cyc2ns data.
      
                              MAINLINE   PRE        POST
      
          sched_clock_stable: 1          1          1
          (cold) sched_clock: 329841     331312     257223
          (cold) local_clock: 301773     310296     309889
          (warm) sched_clock: 38375      38247      25280
          (warm) local_clock: 100371     102713     85268
          (warm) rdtsc:       27340      27289      24247
          sched_clock_stable: 0          0          0
          (cold) sched_clock: 382634     372706     301224
          (cold) local_clock: 396890     399275     399870
          (warm) sched_clock: 38194      38124      25630
          (warm) local_clock: 143452     148698     129629
          (warm) rdtsc:       27345      27365      24307
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-s567in1e5ekq2nlyhn8f987r@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      20d1c86a
    • P
      sched/clock, x86: Move some cyc2ns() code around · 57c67da2
      Peter Zijlstra 提交于
      There are no __cycles_2_ns() users outside of arch/x86/kernel/tsc.c,
      so move it there.
      
      There are no cycles_2_ns() users.
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-01lslnavfgo3kmbo4532zlcj@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      57c67da2
    • P
      sched/clock, x86: Use mul_u64_u32_shr() for native_sched_clock() · 5dd12c21
      Peter Zijlstra 提交于
      Use mul_u64_u32_shr() so that x86_64 can use a single 64x64->128 mul.
      
      Before:
      
      0000000000000560 <native_sched_clock>:
       560:   44 8b 1d 00 00 00 00    mov    0x0(%rip),%r11d        # 567 <native_sched_clock+0x7>
       567:   55                      push   %rbp
       568:   48 89 e5                mov    %rsp,%rbp
       56b:   45 85 db                test   %r11d,%r11d
       56e:   75 4f                   jne    5bf <native_sched_clock+0x5f>
       570:   0f 31                   rdtsc
       572:   89 c0                   mov    %eax,%eax
       574:   48 c1 e2 20             shl    $0x20,%rdx
       578:   48 c7 c1 00 00 00 00    mov    $0x0,%rcx
       57f:   48 09 c2                or     %rax,%rdx
       582:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
       589:   65 8b 04 25 00 00 00    mov    %gs:0x0,%eax
       590:   00
       591:   48 98                   cltq
       593:   48 8b 34 c5 00 00 00    mov    0x0(,%rax,8),%rsi
       59a:   00
       59b:   48 89 d0                mov    %rdx,%rax
       59e:   81 e2 ff 03 00 00       and    $0x3ff,%edx
       5a4:   48 c1 e8 0a             shr    $0xa,%rax
       5a8:   48 0f af 14 0e          imul   (%rsi,%rcx,1),%rdx
       5ad:   48 0f af 04 0e          imul   (%rsi,%rcx,1),%rax
       5b2:   5d                      pop    %rbp
       5b3:   48 03 04 3e             add    (%rsi,%rdi,1),%rax
       5b7:   48 c1 ea 0a             shr    $0xa,%rdx
       5bb:   48 01 d0                add    %rdx,%rax
       5be:   c3                      retq
      
      After:
      
      0000000000000550 <native_sched_clock>:
       550:   8b 3d 00 00 00 00       mov    0x0(%rip),%edi        # 556 <native_sched_clock+0x6>
       556:   55                      push   %rbp
       557:   48 89 e5                mov    %rsp,%rbp
       55a:   48 83 e4 f0             and    $0xfffffffffffffff0,%rsp
       55e:   85 ff                   test   %edi,%edi
       560:   75 2c                   jne    58e <native_sched_clock+0x3e>
       562:   0f 31                   rdtsc
       564:   89 c0                   mov    %eax,%eax
       566:   48 c1 e2 20             shl    $0x20,%rdx
       56a:   48 09 c2                or     %rax,%rdx
       56d:   65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
       574:   00 00
       576:   89 c0                   mov    %eax,%eax
       578:   48 f7 e2                mul    %rdx
       57b:   65 48 8b 0c 25 00 00    mov    %gs:0x0,%rcx
       582:   00 00
       584:   c9                      leaveq
       585:   48 0f ac d0 0a          shrd   $0xa,%rdx,%rax
       58a:   48 01 c8                add    %rcx,%rax
       58d:   c3                      retq
      
                              MAINLINE   POST
      
          sched_clock_stable: 1	   1
          (cold) sched_clock: 329841     331312
          (cold) local_clock: 301773     310296
          (warm) sched_clock: 38375      38247
          (warm) local_clock: 100371     102713
          (warm) rdtsc:       27340      27289
          sched_clock_stable: 0          0
          (cold) sched_clock: 382634     372706
          (cold) local_clock: 396890     399275
          (warm) sched_clock: 38194      38124
          (warm) local_clock: 143452     148698
          (warm) rdtsc:       27345      27365
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-piu203ses5y1g36bnyw2n16x@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5dd12c21
    • D
      sched: Add new scheduler syscalls to support an extended scheduling parameters ABI · d50dde5a
      Dario Faggioli 提交于
      Add the syscalls needed for supporting scheduling algorithms
      with extended scheduling parameters (e.g., SCHED_DEADLINE).
      
      In general, it makes possible to specify a periodic/sporadic task,
      that executes for a given amount of runtime at each instance, and is
      scheduled according to the urgency of their own timing constraints,
      i.e.:
      
       - a (maximum/typical) instance execution time,
       - a minimum interval between consecutive instances,
       - a time constraint by which each instance must be completed.
      
      Thus, both the data structure that holds the scheduling parameters of
      the tasks and the system calls dealing with it must be extended.
      Unfortunately, modifying the existing struct sched_param would break
      the ABI and result in potentially serious compatibility issues with
      legacy binaries.
      
      For these reasons, this patch:
      
       - defines the new struct sched_attr, containing all the fields
         that are necessary for specifying a task in the computational
         model described above;
      
       - defines and implements the new scheduling related syscalls that
         manipulate it, i.e., sched_setattr() and sched_getattr().
      
      Syscalls are introduced for x86 (32 and 64 bits) and ARM only, as a
      proof of concept and for developing and testing purposes. Making them
      available on other architectures is straightforward.
      
      Since no "user" for these new parameters is introduced in this patch,
      the implementation of the new system calls is just identical to their
      already existing counterpart. Future patches that implement scheduling
      policies able to exploit the new data structure must also take care of
      modifying the sched_*attr() calls accordingly with their own purposes.
      Signed-off-by: NDario Faggioli <raistlin@linux.it>
      [ Rewrote to use sched_attr. ]
      Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
      [ Removed sched_setscheduler2() for now. ]
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-3-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d50dde5a
  2. 02 1月, 2014 1 次提交
    • J
      KVM: nVMX: Unconditionally uninit the MMU on nested vmexit · 29bf08f1
      Jan Kiszka 提交于
      Three reasons for doing this: 1. arch.walk_mmu points to arch.mmu anyway
      in case nested EPT wasn't in use. 2. this aligns VMX with SVM. But 3. is
      most important: nested_cpu_has_ept(vmcs12) queries the VMCS page, and if
      one guest VCPU manipulates the page of another VCPU in L2, we may be
      fooled to skip over the nested_ept_uninit_mmu_context, leaving mmu in
      nested state. That can crash the host later on if nested_ept_get_cr3 is
      invoked while L1 already left vmxon and nested.current_vmcs12 became
      NULL therefore.
      
      Cc: stable@kernel.org
      Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      29bf08f1
  3. 31 12月, 2013 1 次提交
  4. 20 12月, 2013 1 次提交
  5. 19 12月, 2013 2 次提交
    • R
      mm: fix TLB flush race between migration, and change_protection_range · 20841405
      Rik van Riel 提交于
      There are a few subtle races, between change_protection_range (used by
      mprotect and change_prot_numa) on one side, and NUMA page migration and
      compaction on the other side.
      
      The basic race is that there is a time window between when the PTE gets
      made non-present (PROT_NONE or NUMA), and the TLB is flushed.
      
      During that time, a CPU may continue writing to the page.
      
      This is fine most of the time, however compaction or the NUMA migration
      code may come in, and migrate the page away.
      
      When that happens, the CPU may continue writing, through the cached
      translation, to what is no longer the current memory location of the
      process.
      
      This only affects x86, which has a somewhat optimistic pte_accessible.
      All other architectures appear to be safe, and will either always flush,
      or flush whenever there is a valid mapping, even with no permissions
      (SPARC).
      
      The basic race looks like this:
      
      CPU A			CPU B			CPU C
      
      						load TLB entry
      make entry PTE/PMD_NUMA
      			fault on entry
      						read/write old page
      			start migrating page
      			change PTE/PMD to new page
      						read/write old page [*]
      flush TLB
      						reload TLB from new entry
      						read/write new page
      						lose data
      
      [*] the old page may belong to a new user at this point!
      
      The obvious fix is to flush remote TLB entries, by making sure that
      pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
      still be accessible if there is a TLB flush pending for the mm.
      
      This should fix both NUMA migration and compaction.
      
      [mgorman@suse.de: fix build]
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      20841405
    • M
      mm: numa: serialise parallel get_user_page against THP migration · 2b4847e7
      Mel Gorman 提交于
      Base pages are unmapped and flushed from cache and TLB during normal
      page migration and replaced with a migration entry that causes any
      parallel NUMA hinting fault or gup to block until migration completes.
      
      THP does not unmap pages due to a lack of support for migration entries
      at a PMD level.  This allows races with get_user_pages and
      get_user_pages_fast which commit 3f926ab9 ("mm: Close races between
      THP migration and PMD numa clearing") made worse by introducing a
      pmd_clear_flush().
      
      This patch forces get_user_page (fast and normal) on a pmd_numa page to
      go through the slow get_user_page path where it will serialise against
      THP migration and properly account for the NUMA hinting fault.  On the
      migration side the page table lock is taken for each PTE update.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2b4847e7
  6. 13 12月, 2013 3 次提交
    • G
      KVM: x86: fix guest-initiated crash with x2apic (CVE-2013-6376) · 17d68b76
      Gleb Natapov 提交于
      A guest can cause a BUG_ON() leading to a host kernel crash.
      When the guest writes to the ICR to request an IPI, while in x2apic
      mode the following things happen, the destination is read from
      ICR2, which is a register that the guest can control.
      
      kvm_irq_delivery_to_apic_fast uses the high 16 bits of ICR2 as the
      cluster id.  A BUG_ON is triggered, which is a protection against
      accessing map->logical_map with an out-of-bounds access and manages
      to avoid that anything really unsafe occurs.
      
      The logic in the code is correct from real HW point of view. The problem
      is that KVM supports only one cluster with ID 0 in clustered mode, but
      the code that has the bug does not take this into account.
      Reported-by: NLars Bull <larsbull@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      17d68b76
    • A
      KVM: x86: Convert vapic synchronization to _cached functions (CVE-2013-6368) · fda4e2e8
      Andy Honig 提交于
      In kvm_lapic_sync_from_vapic and kvm_lapic_sync_to_vapic there is the
      potential to corrupt kernel memory if userspace provides an address that
      is at the end of a page.  This patches concerts those functions to use
      kvm_write_guest_cached and kvm_read_guest_cached.  It also checks the
      vapic_address specified by userspace during ioctl processing and returns
      an error to userspace if the address is not a valid GPA.
      
      This is generally not guest triggerable, because the required write is
      done by firmware that runs before the guest.  Also, it only affects AMD
      processors and oldish Intel that do not have the FlexPriority feature
      (unless you disable FlexPriority, of course; then newer processors are
      also affected).
      
      Fixes: b93463aa ('KVM: Accelerated apic support')
      Reported-by: NAndrew Honig <ahonig@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAndrew Honig <ahonig@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fda4e2e8
    • A
      KVM: x86: Fix potential divide by 0 in lapic (CVE-2013-6367) · b963a22e
      Andy Honig 提交于
      Under guest controllable circumstances apic_get_tmcct will execute a
      divide by zero and cause a crash.  If the guest cpuid support
      tsc deadline timers and performs the following sequence of requests
      the host will crash.
      - Set the mode to periodic
      - Set the TMICT to 0
      - Set the mode bits to 11 (neither periodic, nor one shot, nor tsc deadline)
      - Set the TMICT to non-zero.
      Then the lapic_timer.period will be 0, but the TMICT will not be.  If the
      guest then reads from the TMCCT then the host will perform a divide by 0.
      
      This patch ensures that if the lapic_timer.period is 0, then the division
      does not occur.
      Reported-by: NAndrew Honig <ahonig@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAndrew Honig <ahonig@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b963a22e
  7. 11 12月, 2013 3 次提交
  8. 10 12月, 2013 2 次提交
  9. 05 12月, 2013 3 次提交
  10. 04 12月, 2013 1 次提交
  11. 29 11月, 2013 1 次提交
  12. 22 11月, 2013 1 次提交
  13. 21 11月, 2013 1 次提交
  14. 20 11月, 2013 1 次提交
  15. 19 11月, 2013 1 次提交
  16. 17 11月, 2013 3 次提交
    • R
      um/vdso: add .gitignore for a couple of targets · b13a9bfc
      Ramkumar Ramachandra 提交于
      Cc: Richard Weinberger <richard@nod.at>
      Signed-off-by: NRamkumar Ramachandra <artagnon@gmail.com>
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      b13a9bfc
    • R
      arch/um: make it work with defconfig and x86_64 · e40f04d0
      Ramkumar Ramachandra 提交于
      arch/um/defconfig only lists one default configuration, and that applies
      only to the i386 architecture.  Replace it with two minimal
      configuration files generated using `make savedefconfig`:
      
        i386_defconfig and x86_64_defconfig
      
      The build scripts now require two updates:
      
      1. um's Kconfig (arch/x86/um/Kconfig) should specify an ARCH_DEFCONFIG
         section explicitly pointing to these scripts if the required
         variables are set.  Take care to remove the DEFCONFIG_LIST section
         defined in the included file arch/um/Kconfig.common.
      
      2. um's Makefile (arch/um/Makefile) should set KBUILD_DEFCONFIG properly
         for the top-level Makefile to pick up.  Copy the logic in
         arch/x86/Makefile to properly pick the defconfig file depending on
         the actual architecture; except we're working with $SUBARCH here,
         instead of $ARCH.
      
      Now, you can do:
      
        $ ARCH=um make defconfig
        $ ARCH=um make
      
      and successfully build User-Mode Linux on an x86_64 box in default
      configuration.
      
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Jeff Dike <jdike@addtoit.com>
      Signed-off-by: NRamkumar Ramachandra <artagnon@gmail.com>
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      e40f04d0
    • R
      um: Rewrite show_stack() · 9d1ee8ce
      Richard Weinberger 提交于
      Currently on UML stack traces are not very reliable and both
      x86 and x86_64 have their on implementations.
      This patch unifies both and adds support to outline unreliable
      functions calls.
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      9d1ee8ce
  17. 15 11月, 2013 8 次提交
    • D
      x86: Export 'boot_cpu_physical_apicid' to modules · cc08e04c
      David Rientjes 提交于
      Commit 9ebddac7 "ACPI, x86: Fix extended error log driver to depend on
      CONFIG_X86_LOCAL_APIC" fixed a build error when CONFIG_X86_LOCAL_APIC was not
      selected and !CONFIG_SMP.
      
      However, since CONFIG_ACPI_EXTLOG is tristate, there is a second build error:
      
        ERROR: "boot_cpu_physical_apicid" [drivers/acpi/acpi_extlog.ko] undefined!
      
      The symbol needs to be exported for it to be available.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: NTony Luck <tony.luck@intel.com>
      Cc: Chen Gong <gong.chen@linux.intel.com>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1311141504080.30112@chino.kir.corp.google.com
      [ Changed it to a _GPL() export. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      cc08e04c
    • C
      kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS · 0a06ff06
      Christoph Hellwig 提交于
      We've switched over every architecture that supports SMP to it, so
      remove the new useless config variable.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0a06ff06
    • K
      mm: dynamically allocate page->ptl if it cannot be embedded to struct page · 49076ec2
      Kirill A. Shutemov 提交于
      If split page table lock is in use, we embed the lock into struct page
      of table's page.  We have to disable split lock, if spinlock_t is too
      big be to be embedded, like when DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC
      enabled.
      
      This patch add support for dynamic allocation of split page table lock
      if we can't embed it to struct page.
      
      page->ptl is unsigned long now and we use it as spinlock_t if
      sizeof(spinlock_t) <= sizeof(long), otherwise it's pointer to spinlock_t.
      
      The spinlock_t allocated in pgtable_page_ctor() for PTE table and in
      pgtable_pmd_page_ctor() for PMD table.  All other helpers converted to
      support dynamically allocated page->ptl.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reviewed-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      49076ec2
    • K
      x86: handle pgtable_page_ctor() fail · cecbd1b5
      Kirill A. Shutemov 提交于
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cecbd1b5
    • K
      x86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmds · 09ef4939
      Kirill A. Shutemov 提交于
      In split page table lock case, we embed spinlock_t into struct page.
      For obvious reason, we don't want to increase size of struct page if
      spinlock_t is too big, like with DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC or
      on -rt kernel.  So we disable split page table lock, if spinlock_t is
      too big.
      
      This patchset allows to allocate the lock dynamically if spinlock_t is
      big.  In this page->ptl is used to store pointer to spinlock instead of
      spinlock itself.  It costs additional cache line for indirect access,
      but fix page fault scalability for multi-threaded applications.
      
      LOCK_STAT depends on DEBUG_SPINLOCK, so on current kernel enabling
      LOCK_STAT to analyse scalability issues breaks scalability.  ;)
      
      The patchset mostly fixes this.  Results for ./thp_memscale -c 80 -b 512M
      on 4-socket machine:
      
      baseline, no CONFIG_LOCK_STAT:	9.115460703 seconds time elapsed
      baseline, CONFIG_LOCK_STAT=y:	53.890567123 seconds time elapsed
      patched, no CONFIG_LOCK_STAT:	8.852250368 seconds time elapsed
      patched, CONFIG_LOCK_STAT=y:	11.069770759 seconds time elapsed
      
      Patch count is scary, but most of them trivial. Overview:
      
       Patches 1-4	Few bug fixes. No dependencies to other patches.
      		Probably should applied as soon as possible.
      
       Patch 5	Changes signature of pgtable_page_ctor(). We will use it
      		for dynamic lock allocation, so it can fail.
      
       Patches 6-8	Add missing constructor/destructor calls on few archs.
      		It's fixes NR_PAGETABLE accounting and prepare to use
      		split ptl.
      
       Patches 9-33	Add pgtable_page_ctor() fail handling to all archs.
      
       Patches 34	Finally adds support of dynamically-allocated page->pte.
      		Also contains documentation for split page table lock.
      
      This patch (of 34):
      
      I've missed that we preallocate few pmds on pgd_alloc() if X86_PAE
      enabled.  Let's add missed constructor/destructor calls.
      
      I haven't noticed it during testing since prep_new_page() clears
      page->mapping and therefore page->ptl.  It's effectively equal to
      spin_lock_init(&page->ptl).
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09ef4939
    • K
      x86, mm: enable split page table lock for PMD level · 9491846f
      Kirill A. Shutemov 提交于
      Enable PMD split page table lock for X86_64 and PAE.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NAlex Thorlton <athorlton@sgi.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robin Holt <robinmholt@gmail.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Hugh Dickins <hughd@google.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9491846f
    • K
      mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS · 57c1ffce
      Kirill A. Shutemov 提交于
      We're going to introduce split page table lock for PMD level.  Let's
      rename existing split ptlock for PTE level to avoid confusion.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NAlex Thorlton <athorlton@sgi.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robin Holt <robinmholt@gmail.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      57c1ffce
    • R
      ACPI / driver core: Store an ACPI device pointer in struct acpi_dev_node · 7b199811
      Rafael J. Wysocki 提交于
      Modify struct acpi_dev_node to contain a pointer to struct acpi_device
      associated with the given device object (that is, its ACPI companion
      device) instead of an ACPI handle corresponding to it.  Introduce two
      new macros for manipulating that pointer in a CONFIG_ACPI-safe way,
      ACPI_COMPANION() and ACPI_COMPANION_SET(), and rework the
      ACPI_HANDLE() macro to take the above changes into account.
      Drop the ACPI_HANDLE_SET() macro entirely and rework its users to
      use ACPI_COMPANION_SET() instead.  For some of them who used to
      pass the result of acpi_get_child() directly to ACPI_HANDLE_SET()
      introduce a helper routine acpi_preset_companion() doing an
      equivalent thing.
      
      The main motivation for doing this is that there are things
      represented by struct acpi_device objects that don't have valid
      ACPI handles (so called fixed ACPI hardware features, such as
      power and sleep buttons) and we would like to create platform
      device objects for them and "glue" them to their ACPI companions
      in the usual way (which currently is impossible due to the
      lack of valid ACPI handles).  However, there are more reasons
      why it may be useful.
      
      First, struct acpi_device pointers allow of much better type checking
      than void pointers which are ACPI handles, so it should be more
      difficult to write buggy code using modified struct acpi_dev_node
      and the new macros.  Second, the change should help to reduce (over
      time) the number of places in which the result of ACPI_HANDLE() is
      passed to acpi_bus_get_device() in order to obtain a pointer to the
      struct acpi_device associated with the given "physical" device,
      because now that pointer is returned by ACPI_COMPANION() directly.
      Finally, the change should make it easier to write generic code that
      will build both for CONFIG_ACPI set and unset without adding explicit
      compiler directives to it.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> # on Haswell
      Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Reviewed-by: Aaron Lu <aaron.lu@intel.com> # for ATA and SDIO part
      7b199811
  18. 14 11月, 2013 3 次提交