1. 05 4月, 2018 6 次提交
  2. 04 4月, 2018 9 次提交
    • F
      cxl: Fix possible deadlock when processing page faults from cxllib · ad7b4e80
      Frederic Barrat 提交于
      cxllib_handle_fault() is called by an external driver when it needs to
      have the host resolve page faults for a buffer. The buffer can cover
      several pages and VMAs. The function iterates over all the pages used
      by the buffer, based on the page size of the VMA.
      
      To ensure some stability while processing the faults, the thread T1
      grabs the mm->mmap_sem semaphore with read access (R1). However, when
      processing a page fault for a single page, one of the underlying
      functions, copro_handle_mm_fault(), also grabs the same semaphore with
      read access (R2). So the thread T1 takes the semaphore twice.
      
      If another thread T2 tries to access the semaphore in write mode W1
      (say, because it wants to allocate memory and calls 'brk'), then that
      thread T2 will have to wait because there's a reader (R1). If the
      thread T1 is processing a new page at that time, it won't get an
      automatic grant at R2, because there's now a writer thread
      waiting (T2). And we have a deadlock.
      
      The timeline is:
      1. thread T1 owns the semaphore with read access R1
      2. thread T2 requests write access W1 and waits
      3. thread T1 requests read access R2 and waits
      
      The fix is for the thread T1 to release the semaphore R1 once it got
      the information it needs from the current VMA. The address space/VMAs
      could evolve while T1 iterates over the full buffer, but in the
      unlikely case where T1 misses a page, the external driver will raise a
      new page fault when retrying the memory access.
      
      Fixes: 3ced8d73 ("cxl: Export library to support IBM XSL")
      Cc: stable@vger.kernel.org # 4.13+
      Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ad7b4e80
    • N
      powerpc/hw_breakpoint: Only disable hw breakpoint if cpu supports it · 5d6a03eb
      Naveen N. Rao 提交于
      We get the below warning if we try to use kexec on P9:
         kexec_core: Starting new kernel
         WARNING: CPU: 0 PID: 1223 at arch/powerpc/kernel/process.c:826 __set_breakpoint+0xb4/0x140
         [snip]
         NIP __set_breakpoint+0xb4/0x140
         LR  kexec_prepare_cpus_wait+0x58/0x150
         Call Trace:
           0xc0000000ee70fb20 (unreliable)
           0xc0000000ee70fb20
           default_machine_kexec+0x234/0x2c0
           machine_kexec+0x84/0x90
           kernel_kexec+0xd8/0xe0
           SyS_reboot+0x214/0x2c0
           system_call+0x58/0x6c
      
      This happens since we are trying to clear hw breakpoint on POWER9,
      though we don't have CPU_FTR_DAWR enabled. Guard __set_breakpoint()
      within hw_breakpoint_disable() with ppc_breakpoint_available() to
      address this.
      
      Fixes: 96541531 ("powerpc: Disable DAWR in the base POWER9 CPU features")
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5d6a03eb
    • A
      powerpc/mm/radix: Update command line parsing for disable_radix · 7a22d632
      Aneesh Kumar K.V 提交于
      kernel parameter disable_radix takes different options
      disable_radix=yes|no|1|0  or just disable_radix.
      
      prom_init parsing is not supporting these options.
      
      Fixes: 1fd6c022 ("powerpc/mm: Add a CONFIG option to choose if radix is used by default")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7a22d632
    • A
      powerpc/mm/radix: Parse disable_radix commandline correctly. · cec4e9b2
      Aneesh Kumar K.V 提交于
      kernel parameter disable_radix takes different options
      disable_radix=yes|no|1|0 or just disable_radix. When using the later
      format we get below error.
      
       `Malformed early option 'disable_radix'`
      
      Fixes: 1fd6c022 ("powerpc/mm: Add a CONFIG option to choose if radix is used by default")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      cec4e9b2
    • A
      powerpc/mm/hugetlb: initialize the pagetable cache correctly for hugetlb · 6fa50483
      Aneesh Kumar K.V 提交于
      With 64k page size, we have hugetlb pte entries at the pmd and pud level for
      book3s64. We don't need to create a separate page table cache for that. With 4k
      we need to make sure hugepd page table cache for 16M is placed at PUD level
      and 16G at the PGD level.
      
      Simplify all these by not using HUGEPD_PD_SHIFT which is confusing for book3s64.
      
      Without this patch, with 64k page size we create pagetable caches with shift
      value 10 and 7 which are not used at all.
      
      Fixes: 419df06e ("powerpc: Reduce the PTE_INDEX_SIZE")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6fa50483
    • A
      powerpc/mm/radix: Update pte fragment count from 16 to 256 on radix · fb4e5dbd
      Aneesh Kumar K.V 提交于
      With split PTL (page table lock) config, we allocate the level
      4 (leaf) page table using pte fragment framework instead of slab cache
      like other levels. This was done to enable us to have split page table
      lock at the level 4 of the page table. We use page->plt backing the
      all the level 4 pte fragment for the lock.
      
      Currently with Radix, we use only 16 fragments out of the allocated
      page. In radix each fragment is 256 bytes which means we use only 4k
      out of the allocated 64K page wasting 60k of the allocated memory.
      This was done earlier to keep it closer to hash.
      
      This patch update the pte fragment count to 256, thereby using the
      full 64K page and reducing the memory usage. Performance tests shows
      really low impact even with THP disabled. With THP disabled we will be
      contenting further less on level 4 ptl and hence the impact should be
      further low.
      
        256 threads:
          without patch (10 runs of ./ebizzy  -m -n 1000 -s 131072 -S 100)
            median = 15678.5
            stdev = 42.1209
      
          with patch:
            median = 15354
            stdev = 194.743
      
      This is with THP disabled. With THP enabled the impact of the patch
      will be less.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fb4e5dbd
    • A
      powerpc/mm/keys: Update documentation and remove unnecessary check · f2ed480f
      Aneesh Kumar K.V 提交于
      Adds more code comments. We also remove an unnecessary pkey check
      after we check for pkey error in this patch.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f2ed480f
    • N
      powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead · b9ee31e1
      Nicholas Piggin 提交于
      When stop is executed with EC=ESL=0, it appears to execute like a
      normal instruction (resuming from NIP when woken by interrupt). So all
      the save/restore handling can be avoided completely. In particular NV
      GPRs do not have to be saved, and MSR does not have to be switched
      back to kernel MSR.
      
      So move the test for EC=ESL=0 sleep states out to power9_idle_stop,
      and return directly to the caller after stop in that case.
      
      This improves performance for ping-pong benchmark with the stop0_lite
      idle state by 2.54% for 2 threads in the same core, and 2.57% for
      different cores. Performance increase with HV_POSSIBLE defined will be
      improved further by avoiding the hwsync.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b9ee31e1
    • M
      powerpc/64s/idle: Consolidate power9_offline_stop()/power9_idle_stop() · d0b791c0
      Michael Ellerman 提交于
      Commit 3d4fbffd ("powerpc/64s/idle: POWER9 implement a separate
      idle stop function for hotplug") that added power9_offline_stop() was
      written before commit 7672691a ("powerpc/powernv: Provide a way to
      force a core into SMT4 mode").
      
      When merging the former I failed to notice that it caused us to skip
      the force-SMT4 logic for offline CPUs. The result is that offlined
      CPUs will not correctly participate in the force-SMT4 logic, which
      presumably will result in badness (not tested).
      
      Reconcile the two commits by making power9_offline_stop() a pre-cursor
      to power9_idle_stop(), so that they share the force-SMT4 logic.
      
      This is based on an original commit from Nick, all breakage is my own.
      
      Fixes: 3d4fbffd ("powerpc/64s/idle: POWER9 implement a separate idle stop function for hotplug")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      d0b791c0
  3. 03 4月, 2018 10 次提交
  4. 01 4月, 2018 5 次提交
  5. 31 3月, 2018 10 次提交