1. 16 11月, 2016 3 次提交
  2. 25 10月, 2016 1 次提交
  3. 17 10月, 2016 3 次提交
  4. 08 10月, 2016 1 次提交
  5. 22 9月, 2016 1 次提交
    • S
      s390/pci_dma: improve lazy flush for unmap · 13954fd6
      Sebastian Ott 提交于
      Lazy unmap (defer tlb flush after unmap until dma address reuse) can
      greatly reduce the number of RPCIT instructions in the best case. In
      reality we are often far away from the best case scenario because our
      implementation suffers from the following problem:
      
      To create dma addresses we maintain an iommu bitmap and a pointer into
      that bitmap to mark the start of the next search. That pointer moves from
      the start to the end of that bitmap and we issue a global tlb flush
      once that pointer wraps around. To prevent address reuse before we issue
      the tlb flush we even have to move the next pointer during unmaps - when
      clearing a bit > next. This could lead to a situation where we only use
      the rear part of that bitmap and issue more tlb flushes than expected.
      
      To fix this we no longer clear bits during unmap but maintain a 2nd
      bitmap which we use to mark addresses that can't be reused until we issue
      the global tlb flush after wrap around.
      Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
      Reviewed-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      13954fd6
  6. 14 9月, 2016 1 次提交
  7. 08 9月, 2016 2 次提交
    • D
      KVM: s390: allow 255 VCPUs when sca entries aren't used · a6940674
      David Hildenbrand 提交于
      If the SCA entries aren't used by the hardware (no SIGPIF), we
      can simply not set the entries, stick to the basic sca and allow more
      than 64 VCPUs.
      
      To hinder any other facility from using these entries, let's properly
      provoke intercepts by not setting the MCN and keeping the entries
      unset.
      
      This effectively allows when running KVM under KVM (vSIE) or under z/VM to
      provide more than 64 VCPUs to a guest. Let's limit it to 255 for now, to
      not run into problems if the CPU numbers are limited somewhere else.
      Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      a6940674
    • S
      KVM: Add provisioning for ulong vm stats and u64 vcpu stats · 8a7e75d4
      Suraj Jitindar Singh 提交于
      vms and vcpus have statistics associated with them which can be viewed
      within the debugfs. Currently it is assumed within the vcpu_stat_get() and
      vm_stat_get() functions that all of these statistics are represented as
      u32s, however the next patch adds some u64 vcpu statistics.
      
      Change all vcpu statistics to u64 and modify vcpu_stat_get() accordingly.
      Since vcpu statistics are per vcpu, they will only be updated by a single
      vcpu at a time so this shouldn't present a problem on 32-bit machines
      which can't atomically increment 64-bit numbers. However vm statistics
      could potentially be updated by multiple vcpus from that vm at a time.
      To avoid the overhead of atomics make all vm statistics ulong such that
      they are 64-bit on 64-bit systems where they can be atomically incremented
      and are 32-bit on 32-bit systems which may not be able to atomically
      increment 64-bit numbers. Modify vm_stat_get() to expect ulongs.
      Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      8a7e75d4
  8. 31 8月, 2016 1 次提交
    • J
      mm/usercopy: get rid of CONFIG_DEBUG_STRICT_USER_COPY_CHECKS · 0d025d27
      Josh Poimboeuf 提交于
      There are three usercopy warnings which are currently being silenced for
      gcc 4.6 and newer:
      
      1) "copy_from_user() buffer size is too small" compile warning/error
      
         This is a static warning which happens when object size and copy size
         are both const, and copy size > object size.  I didn't see any false
         positives for this one.  So the function warning attribute seems to
         be working fine here.
      
         Note this scenario is always a bug and so I think it should be
         changed to *always* be an error, regardless of
         CONFIG_DEBUG_STRICT_USER_COPY_CHECKS.
      
      2) "copy_from_user() buffer size is not provably correct" compile warning
      
         This is another static warning which happens when I enable
         __compiletime_object_size() for new compilers (and
         CONFIG_DEBUG_STRICT_USER_COPY_CHECKS).  It happens when object size
         is const, but copy size is *not*.  In this case there's no way to
         compare the two at build time, so it gives the warning.  (Note the
         warning is a byproduct of the fact that gcc has no way of knowing
         whether the overflow function will be called, so the call isn't dead
         code and the warning attribute is activated.)
      
         So this warning seems to only indicate "this is an unusual pattern,
         maybe you should check it out" rather than "this is a bug".
      
         I get 102(!) of these warnings with allyesconfig and the
         __compiletime_object_size() gcc check removed.  I don't know if there
         are any real bugs hiding in there, but from looking at a small
         sample, I didn't see any.  According to Kees, it does sometimes find
         real bugs.  But the false positive rate seems high.
      
      3) "Buffer overflow detected" runtime warning
      
         This is a runtime warning where object size is const, and copy size >
         object size.
      
      All three warnings (both static and runtime) were completely disabled
      for gcc 4.6 with the following commit:
      
        2fb0815c ("gcc4: disable __compiletime_object_size for GCC 4.6+")
      
      That commit mistakenly assumed that the false positives were caused by a
      gcc bug in __compiletime_object_size().  But in fact,
      __compiletime_object_size() seems to be working fine.  The false
      positives were instead triggered by #2 above.  (Though I don't have an
      explanation for why the warnings supposedly only started showing up in
      gcc 4.6.)
      
      So remove warning #2 to get rid of all the false positives, and re-enable
      warnings #1 and #3 by reverting the above commit.
      
      Furthermore, since #1 is a real bug which is detected at compile time,
      upgrade it to always be an error.
      
      Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer
      needed.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0d025d27
  9. 29 8月, 2016 6 次提交
    • M
      s390/crypto: cpacf function detection · 69c0e360
      Martin Schwidefsky 提交于
      The CPACF code makes some assumptions about the availablity of hardware
      support. E.g. if the machine supports KM(AES-256) without chaining it is
      assumed that KMC(AES-256) with chaining is available as well. For the
      existing CPUs this is true but the architecturally correct way is to
      check each CPACF functions on its own. This is what the query function
      of each instructions is all about.
      Reviewed-by: NHarald Freudenberger <freude@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      69c0e360
    • M
      s390/crypto: simplify return code handling · 0177db01
      Martin Schwidefsky 提交于
      The CPACF instructions can complete with three different condition codes:
      CC=0 for successful completion, CC=1 if the protected key verification
      failed, and CC=3 for partial completion.
      
      The inline functions will restart the CPACF instruction for partial
      completion, this removes the CC=3 case. The CC=1 case is only relevant
      for the protected key functions of the KM, KMC, KMAC and KMCTR
      instructions. As the protected key functions are not used by the
      current code, there is no need for any kind of return code handling.
      Reviewed-by: NHarald Freudenberger <freude@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      0177db01
    • M
      s390/crypto: cleanup cpacf function codes · edc63a37
      Martin Schwidefsky 提交于
      Use a separate define for the decryption modifier bit instead of
      duplicating the function codes for encryption / decrypton.
      In addition use an unsigned type for the function code.
      Reviewed-by: NHarald Freudenberger <freude@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      edc63a37
    • M
      RAID/s390: add SIMD implementation for raid6 gen/xor · 474fd6e8
      Martin Schwidefsky 提交于
      Using vector registers is slightly faster:
      
      raid6: vx128x8  gen() 19705 MB/s
      raid6: vx128x8  xor() 11886 MB/s
      raid6: using algorithm vx128x8 gen() 19705 MB/s
      raid6: .... xor() 11886 MB/s, rmw enabled
      
      vs the software algorithms:
      
      raid6: int64x1  gen()  3018 MB/s
      raid6: int64x1  xor()  1429 MB/s
      raid6: int64x2  gen()  4661 MB/s
      raid6: int64x2  xor()  3143 MB/s
      raid6: int64x4  gen()  5392 MB/s
      raid6: int64x4  xor()  3509 MB/s
      raid6: int64x8  gen()  4441 MB/s
      raid6: int64x8  xor()  3207 MB/s
      raid6: using algorithm int64x4 gen() 5392 MB/s
      raid6: .... xor() 3509 MB/s, rmw enabled
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      474fd6e8
    • M
      s390/fpu: improve kernel_fpu_[begin|end] · 7f79695c
      Martin Schwidefsky 提交于
      In case of nested user of the FPU or vector registers in the kernel
      the current code uses the mask of the FPU/vector registers of the
      previous contexts to decide which registers to save and restore.
      E.g. if the previous context used KERNEL_VXR_V0V7 and the next
      context wants to use KERNEL_VXR_V24V31 the first 8 vector registers
      are stored to the FPU state structure. But this is not necessary
      as the next context does not use these registers.
      
      Rework the FPU/vector register save and restore code. The new code
      does a few things differently:
      1) A lowcore field is used instead of a per-cpu variable.
      2) The kernel_fpu_end function now has two parameters just like
         kernel_fpu_begin. The register flags are required by both
         functions to save / restore the minimal register set.
      3) The inline functions kernel_fpu_begin/kernel_fpu_end now do the
         update of the register masks. If the user space FPU registers
         have already been stored neither save_fpu_regs nor the
         __kernel_fpu_begin/__kernel_fpu_end functions have to be called
         for the first context. In this case kernel_fpu_begin adds 7
         instructions and kernel_fpu_end adds 4 instructions.
      3) The inline assemblies in __kernel_fpu_begin / __kernel_fpu_end
         to save / restore the vector registers are simplified a bit.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      7f79695c
    • M
      s390/vx: allow to include vx-insn.h with .include · 0eab11c7
      Martin Schwidefsky 提交于
      To make the vx-insn.h more versatile avoid cpp preprocessor macros
      and allow to use plain numbers for vector and general purpose register
      operands. With that you can emit an .include from a C file into the
      assembler text and then use the vx-insn macros in inline assemblies.
      
      For example:
      
      asm (".include \"asm/vx-insn.h\"");
      
      static inline void xor_vec(int x, int y, int z)
      {
      	asm volatile("VX %0,%1,%2"
      		     : : "i" (x), "i" (y), "i" (z));
      }
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      0eab11c7
  10. 26 8月, 2016 1 次提交
  11. 24 8月, 2016 5 次提交
  12. 08 8月, 2016 1 次提交
  13. 04 8月, 2016 1 次提交
    • K
      dma-mapping: use unsigned long for dma_attrs · 00085f1e
      Krzysztof Kozlowski 提交于
      The dma-mapping core and the implementations do not change the DMA
      attributes passed by pointer.  Thus the pointer can point to const data.
      However the attributes do not have to be a bitfield.  Instead unsigned
      long will do fine:
      
      1. This is just simpler.  Both in terms of reading the code and setting
         attributes.  Instead of initializing local attributes on the stack
         and passing pointer to it to dma_set_attr(), just set the bits.
      
      2. It brings safeness and checking for const correctness because the
         attributes are passed by value.
      
      Semantic patches for this change (at least most of them):
      
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
      
          @@
          f(...,
          - struct dma_attrs *attrs
          + unsigned long attrs
          , ...)
          {
          ...
          }
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      and
      
          // Options: --all-includes
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
          type t;
      
          @@
          t f(..., struct dma_attrs *attrs);
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.comSigned-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Acked-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
      Acked-by: Mark Salter <msalter@redhat.com> [c6x]
      Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
      Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
      Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
      Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
      Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
      Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org>
      Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
      Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
      Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00085f1e
  14. 31 7月, 2016 2 次提交
    • J
      s390: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO · 68c5cf5a
      James Hogan 提交于
      AT_VECTOR_SIZE_ARCH should be defined with the maximum number of
      NEW_AUX_ENT entries that ARCH_DLINFO can contain, but it wasn't defined
      for s390 at all even though ARCH_DLINFO can contain one NEW_AUX_ENT when
      VDSO is enabled.
      
      This shouldn't be a problem as AT_VECTOR_SIZE_BASE includes space for
      AT_BASE_PLATFORM which s390 doesn't use, but lets define it now and add
      the comment above ARCH_DLINFO as found in several other architectures to
      remind future modifiers of ARCH_DLINFO to keep AT_VECTOR_SIZE_ARCH up to
      date.
      
      Fixes: b020632e ("[S390] introduce vdso on s390")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux-s390@vger.kernel.org
      68c5cf5a
    • G
      s390/mm: clean up pte/pmd encoding · bc29b7ac
      Gerald Schaefer 提交于
      The hugetlbfs pte<->pmd conversion functions currently assume that the pmd
      bit layout is consistent with the pte layout, which is not really true.
      
      The SW read and write bits are encoded as the sequence "wr" in a pte, but
      in a pmd it is "rw". The hugetlbfs conversion assumes that the sequence
      is identical in both cases, which results in swapped read and write bits
      in the pmd. In practice this is not a problem, because those pmd bits are
      only relevant for THP pmds and not for hugetlbfs pmds. The hugetlbfs code
      works on (fake) ptes, and the converted pte bits are correct.
      
      There is another variation in pte/pmd encoding which affects dirty
      prot-none ptes/pmds. In this case, a pmd has both its HW read-only and
      invalid bit set, while it is only the invalid bit for a pte. This also has
      no effect in practice, but it should better be consistent.
      
      This patch fixes both inconsistencies by changing the SW read/write bit
      layout for pmds as well as the PAGE_NONE encoding for ptes. It also makes
      the hugetlbfs conversion functions more robust by introducing a
      move_set_bit() macro that uses the pte/pmd bit #defines instead of
      constant shifts.
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      bc29b7ac
  15. 27 7月, 2016 2 次提交
  16. 18 7月, 2016 1 次提交
  17. 13 7月, 2016 2 次提交
  18. 06 7月, 2016 1 次提交
  19. 28 6月, 2016 5 次提交