1. 12 2月, 2015 2 次提交
  2. 11 2月, 2015 1 次提交
  3. 06 2月, 2015 1 次提交
  4. 29 1月, 2015 3 次提交
  5. 23 1月, 2015 1 次提交
    • M
      s390/spinlock: add compare-and-delay to lock wait loops · 2c72a44e
      Martin Schwidefsky 提交于
      Add the compare-and-delay instruction to the spin-lock and rw-lock
      retry loops. A CPU executing the compare-and-delay instruction stops
      until the lock value has changed. This is done to make the locking
      code for contended locks to behave better in regard to the multi-
      hreading facility. A thread of a core executing a compare-and-delay
      will allow the other threads of a core to get a larger share of the
      core resources.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      2c72a44e
  6. 22 1月, 2015 2 次提交
    • M
      s390: add SMT support · 10ad34bc
      Martin Schwidefsky 提交于
      The multi-threading facility is introduced with the z13 processor family.
      This patch adds code to detect the multi-threading facility. With the
      facility enabled each core will surface multiple hardware threads to the
      system. Each hardware threads looks like a normal CPU to the operating
      system with all its registers and properties.
      
      The SCLP interface reports the SMT topology indirectly via the maximum
      thread id. Each reported CPU in the result of a read-scp-information
      is a core representing a number of hardware threads.
      
      To reflect the reduced CPU capacity if two hardware threads run on a
      single core the MT utilization counter set is used to normalize the
      raw cputime obtained by the CPU timer deltas. This scaled cputime is
      reported via the taskstats interface. The normal /proc/stat numbers
      are based on the raw cputime and are not affected by the normalization.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      10ad34bc
    • M
      s390: avoid z13 cache aliasing · 1f6b83e5
      Martin Schwidefsky 提交于
      Avoid cache aliasing on z13 by aligning shared objects to multiples
      of 512K. The virtual addresses of a page from a shared file needs
      to have identical bits in the range 2^12 to 2^18.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      1f6b83e5
  7. 08 1月, 2015 1 次提交
  8. 07 1月, 2015 1 次提交
    • C
      s390/timex: fix get_tod_clock_ext() inline assembly · e38f9781
      Chen Gang 提交于
      For C language, it treats array parameter as a pointer, so sizeof for an
      array parameter is equal to sizeof for a pointer, which causes compiler
      warning (with allmodconfig by gcc 5):
      
        ./arch/s390/include/asm/timex.h: In function 'get_tod_clock_ext':
        ./arch/s390/include/asm/timex.h:76:32: warning: 'sizeof' on array function parameter 'clk' will return size of 'char *' [-Wsizeof-array-argument]
          typedef struct { char _[sizeof(clk)]; } addrtype;
                                        ^
      Can use macro CLOCK_STORE_SIZE instead of all related hard code numbers,
      which also can avoid this warning. And also add a tab to CLOCK_TICK_RATE
      definition to match coding styles.
      
      [heiko.carstens@de.ibm.com]:
      Chen's patch actually fixes a bug within the get_tod_clock_ext() inline assembly
      where we incorrectly tell the compiler that only 8 bytes of memory get changed
      instead of 16 bytes.
      This would allow gcc to generate incorrect code. Right now this doesn't seem to
      be the case.
      Also slightly changed the patch a bit.
      - renamed CLOCK_STORE_SIZE to STORE_CLOCK_EXT_SIZE
      - changed get_tod_clock_ext() to receive a char pointer parameter
      Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      e38f9781
  9. 18 12月, 2014 1 次提交
  10. 12 12月, 2014 2 次提交
    • A
      arch: Add lightweight memory barriers dma_rmb() and dma_wmb() · 1077fa36
      Alexander Duyck 提交于
      There are a number of situations where the mandatory barriers rmb() and
      wmb() are used to order memory/memory operations in the device drivers
      and those barriers are much heavier than they actually need to be.  For
      example in the case of PowerPC wmb() calls the heavy-weight sync
      instruction when for coherent memory operations all that is really needed
      is an lsync or eieio instruction.
      
      This commit adds a coherent only version of the mandatory memory barriers
      rmb() and wmb().  In most cases this should result in the barrier being the
      same as the SMP barriers for the SMP case, however in some cases we use a
      barrier that is somewhere in between rmb() and smp_rmb().  For example on
      ARM the rmb barriers break down as follows:
      
        Barrier   Call     Explanation
        --------- -------- ----------------------------------
        rmb()     dsb()    Data synchronization barrier - system
        dma_rmb() dmb(osh) data memory barrier - outer sharable
        smp_rmb() dmb(ish) data memory barrier - inner sharable
      
      These new barriers are not as safe as the standard rmb() and wmb().
      Specifically they do not guarantee ordering between coherent and incoherent
      memories.  The primary use case for these would be to enforce ordering of
      reads and writes when accessing coherent memory that is shared between the
      CPU and a device.
      
      It may also be noted that there is no dma_mb().  Most architectures don't
      provide a good mechanism for performing a coherent only full barrier without
      resorting to the same mechanism used in mb().  As such there isn't much to
      be gained in trying to define such a function.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1077fa36
    • A
      arch: Cleanup read_barrier_depends() and comments · 8a449718
      Alexander Duyck 提交于
      This patch is meant to cleanup the handling of read_barrier_depends and
      smp_read_barrier_depends.  In multiple spots in the kernel headers
      read_barrier_depends is defined as "do {} while (0)", however we then go
      into the SMP vs non-SMP sections and have the SMP version reference
      read_barrier_depends, and the non-SMP define it as yet another empty
      do/while.
      
      With this commit I went through and cleaned out the duplicate definitions
      and reduced the number of definitions down to 2 per header.  In addition I
      moved the 50 line comments for the macro from the x86 and mips headers that
      defined it as an empty do/while to those that were actually defining the
      macro, alpha and blackfin.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a449718
  11. 11 12月, 2014 1 次提交
  12. 08 12月, 2014 5 次提交
  13. 28 11月, 2014 6 次提交
  14. 19 11月, 2014 3 次提交
  15. 10 11月, 2014 1 次提交
    • T
      /dev/mem: Use more consistent data types · 4707a341
      Thierry Reding 提交于
      The xlate_dev_{kmem,mem}_ptr() functions take either a physical address
      or a kernel virtual address, so data types should be phys_addr_t and
      void *. They both return a kernel virtual address which is only ever
      used in calls to copy_{from,to}_user(), so make variables that store it
      void * rather than char * for consistency.
      
      Also only define a weak unxlate_dev_mem_ptr() function if architectures
      haven't overridden them in the asm/io.h header file.
      Signed-off-by: NThierry Reding <treding@nvidia.com>
      4707a341
  16. 03 11月, 2014 3 次提交
    • M
      s390/pci: add sparse annotations · 5b9f2081
      Martin Schwidefsky 提交于
      Fix the following warnings from the sparse code checker:
      
      arch/s390/include/asm/pci_io.h:165:49: warning: cast removes address space of expression
      arch/s390/pci/pci.c:476:44: warning: cast removes address space of expression
      arch/s390/pci/pci.c:491:36: warning: incorrect type in argument 2 (different address spaces)
      arch/s390/pci/pci.c:491:36:    expected void [noderef] <asn:2>*addr
      arch/s390/pci/pci.c:491:36:    got void *<noident>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      5b9f2081
    • S
      s390/pci: improve irq number check for msix · b19148f6
      Sebastian Ott 提交于
      s390s arch_setup_msi_irqs function ensures that we don't return with
      more irqs than the PCI architecture allows and that a single PCI
      function doesn't consume more irqs than the kernel is configured for.
      
      At least the last check doesn't help much and should take the sum of
      all irqs into account. Since that's already done by irq_alloc_desc
      we can remove this check.
      
      As for the first check we should use the value provided by the
      firmware which can be less than what the PCI architecture allows.
      Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b19148f6
    • M
      s390/cmpxchg: use compiler builtins · f318a122
      Martin Schwidefsky 提交于
      The kernel build for s390 fails for gcc compilers with version 3.x,
      set the minimum required version of gcc to version 4.3.
      
      As the atomic builtins are available with all gcc 4.x compilers,
      use the __sync_val_compare_and_swap and __sync_bool_compare_and_swap
      functions to replace the complex macro and inline assembler magic
      in include/asm/cmpxchg.h. The compiler can just-do-it and generates
      better code with the builtins.
      
      While we are at it use __sync_bool_compare_and_swap for the
      _raw_compare_and_swap function in the spinlock code as well.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      f318a122
  17. 28 10月, 2014 3 次提交
  18. 27 10月, 2014 3 次提交
    • M
      s390/mm: pmdp_get_and_clear_full optimization · fcbe08d6
      Martin Schwidefsky 提交于
      Analog to ptep_get_and_clear_full define a variant of the
      pmpd_get_and_clear primitive which gets the full hint from the
      mmu_gather struct. This allows s390 to avoid a costly instruction
      when destroying an address space.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      fcbe08d6
    • H
      s390/ftrace,kprobes: allow to patch first instruction · c933146a
      Heiko Carstens 提交于
      If the function tracer is enabled, allow to set kprobes on the first
      instruction of a function (which is the function trace caller):
      
      If no kprobe is set handling of enabling and disabling function tracing
      of a function simply patches the first instruction. Either it is a nop
      (right now it's an unconditional branch, which skips the mcount block),
      or it's a branch to the ftrace_caller() function.
      
      If a kprobe is being placed on a function tracer calling instruction
      we encode if we actually have a nop or branch in the remaining bytes
      after the breakpoint instruction (illegal opcode).
      This is possible, since the size of the instruction used for the nop
      and branch is six bytes, while the size of the breakpoint is only
      two bytes.
      Therefore the first two bytes contain the illegal opcode and the last
      four bytes contain either "0" for nop or "1" for branch. The kprobes
      code will then execute/simulate the correct instruction.
      
      Instruction patching for kprobes and function tracer is always done
      with stop_machine(). Therefore we don't have any races where an
      instruction is patched concurrently on a different cpu.
      Besides that also the program check handler which executes the function
      trace caller instruction won't be executed concurrently to any
      stop_machine() execution.
      
      This allows to keep full fault based kprobes handling which generates
      correct pt_regs contents automatically.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      c933146a
    • D
      s390/mm: disable KSM for storage key enabled pages · 3ac8e380
      Dominik Dingel 提交于
      When storage keys are enabled unmerge already merged pages and prevent
      new pages from being merged.
      Signed-off-by: NDominik Dingel <dingel@linux.vnet.ibm.com>
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      3ac8e380