1. 19 10月, 2014 1 次提交
    • A
      x86,kvm,vmx: Preserve CR4 across VM entry · d974baa3
      Andy Lutomirski 提交于
      CR4 isn't constant; at least the TSD and PCE bits can vary.
      
      TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks
      like it's correct.
      
      This adds a branch and a read from cr4 to each vm entry.  Because it is
      extremely likely that consecutive entries into the same vcpu will have
      the same host cr4 value, this fixes up the vmcs instead of restoring cr4
      after the fact.  A subsequent patch will add a kernel-wide cr4 shadow,
      reducing the overhead in the common case to just two memory reads and a
      branch.
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: stable@vger.kernel.org
      Cc: Petr Matousek <pmatouse@redhat.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d974baa3
  2. 16 10月, 2014 1 次提交
    • M
      powerpc/pci: Fix IO space breakage after of_pci_range_to_resource() change · aeba3731
      Michael Ellerman 提交于
      Commit 0b0b0893 "of/pci: Fix the conversion of IO ranges into IO
      resources" changed the behaviour of of_pci_range_to_resource().
      
      Previously it simply populated the resource based on the arguments. Now
      it calls pci_register_io_range() and pci_address_to_pio(). These both
      have two implementations depending on whether PCI_IOBASE is defined,
      which it is not for powerpc.
      
      Further complicating matters, both routines are weak, and powerpc
      implements it's own version of one - pci_address_to_pio(). However
      powerpc's implementation depends on other initialisations which are done
      later in boot.
      
      The end result is incorrectly initialised IO space. Often we can get
      away with that, because we don't make much use of IO space. However
      virtio requires it, so we see eg:
      
        pci_bus 0000:00: root bus resource [io  0xffff] (bus address [0xffffffffffffffff-0xffffffffffffffff])
        PCI: Cannot allocate resource region 0 of device 0000:00:01.0, will remap
        virtio-pci 0000:00:01.0: can't enable device: BAR 0 [io  size 0x0020] not assigned
      
      The simplest fix for now is to just stop using of_pci_range_to_resource(),
      and open-code the original implementation, that's all we want it to do.
      
      Fixes: 0b0b0893 ("of/pci: Fix the conversion of IO ranges into IO resources")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      aeba3731
  3. 15 10月, 2014 3 次提交
    • D
      sparc64: Fix FPU register corruption with AES crypto offload. · f4da3628
      David S. Miller 提交于
      The AES loops in arch/sparc/crypto/aes_glue.c use a scheme where the
      key material is preloaded into the FPU registers, and then we loop
      over and over doing the crypt operation, reusing those pre-cooked key
      registers.
      
      There are intervening blkcipher*() calls between the crypt operation
      calls.  And those might perform memcpy() and thus also try to use the
      FPU.
      
      The sparc64 kernel FPU usage mechanism is designed to allow such
      recursive uses, but with a catch.
      
      There has to be a trap between the two FPU using threads of control.
      
      The mechanism works by, when the FPU is already in use by the kernel,
      allocating a slot for FPU saving at trap time.  Then if, within the
      trap handler, we try to use the FPU registers, the pre-trap FPU
      register state is saved into the slot.  Then at trap return time we
      notice this and restore the pre-trap FPU state.
      
      Over the long term there are various more involved ways we can make
      this work, but for a quick fix let's take advantage of the fact that
      the situation where this happens is very limited.
      
      All sparc64 chips that support the crypto instructiosn also are using
      the Niagara4 memcpy routine, and that routine only uses the FPU for
      large copies where we can't get the source aligned properly to a
      multiple of 8 bytes.
      
      We look to see if the FPU is already in use in this context, and if so
      we use the non-large copy path which only uses integer registers.
      
      Furthermore, we also limit this special logic to when we are doing
      kernel copy, rather than a user copy.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4da3628
    • I
    • A
      x86: bpf_jit: fix two bugs in eBPF JIT compiler · e0ee9c12
      Alexei Starovoitov 提交于
      1.
      JIT compiler using multi-pass approach to converge to final image size,
      since x86 instructions are variable length. It starts with large
      gaps between instructions (so some jumps may use imm32 instead of imm8)
      and iterates until total program size is the same as in previous pass.
      This algorithm works only if program size is strictly decreasing.
      Programs that use LD_ABS insn need additional code in prologue, but it
      was not emitted during 1st pass, so there was a chance that 2nd pass would
      adjust imm32->imm8 jump offsets to the same number of bytes as increase in
      prologue, which may cause algorithm to erroneously decide that size converged.
      Fix it by always emitting largest prologue in the first pass which
      is detected by oldproglen==0 check.
      Also change error check condition 'proglen != oldproglen' to fail gracefully.
      
      2.
      while staring at the code realized that 64-byte buffer may not be enough
      when 1st insn is large, so increase it to 128 to avoid buffer overflow
      (theoretical maximum size of prologue+div is 109) and add runtime check.
      
      Fixes: 62258278 ("net: filter: x86: internal BPF JIT")
      Reported-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Tested-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0ee9c12
  4. 14 10月, 2014 14 次提交
  5. 13 10月, 2014 2 次提交
  6. 12 10月, 2014 1 次提交
    • H
      parisc: Reduce SIGRTMIN from 37 to 32 to behave like other Linux architectures · 1f25df2e
      Helge Deller 提交于
      This patch reduces the value of SIGRTMIN on PARISC from 37 to 32, thus
      increasing the number of available RT signals and bring it in sync with other
      Linux architectures.
      
      Historically we wanted to natively support HP-UX 32bit binaries with the
      PA-RISC Linux port.  Because of that we carried the various available signals
      from HP-UX (e.g. SIGEMT and SIGLOST) and folded them in between the native
      Linux signals.  Although this was the right decision at that time, this
      required us to increase SIGRTMIN to at least 37 which left us with 27 (64-37)
      RT signals.
      
      Those 27 RT signals haven't been a problem in the past, but with the upcoming
      importance of systemd we now got the problem that systemd alloctes (hardcoded)
      signals up to SIGRTMIN+29 which is beyond our NSIG of 64. Because of that we
      have not been able to use systemd on the PARISC Linux port yet.
      
      Of course we could ask the systemd developers to not use those hardcoded
      values, but this change is very unlikely, esp. with PA-RISC being a niche
      architecture.
      
      The other possibility would be to increase NSIG to e.g. 128, but this would
      mean to duplicate most of the existing Linux signal handling code into the
      parisc specific Linux kernel tree which would most likely introduce lots of new
      bugs beside the code duplication.
      
      The third option is to drop some HP-UX signals and shuffle some other signals
      around to bring SIGRTMIN to 32.  This is of course an ABI change, but testing
      has shown that existing Linux installations are not visibly affected by this
      change - most likely because we move those signals around which are rarely used
      and move them to slots which haven't been used in Linux yet. In an existing
      installation I was able to exchange either the Linux kernel or glibc (or both)
      without affecting the boot process and installed applications.
      
      Dropping the HP-UX signals isn't an issue either, since support for HP-UX was
      basically dropped a few months back with Kernel 3.14 in commit
      f5a408d5 already, when we changed EWOULDBLOCK
      to be equal to EAGAIN.
      
      So, even if this is an ABI change, it's better to change it now and thus bring
      PARISC Linux in sync with other architectures to avoid other issues in the
      future.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: Carlos O'Donell <carlos@systemhalted.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
      Cc: PARISC Linux Kernel Mailinglist <linux-parisc@vger.kernel.org>
      Tested-by: NAaro Koskinen <aaro.koskinen@iki.fi>
      1f25df2e
  7. 11 10月, 2014 2 次提交
    • D
      sparc64: Fix lockdep warnings on reboot on Ultra-5 · bdcf81b6
      David S. Miller 提交于
      Inconsistently, the raw_* IRQ routines do not interact with and update
      the irqflags tracing and lockdep state, whereas the raw_* spinlock
      interfaces do.
      
      This causes problems in p1275_cmd_direct() because we disable hardirqs
      by hand using raw_local_irq_restore() and then do a raw_spin_lock()
      which triggers a lockdep trace because the CPU's hw IRQ state doesn't
      match IRQ tracing's internal software copy of that state.
      
      The CPU's irqs are disabled, yet current->hardirqs_enabled is true.
      
      ====================
      reboot: Restarting system
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3536 check_flags+0x7c/0x240()
      DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
      Modules linked in: openpromfs
      CPU: 0 PID: 1 Comm: systemd-shutdow Tainted: G        W      3.17.0-dirty #145
      Call Trace:
       [000000000045919c] warn_slowpath_common+0x5c/0xa0
       [0000000000459210] warn_slowpath_fmt+0x30/0x40
       [000000000048f41c] check_flags+0x7c/0x240
       [0000000000493280] lock_acquire+0x20/0x1c0
       [0000000000832b70] _raw_spin_lock+0x30/0x60
       [000000000068f2fc] p1275_cmd_direct+0x1c/0x60
       [000000000068ed28] prom_reboot+0x28/0x40
       [000000000043610c] machine_restart+0x4c/0x80
       [000000000047d2d4] kernel_restart+0x54/0x80
       [000000000047d618] SyS_reboot+0x138/0x200
       [00000000004060b4] linux_sparc_syscall32+0x34/0x60
      ---[ end trace 5c439fe81c05a100 ]---
      possible reason: unannotated irqs-off.
      irq event stamp: 2010267
      hardirqs last  enabled at (2010267): [<000000000049a358>] vprintk_emit+0x4b8/0x580
      hardirqs last disabled at (2010266): [<0000000000499f08>] vprintk_emit+0x68/0x580
      softirqs last  enabled at (2010046): [<000000000045d278>] __do_softirq+0x378/0x4a0
      softirqs last disabled at (2010039): [<000000000042bf08>] do_softirq_own_stack+0x28/0x40
      Resetting ...
      ====================
      
      Use local_* variables of the hw IRQ interfaces so that IRQ tracing sees
      all of our changes.
      Reported-by: NMeelis Roos <mroos@linux.ee>
      Tested-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdcf81b6
    • I
      dtb: Add 10GbE node to APM X-Gene SoC device tree · 5fb32417
      Iyappan Subramanian 提交于
      Added 10GbE interface and clock nodes.
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Signed-off-by: NKeyur Chudgar <kchudgar@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5fb32417
  8. 10 10月, 2014 16 次提交