1. 29 1月, 2015 2 次提交
  2. 21 1月, 2015 1 次提交
  3. 13 1月, 2015 2 次提交
    • D
      ARM: 8255/1: perf: Prevent wraparound during overflow · 2d9ed740
      Daniel Thompson 提交于
      If the overflow threshold for a counter is set above or near the
      0xffffffff boundary then the kernel may lose track of the overflow
      causing only events that occur *after* the overflow to be recorded.
      Specifically the problem occurs when the value of the performance counter
      overtakes its original programmed value due to wrap around.
      
      Typical solutions to this problem are either to avoid programming in
      values likely to be overtaken or to treat the overflow bit as the 33rd
      bit of the counter.
      
      Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
      during irqsave sections (context switches for example) so instead we take
      the simpler approach of avoiding values likely to be overtaken.
      
      We set the limit to half of max_period because this matches the limit
      imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
      rate for large threshold values, however even with a very fast counter
      ticking at 4GHz the interrupt rate would only be ~1Hz.
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      2d9ed740
    • D
      ARM: 8266/1: Remove early stack deallocation from restore_user_regs · a18f3645
      Daniel Thompson 提交于
      Currently restore_user_regs deallocates the SVC stack early in
      its execution and relies on no exception being taken between
      the deallocation and the registers being restored. The introduction
      of a default FIQ handler that also uses the SVC stack breaks this
      assumption and can result in corrupted register state.
      
      This patch works around the problem by removing the early
      stack deallocation and using r2 as a temporary instead. I have
      not found a way to do this without introducing an extra mov
      instruction to the macro.
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      a18f3645
  4. 10 1月, 2015 1 次提交
  5. 08 1月, 2015 3 次提交
    • G
      ARM: 8253/1: mm: use phys_addr_t type in map_lowmem() for kernel mem region · ac084688
      Grygorii Strashko 提交于
      Now local variables kernel_x_start and kernel_x_end defined using
      'unsigned long' type which is wrong because they represent physical
      memory range and will be calculated wrongly if LPAE is enabled.
      As result, all following code in map_lowmem() will not work correctly.
      
      For example, Keystone 2 boot is broken because
       kernel_x_start == 0x0000 0000
       kernel_x_end   == 0x0080 0000
      
      instead of
       kernel_x_start == 0x0000 0008 0000 0000
       kernel_x_end   == 0x0000 0008 0080 0000
      and as result whole low memory will be mapped with MT_MEMORY_RW
      permissions by code (start > kernel_x_end):
      		} else if (start >= kernel_x_end) {
      			map.pfn = __phys_to_pfn(start);
      			map.virtual = __phys_to_virt(start);
      			map.length = end - start;
      			map.type = MT_MEMORY_RW;
      
      			create_mapping(&map);
      		}
      
      Hence, fix it by using phys_addr_t type for variables kernel_x_start
      and kernel_x_end.
      Tested-by: NMurali Karicheri <m-karicheri2@ti.com>
      Signed-off-by: NGrygorii Strashko <grygorii.strashko@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      ac084688
    • M
      ARM: 8249/1: mm: dump: don't skip regions · cca547e9
      Mark Rutland 提交于
      Currently the arm page table dumping code starts dumping page tables
      from USER_PGTABLES_CEILING. This is unnecessary for skipping any entries
      related to userspace as the swapper_pg_dir does not contain such
      entries, and results in a couple of unfortuante side effects.
      
      Firstly, any kernel mappings which might exist below
      USER_PGTABLES_CEILING will not be accounted in the dump output. This
      masks any entries erroneously created below this address.
      
      Secondly, if the final page table entry walked is part of a valid
      mapping the page table dumping code will not log the region this entry
      is part of, as the final note_page call in walk_pgd will trigger an
      early return when 0 < USER_PGTABLES_CEILING. Luckily this isn't seen on
      contemporary systems as they typically don't have enough RAM to extend
      the linear mapping right to the end of the address space.
      
      Due to the way addr is constructed in the walk_* functions, it can never
      be less than USER_PGTABLES_CEILING when walking the page tables, so it
      is not necessary to avoid dereferencing invalid table addresses. The
      existing checks for st->current_prot and st->marker[1].start_address are
      sufficient to ensure we will not print and/or dereference garbage when
      trying to log information.
      
      This patch removes both problematic uses of USER_PGTABLES_CEILING from
      the arm page table dumping code, preventing both of these issues. We
      will now report any low mappings, and the final note_page call will not
      return early, ensuring all regions are logged.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Steve Capper <steve.capper@linaro.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      cca547e9
    • R
      ARM: wire up execveat syscall · 841ee230
      Russell King 提交于
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      841ee230
  6. 20 12月, 2014 15 次提交
  7. 19 12月, 2014 2 次提交
  8. 18 12月, 2014 12 次提交
  9. 17 12月, 2014 2 次提交
    • S
      KVM: PPC: Book3S HV: Improve H_CONFER implementation · 90fd09f8
      Sam Bobroff 提交于
      Currently the H_CONFER hcall is implemented in kernel virtual mode,
      meaning that whenever a guest thread does an H_CONFER, all the threads
      in that virtual core have to exit the guest.  This is bad for
      performance because it interrupts the other threads even if they
      are doing useful work.
      
      The H_CONFER hcall is called by a guest VCPU when it is spinning on a
      spinlock and it detects that the spinlock is held by a guest VCPU that
      is currently not running on a physical CPU.  The idea is to give this
      VCPU's time slice to the holder VCPU so that it can make progress
      towards releasing the lock.
      
      To avoid having the other threads exit the guest unnecessarily,
      we add a real-mode implementation of H_CONFER that checks whether
      the other threads are doing anything.  If all the other threads
      are idle (i.e. in H_CEDE) or trying to confer (i.e. in H_CONFER),
      it returns H_TOO_HARD which causes a guest exit and allows the
      H_CONFER to be handled in virtual mode.
      
      Otherwise it spins for a short time (up to 10 microseconds) to give
      other threads the chance to observe that this thread is trying to
      confer.  The spin loop also terminates when any thread exits the guest
      or when all other threads are idle or trying to confer.  If the
      timeout is reached, the H_CONFER returns H_SUCCESS.  In this case the
      guest VCPU will recheck the spinlock word and most likely call
      H_CONFER again.
      
      This also improves the implementation of the H_CONFER virtual mode
      handler.  If the VCPU is part of a virtual core (vcore) which is
      runnable, there will be a 'runner' VCPU which has taken responsibility
      for running the vcore.  In this case we yield to the runner VCPU
      rather than the target VCPU.
      
      We also introduce a check on the target VCPU's yield count: if it
      differs from the yield count passed to H_CONFER, the target VCPU
      has run since H_CONFER was called and may have already released
      the lock.  This check is required by PAPR.
      Signed-off-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      90fd09f8
    • P
      KVM: PPC: Book3S HV: Fix endianness of instruction obtained from HEIR register · 4a157d61
      Paul Mackerras 提交于
      There are two ways in which a guest instruction can be obtained from
      the guest in the guest exit code in book3s_hv_rmhandlers.S.  If the
      exit was caused by a Hypervisor Emulation interrupt (i.e. an illegal
      instruction), the offending instruction is in the HEIR register
      (Hypervisor Emulation Instruction Register).  If the exit was caused
      by a load or store to an emulated MMIO device, we load the instruction
      from the guest by turning data relocation on and loading the instruction
      with an lwz instruction.
      
      Unfortunately, in the case where the guest has opposite endianness to
      the host, these two methods give results of different endianness, but
      both get put into vcpu->arch.last_inst.  The HEIR value has been loaded
      using guest endianness, whereas the lwz will load the instruction using
      host endianness.  The rest of the code that uses vcpu->arch.last_inst
      assumes it was loaded using host endianness.
      
      To fix this, we define a new vcpu field to store the HEIR value.  Then,
      in kvmppc_handle_exit_hv(), we transfer the value from this new field to
      vcpu->arch.last_inst, doing a byte-swap if the guest and host endianness
      differ.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      4a157d61