提交 · 555fa17b2b8cc865c203de263443041eb75bc7f7 · openanolis / cloud-kernel

31 3月, 2015 11 次提交

crypto: sha-mb - mark Multi buffer SHA1 helper cipher · 555fa17b

由 Stephan Mueller 提交于 3月 30, 2015

Flag all Multi buffer SHA1 helper ciphers as internal ciphers
to prevent them from being called by normal users.
Signed-off-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

555fa17b

crypto: twofish_avx - mark Twofish AVX helper ciphers · 4dda66f6

由 Stephan Mueller 提交于 3月 30, 2015

Flag all Twofish AVX helper ciphers as internal ciphers to prevent
them from being called by normal users.
Signed-off-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

4dda66f6

crypto: serpent_sse2 - mark Serpent SSE2 helper ciphers · 748be1f1

由 Stephan Mueller 提交于 3月 30, 2015

Flag all Serpent SSE2 helper ciphers as internal ciphers to prevent
them from being called by normal users.
Signed-off-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

748be1f1

crypto: serpent_avx - mark Serpent AVX helper ciphers · 65aed539

由 Stephan Mueller 提交于 3月 30, 2015

Flag all Serpent AVX helper ciphers as internal ciphers to prevent
them from being called by normal users.
Signed-off-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

65aed539

crypto: serpent_avx2 - mark Serpent AVX2 helper ciphers · f82419ac

由 Stephan Mueller 提交于 3月 30, 2015

Flag all Serpent AVX2 helper ciphers as internal ciphers to prevent
them from being called by normal users.
Signed-off-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

f82419ac

crypto: cast6_avx - mark CAST6 helper ciphers · e69b8a46

由 Stephan Mueller 提交于 3月 30, 2015

Flag all CAST6 helper ciphers as internal ciphers to prevent them
from being called by normal users.
Signed-off-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

e69b8a46

crypto: camellia_aesni_avx - mark AVX Camellia helper ciphers · 7d2c31dd

由 Stephan Mueller 提交于 3月 30, 2015

Flag all AVX Camellia helper ciphers as internal ciphers to prevent
them from being called by normal users.
Signed-off-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

7d2c31dd

crypto: cast5_avx - mark CAST5 helper ciphers · 680574e8

由 Stephan Mueller 提交于 3月 30, 2015

Flag all CAST5 helper ciphers as internal ciphers to prevent them
from being called by normal users.
Signed-off-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

680574e8

crypto: camellia_aesni_avx2 - mark AES-NI Camellia helper ciphers · a62356a9

由 Stephan Mueller 提交于 3月 30, 2015

Flag all AES-NI Camellia helper ciphers as internal ciphers to
prevent them from being called by normal users.
Signed-off-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

a62356a9

crypto: clmulni - mark ghash clmulni helper ciphers · 6a9b52b7

由 Stephan Mueller 提交于 3月 30, 2015

Flag all ash clmulni helper ciphers as internal ciphers to prevent them
from being called by normal users.
Signed-off-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

6a9b52b7

crypto: aesni - mark AES-NI helper ciphers · eabdc320

由 Stephan Mueller 提交于 3月 30, 2015

Flag all AES-NI helper ciphers as internal ciphers to prevent them from
being called by normal users.
Signed-off-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

eabdc320

16 3月, 2015 1 次提交

crypto: sha1-mb - Syntax error · c42e9902

由 Ameen Ali 提交于 3月 13, 2015

fixing a syntax-error .
Signed-off-by: NAmeen Ali <AmeenAli023@gmail.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

c42e9902

13 3月, 2015 1 次提交

crypto: don't export static symbol · 05713ba9

由 Julia Lawall 提交于 3月 11, 2015

The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
type T;
identifier f;
@@

static T f (...) { ... }

@@
identifier r.f;
declarer name EXPORT_SYMBOL_GPL;
@@

-EXPORT_SYMBOL_GPL(f);
// </smpl>
Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

05713ba9

28 2月, 2015 1 次提交

crypto: aesni - make driver-gcm-aes-aesni helper a proper aead alg · 81e397d9

由 Tadeusz Struk 提交于 2月 06, 2015

Changed the __driver-gcm-aes-aesni to be a proper aead algorithm.
This required a valid setkey and setauthsize functions to be added and also
some changes to make sure that math context is not corrupted when the alg is
used directly.
Note that the __driver-gcm-aes-aesni should not be used directly by modules
that can use it in interrupt context as we don't have a good fallback mechanism
in this case.
Signed-off-by: NAdrian Hoban <adrian.hoban@intel.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

81e397d9

27 2月, 2015 1 次提交

crypto: sha-mb - Fix big integer constant sparse warning · 66c046b4

由 Lad, Prabhakar 提交于 2月 05, 2015

this patch fixes following sparse warning:

sha1_mb_mgr_init_avx2.c:59:31: warning: constant 0xF76543210 is so big it is long
Signed-off-by: NLad, Prabhakar <prabhakar.csengg@gmail.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

66c046b4

20 2月, 2015 2 次提交

x86/mm/ASLR: Avoid PAGE_SIZE redefinition for UML subarch · 570e1aa8

由 Jiri Kosina 提交于 2月 20, 2015

Commit f47233c2 ("x86/mm/ASLR: Propagate base load address
calculation") causes PAGE_SIZE redefinition warnings for UML
subarch  builds. This is caused by added includes that were
leftovers from previous  patch versions are are not actually
needed (especially page_types.h  inlcude in module.c). Drop
those stray includes.
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1502201017240.28769@pobox.suse.czSigned-off-by: NIngo Molnar <mingo@kernel.org>

570e1aa8

x86: pte_protnone() and pmd_protnone() must check entry is not present · e3a1f6ca

由 David Vrabel 提交于 2月 19, 2015

Since _PAGE_PROTNONE aliases _PAGE_GLOBAL it is only valid if
_PAGE_PRESENT is clear.  Make pte_protnone() and pmd_protnone() check
for this.

This fixes a 64-bit Xen PV guest regression introduced by 8a0516ed
("mm: convert p[te|md]_numa users to p[te|md]_protnone_numa").  Any
userspace process would endlessly fault.

In a 64-bit PV guest, userspace page table entries have _PAGE_GLOBAL set
by the hypervisor.  This meant that any fault on a present userspace
entry (e.g., a write to a read-only mapping) would be misinterpreted as
a NUMA hinting fault and the fault would not be correctly handled,
resulting in the access endlessly faulting.
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Acked-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e3a1f6ca

19 2月, 2015 14 次提交

x86/microcode/intel: Handle truncated microcode images more robustly · 35a9ff4e

由 Quentin Casasnovas 提交于 2月 03, 2015

We do not check the input data bounds containing the microcode before
copying a struct microcode_intel_header from it. A specially crafted
microcode could cause the kernel to read invalid memory and lead to a
denial-of-service.
Signed-off-by: NQuentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1422964824-22056-3-git-send-email-quentin.casasnovas@oracle.com
[ Made error message differ from the next one and flipped comparison. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

35a9ff4e

x86/microcode/intel: Guard against stack overflow in the loader · f84598bd

由 Quentin Casasnovas 提交于 2月 03, 2015

mc_saved_tmp is a static array allocated on the stack, we need to make
sure mc_saved_count stays within its bounds, otherwise we're overflowing
the stack in _save_mc(). A specially crafted microcode header could lead
to a kernel crash or potentially kernel execution.
Signed-off-by: NQuentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1422964824-22056-1-git-send-email-quentin.casasnovas@oracle.comSigned-off-by: NBorislav Petkov <bp@suse.de>

f84598bd

x86, mm/ASLR: Fix stack randomization on 64-bit systems · 4e7c22d4

由 Hector Marco-Gisbert 提交于 2月 14, 2015

The issue is that the stack for processes is not properly randomized on
64 bit architectures due to an integer overflow.

The affected function is randomize_stack_top() in file
"fs/binfmt_elf.c":

  static unsigned long randomize_stack_top(unsigned long stack_top)
  {
           unsigned int random_variable = 0;

           if ((current->flags & PF_RANDOMIZE) &&
                   !(current->personality & ADDR_NO_RANDOMIZE)) {
                   random_variable = get_random_int() & STACK_RND_MASK;
                   random_variable <<= PAGE_SHIFT;
           }
           return PAGE_ALIGN(stack_top) + random_variable;
           return PAGE_ALIGN(stack_top) - random_variable;
  }

Note that, it declares the "random_variable" variable as "unsigned int".
Since the result of the shifting operation between STACK_RND_MASK (which
is 0x3fffff on x86_64, 22 bits) and PAGE_SHIFT (which is 12 on x86_64):

	  random_variable <<= PAGE_SHIFT;

then the two leftmost bits are dropped when storing the result in the
"random_variable". This variable shall be at least 34 bits long to hold
the (22+12) result.

These two dropped bits have an impact on the entropy of process stack.
Concretely, the total stack entropy is reduced by four: from 2^28 to
2^30 (One fourth of expected entropy).

This patch restores back the entropy by correcting the types involved
in the operations in the functions randomize_stack_top() and
stack_maxrandom_size().

The successful fix can be tested with:

  $ for i in `seq 1 10`; do cat /proc/self/maps | grep stack; done
  7ffeda566000-7ffeda587000 rw-p 00000000 00:00 0                          [stack]
  7fff5a332000-7fff5a353000 rw-p 00000000 00:00 0                          [stack]
  7ffcdb7a1000-7ffcdb7c2000 rw-p 00000000 00:00 0                          [stack]
  7ffd5e2c4000-7ffd5e2e5000 rw-p 00000000 00:00 0                          [stack]
  ...

Once corrected, the leading bytes should be between 7ffc and 7fff,
rather than always being 7fff.
Signed-off-by: NHector Marco-Gisbert <hecmargi@upv.es>
Signed-off-by: NIsmael Ripoll <iripoll@upv.es>
[ Rebased, fixed 80 char bugs, cleaned up commit message, added test example and CVE ]
Signed-off-by: NKees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Fixes: CVE-2015-1593
Link: http://lkml.kernel.org/r/20150214173350.GA18393@www.outflux.netSigned-off-by: NBorislav Petkov <bp@suse.de>

4e7c22d4

x86/mm/init: Fix incorrect page size in init_memory_mapping() printks · f15e0518

由 Dave Hansen 提交于 2月 10, 2015

With 32-bit non-PAE kernels, we have 2 page sizes available
(at most): 4k and 4M.

Enabling PAE replaces that 4M size with a 2M one (which 64-bit
systems use too).

But, when booting a 32-bit non-PAE kernel, in one of our
early-boot printouts, we say:

  init_memory_mapping: [mem 0x00000000-0x000fffff]
   [mem 0x00000000-0x000fffff] page 4k
  init_memory_mapping: [mem 0x37000000-0x373fffff]
   [mem 0x37000000-0x373fffff] page 2M
  init_memory_mapping: [mem 0x00100000-0x36ffffff]
   [mem 0x00100000-0x003fffff] page 4k
   [mem 0x00400000-0x36ffffff] page 2M
  init_memory_mapping: [mem 0x37400000-0x377fdfff]
   [mem 0x37400000-0x377fdfff] page 4k

Which is obviously wrong.  There is no 2M page available.  This
is probably because of a badly-named variable: in the map_range
code: PG_LEVEL_2M.

Instead of renaming all the PG_LEVEL_2M's.  This patch just
fixes the printout:

  init_memory_mapping: [mem 0x00000000-0x000fffff]
   [mem 0x00000000-0x000fffff] page 4k
  init_memory_mapping: [mem 0x37000000-0x373fffff]
   [mem 0x37000000-0x373fffff] page 4M
  init_memory_mapping: [mem 0x00100000-0x36ffffff]
   [mem 0x00100000-0x003fffff] page 4k
   [mem 0x00400000-0x36ffffff] page 4M
  init_memory_mapping: [mem 0x37400000-0x377fdfff]
   [mem 0x37400000-0x377fdfff] page 4k
  BRK [0x03206000, 0x03206fff] PGTABLE
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20150210212030.665EC267@viggo.jf.intel.comSigned-off-by: NBorislav Petkov <bp@suse.de>

f15e0518

x86/mm/ASLR: Propagate base load address calculation · f47233c2

由 Jiri Kosina 提交于 2月 13, 2015

Commit:

  e2b32e67 ("x86, kaslr: randomize module base load address")

makes the base address for module to be unconditionally randomized in
case when CONFIG_RANDOMIZE_BASE is defined and "nokaslr" option isn't
present on the commandline.

This is not consistent with how choose_kernel_location() decides whether
it will randomize kernel load base.

Namely, CONFIG_HIBERNATION disables kASLR (unless "kaslr" option is
explicitly specified on kernel commandline), which makes the state space
larger than what module loader is looking at. IOW CONFIG_HIBERNATION &&
CONFIG_RANDOMIZE_BASE is a valid config option, kASLR wouldn't be applied
by default in that case, but module loader is not aware of that.

Instead of fixing the logic in module.c, this patch takes more generic
aproach. It introduces a new bootparam setup data_type SETUP_KASLR and
uses that to pass the information whether kaslr has been applied during
kernel decompression, and sets a global 'kaslr_enabled' variable
accordingly, so that any kernel code (module loading, livepatching, ...)
can make decisions based on its value.

x86 module loader is converted to make use of this flag.
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
Acked-by: NKees Cook <keescook@chromium.org>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Link: https://lkml.kernel.org/r/alpine.LNX.2.00.1502101411280.10719@pobox.suse.cz
[ Always dump correct kaslr status when panicking ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

f47233c2

x86/intel/quark: Fix simple_return.cocci warnings · c11a25f4

由 Fengguang Wu 提交于 2月 19, 2015

arch/x86/platform/intel-quark/imr.c:129:1-4: WARNING: end returns can be simpified

 Simplify a trivial if-return sequence.  Possibly combine with a preceding function call.

Generated by: scripts/coccinelle/misc/simple_return.cocci
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Cc: Andy Shevchenko <andy.schevchenko@gmail.com>
Cc: Ong, Boon Leong <boon.leong.ong@intel.com>
Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: kbuild-all@01.org
Link: http://lkml.kernel.org/r/20150219081432.GA21996@waimeaSigned-off-by: NIngo Molnar <mingo@kernel.org>

c11a25f4

x86/intel/quark: Fix ptr_ret.cocci warnings · 32d39169

由 Fengguang Wu 提交于 2月 19, 2015

arch/x86/platform/intel-quark/imr.c:280:1-3: WARNING: PTR_ERR_OR_ZERO can be used

 Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Generated by: scripts/coccinelle/api/ptr_ret.cocci
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Cc: Andy Shevchenko <andy.schevchenko@gmail.com>
Cc: Ong, Boon Leong <boon.leong.ong@intel.com>
Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: kbuild-all@01.org
Link: http://lkml.kernel.org/r/20150219081432.GA21983@waimeaSigned-off-by: NIngo Molnar <mingo@kernel.org>

32d39169

x86/intel/quark: Add Intel Quark platform support · 8bbc2a13

由 Bryan O'Donoghue 提交于 1月 30, 2015

Add Intel Quark platform support. Quark needs to pull down all
unlocked IMRs to ensure agreement with the EFI memory map post
boot.

This patch adds an entry in Kconfig for Quark as a platform and
makes IMR support mandatory if selected.
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Suggested-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: NOng, Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
Reviewed-by: NAndy Shevchenko <andy.schevchenko@gmail.com>
Reviewed-by: NDarren Hart <dvhart@linux.intel.com>
Reviewed-by: NOng, Boon Leong <boon.leong.ong@intel.com>
Cc: dvhart@infradead.org
Link: http://lkml.kernel.org/r/1422635379-12476-3-git-send-email-pure.logic@nexus-software.ieSigned-off-by: NIngo Molnar <mingo@kernel.org>

8bbc2a13

x86/intel/quark: Add Isolated Memory Regions for Quark X1000 · 28a375df

由 Bryan O'Donoghue 提交于 1月 30, 2015

Intel's Quark X1000 SoC contains a set of registers called
Isolated Memory Regions. IMRs are accessed over the IOSF mailbox
interface. IMRs are areas carved out of memory that define
read/write access rights to the various system agents within the
Quark system. For a given agent in the system it is possible to
specify if that agent may read or write an area of memory
defined by an IMR with a granularity of 1 KiB.

Quark_SecureBootPRM_330234_001.pdf section 4.5 details the
concept of IMRs quark-x1000-datasheet.pdf section 12.7.4 details
the implementation of IMRs in silicon.

eSRAM flush, CPU Snoop write-only, CPU SMM Mode, CPU non-SMM
mode, RMU and PCIe Virtual Channels (VC0 and VC1) can have
individual read/write access masks applied to them for a given
memory region in Quark X1000. This enables IMRs to treat each
memory transaction type listed above on an individual basis and
to filter appropriately based on the IMR access mask for the
memory region. Quark supports eight IMRs.

Since all of the DMA capable SoC components in the X1000 are
mapped to VC0 it is possible to define sections of memory as
invalid for DMA write operations originating from Ethernet, USB,
SD and any other DMA capable south-cluster component on VC0.
Similarly it is possible to mark kernel memory as non-SMM mode
read/write only or to mark BIOS runtime memory as SMM mode
accessible only depending on the particular memory footprint on
a given system.

On an IMR violation Quark SoC X1000 systems are configured to
reset the system, so ensuring that the IMR memory map is
consistent with the EFI provided memory map is critical to
ensure no IMR violations reset the system.

The API for accessing IMRs is based on MTRR code but doesn't
provide a /proc or /sys interface to manipulate IMRs. Defining
the size and extent of IMRs is exclusively the domain of
in-kernel code.

Quark firmware sets up a series of locked IMRs around pieces of
memory that firmware owns such as ACPI runtime data. During boot
a series of unlocked IMRs are placed around items in memory to
guarantee no DMA modification of those items can take place.
Grub also places an unlocked IMR around the kernel boot params
data structure and compressed kernel image. It is necessary for
the kernel to tear down all unlocked IMRs in order to ensure
that the kernel's view of memory passed via the EFI memory map
is consistent with the IMR memory map. Without tearing down all
unlocked IMRs on boot transitory IMRs such as those used to
protect the compressed kernel image will cause IMR violations and system reboots.

The IMR init code tears down all unlocked IMRs and sets a
protective IMR around the kernel .text and .rodata as one
contiguous block. This sanitizes the IMR memory map with respect
to the EFI memory map and protects the read-only portions of the
kernel from unwarranted DMA access.
Tested-by: NOng, Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
Reviewed-by: NAndy Shevchenko <andy.schevchenko@gmail.com>
Reviewed-by: NDarren Hart <dvhart@linux.intel.com>
Reviewed-by: NOng, Boon Leong <boon.leong.ong@intel.com>
Cc: andy.shevchenko@gmail.com
Cc: dvhart@infradead.org
Link: http://lkml.kernel.org/r/1422635379-12476-2-git-send-email-pure.logic@nexus-software.ieSigned-off-by: NIngo Molnar <mingo@kernel.org>

28a375df

x86/apic: Fix the devicetree build in certain configs · b273c2c2

由 Ricardo Ribalda Delgado 提交于 2月 02, 2015

Without this patch:

  LD      init/built-in.o
  arch/x86/built-in.o: In function `dtb_lapic_setup': kernel/devicetree.c:155:
  undefined reference to `apic_force_enable'
  Makefile:923: recipe for target 'vmlinux' failed
  make: *** [vmlinux] Error 1
Signed-off-by: NRicardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Reviewed-by: NMaciej W. Rozycki <macro@linux-mips.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Jan Beulich <JBeulich@suse.com>
Link: http://lkml.kernel.org/r/1422905231-16067-1-git-send-email-ricardo.ribalda@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

b273c2c2

kprobes/x86: Mark 2 bytes NOP as boostable · b7e37567

由 Wang Nan 提交于 2月 10, 2015

Currently, x86 kprobes is unable to boost 2 bytes nop like:

  nopl 0x0(%rax,%rax,1)

which is 0x0f 0x1f 0x44 0x00 0x00.

Such nops have exactly 5 bytes to hold a relative jmp
instruction. Boosting them should be obviously safe.

This patch enable boosting such nops by simply updating
twobyte_is_boostable[] array.
Signed-off-by: NWang Nan <wangnan0@huawei.com>
Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1423532045-41049-1-git-send-email-wangnan0@huawei.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

b7e37567

uprobes/x86: Fix 2-byte opcode table · 5154d4f2

由 Denys Vlasenko 提交于 2月 12, 2015

Enabled probing of lar, lsl, popcnt, lddqu, prefetch insns.
They should be safe to probe, they throw no exceptions.

Enabled probing of 3-byte opcodes 0f 38-3f xx - these are
vector isns, so should be safe.

Enabled probing of many currently undefined 0f xx insns.
At the rate new vector instructions are getting added,
we don't want to constantly enable more bits.
We want to only occasionally *disable* ones which
for some reason can't be probed.
This includes 0f 24,26 opcodes, which are undefined
since Pentium. On 486, they were "mov to/from test register".

Explained more fully what 0f 78,79 opcodes are.

Explained what 0f ae opcode is. (It's unclear why we don't allow
probing it, but let's not change it for now).
Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1423768732-32194-3-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

5154d4f2

uprobes/x86: Fix 1-byte opcode tables · 67fc8092

由 Denys Vlasenko 提交于 2月 12, 2015

This change fixes 1-byte opcode tables so that only insns
for which we have real reasons to disallow probing are marked
with unset bits.

To that end:

Set bits for all prefix bytes. Their setting is ignored anyway -
we check the bitmap against OPCODE1(insn), not against first
byte. Keeping them set to 0 only confuses code reader with
"why we don't support that opcode" question.

Thus: enable bytes c4,c5 in 64-bit mode (VEX prefixes).
Byte 62 (EVEX prefix) is not yet enabled since insn decoder
does not support that yet.

For 32-bit mode, enable probing of opcodes 63 (arpl) and d6
(salc). They don't require any special handling.

For 64-bit mode, disable 9a and ea - these undefined opcodes
were mistakenly left enabled.
Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1423768732-32194-2-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

67fc8092

uprobes/x86: Add comment with insn opcodes, mnemonics and why we dont support them · 097f4e5e

由 Denys Vlasenko 提交于 2月 12, 2015

After adding these, it's clear we have some awkward choices
there. Some valid instructions are prohibited from uprobing
while several invalid ones are allowed.

Hopefully future edits to the good-opcode tables will fix wrong
bits or explain why those bits are not wrong.

No actual code changes.
Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1423768732-32194-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

097f4e5e

18 2月, 2015 3 次提交

x86/irq: Check for valid irq descriptor in check_irq_vectors_for_cpu_disable() · d97eb896

由 Joerg Roedel 提交于 2月 04, 2015

When an interrupt is migrated away from a cpu it will stay
in its vector_irq array until smp_irq_move_cleanup_interrupt
succeeded. The cfg->move_in_progress flag is cleared already
when the IPI was sent.

When the interrupt is destroyed after migration its 'struct
irq_desc' is freed and the vector_irq arrays are cleaned up.
But since cfg->move_in_progress is already 0 the references
at cpus before the last migration will not be cleared. So
this would leave a reference to an already destroyed irq
alive.

When the cpu is taken down at this point, the
check_irq_vectors_for_cpu_disable() function finds a valid irq
number in the vector_irq array, but gets NULL for its
descriptor and dereferences it, causing a kernel panic.

This has been observed on real systems at shutdown. Add a
check to check_irq_vectors_for_cpu_disable() for a valid
'struct irq_desc' to prevent this issue.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NJiang Liu <jiang.liu@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: alnovak@suse.com
Cc: joro@8bytes.org
Link: http://lkml.kernel.org/r/20150204132754.GA10078@suse.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

d97eb896

x86/irq: Fix regression caused by commit · 1ea76fba

由 Jiang Liu 提交于 2月 16, 2015

Commit b568b860 ("Treat SCI interrupt as normal GSI interrupt")
accidently removes support of legacy PIC interrupt when fixing a
regression for Xen, which causes a nasty regression on HP/Compaq
nc6000 where we fail to register the ACPI interrupt, and thus
lose eg. thermal notifications leading a potentially overheated
machine.

So reintroduce support of legacy PIC based ACPI SCI interrupt.
Reported-by: NVille Syrjälä <syrjala@sci.fi>
Tested-by: NVille Syrjälä <syrjala@sci.fi>
Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NPavel Machek <pavel@ucw.cz>
Cc: <stable@vger.kernel.org> # 3.19+
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Cc: linux-pm@vger.kernel.org
Link: http://lkml.kernel.org/r/1424052673-22974-1-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

1ea76fba

x86/spinlocks/paravirt: Fix memory corruption on unlock · d6abfdb2

由 Raghavendra K T 提交于 2月 06, 2015

Paravirt spinlock clears slowpath flag after doing unlock.
As explained by Linus currently it does:

                prev = *lock;
                add_smp(&lock->tickets.head, TICKET_LOCK_INC);

                /* add_smp() is a full mb() */

                if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG))
                        __ticket_unlock_slowpath(lock, prev);

which is *exactly* the kind of things you cannot do with spinlocks,
because after you've done the "add_smp()" and released the spinlock
for the fast-path, you can't access the spinlock any more.  Exactly
because a fast-path lock might come in, and release the whole data
structure.

Linus suggested that we should not do any writes to lock after unlock(),
and we can move slowpath clearing to fastpath lock.

So this patch implements the fix with:

 1. Moving slowpath flag to head (Oleg):
    Unlocked locks don't care about the slowpath flag; therefore we can keep
    it set after the last unlock, and clear it again on the first (try)lock.
    -- this removes the write after unlock. note that keeping slowpath flag would
    result in unnecessary kicks.
    By moving the slowpath flag from the tail to the head ticket we also avoid
    the need to access both the head and tail tickets on unlock.

 2. use xadd to avoid read/write after unlock that checks the need for
    unlock_kick (Linus):
    We further avoid the need for a read-after-release by using xadd;
    the prev head value will include the slowpath flag and indicate if we
    need to do PV kicking of suspended spinners -- on modern chips xadd
    isn't (much) more expensive than an add + load.

Result:
 setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest)
 benchmark overcommit %improve
 kernbench  1x           -0.13
 kernbench  2x            0.02
 dbench     1x           -1.77
 dbench     2x           -0.63

[Jeremy: Hinted missing TICKET_LOCK_INC for kick]
[Oleg: Moved slowpath flag to head, ticket_equals idea]
[PeterZ: Added detailed changelog]
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Reported-by: NSasha Levin <sasha.levin@oracle.com>
Tested-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Cc: Andrew Jones <drjones@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Fernando Luis Vázquez Cao <fernando_b1@lab.ntt.co.jp>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: a.ryabinin@samsung.com
Cc: dave@stgolabs.net
Cc: hpa@zytor.com
Cc: jasowang@redhat.com
Cc: jeremy@goop.org
Cc: paul.gortmaker@windriver.com
Cc: riel@redhat.com
Cc: tglx@linutronix.de
Cc: waiman.long@hp.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20150215173043.GA7471@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

d6abfdb2

14 2月, 2015 6 次提交

kasan: enable instrumentation of global variables · bebf56a1

由 Andrey Ryabinin 提交于 2月 13, 2015

This feature let us to detect accesses out of bounds of global variables.
This will work as for globals in kernel image, so for globals in modules.
Currently this won't work for symbols in user-specified sections (e.g.
__init, __read_mostly, ...)

The idea of this is simple.  Compiler increases each global variable by
redzone size and add constructors invoking __asan_register_globals()
function.  Information about global variable (address, size, size with
redzone ...) passed to __asan_register_globals() so we could poison
variable's redzone.

This patch also forces module_alloc() to return 8*PAGE_SIZE aligned
address making shadow memory handling (
kasan_module_alloc()/kasan_module_free() ) more simple.  Such alignment
guarantees that each shadow page backing modules address space correspond
to only one module_alloc() allocation.
Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: NAndrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bebf56a1

mm: vmalloc: pass additional vm_flags to __vmalloc_node_range() · cb9e3c29

由 Andrey Ryabinin 提交于 2月 13, 2015

For instrumenting global variables KASan will shadow memory backing memory
for modules.  So on module loading we will need to allocate memory for
shadow and map it at address in shadow that corresponds to the address
allocated in module_alloc().

__vmalloc_node_range() could be used for this purpose, except it puts a
guard hole after allocated area.  Guard hole in shadow memory should be a
problem because at some future point we might need to have a shadow memory
at address occupied by guard hole.  So we could fail to allocate shadow
for module_alloc().

Now we have VM_NO_GUARD flag disabling guard page, so we need to pass into
__vmalloc_node_range().  Add new parameter 'vm_flags' to
__vmalloc_node_range() function.
Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: NAndrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cb9e3c29

kasan: enable stack instrumentation · c420f167

由 Andrey Ryabinin 提交于 2月 13, 2015

Stack instrumentation allows to detect out of bounds memory accesses for
variables allocated on stack.  Compiler adds redzones around every
variable on stack and poisons redzones in function's prologue.

Such approach significantly increases stack usage, so all in-kernel stacks
size were doubled.
Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: NAndrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c420f167

x86_64: kasan: add interceptors for memset/memmove/memcpy functions · 393f203f

由 Andrey Ryabinin 提交于 2月 13, 2015

Recently instrumentation of builtin functions calls was removed from GCC
5.0.  To check the memory accessed by such functions, userspace asan
always uses interceptors for them.

So now we should do this as well.  This patch declares
memset/memmove/memcpy as weak symbols.  In mm/kasan/kasan.c we have our
own implementation of those functions which checks memory before accessing
it.

Default memset/memmove/memcpy now now always have aliases with '__'
prefix.  For files that built without kasan instrumentation (e.g.
mm/slub.c) original mem* replaced (via #define) with prefixed variants,
cause we don't want to check memory accesses there.
Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: NAndrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

393f203f

x86_64: add KASan support · ef7f0d6a

由 Andrey Ryabinin 提交于 2月 13, 2015

This patch adds arch specific code for kernel address sanitizer.

16TB of virtual addressed used for shadow memory.  It's located in range
[ffffec0000000000 - fffffc0000000000] between vmemmap and %esp fixup
stacks.

At early stage we map whole shadow region with zero page.  Latter, after
pages mapped to direct mapping address range we unmap zero pages from
corresponding shadow (see kasan_map_shadow()) and allocate and map a real
shadow memory reusing vmemmap_populate() function.

Also replace __pa with __pa_nodebug before shadow initialized.  __pa with
CONFIG_DEBUG_VIRTUAL=y make external function call (__phys_addr)
__phys_addr is instrumented, so __asan_load could be called before shadow
area initialized.
Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: NAndrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Jim Davis <jim.epost@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ef7f0d6a

x86: use %*pb[l] to print bitmaps including cpumasks and nodemasks · bf58b487

由 Tejun Heo 提交于 2月 13, 2015

printk and friends can now format bitmaps using '%*pb[l]'.  cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.

* Unnecessary buffer size calculation and condition on the lenght
  removed from intel_cacheinfo.c::show_shared_cpu_map_func().

* uv_nmi_nr_cpus_pr() got overly smart and implemented "..."
  abbreviation if the output stretched over the predefined 1024 byte
  buffer.  Replaced with plain printk.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bf58b487

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功