提交 · 8f060f53554cf58dcb28c85ff05d03ed8b02f4b2 · openeuler / Kernel

22 2月, 2019 1 次提交

KVM: s390: implement subfunction processor calls · 346fa2f8

While we will not implement interception for query functions yet, we can
and should disable functions that have a control bit based on the given
CPU model.

Let us start with enabling the subfunction interface.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NJanosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>

346fa2f8

21 2月, 2019 1 次提交

KVM: Call kvm_arch_memslots_updated() before updating memslots · 15248258

由 Sean Christopherson 提交于 5年前

kvm_arch_memslots_updated() is at this point in time an x86-specific
hook for handling MMIO generation wraparound. x86 stashes 19 bits of
the memslots generation number in its MMIO sptes in order to avoid
full page fault walks for repeat faults on emulated MMIO addresses.
Because only 19 bits are used, wrapping the MMIO generation number is
possible, if unlikely. kvm_arch_memslots_updated() alerts x86 that
the generation has changed so that it can invalidate all MMIO sptes in
case the effective MMIO generation has wrapped so as to avoid using a
stale spte, e.g. a (very) old spte that was created with generation==0.

Given that the purpose of kvm_arch_memslots_updated() is to prevent
consuming stale entries, it needs to be called before the new generation
is propagated to memslots. Invalidating the MMIO sptes after updating
memslots means that there is a window where a vCPU could dereference
the new memslots generation, e.g. 0, and incorrectly reuse an old MMIO
spte that was created with (pre-wrap) generation==0.

Fixes: e59dbe09 ("KVM: Introduce kvm_arch_memslots_updated()")
Cc: <stable@vger.kernel.org>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

15248258

05 2月, 2019 8 次提交

KVM: s390: add gib_alert_irq_handler() · 9f30f621

由 Michael Mueller 提交于 5年前

The patch implements a handler for GIB alert interruptions
on the host. Its task is to alert guests that interrupts are
pending for them.

A GIB alert interrupt statistic counter is added as well:

$ cat /proc/interrupts
          CPU0       CPU1
  ...
  GAL:      23         37   [I/O] GIB Alert
  ...
Signed-off-by: NMichael Mueller <mimu@linux.ibm.com>
Acked-by: NHalil Pasic <pasic@linux.ibm.com>
Reviewed-by: NPierre Morel <pmorel@linux.ibm.com>
Message-Id: <20190131085247.13826-14-mimu@linux.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

9f30f621

KVM: s390: add functions to (un)register GISC with GISA · 6cff2e10

由 Michael Mueller 提交于 5年前

Add the Interruption Alert Mask (IAM) to the architecture specific
kvm struct. This mask in the GISA is used to define for which ISC
a GIB alert will be issued.

The functions kvm_s390_gisc_register() and kvm_s390_gisc_unregister()
are used to (un)register a GISC (guest ISC) with a virtual machine and
its GISA.

Upon successful completion, kvm_s390_gisc_register() returns the
ISC to be used for GIB alert interruptions. A negative return code
indicates an error during registration.

Theses functions will be used by other adapter types like AP and PCI to
request pass-through interruption support.
Signed-off-by: NMichael Mueller <mimu@linux.ibm.com>
Acked-by: NPierre Morel <pmorel@linux.ibm.com>
Acked-by: NHalil Pasic <pasic@linux.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Message-Id: <20190131085247.13826-12-mimu@linux.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

6cff2e10

KVM: s390: add kvm reference to struct sie_page2 · 25c84dba

由 Michael Mueller 提交于 5年前

Adding the kvm reference to struct sie_page2 will allow to
determine the kvm a given gisa belongs to:

  container_of(gisa, struct sie_page2, gisa)->kvm

This functionality will be required to process a gisa in
gib alert interruption context.
Signed-off-by: NMichael Mueller <mimu@linux.ibm.com>
Reviewed-by: NPierre Morel <pmorel@linux.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-11-mimu@linux.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

25c84dba

KVM: s390: add the GIB and its related life-cyle functions · 1282c21e

由 Michael Mueller 提交于 5年前

The Guest Information Block (GIB) links the GISA of all guests
that have adapter interrupts pending. These interrupts cannot be
delivered because all vcpus of these guests are currently in WAIT
state or have masked the respective Interruption Sub Class (ISC).
If enabled, a GIB alert is issued on the host to schedule these
guests to run suitable vcpus to consume the pending interruptions.

This mechanism allows to process adapter interrupts for currently
not running guests.

The GIB is created during host initialization and associated with
the Adapter Interruption Facility in case an Adapter Interruption
Virtualization Facility is available.

The GIB initialization and thus the activation of the related code
will be done in an upcoming patch of this series.
Signed-off-by: NMichael Mueller <mimu@linux.ibm.com>
Reviewed-by: NJanosch Frank <frankja@linux.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NPierre Morel <pmorel@linux.ibm.com>
Acked-by: NHalil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-10-mimu@linux.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

1282c21e

s390/cio: add function chsc_sgib() · 3dec1922

由 Michael Mueller 提交于 5年前

This patch implements the Set Guest Information Block operation
to request association or disassociation of a Guest Information
Block (GIB) with the Adapter Interruption Facility. The operation
is required to receive GIB alert interrupts for guest adapters
in conjunction with AIV and GISA.
Signed-off-by: NMichael Mueller <mimu@linux.ibm.com>
Reviewed-by: NSebastian Ott <sebott@linux.ibm.com>
Reviewed-by: NPierre Morel <pmorel@linux.ibm.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NHalil Pasic <pasic@linux.ibm.com>
Acked-by: NJanosch Frank <frankja@linux.ibm.com>
Acked-by: NCornelia Huck <cohuck@redhat.com>
Message-Id: <20190131085247.13826-9-mimu@linux.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

3dec1922

KVM: s390: introduce struct kvm_s390_gisa_interrupt · 982cff42

由 Michael Mueller 提交于 5年前

Use this struct analog to the kvm interruption structs
for kvm emulated floating and local interruptions.

GIB handling will add further fields to this structure as
required.
Signed-off-by: NMichael Mueller <mimu@linux.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-8-mimu@linux.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

982cff42

KVM: s390: move bitmap idle_mask into arch struct top level · 246b7218

由 Michael Mueller 提交于 5年前

The vcpu idle_mask state is used by but not specific
to the emulated floating interruptions. The state is
relevant to gisa related interruptions as well.
Signed-off-by: NMichael Mueller <mimu@linux.ibm.com>
Reviewed-by: NPierre Morel <pmorel@linux.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Acked-by: NHalil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-4-mimu@linux.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

246b7218

KVM: s390: make bitmap declaration consistent · 689bdf9e

由 Michael Mueller 提交于 5年前

Use a consistent bitmap declaration throughout the code.
Signed-off-by: NMichael Mueller <mimu@linux.ibm.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-3-mimu@linux.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

689bdf9e

12 1月, 2019 2 次提交

s390/vdso: correct vdso mapping for compat tasks · 190f056f

由 Vasily Gorbik 提交于 6年前

While "s390/vdso: avoid 64-bit vdso mapping for compat tasks" fixed
64-bit vdso mapping for compat tasks under gdb it introduced another
problem. "compat_mm" flag is not inherited during fork and when
31-bit process forks a child (but does not perform exec) it ends up
with 64-bit vdso. To address that, init_new_context (which is called
during fork and exec) now initialize compat_mm based on thread TIF_31BIT
flag. Later compat_mm is adjusted in arch_setup_additional_pages, which
is called during exec.

Fixes: d1befa65 ("s390/vdso: avoid 64-bit vdso mapping for compat tasks")
Reported-by: NStefan Liebler <stli@linux.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@vger.kernel.org> # v4.20+
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

190f056f

s390/mm: always force a load of the primary ASCE on context switch · a3866208

由 Martin Schwidefsky 提交于 6年前

The ASCE of an mm_struct can be modified after a task has been created,
e.g. via crst_table_downgrade for a compat process. The active_mm logic
to avoid the switch_mm call if the next task is a kernel thread can
lead to a situation where switch_mm is called where 'prev == next' is
true but 'prev->context.asce == next->context.asce' is not.

This can lead to a situation where a CPU uses the outdated ASCE to run
a task. The result can be a crash, endless loops and really subtle
problem due to TLBs being created with an invalid ASCE.

Cc: stable@kernel.org # v3.15+
Fixes: 53e857f3 ("s390/mm,tlb: race of lazy TLB flush vs. recreation")
Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

a3866208

06 1月, 2019 2 次提交

arch: remove redundant UAPI generic-y defines · d6e4b3e3

由 Masahiro Yamada 提交于 6年前

Now that Kbuild automatically creates asm-generic wrappers for missing
mandatory headers, it is redundant to list the same headers in
generic-y and mandatory-y.
Suggested-by: NSam Ravnborg <sam@ravnborg.org>
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: NSam Ravnborg <sam@ravnborg.org>

d6e4b3e3

arch: remove stale comments "UAPI Header export list" · d4ce5458

由 Masahiro Yamada 提交于 6年前

These comments are leftovers of commit fcc8487d ("uapi: export all
headers under uapi directories").

Prior to that commit, exported headers must be explicitly added to
header-y. Now, all headers under the uapi/ directories are exported.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>

d4ce5458

05 1月, 2019 2 次提交

mm: treewide: remove unused address argument from pte_alloc functions · 4cf58924

由 Joel Fernandes (Google) 提交于 6年前

Patch series "Add support for fast mremap".

This series speeds up the mremap(2) syscall by copying page tables at
the PMD level even for non-THP systems.  There is concern that the extra
'address' argument that mremap passes to pte_alloc may do something
subtle architecture related in the future that may make the scheme not
work.  Also we find that there is no point in passing the 'address' to
pte_alloc since its unused.  This patch therefore removes this argument
tree-wide resulting in a nice negative diff as well.  Also ensuring
along the way that the enabled architectures do not do anything funky
with the 'address' argument that goes unnoticed by the optimization.

Build and boot tested on x86-64.  Build tested on arm64.  The config
enablement patch for arm64 will be posted in the future after more
testing.

The changes were obtained by applying the following Coccinelle script.
(thanks Julia for answering all Coccinelle questions!).
Following fix ups were done manually:
* Removal of address argument from  pte_fragment_alloc
* Removal of pte_alloc_one_fast definitions from m68k and microblaze.

// Options: --include-headers --no-includes
// Note: I split the 'identifier fn' line, so if you are manually
// running it, please unsplit it so it runs for you.

virtual patch

@pte_alloc_func_def depends on patch exists@
identifier E2;
identifier fn =~
"^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
type T2;
@@

 fn(...
- , T2 E2
 )
 { ... }

@pte_alloc_func_proto_noarg depends on patch exists@
type T1, T2, T3, T4;
identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
@@

(
- T3 fn(T1, T2);
+ T3 fn(T1);
|
- T3 fn(T1, T2, T4);
+ T3 fn(T1, T2);
)

@pte_alloc_func_proto depends on patch exists@
identifier E1, E2, E4;
type T1, T2, T3, T4;
identifier fn =~
"^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
@@

(
- T3 fn(T1 E1, T2 E2);
+ T3 fn(T1 E1);
|
- T3 fn(T1 E1, T2 E2, T4 E4);
+ T3 fn(T1 E1, T2 E2);
)

@pte_alloc_func_call depends on patch exists@
expression E2;
identifier fn =~
"^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
@@

 fn(...
-,  E2
 )

@pte_alloc_macro depends on patch exists@
identifier fn =~
"^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
identifier a, b, c;
expression e;
position p;
@@

(
- #define fn(a, b, c) e
+ #define fn(a, b) e
|
- #define fn(a, b) e
+ #define fn(a) e
)

Link: http://lkml.kernel.org/r/20181108181201.88826-2-joelaf@google.comSigned-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
Suggested-by: NKirill A. Shutemov <kirill@shutemov.name>
Acked-by: NKirill A. Shutemov <kirill@shutemov.name>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4cf58924

fls: change parameter to unsigned int · 3fc2579e

由 Matthew Wilcox 提交于 6年前

When testing in userspace, UBSAN pointed out that shifting into the sign
bit is undefined behaviour.  It doesn't really make sense to ask for the
highest set bit of a negative value, so just turn the argument type into
an unsigned int.

Some architectures (eg ppc) already had it declared as an unsigned int,
so I don't expect too many problems.

Link: http://lkml.kernel.org/r/20181105221117.31828-1-willy@infradead.orgSigned-off-by: NMatthew Wilcox <willy@infradead.org>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3fc2579e

04 1月, 2019 1 次提交

Remove 'type' argument from access_ok() function · 96d4f267

由 Linus Torvalds 提交于 6年前

Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.

It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access.  But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.

A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model.  And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.

This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

There were a couple of notable cases:

 - csky still had the old "verify_area()" name as an alias.

 - the iter_iov code had magical hardcoded knowledge of the actual
   values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
   really used it)

 - microblaze used the type argument for a debug printout

but other than those oddities this should be a total no-op patch.

I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something.  Any missed conversion should be trivially fixable, though.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

96d4f267

07 12月, 2018 1 次提交

preempt: Move PREEMPT_NEED_RESCHED definition into arch code · 08861d33

由 Will Deacon 提交于 6年前

PREEMPT_NEED_RESCHED is never used directly, so move it into the arch
code where it can potentially be implemented using either a different
bit in the preempt count or as an entirely separate entity.

Cc: Robert Love <rml@tech9.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

08861d33

30 11月, 2018 2 次提交

s390/zcrypt: improve special ap message cmd handling · be534791

由 Harald Freudenberger 提交于 6年前

There exist very few ap messages which need to have the 'special' flag
enabled. This flag tells the firmware layer to do some pre- and maybe
postprocessing. However, it may happen that this special flag is
enabled but the firmware is unable to deal with this kind of message
and thus returns with reply code 0x41. For example older firmware may
not know the newest messages triggered by the zcrypt device driver and
thus react with reject and the named reply code. Unfortunately this
reply code is not known to the zcrypt error routines and thus default
behavior is to switch the ap queue offline.

This patch now makes the ap error routine aware of the reply code and
so userspace is informed about the bad processing result but the queue
is not switched to offline state any more.
Signed-off-by: NHarald Freudenberger <freude@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

be534791

s390/ap: rework assembler functions to use unions for in/out register variables · 159491f3

由 Harald Freudenberger 提交于 6年前

The inline assembler functions ap_aqic() and ap_qact() used two
variables declared on the very same register. One variable was for
input only, the other for output. Looks like newer versions of the gcc
don't like this. Anyway it is a better coding to use one variable
(which may have a union data type) on one register for input and
output. So this patch introduces unions and uses only one variable now
for input and output for GR1 for the PQAP(QACT) and PQAP(QIC)
invocation.
Signed-off-by: NHarald Freudenberger <freude@linux.ibm.com>
Acked-by: NIlya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

159491f3

06 11月, 2018 1 次提交

compiler: remove __no_sanitize_address_or_inline again · 163c8d54

由 Martin Schwidefsky 提交于 6年前

The __no_sanitize_address_or_inline and __no_kasan_or_inline defines
are almost identical. The only difference is that __no_kasan_or_inline
does not have the 'notrace' attribute.

To be able to replace __no_sanitize_address_or_inline with the older
definition, add 'notrace' to __no_kasan_or_inline and change to two
users of __no_sanitize_address_or_inline in the s390 code.

The 'notrace' option is necessary for e.g. the __load_psw_mask function
in arch/s390/include/asm/processor.h. Without the option it is possible
to trace __load_psw_mask which leads to kernel stack overflow.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Pointed-out-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

163c8d54

02 11月, 2018 2 次提交

s390/kasan: increase instrumented stack size to 64k · 9fed920e

由 Vasily Gorbik 提交于 6年前

Increase kasan instrumented kernel stack size from 32k to 64k. Other
architectures seems to get away with just doubling kernel stack size under
kasan, but on s390 this appears to be not enough due to bigger frame size.
The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE
vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting
stack overflow is fs sync on xfs filesystem:

 #0 [9a0681e8]  704 bytes  check_usage at 34b1fc
 #1 [9a0684a8]  432 bytes  check_usage at 34c710
 #2 [9a068658]  1048 bytes  validate_chain at 35044a
 #3 [9a068a70]  312 bytes  __lock_acquire at 3559fe
 #4 [9a068ba8]  440 bytes  lock_acquire at 3576ee
 #5 [9a068d60]  104 bytes  _raw_spin_lock at 21b44e0
 #6 [9a068dc8]  1992 bytes  enqueue_entity at 2dbf72
 #7 [9a069590]  1496 bytes  enqueue_task_fair at 2df5f0
 #8 [9a069b68]  64 bytes  ttwu_do_activate at 28f438
 #9 [9a069ba8]  552 bytes  try_to_wake_up at 298c4c
 #10 [9a069dd0]  168 bytes  wake_up_worker at 23f97c
 #11 [9a069e78]  200 bytes  insert_work at 23fc2e
 #12 [9a069f40]  648 bytes  __queue_work at 2487c0
 #13 [9a06a1c8]  200 bytes  __queue_delayed_work at 24db28
 #14 [9a06a290]  248 bytes  mod_delayed_work_on at 24de84
 #15 [9a06a388]  24 bytes  kblockd_mod_delayed_work_on at 153e2a0
 #16 [9a06a3a0]  288 bytes  __blk_mq_delay_run_hw_queue at 158168c
 #17 [9a06a4c0]  192 bytes  blk_mq_run_hw_queue at 1581a3c
 #18 [9a06a580]  184 bytes  blk_mq_sched_insert_requests at 15a2192
 #19 [9a06a638]  1024 bytes  blk_mq_flush_plug_list at 1590f3a
 #20 [9a06aa38]  704 bytes  blk_flush_plug_list at 1555028
 #21 [9a06acf8]  320 bytes  schedule at 219e476
 #22 [9a06ae38]  760 bytes  schedule_timeout at 21b0aac
 #23 [9a06b130]  408 bytes  wait_for_common at 21a1706
 #24 [9a06b2c8]  360 bytes  xfs_buf_iowait at fa1540
 #25 [9a06b430]  256 bytes  __xfs_buf_submit at fadae6
 #26 [9a06b530]  264 bytes  xfs_buf_read_map at fae3f6
 #27 [9a06b638]  656 bytes  xfs_trans_read_buf_map at 10ac9a8
 #28 [9a06b8c8]  304 bytes  xfs_btree_kill_root at e72426
 #29 [9a06b9f8]  288 bytes  xfs_btree_lookup_get_block at e7bc5e
 #30 [9a06bb18]  624 bytes  xfs_btree_lookup at e7e1a6
 #31 [9a06bd88]  2664 bytes  xfs_alloc_ag_vextent_near at dfa070
 #32 [9a06c7f0]  144 bytes  xfs_alloc_ag_vextent at dff3ca
 #33 [9a06c880]  1128 bytes  xfs_alloc_vextent at e05fce
 #34 [9a06cce8]  584 bytes  xfs_bmap_btalloc at e58342
 #35 [9a06cf30]  1336 bytes  xfs_bmapi_write at e618de
 #36 [9a06d468]  776 bytes  xfs_iomap_write_allocate at ff678e
 #37 [9a06d770]  720 bytes  xfs_map_blocks at f82af8
 #38 [9a06da40]  928 bytes  xfs_writepage_map at f83cd6
 #39 [9a06dde0]  320 bytes  xfs_do_writepage at f85872
 #40 [9a06df20]  1320 bytes  write_cache_pages at 73dfe8
 #41 [9a06e448]  208 bytes  xfs_vm_writepages at f7f892
 #42 [9a06e518]  88 bytes  do_writepages at 73fe6a
 #43 [9a06e570]  872 bytes  __writeback_single_inode at a20cb6
 #44 [9a06e8d8]  664 bytes  writeback_sb_inodes at a23be2
 #45 [9a06eb70]  296 bytes  __writeback_inodes_wb at a242e0
 #46 [9a06ec98]  928 bytes  wb_writeback at a2500e
 #47 [9a06f038]  848 bytes  wb_do_writeback at a260ae
 #48 [9a06f388]  536 bytes  wb_workfn at a28228
 #49 [9a06f5a0]  1088 bytes  process_one_work at 24a234
 #50 [9a06f9e0]  1120 bytes  worker_thread at 24ba26
 #51 [9a06fe40]  104 bytes  kthread at 26545a
 #52 [9a06fea8]             kernel_thread_starter at 21b6b62

To be able to increase the stack size to 64k reuse LLILL instruction
in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE
(65192) value as unsigned.
Reported-by: NBenjamin Block <bblock@linux.ibm.com>
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

9fed920e

s390/mm: fix mis-accounting of pgtable_bytes · e12e4044

由 Martin Schwidefsky 提交于 6年前

In case a fork or a clone system fails in copy_process and the error
handling does the mmput() at the bad_fork_cleanup_mm label, the
following warning messages will appear on the console:

BUG: non-zero pgtables_bytes on freeing mm: 16384

The reason for that is the tricks we play with mm_inc_nr_puds() and
mm_inc_nr_pmds() in init_new_context().

A normal 64-bit process has 3 levels of page table, the p4d level and
the pud level are folded. On process termination the free_pud_range()
function in mm/memory.c will subtract 16KB from pgtable_bytes with a
mm_dec_nr_puds() call, but there actually is not really a pud table.

One issue with this is the fact that pgtable_bytes is usually off
by a few kilobytes, but the more severe problem is that for a failed
fork or clone the free_pgtables() function is not called. In this case
there is no mm_dec_nr_puds() or mm_dec_nr_pmds() that go together with
the mm_inc_nr_puds() and mm_inc_nr_pmds in init_new_context().
The pgtable_bytes will be off by 16384 or 32768 bytes and we get the
BUG message. The message itself is purely cosmetic, but annoying.

To fix this override the mm_pmd_folded, mm_pud_folded and mm_p4d_folded
function to check for the true size of the address space.
Reported-by: NLi Wang <liwang@redhat.com>
Tested-by: NLi Wang <liwang@redhat.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

e12e4044

31 10月, 2018 1 次提交

treewide: remove current_text_addr · de0d22e5

由 Nick Desaulniers 提交于 6年前

Prefer _THIS_IP_ defined in linux/kernel.h.

Most definitions of current_text_addr were the same as _THIS_IP_, but
a few archs had inline assembly instead.

This patch removes the final call site of current_text_addr, making all
of the definitions dead code.

[akpm@linux-foundation.org: fix arch/csky/include/asm/processor.h]
Link: http://lkml.kernel.org/r/20180911182413.180715-1-ndesaulniers@google.comSigned-off-by: NNick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

de0d22e5

22 10月, 2018 1 次提交

s390/kasan: support preemptible kernel build · cf3dbe5d

由 Vasily Gorbik 提交于 6年前

When the kernel is built with:
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
"stfle" function used by kasan initialization code makes additional
call to preempt_count_add/preempt_count_sub. To avoid removing kasan
instrumentation from sched code where those functions leave split stfle
function and provide __stfle variant without preemption handling to be
used by Kasan.
Reported-by: NBenjamin Block <bblock@linux.ibm.com>
Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

cf3dbe5d

10 10月, 2018 2 次提交

s390/pkey: Introduce new API for transforming key blobs · fb1136d6

由 Ingo Franzki 提交于 6年前

Introduce a new ioctl API and in-kernel API to transform
a variable length key blob of any supported type into a
protected key.

Transforming a secure key blob uses the already existing
function pkey_sec2protk().
Transforming a protected key blob also verifies if the
protected key is still valid. If not, -ENODEV is returned.

Both APIs are described in detail in the header files
arch/s390/include/asm/pkey.h and arch/s390/include/uapi/asm/pkey.h.
Signed-off-by: NIngo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: NHarald Freudenberger <freude@linux.ibm.com>
Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

fb1136d6

s390/pkey: Introduce new API for random protected key verification · cb26b9ff

由 Ingo Franzki 提交于 6年前

Introduce a new ioctl API and in-kernel API to verify if a
random protected key is still valid. A protected key is
invalid when its wrapping key verification pattern does not
match the verification pattern of the LPAR. Each time an LPAR
is activated, a new LPAR wrapping key is generated and the
wrapping key verification pattern is updated.
Both APIs are described in detail in the header files
arch/s390/include/asm/pkey.h and arch/s390/include/uapi/asm/pkey.h.
Signed-off-by: NIngo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: NHarald Freudenberger <freude@linux.ibm.com>
Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

cb26b9ff

09 10月, 2018 13 次提交

s390/pkey: Introduce new API for random protected key generation · a45a5c7d

由 Ingo Franzki 提交于 6年前

This patch introduces a new ioctl API and in-kernel API to
generate a random protected key. The protected key is generated
in a way that the effective clear key is never exposed in clear.
Both APIs are described in detail in the header files
arch/s390/include/asm/pkey.h and arch/s390/include/uapi/asm/pkey.h.
Signed-off-by: NIngo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: NHarald Freudenberger <freude@linux.ibm.com>
Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

a45a5c7d

s390/zcrypt: zcrypt device driver cleanup · ee410de8

由 Harald Freudenberger 提交于 6年前

Some cleanup in the s390 zcrypt device driver:
- Removed fragments of pcixx crypto card code. This code
  can't be reached anymore because the hardware detection
  function does not recognize crypto cards < CEX2 since
  commit f56545430736 ("s390/zcrypt: Introduce QACT support
  for AP bus devices.")
- Rename of some files and driver names which where still
  reflecting pcixx support to cex2a/cex2c.
- Removed all the zcrypt version strings in the file headers.
  There is only one place left - the zcrypt.h header file is
  now the only place for zcrypt device driver version info.
- Zcrypt version pump up from 2.2.0 to 2.2.1.
Signed-off-by: NHarald Freudenberger <freude@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

ee410de8

s390/head: avoid doubling early boot stack size under KASAN · 19733fe8

由 Vasily Gorbik 提交于 7年前

Early boot stack uses predefined 4 pages of memory 0x8000-0xC000. This
stack is used to run not instumented decompressor/facilities
verification C code. It doesn't make sense to double its size when
the kernel is built with KASAN support. BOOT_STACK_ORDER is introduced
to avoid that.
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

19733fe8

s390/kasan: add option for 4-level paging support · 5dff0381

由 Vasily Gorbik 提交于 7年前

By default 3-level paging is used when the kernel is compiled with
kasan support. Add 4-level paging option to support systems with more
then 3TB of physical memory and to cover 4-level paging specific code
with kasan as well.
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

5dff0381

s390/kasan: free early identity mapping structures · 135ff163

由 Vasily Gorbik 提交于 7年前

Kasan initialization code is changed to populate persistent shadow
first, save allocator position into pgalloc_freeable and proceed with
early identity mapping creation. This way early identity mapping paging
structures could be freed at once after switching to swapper_pg_dir
when early identity mapping is not needed anymore.
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

135ff163

s390/kasan: enable stack and global variables access checks · 5e785963

由 Vasily Gorbik 提交于 7年前

By defining KASAN_SHADOW_OFFSET in Kconfig stack and global variables
memory access check instrumentation is enabled. gcc version 4.9.2 or
newer is also required.
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

5e785963

s390/kasan: reipl and kexec support · ac1256f8

由 Vasily Gorbik 提交于 7年前

Some functions from both arch/s390/kernel/ipl.c and
arch/s390/kernel/machine_kexec.c are called without DAT enabled
(or with and without DAT enabled code paths). There is no easy way
to partially disable kasan for those files without a substantial
rework. Disable kasan for both files for now.

To avoid disabling kasan for arch/s390/kernel/diag.c DAT flag is
enabled in diag308 call. pcpu_delegate which disables DAT is marked
with __no_sanitize_address to disable instrumentation for that one
function.
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

ac1256f8

s390/smp: kasan stack instrumentation support · 9e8df6da

由 Vasily Gorbik 提交于 7年前

smp_start_secondary function is called without DAT enabled. To avoid
disabling kasan instrumentation for entire arch/s390/kernel/smp.c
smp_start_secondary has been split in 2 parts. smp_start_secondary has
instrumentation disabled, it does minimal setup and enables DAT. Then
instrumentated __smp_start_secondary is called to do the rest.

__load_psw_mask function instrumentation has been disabled as well
to be able to call it from smp_start_secondary.
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

9e8df6da

s390/kasan: use noexec and large pages · d58106c3

由 Vasily Gorbik 提交于 7年前

To lower memory footprint and speed up kasan initialisation detect
EDAT availability and use large pages if possible. As we know how
much memory is needed for initialisation, another simplistic large
page allocator is introduced to avoid memory fragmentation.

Since facilities list is retrieved anyhow, detect noexec support and
adjust pages attributes. Handle noexec kernel option to avoid inconsistent
kasan shadow memory pages flags.
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d58106c3

s390/kasan: double the stack size · 7fef92cc

由 Vasily Gorbik 提交于 7年前

Kasan stack instrumentation pads stack variables with redzones, which
increases stack frames size significantly. Stack sizes are increased
from 16k to 32k in the code, as well as for the kernel stack overflow
detection option (CHECK_STACK).
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

7fef92cc

s390/kasan: add initialization code and enable it · 42db5ed8

由 Vasily Gorbik 提交于 7年前

Kasan needs 1/8 of kernel virtual address space to be reserved as the
shadow area. And eventually it requires the shadow memory offset to be
known at compile time (passed to the compiler when full instrumentation
is enabled).  Any value picked as the shadow area offset for 3-level
paging would eat up identity mapping on 4-level paging (with 1PB
shadow area size). So, the kernel sticks to 3-level paging when kasan
is enabled. 3TB border is picked as the shadow offset.  The memory
layout is adjusted so, that physical memory border does not exceed
KASAN_SHADOW_START and vmemmap does not go below KASAN_SHADOW_END.

Due to the fact that on s390 paging is set up very late and to cover
more code with kasan instrumentation, temporary identity mapping and
final shadow memory are set up early. The shadow memory mapping is
later carried over to init_mm.pgd during paging_init.

For the needs of paging structures allocation and shadow memory
population a primitive allocator is used, which simply chops off
memory blocks from the end of the physical memory.

Kasan currenty doesn't track vmemmap and vmalloc areas.

Current memory layout (for 3-level paging, 2GB physical memory).

---[ Identity Mapping ]---
0x0000000000000000-0x0000000000100000
---[ Kernel Image Start ]---
0x0000000000100000-0x0000000002b00000
---[ Kernel Image End ]---
0x0000000002b00000-0x0000000080000000        2G <- physical memory border
0x0000000080000000-0x0000030000000000     3070G PUD I
---[ Kasan Shadow Start ]---
0x0000030000000000-0x0000030010000000      256M PMD RW X  <- shadow for 2G memory
0x0000030010000000-0x0000037ff0000000   523776M PTE RO NX <- kasan zero ro page
0x0000037ff0000000-0x0000038000000000      256M PMD RW X  <- shadow for 2G modules
---[ Kasan Shadow End ]---
0x0000038000000000-0x000003d100000000      324G PUD I
---[ vmemmap Area ]---
0x000003d100000000-0x000003e080000000
---[ vmalloc Area ]---
0x000003e080000000-0x000003ff80000000
---[ Modules Area ]---
0x000003ff80000000-0x0000040000000000        2G
Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

42db5ed8

s390: add pgd_page primitive · d0e2eb0a

由 Vasily Gorbik 提交于 6年前

Add pgd_page primitive which is required by kasan common code.
Also fixes typo in p4d_page definition.
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d0e2eb0a

s390: introduce MAX_PTRS_PER_P4D · 34377d3c

由 Vasily Gorbik 提交于 6年前

Kasan common code requires MAX_PTRS_PER_P4D definition, which in case
of s390 is always PTRS_PER_P4D.
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

34377d3c

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功