提交 · 0f80bc85c587e8fdeecece4f294a47eca4922ea2 · openanolis / cloud-kernel

18 9月, 2005 4 次提交

[PATCH] uml: move libc code out of mem_user.c and tempfile.c · 0f80bc85

由 Jeff Dike 提交于 9月 16, 2005

The serial UML OS-abstraction layer patch (um/kernel dir).

This moves all system calls from mem_user.c and tempfile.c files under
os-Linux dir.
Signed-off-by: NGennady Sharapov <Gennady.V.Sharapov@intel.com>
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0f80bc85

[PATCH] uml: preserve errno in error paths · b4fd310e

由 Jeff Dike 提交于 9月 16, 2005

The poster child for this patch is the third tuntap_user hunk.  When an ioctl
fails, it properly closes the opened file descriptor and returns.  However,
the close resets errno to 0, and the 'return errno' that follows returns 0
rather than the value that ioctl set.  This caused the caller to believe that
the device open succeeded and had opened file descriptor 0, which caused no
end of interesting behavior.

The rest of this patch is a pass through the UML sources looking for places
where errno could be reset before being passed back out.  A common culprit is
printk, which could call write, being called before errno is returned.

In some cases, where the code ends up being much smaller, I just deleted the
printk.

There was another case where a caller of run_helper looked at errno after a
failure, rather than the return value of run_helper, which was the errno value
that it wanted.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b4fd310e

[PATCH] uml: breakpoint an arbitrary thread · 3eddddcf

由 Jeff Dike 提交于 9月 16, 2005

This patch implements a stack trace for a thread, not unlike sysrq-t does.
The advantage to this is that a break point can be placed on showreqs, so that
upon showing the stack, you jump immediately into the debugger. While sysrq-t
does the same thing, sysrq-t shows *all* threads stacks. It also doesn't work
right now. In the future, I thought it might be acceptable to make this show
all pids stacks, but perhaps leaving well enough alone and just using sysrq-t
would be okay. For now, upon receiving the stack command, UML switches
context to that thread, dumps its registers, and then switches context back to
the original thread. Since UML compacts all threads into one of 4 host
threads, this sort of mechanism could be expanded in the future to include
other debugging helpers that sysrq does not cover.

Note by jdike - The main benefit to this is that it brings an arbitrary thread
back into context, where it can be examined by gdb. The fact that it dumps it
stack is secondary. This provides the capability to examine a sleeping
thread, which has existed in tt mode, but not in skas mode until now.

Also, the other threads, that sysrq doesn't cover, can be gdb-ed directly
anyway.

Signed-off-by: Allan Graves<allan.graves@gmail.com>
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3eddddcf

[PATCH] uml: _switch_to code consolidation · f6e34c6a

由 Jeff Dike 提交于 9月 16, 2005

This patch moves code that is in both switch_to_tt and switch_to_skas to the
top level _switch_to function, keeping us from duplicating code.  It is
required for the stack trace patch to work properly.
Signed-off-by: NAllan Graves <allan.graves@gmail.com>
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f6e34c6a

11 9月, 2005 4 次提交

[PATCH] uml: avoid already done dirtying · 16b03678

由 Paolo 'Blaisorblade' Giarrusso 提交于 9月 10, 2005

The PTE returned from handle_mm_fault is already marked as dirty and accessed
if needed.

Also, since this is not set with set_pte() (which sets NEWPAGE and NEWPROT as
needed), this wouldn't work anyway.

This version has been updated and fixed, thanks to some feedback from Jeff Dike.
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

16b03678

[PATCH] uml: fix fault handler on write · d129f312

由 Paolo 'Blaisorblade' Giarrusso 提交于 9月 10, 2005

The UML fault handler was recently changed to enforce PROT_NONE protections,
by requiring VM_READ or VM_EXEC on VMA's.

However, by mistake, things were changed such that VM_READ is always checked,
also on write faults; so a VMA mapped with only PROT_WRITE is not readable
(unless it's prefaulted with MAP_POPULATE or with a write), which is different
from i386.

Discovered while testing remap_file_pages protection support.
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d129f312

[PATCH] uml: inline mk_pte and various friends · d99c4022

由 Paolo 'Blaisorblade' Giarrusso 提交于 9月 10, 2005

Turns out that, for UML, a *lot* of VM-related trivial functions are not
inlined but rather normal functions.

In other sections of UML code, this is justified by having files which
interact with the host and cannot therefore include kernel headers, but in
this case there's no such justification.

I've had to turn many of them to macros because of missing declarations. While
doing this, I've decided to reuse some already existing macros.
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d99c4022

[PATCH] i386 / uml: add dwarf sections to static link script · a7d0c210

由 Paolo 'Blaisorblade' Giarrusso 提交于 9月 10, 2005

Inside the linker script, insert the code for DWARF debug info sections. This
may help GDB'ing a Uml binary. Actually, it seems that ld is able to guess
what I added correctly, but normal linker scripts include this section so it
should be correct anyway adding it.

On request by Sam Ravnborg <sam@ravnborg.org>, I've added it to
asm-generic/vmlinux.lds.s. I've also moved there the stabs debug section,
used the new macro in i386 linker script and added DWARF debug section to
that.

In the truth, I've not been able to verify the difference in GDB behaviour
after this change (I've seen large improvements with another patch). This
may depend on my binutils version, older one may have worse defaults.

However, this section is present in normal linker script, so add it at
least for the sake of cleanness.
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: NSam Ravnborg <sam@ravnborg.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

a7d0c210

10 9月, 2005 2 次提交

kbuild: um fix so it compile with generic asm-offsets.h support · f64a227b

由 Sam Ravnborg 提交于 9月 09, 2005

um has it own set of files for asm-offsets. So for now the
gen-asm-offset macro is just duplicated in the um Makefile.

This may well be the final solution since um is a bit special compared
to other architectures - time will tell.

Also added a dummy arch/um/kernel/asm-offsets.h file to keep kbuild happy.
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>

f64a227b

[PATCH] uaccess.h annotations (uml) · 85c39206

由 viro@ZenIV.linux.org.uk 提交于 9月 09, 2005

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

85c39206

08 9月, 2005 1 次提交
- V
  [PATCH] bogus #if (arch/um/kernel/mem.c) · 9a0b3869
  由 viro@ZenIV.linux.org.uk 提交于 9月 07, 2005
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
```
  9a0b3869
05 9月, 2005 11 次提交

[PATCH] uml: fix x86_64 page leak · 7ef93905

由 Jeff Dike 提交于 9月 03, 2005

We were leaking pmd pages when 3_LEVEL_PGTABLES was enabled.  This fixes that.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7ef93905

[PATCH] uml: skas0 stubs now check system call return values · 07bf731e

由 Bodo Stroesser 提交于 9月 03, 2005

Change syscall-stub's data to include a "expected retval".

Stub now checks syscalls retval and aborts execution of syscall list, if
retval != expected retval.

run_syscall_stub prints the data of the failed syscall, using the data pointer
and retval written by the stub to the beginning of the stack.

one_syscall_stub is removed, to simplify code, because only some instructions
are saved by one_syscall_stub, no host-syscall.

Using the stub with additional data (modify_ldt via stub)
is prepared also.
Signed-off-by: NBodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

07bf731e

[PATCH] uml: increase granularity of host capability checking · 8b51304e

由 Bodo Stroesser 提交于 9月 03, 2005

This change enables SKAS0/SKAS3 to work with all combinations of /proc/mm and
PTRACE_FAULTINFO being available or not.

Also it changes the initialization of proc_mm and ptrace_faultinfo slightly,
to ease forcing SKAS0 on a patched host.  Forcing UML to run without /proc/mm
or PTRACE_FAULTINFO by cmdline parameter can be implemented with a setup
resetting the related variable.
Signed-off-by: NBodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8b51304e

[PATCH] uml: move libc-dependent startup and signal code · 60d339f6

由 Gennady Sharapov 提交于 9月 03, 2005

The serial UML OS-abstraction layer patch (um/kernel dir).

This moves all systemcalls from process.c file under os-Linux dir and join
process.c and process_kern.c files.
Signed-off-by: NGennady Sharapov <gennady.v.sharapov@intel.com>
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

60d339f6

[PATCH] uml: use host AIO support · 75e5584c

由 Jeff Dike 提交于 9月 03, 2005

This patch makes UML use host AIO support when it (and
/usr/include/linux/aio_abi.h) are present.  This is only the support, with no
consumers - a consumer is coming in the next patch.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

75e5584c

[PATCH] uml: system call path cleanup · e32dacb9

由 Jeff Dike 提交于 9月 03, 2005

This merges two sets of files which had no business being split apart in the
first place.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e32dacb9

[PATCH] uml: TLB operation batching · c5600490

由 Jeff Dike 提交于 9月 03, 2005

This adds VM op batching to skas0.  Rather than having a context switch to and
from the userspace stub for each address space change, we write a number of
operations to the stub data page and invoke a different stub which loops over
them and executes them all in one go.

The operations are stored as [ system call number, arg1, arg2, ... ] tuples.

The set is terminated by a system call number of 0.  Single operations, i.e.
page faults, are handled in the old way, since that is slightly more
efficient.

For a kernel build, a minority (~1/4) of the operations are part of a set.
These sets averaged ~100 in length, so for this quarter, the context switching
overhead is greatly reduced.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c5600490

[PATCH] uml: remove duplicated exports · 77fa5adc

由 Jeff Dike 提交于 9月 03, 2005

Al Viro <viro@parcelfarce.linux.theplanet.co.uk> spotted a bunch of duplicated
exports - this removes them.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

77fa5adc

[PATCH] uml: fault handler micro-cleanups · 3b52166c

由 Paolo 'Blaisorblade' Giarrusso 提交于 9月 03, 2005

Avoid chomping low bits of address for functions doing it by themselves,
fix whitespace, add a correctness checking.

I did this for remap-file-pages protection support, it was useful on its
own too.
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3b52166c

[PATCH] uml: workaround GDB problems on debugging · 02215759

由 Paolo 'Blaisorblade' Giarrusso 提交于 9月 03, 2005

Apparently, GDB gets confused when we do an execvp() on ourselves.

Since it's simply done to allocate further space for command line arguments
(which we'll use to allow gathering the startup command line for guest
processes through the host), allow the user to disable that to get a
debuggable UML binary.
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

02215759

[PATCH] uml: remove debugging code from page fault path · 96e59245

由 Jeff Dike 提交于 9月 03, 2005

This eliminates the segfault info ring buffer, which added a system call to
each page fault, and which hadn't been useful for debugging in ages.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

96e59245

30 8月, 2005 1 次提交

[PATCH] convert signal handling of NODEFER to act like other Unix boxes. · 69be8f18

由 Steven Rostedt 提交于 8月 29, 2005

It has been reported that the way Linux handles NODEFER for signals is
not consistent with the way other Unix boxes handle it.  I've written a
program to test the behavior of how this flag affects signals and had
several reports from people who ran this on various Unix boxes,
confirming that Linux seems to be unique on the way this is handled.

The way NODEFER affects signals on other Unix boxes is as follows:

1) If NODEFER is set, other signals in sa_mask are still blocked.

2) If NODEFER is set and the signal is in sa_mask, then the signal is
still blocked. (Note: this is the behavior of all tested but Linux _and_
NetBSD 2.0 *).

The way NODEFER affects signals on Linux:

1) If NODEFER is set, other signals are _not_ blocked regardless of
sa_mask (Even NetBSD doesn't do this).

2) If NODEFER is set and the signal is in sa_mask, then the signal being
handled is not blocked.

The patch converts signal handling in all current Linux architectures to
the way most Unix boxes work.

Unix boxes that were tested:  DU4, AIX 5.2, Irix 6.5, NetBSD 2.0, SFU
3.5 on WinXP, AIX 5.3, Mac OSX, and of course Linux 2.6.13-rcX.

* NetBSD was the only other Unix to behave like Linux on point #2. The
main concern was brought up by point #1 which even NetBSD isn't like
Linux.  So with this patch, we leave NetBSD as the lonely one that
behaves differently here with #2.
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

69be8f18

19 8月, 2005 1 次提交

[PATCH] uml: fix a crash under screen · 2eaa297c

由 Jeff Dike 提交于 8月 18, 2005

Running UML inside a detached screen delivers SIGWINCH when UML is not
expecting it.  This patch ignores them.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

2eaa297c

29 7月, 2005 5 次提交

[PATCH] uml: Clean up prink calls · 30f417c6

由 Christophe Lucas 提交于 7月 28, 2005

printk() calls should include appropriate KERN_* constant.
Signed-off-by: NChristophe Lucas <clucas@rotomalug.org>
Signed-off-by: NDomen Puncer <domen@coderock.org>
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

30f417c6

[PATCH] uml: Fix typo · 201134ca

由 Bodo Stroesser 提交于 7月 28, 2005

Fix a typo in wait_stub_done.
Signed-off-by: NBodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

201134ca

[PATCH] uml: Fix load average >=1 · 7e1f49da

由 Jeff Dike 提交于 7月 28, 2005

update_process_times was missing its irq_enter/irq_exit wrapper.  This caused
ksoftirqd to be scheduled on every clock tick.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

7e1f49da

[PATCH] uml: Fix redundant assignment · d9b7cc84

由 Jeff Dike 提交于 7月 28, 2005

By this point, .is_user has already been set, so this assignment is useless.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d9b7cc84

[PATCH] uml: fix TT mode by reverting "use fork instead of clone" · b85e9680

由 Jeff Dike 提交于 7月 28, 2005

With Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>

Revert the following patch, because of miscompilation problems in different
environments leading to UML not working *at all* in TT mode; it was merged
lately in 2.6 development cycle, a little after being written, and has
caused problems to lots of people; I know it's a bit too long, but it
shouldn't have been merged in first place, so I still apply for inclusion
in the -stable tree. Anyone using this feature currently is either using
some older kernel (some reports even used 2.6.12-rc4-mm2) or using this
patch, as included in my -bs patchset.

For now there's not yet a fix for this patch, so for now the best thing is
to drop it (which was widely reported to give a working kernel, and as such
was even merged in -stable tree).

"Convert the boot-time host ptrace testing from clone to fork. They were
essentially doing fork anyway. This cleans up the code a bit, and makes
valgrind a bit happier about grinding it."

URL:
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=98fdffccea6cc3fe9dba32c0fcc310bcb5d71529Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b85e9680

28 7月, 2005 3 次提交

[PATCH] turn many #if $undefined_string into #ifdef $undefined_string · 44456d37

由 Olaf Hering 提交于 7月 27, 2005

turn many #if $undefined_string into #ifdef $undefined_string to fix some
warnings after -Wno-def was added to global CFLAGS
Signed-off-by: NOlaf Hering <olh@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

44456d37

[PATCH] uml: fix misdeclared function · 0d4579ed

由 Jeff Dike 提交于 7月 27, 2005

This fixes an interface which differed from its declaration, and includes
the relevant header so that this doesn't happen again.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

0d4579ed

[PATCH] uml: add skas0 command-line option · cb66504d

由 Paolo 'Blaisorblade' Giarrusso 提交于 7月 27, 2005

This adds the "skas0" parameter to force skas0 operation on SKAS3 host and
shows which operating mode has been selected.
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

cb66504d

27 7月, 2005 1 次提交

[PATCH] Don't export machine_restart, machine_halt, or machine_power_off. · 59586e5a

由 Eric W. Biederman 提交于 7月 26, 2005

machine_restart, machine_halt and machine_power_off are machine
specific hooks deep into the reboot logic, that modules
have no business messing with.  Usually code should be calling
kernel_restart, kernel_halt, kernel_power_off, or
emergency_restart. So don't export machine_restart,
machine_halt, and machine_power_off so we can catch buggy users.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

59586e5a

15 7月, 2005 1 次提交

[PATCH] uml: gcc 2.95 fix and Makefile cleanup · 1c30385a

由 Paolo 'Blaisorblade' Giarrusso 提交于 7月 14, 2005

1) Cleanup an ugly hyper-nested code in Makefile (now only the arith.
   expression is passed through the host bash).

2) Fix a problem with GCC 2.95: according to a report from Raphael Bossek,
   .remap_data : { arch/um/sys-SUBARCH/unmap_fin.o (.data .bss) } is expanded
   into: .remap_data : { arch/um/sys-i386 /unmap_fin.o (.data .bss) }

(because I didn't use ## to join the two tokens), thus stopping linking.  Pass
the whole path from the Makefile as a simple and nice fix.
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Raphael Bossek <raphael.bossek@gmx.de>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1c30385a

08 7月, 2005 3 次提交

[PATCH] uml: Proper clone support for skas0 · 9786a8f3

由 Bodo Stroesser 提交于 7月 07, 2005

This patch implements the clone-stub mechanism, which allows skas0 to run
with proc_mm==0, even if the clib in UML uses modify_ldt.

Note: There is a bug in skas3.v7 host patch, that avoids UML-skas from
running properly on a SMP-box. In full skas3, I never really saw problems,
but in skas0 they showed up.

More commentary by jdike - What this patch does is makes sure that the host
parent of each new host process matches the UML parent of the corresponding
UML process. This ensures that any changed LDTs are inherited. This is
done by having clone actually called by the UML process from its stub,
rather than by the kernel. We have special syscall stubs that are loaded
onto the stub code page because that code must be completely
self-contained. These stubs are given C interfaces, and used like normal C
functions, but there are subtleties. Principally, we have to be careful
about stack variables in stub_clone_handler after the clone. The code is
written so that there aren't any - everything boils down to a fixed
address. If there were any locals, references to them after the clone
would be wrong because the stack just changed.
Signed-off-by: NBodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9786a8f3

[PATCH] uml: skas0 - separate kernel address space on stock hosts · d67b569f

由 Jeff Dike 提交于 7月 07, 2005

UML has had two modes of operation - an insecure, slow mode (tt mode) in
which the kernel is mapped into every process address space which requires
no host kernel modifications, and a secure, faster mode (skas mode) in
which the UML kernel is in a separate host address space, which requires a
patch to the host kernel.

This patch implements something very close to skas mode for hosts which
don't support skas - I'm calling this skas0.  It provides the security of
the skas host patch, and some of the performance gains.

The two main things that are provided by the skas patch, /proc/mm and
PTRACE_FAULTINFO, are implemented in a way that require no host patch.

For the remote address space changing stuff (mmap, munmap, and mprotect),
we set aside two pages in the process above its stack, one of which
contains a little bit of code which can call mmap et al.

To update the address space, the system call information (system call
number and arguments) are written to the stub page above the code.  The
%esp is set to the beginning of the data, the %eip is set the the start of
the stub, and it repeatedly pops the information into its registers and
makes the system call until it sees a system call number of zero.  This is
to amortize the cost of the context switch across multiple address space
updates.

When the updates are done, it SIGSTOPs itself, and the kernel process
continues what it was doing.

For a PTRACE_FAULTINFO replacement, we set up a SIGSEGV handler in the
child, and let it handle segfaults rather than nullifying them.  The
handler is in the same page as the mmap stub.  The second page is used as
the stack.  The handler reads cr2 and err from the sigcontext, sticks them
at the base of the stack in a faultinfo struct, and SIGSTOPs itself.  The
kernel then reads the faultinfo and handles the fault.

A complication on x86_64 is that this involves resetting the registers to
the segfault values when the process is inside the kill system call.  This
breaks on x86_64 because %rcx will contain %rip because you tell SYSRET
where to return to by putting the value in %rcx.  So, this corrupts $rcx on
return from the segfault.  To work around this, I added an
arch_finish_segv, which on x86 does nothing, but which on x86_64 ptraces
the child back through the sigreturn.  This causes %rcx to be restored by
sigreturn and avoids the corruption.  Ultimately, I think I will replace
this with the trick of having it send itself a blocked signal which will be
unblocked by the sigreturn.  This will allow it to be stopped just after
the sigreturn, and PTRACE_SYSCALLed without all the back-and-forth of
PTRACE_SYSCALLing it through sigreturn.

This runs on a stock host, so theoretically (and hopefully), tt mode isn't
needed any more.  We need to make sure that this is better in every way
than tt mode, though.  I'm concerned about the speed of address space
updates and page fault handling, since they involve extra round-trips to
the child.  We can amortize the round-trip cost for large address space
updates by writing all of the operations to the data page and having the
child execute them all at the same time.  This will help fork and exec, but
not page faults, since they involve only one page.

I can't think of any way to help page faults, except to add something like
PTRACE_FAULTINFO to the host.  There is PTRACE_SIGINFO, but UML doesn't use
siginfo for SIGSEGV (or anything else) because there isn't enough
information in the siginfo struct to handle page faults (the faulting
operation type is missing).  Adding that would make PTRACE_SIGINFO a usable
equivalent to PTRACE_FAULTINFO.

As for the code itself:

- The system call stub is in arch/um/kernel/sys-$(SUBARCH)/stub.S.  It is
  put in its own section of the binary along with stub_segv_handler in
  arch/um/kernel/skas/process.c.  This is manipulated with run_syscall_stub
  in arch/um/kernel/skas/mem_user.c.  syscall_stub will execute any system
  call at all, but it's only used for mmap, munmap, and mprotect.

- The x86_64 stub calls sigreturn by hand rather than allowing the normal
  sigreturn to happen, because the normal sigreturn is a SA_RESTORER in
  UML's address space provided by libc.  Needless to say, this is not
  available in the child's address space.  Also, it does a couple of odd
  pops before that which restore the stack to the state it was in at the
  time the signal handler was called.

- There is a new field in the arch mmu_context, which is now a union.
  This is the pid to be manipulated rather than the /proc/mm file
  descriptor.  Code which deals with this now checks proc_mm to see whether
  it should use the usual skas code or the new code.

- userspace_tramp is now used to create a new host process for every UML
  process, rather than one per UML processor.  It checks proc_mm and
  ptrace_faultinfo to decide whether to map in the pages above its stack.

- start_userspace now makes CLONE_VM conditional on proc_mm since we need
  separate address spaces now.

- switch_mm_skas now just sets userspace_pid[0] to the new pid rather
  than PTRACE_SWITCH_MM.  There is an addition to userspace which updates
  its idea of the pid being manipulated each time around the loop.  This is
  important on exec, when the pid will change underneath userspace().

- The stub page has a pte, but it can't be mapped in using tlb_flush
  because it is part of tlb_flush.  This is why it's required for it to be
  mapped in by userspace_tramp.

Other random things:

- The stub section in uml.lds.S is page aligned.  This page is written
  out to the backing vm file in setup_physmem because it is mapped from
  there into user processes.

- There's some confusion with TASK_SIZE now that there are a couple of
  extra pages that the process can't use.  TASK_SIZE is considered by the
  elf code to be the usable process memory, which is reasonable, so it is
  decreased by two pages.  This confuses the definition of
  USER_PGDS_IN_LAST_PML4, making it too small because of the rounding down
  of the uneven division.  So we round it to the nearest PGDIR_SIZE rather
  than the lower one.

- I added a missing PT_SYSCALL_ARG6_OFFSET macro.

- um_mmu.h was made into a userspace-usable file.

- proc_mm and ptrace_faultinfo are globals which say whether the host
  supports these features.

- There is a bad interaction between the mm.nr_ptes check at the end of
  exit_mmap, stack randomization, and skas0.  exit_mmap will stop freeing
  pages at the PGDIR_SIZE boundary after the last vma.  If the stack isn't
  on the last page table page, the last pte page won't be freed, as it
  should be since the stub ptes are there, and exit_mmap will BUG because
  there is an unfreed page.  To get around this, TASK_SIZE is set to the
  next lowest PGDIR_SIZE boundary and mm->nr_ptes is decremented after the
  calls to init_stub_pte.  This ensures that we know the process stack (and
  all other process mappings) will be below the top page table page, and
  thus we know that mm->nr_ptes will be one too many, and can be
  decremented.

Things that need fixing:

- We may need better assurrences that the stub code is PIC.

- The stub pte is set up in init_new_context_skas.

- alloc_pgdir is probably the right place.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d67b569f

[PATCH] uml: kill some useless vmalloc tlb flushing · eda80228

由 Jeff Dike 提交于 7月 07, 2005

There is absolutely no reason to flush the kernel's VM area during a
tlb_flush_mm.

This results in a noticable performance increase in the kernel build
benchmark.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

eda80228

26 6月, 2005 3 次提交

[PATCH] uml: hot-unplug code cleanup · 29d56cfe

由 Jeff Dike 提交于 6月 25, 2005

Clean up the hot-unplugging code.  There is now an id procedure which is
called to figure out what device we're talking to.  The error messages from
that are now done from mconsole_remove instead of the driver.  remove is now
called with the device number, after it has been checked, so doesn't need to
do sanity checking on it.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

29d56cfe

[PATCH] uml: time initialization tidying · fc47a0d1

由 Jeff Dike 提交于 6月 25, 2005

user_time_init_skas and user_time_init_tt were essentially the same.  So, this
merges them, deleting the mode-specific functions and declarations.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

fc47a0d1

[PATCH] uml: always disable kmalloc during shutdown · 026549d2

由 Jeff Dike 提交于 6月 25, 2005

kmalloc wasn't being disabled during panic.  This patch ensures that, no
matter how UML is exiting, it is disabled.  This matters because part of the
cleanup is to remove the umid file, which involves readdir, which calls
malloc.  This must map to libc malloc, rather than kmalloc or vmalloc.
Signed-off-by: NJeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

026549d2

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功