- 15 1月, 2010 1 次提交
-
-
由 Michael S. Tsirkin 提交于
What it is: vhost net is a character device that can be used to reduce the number of system calls involved in virtio networking. Existing virtio net code is used in the guest without modification. There's similarity with vringfd, with some differences and reduced scope - uses eventfd for signalling - structures can be moved around in memory at any time (good for migration, bug work-arounds in userspace) - write logging is supported (good for migration) - support memory table and not just an offset (needed for kvm) common virtio related code has been put in a separate file vhost.c and can be made into a separate module if/when more backends appear. I used Rusty's lguest.c as the source for developing this part : this supplied me with witty comments I wouldn't be able to write myself. What it is not: vhost net is not a bus, and not a generic new system call. No assumptions are made on how guest performs hypercalls. Userspace hypervisors are supported as well as kvm. How it works: Basically, we connect virtio frontend (configured by userspace) to a backend. The backend could be a network device, or a tap device. Backend is also configured by userspace, including vlan/mac etc. Status: This works for me, and I haven't see any crashes. Compared to userspace, people reported improved latency (as I save up to 4 system calls per packet), as well as better bandwidth and CPU utilization. Features that I plan to look at in the future: - mergeable buffers - zero copy - scalability tuning: figure out the best threading model to use Note on RCU usage (this is also documented in vhost.h, near private_pointer which is the value protected by this variant of RCU): what is happening is that the rcu_dereference() is being used in a workqueue item. The role of rcu_read_lock() is taken on by the start of execution of the workqueue item, of rcu_read_unlock() by the end of execution of the workqueue item, and of synchronize_rcu() by flush_workqueue()/flush_work(). In the future we might need to apply some gcc attribute or sparse annotation to the function passed to INIT_WORK(). Paul's ack below is for this RCU usage. (Includes fixes by Alan Cox <alan@linux.intel.com>, David L Stevens <dlstevens@us.ibm.com>, Chris Wright <chrisw@redhat.com>) Acked-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 12月, 2009 5 次提交
-
-
由 Heiko Carstens 提交于
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Oleg Nesterov 提交于
Nobody except ptrace itself should use task->ptrace or PT_PTRACED directly, change arch/s390/kernel/traps.c to use the helper. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Acked-by: NRoland McGrath <roland@redhat.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
The elf notes number for the upper register halves is s390 specific. Change the name of the elf notes to include S390. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Roel Kluin 提交于
Return the PTR_ERR of the correct pointer. Signed-off-by: NRoel Kluin <roel.kluin@gmail.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 16 12月, 2009 3 次提交
-
-
由 Christoph Hellwig 提交于
Currently all architectures but microblaze unconditionally define USE_ELF_CORE_DUMP. The microblaze omission seems like an error to me, so let's kill this ifdef and make sure we are the same everywhere. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Cc: <linux-arch@vger.kernel.org> Cc: Michal Simek <michal.simek@petalogix.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 André Goddard Rosa 提交于
Makes use of skip_spaces() defined in lib/string.c for removing leading spaces from strings all over the tree. It decreases lib.a code size by 47 bytes and reuses the function tree-wide: text data bss dec hex filename 64688 584 592 65864 10148 (TOTALS-BEFORE) 64641 584 592 65817 10119 (TOTALS-AFTER) Also, while at it, if we see (*str && isspace(*str)), we can be sure to remove the first condition (*str) as the second one (isspace(*str)) also evaluates to 0 whenever *str == 0, making it redundant. In other words, "a char equals zero is never a space". Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below, and found occurrences of this pattern on 3 more files: drivers/leds/led-class.c drivers/leds/ledtrig-timer.c drivers/video/output.c @@ expression str; @@ ( // ignore skip_spaces cases while (*str && isspace(*str)) { \(str++;\|++str;\) } | - *str && isspace(*str) ) Signed-off-by: NAndré Goddard Rosa <andre.goddard@gmail.com> Cc: Julia Lawall <julia@diku.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Neil Brown <neilb@suse.de> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: David Howells <dhowells@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 12月, 2009 6 次提交
-
-
由 Boaz Harrosh 提交于
Some un-used includes removed. This patch is in an effort to cleanup nfsd headers and move private definitions to source directory. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 Thomas Gleixner 提交于
Name space cleanup for rwlock functions. No functional change. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NIngo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
-
由 Thomas Gleixner 提交于
Not strictly necessary for -rt as -rt does not have non sleeping rwlocks, but it's odd to not have a consistent naming convention. No functional change. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NIngo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
-
由 Thomas Gleixner 提交于
Name space cleanup. No functional change. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NIngo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
-
由 Thomas Gleixner 提交于
Further name space cleanup. No functional change Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NIngo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
-
由 Thomas Gleixner 提交于
The raw_spin* namespace was taken by lockdep for the architecture specific implementations. raw_spin_* would be the ideal name space for the spinlocks which are not converted to sleeping locks in preempt-rt. Linus suggested to convert the raw_ to arch_ locks and cleanup the name space instead of using an artifical name like core_spin, atomic_spin or whatever No functional change. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NIngo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
-
- 12 12月, 2009 1 次提交
-
-
由 Sam Ravnborg 提交于
The simplest method was to add an extra asm-offsets.h file in arch/$ARCH/include/asm that references the generated file. We can now migrate the architectures one-by-one to reference the generated file direct - and when done we can delete the temporary arch/$ARCH/include/asm/asm-offsets.h file. Signed-off-by: NSam Ravnborg <sam@ravnborg.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NMichal Marek <mmarek@suse.cz>
-
- 11 12月, 2009 2 次提交
-
-
由 Al Viro 提交于
New helper - sys_mmap_pgoff(); switch syscalls to using it. Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
We've had TASK_SIZE set to 1<<31 for 31bit tasks since May 2004. Before that old32_mmap() had to deal with do_mmap_pgoff() giving it an address out of range. It had tried to do that by checking return value and doing do_munmap() (at wrong address, BTW). IOW, that code had been dead for 5.5 years (and bogus - for 8). Kill. Acked-by: NHugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 07 12月, 2009 18 次提交
-
-
由 Boaz Harrosh 提交于
Some unused includes removed. This patch is in an effort to cleanup nfsd headers and move private definitions to source directory. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Aristeu Rozanski 提交于
Trying to build a s390x kernel with CONFIG_FTRACE_SYSCALLS will fail because ftrace.o is not built/linked. Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NAristeu Rozanski <aris@redhat.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Fix this compile error in linux-next: arch/s390/kernel/time.c: In function 'get_sync_clock': arch/s390/kernel/time.c:337: error: 'clock_sync_sync' undeclared (first use in this function) Gets exposed because the new per cpu code references the variable passed to put_cpu_var. This was not a real bug. Reported-by: NSachin Sant <sachinp@in.ibm.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Christian Borntraeger 提交于
We dont need the dirty bit if a write access is done via the kernel mapping. In that case SetPageDirty and friends are used anyway, no need to do that a second time. We can use the change-recording overide function for the kernel mapping, if available. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Christian Borntraeger 提交于
A compare shows that arch/s390/include/asm/sockios.h is the same as include/asm-generic/sockios.h. This patch lets arch/s390/include/asm/ termbits.h include the generic header. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Christian Borntraeger 提交于
A compare shows that arch/s390/include/asm/termbits.h is the same as include/asm-generic.h. This patch lets arch/s390/include/asm/termbits.h include the generic header. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Remove unused typedef, defines, update copyright, remove unneeded includes, remove unneeded ifdefs. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
The pages allocated by the cmm memory balloon should be freed before the hibernation image is created. Otherwise the memory reserved by the balloon gets written to the swap device but there is no content in these pages that need to be preserved. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
Git commit ea435467 changed the definition of atomic_t and atomic64_t for s390 by adding the volatile modifier to the counter field. This has an unfortunate side effect with newer versions of the gcc. The typeof operator now picks up the volatile modifier from the expression. This causes the compiler to think that it has to store the two temporary variable old_val and new_val in the __CS_LOOP for the different atomic operations to the stack as the variables are now volatile. Both stores are superfluous. The hack to replace typeof(ptr->counter) with int in __CS_LOOP and and long long in __CSG_LOOP avoids the two stores. A better solution would be to drop the volatile from the counter field of the atomic_t and atomic64_t definition. But that is a touchy subject .. Cc: Matthew Wilcox <matthew@wil.cx> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Christian Borntraeger 提交于
the todclk.h header file is dead code. Remove it. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Gerald Schaefer 提交于
The pagetable walk usercopy functions have used a modified copy of the do_exception() function for fault handling. This lead to inconsistencies with recent changes to do_exception(), e.g. performance counters. This patch changes the pagetable walk usercopy code to call do_exception() directly, eliminating the redundancy. A new parameter is added to do_exception() to specify the fault address. Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
Simplify the check of the vma->flags in do_exception for the different fault types. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
Slim down the do_exception function to handle only the fast path of a fault and move the exceptional cases into a new function. That slightly increases the performance of the fault handling. Build fix for !CONFIG_COMPAT by Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
notify_page_fault does a preempt_disable/preempt_enable for each fault generated by a kernel access to user space. If kprobes is not active that is unnecessary since the interrupts are not reenabled yet. To play safe repeat the kprobe_running check after preempt_disable(). Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
Introduce user_mode to replace the two variables switch_amode and s390_noexec. There are three valid combinations of the old values: 1) switch_amode == 0 && s390_noexec == 0 2) switch_amode == 1 && s390_noexec == 0 3) switch_amode == 1 && s390_noexec == 1 They get replaced by 1) user_mode == HOME_SPACE_MODE 2) user_mode == PRIMARY_SPACE_MODE 3) user_mode == SECONDARY_SPACE_MODE The new kernel parameter user_mode=[primary,secondary,home] lets you choose the address space mode the user space processes should use. In addition the CONFIG_S390_SWITCH_AMODE config option is removed. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
A data access in access-register mode always is a user mode access, the code to inspect the access-registers can be removed. The second change is to use a different test to check for no-execute fault. The third change is to pass the translation exception identification as parameter, in theory the trans_exc_code in the lowcore could have been overwritten by the time the call to check_space from do_no_context is done. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Split setting (driver wants feature enabled) and status (feature setup was successful) for PGID related ccw device features so that setup errors can be detected. Previously, incorrectly handled setup errors could in rare cases lead to erratic I/O behavior and permanently unusuable devices. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Hendrik Brueckner 提交于
When the kernel is IPLed without the CLEAR option and switches to 64-bit, the high-order half of the registers might contain random values. This can cause addressing exceptions and the kernel enters an interrupt loop. Initialize the high-order half of the general purpose registers with zeros after switching to 64-bit mode. Cc: <stable@kernel.org> Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 06 12月, 2009 1 次提交
-
-
由 David Daney 提交于
Use the new unreachable() macro instead of for(;;); Signed-off-by: NDavid Daney <ddaney@caviumnetworks.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> CC: Heiko Carstens <heiko.carstens@de.ibm.com> CC: linux390@de.ibm.com CC: linux-s390@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 12月, 2009 1 次提交
-
-
由 André Goddard Rosa 提交于
That is "success", "unknown", "through", "performance", "[re|un]mapping" , "access", "default", "reasonable", "[con]currently", "temperature" , "channel", "[un]used", "application", "example","hierarchy", "therefore" , "[over|under]flow", "contiguous", "threshold", "enough" and others. Signed-off-by: NAndré Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 03 12月, 2009 2 次提交
-
-
由 Carsten Otte 提交于
This patch corrects the checking of the new address for the prefix register. On s390, the prefix register is used to address the cpu's lowcore (address 0...8k). This check is supposed to verify that the memory is readable and present. copy_from_guest is a helper function, that can be used to read from guest memory. It applies prefixing, adds the start address of the guest memory in user, and then calls copy_from_user. Previous code was obviously broken for two reasons: - prefixing should not be applied here. The current prefix register is going to be updated soon, and the address we're looking for will be 0..8k after we've updated the register - we're adding the guest origin (gmsor) twice: once in subject code and once in copy_from_guest With kuli, we did not hit this problem because (a) we were lucky with previous prefix register content, and (b) our guest memory was mmaped very low into user address space. Cc: stable@kernel.org Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Reported-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Carsten Otte 提交于
This patch moves s390 processor status word into the base kvm_run struct and keeps it up-to date on all userspace exits. The userspace ABI is broken by this, however there are no applications in the wild using this. A capability check is provided so users can verify the updated API exists. Cc: stable@kernel.org Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-