- 14 12月, 2009 1 次提交
-
-
由 Li Zefan 提交于
Use a generic trace_event_raw_init() function for all event's raw_init callbacks (but kprobes) instead of defining the same version for each of these. This shrinks the kernel code: text data bss dec hex filename 5355293 1961928 7103260 14420481 dc0a01 vmlinux.o.old 5346802 1961864 7103260 14411926 dbe896 vmlinux.o raw_init can't be removed, because ftrace events and kprobe events use different raw_init callbacks. Though it's possible to totally remove raw_init, I choose to leave it as it is for now. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <4B1DC48C.7080603@cn.fujitsu.com> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
-
- 10 12月, 2009 2 次提交
-
-
由 Johannes Berg 提交于
The trace_seq buffer might fill up, and right now one needs to check the return value of each printf into the buffer to check for that. Instead, have the buffer keep track of whether it is full or not, and reject more input if it is full or would have overflowed with an input that wasn't added. Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NJohannes Berg <johannes@sipsolutions.net> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
If the seq_read fills the buffer it will call s_start again on the next itertation with the same position. This causes a problem with the function_graph tracer because it consumes the iteration in order to determine leaf functions. What happens is that the iterator stores the entry, and the function graph plugin will look at the next entry. If that next entry is a return of the same function and task, then the function is a leaf and the function_graph plugin calls ring_buffer_read which moves the ring buffer iterator forward (the trace iterator still points to the function start entry). The copying of the trace_seq to the seq_file buffer will fail if the seq_file buffer is full. The seq_read will not show this entry. The next read by userspace will cause seq_read to again call s_start which will reuse the trace iterator entry (the function start entry). But the function return entry was already consumed. The function graph plugin will think that this entry is a nested function and not a leaf. To solve this, the trace code now checks the return status of the seq_printf (trace_print_seq). If the writing to the seq_file buffer fails, we set a flag in the iterator (leftover) and we do not reset the trace_seq buffer. On the next call to s_start, we check the leftover flag, and if it is set, we just reuse the trace_seq buffer and do not call into the plugin print functions. Before this patch: 2) | fput() { 2) | __fput() { 2) 0.550 us | inotify_inode_queue_event(); 2) | __fsnotify_parent() { 2) 0.540 us | inotify_dentry_parent_queue_event(); After the patch: 2) | fput() { 2) | __fput() { 2) 0.550 us | inotify_inode_queue_event(); 2) 0.548 us | __fsnotify_parent(); 2) 0.540 us | inotify_dentry_parent_queue_event(); [ Updated the patch to fix a missing return 0 from the trace_print_seq() stub when CONFIG_TRACING is disabled. Reported-by: NIngo Molnar <mingo@elte.hu> ] Reported-by: NJiri Olsa <jolsa@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 07 12月, 2009 2 次提交
-
-
由 Jean Delvare 提交于
The legacy probe and force module parameters are obsolete now, the same can be achieved using the new_device sysfs interface, which is both more flexible and cheaper (it is implemented by i2c-core rather than replicated in every driver module.) The legacy ignore module parameters can be dropped as well. Ignoring can be done by instantiating a "dummy" device at the problematic address. This is the first step of a huge cleanup to i2c-core's i2c_detect function, i2c.h's I2C_CLIENT_INSMOD* macros, and all drivers that made use of them. Signed-off-by: NJean Delvare <khali@linux-fr.org>
-
由 Mika Kuoppala 提交于
Low priority thread holding the i2c bus mutex could block higher priority threads to access the bus resulting in unacceptable latencies. Change the mutex type to rt_mutex preventing priority inversion. Tested-by: NPeter Ujfalusi <peter.ujfalusi@nokia.com> Signed-off-by: NMika Kuoppala <mika.kuoppala@nokia.com> Signed-off-by: NJean Delvare <khali@linux-fr.org>
-
- 06 12月, 2009 2 次提交
-
-
由 Rafael J. Wysocki 提交于
Apparently, there are devices that can wake up the system from sleep states and yet are incapable of generating wake-up events at run time. Thus, introduce a flag indicating if given device is capable of generating run-time wake-up events. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
-
由 David Daney 提交于
Starting with version 4.5, GCC has a new built-in function __builtin_unreachable() that can be used in places like the kernel's BUG() where inline assembly is used to transfer control flow. This eliminated the need for an endless loop in these places. The patch adds a new macro 'unreachable()' that will expand to either __builtin_unreachable() or an endless loop depending on the compiler version. Change from v1: Simplify unreachable() for non-GCC 4.5 case. Signed-off-by: NDavid Daney <ddaney@caviumnetworks.com> Acked-by: NRalf Baechle <ralf@linux-mips.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 12月, 2009 12 次提交
-
-
由 Louis Rilling 提交于
With CLONE_IO, parent's io_context->nr_tasks is incremented, but never decremented whenever copy_process() fails afterwards, which prevents exit_io_context() from calling IO schedulers exit functions. Give a task_struct to exit_io_context(), and call exit_io_context() instead of put_io_context() in copy_process() cleanup path. Signed-off-by: NLouis Rilling <louis.rilling@kerlabs.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Patrick Mullaney 提交于
Provide common routine for the transition of operational state for a leaf device during a root device transition. Signed-off-by: NPatrick Mullaney <pmullaney@novell.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Oliver Neukum 提交于
Using remote wakeup and delayed transmission to allow online device to go into usb autosuspend. Minimal alternate support for devices that don't support remote wakeup. Signed-off-by: NOliver Neukum <oliver@neukum.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Martin K. Petersen 提交于
ata_set_lba_range_entries used the variable max for two different things which was confusing. Make the function take a buffer size in bytes as argument and return the used buffer size upon completion. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Martin K. Petersen 提交于
Our current TRIM payload is a single sector that can accommodate 64 * 65535 blocks being unmapped. Report this value in the Block Limits Maximum Unmap LBA count field. If a storage device supports TRIM and the DRAT and RZAT bits are set, report TPRZ=1 in Read Capacity(16). Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Matt Carlson 提交于
This patch cleans up the VPD code by creating preprocessor definitions and using them in the place of hardcoded constants. Signed-off-by: NMatt Carlson <mcarlson@broadcom.com> Reviewed-by: NMichael Chan <mchan@broadcom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
commit 8ec1e0ebe26087bfc5c0394ada5feb5758014fc8 Author: Patrick McHardy <kaber@trash.net> Date: Thu Dec 3 12:16:35 2009 +0100 ipv4: add sysctl to accept packets with local source addresses Change fib_validate_source() to accept packets with a local source address when the "accept_local" sysctl is set for the incoming inet device. Combined with the previous patches, this allows to communicate between multiple local interfaces over the wire. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
commit 68144d350f4f6c348659c825cde6a82b34c27a91 Author: Patrick McHardy <kaber@trash.net> Date: Thu Dec 3 12:05:25 2009 +0100 net: fib_rules: add oif classification Support routing table lookup based on the flow's oif. This is useful to classify packets originating from sockets bound to interfaces differently. The route cache already includes the oif and needs no changes. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
commit 229e77eec406ad68662f18e49fda8b5d366768c5 Author: Patrick McHardy <kaber@trash.net> Date: Thu Dec 3 12:05:23 2009 +0100 net: fib_rules: rename ifindex/ifname/FRA_IFNAME to iifindex/iifname/FRA_IIFNAME The next patch will add oif classification, rename interface related members and attributes to reflect that they're used for iif classification. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
This brings struct ata_device in-line with struct ata_{port,host}. Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Alan Cox 提交于
We were never able to get docs for this out of Toshiba for years. Dave Barnes produced a NetBSD driver however and from that we can fill in the needed tables. As we correct the PCI identifiers a bit also update the old ide generic driver at the same time so it stays compiling. Signed-off-by: NAlan Cox <alan@linux.intel.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Vivek Goyal 提交于
o This is basic implementation of blkio controller cgroup interface. This is the common interface visible to user space and should be used by different IO control policies as we implement those. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 03 12月, 2009 21 次提交
-
-
由 Wu Fengguang 提交于
It will lower the flush priority for NFS, and maybe more in future. Signed-off-by: NWu Fengguang <fengguang.wu@intel.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Steven Whitehouse 提交于
There are two spare field in the header common to all GFS2 metadata. One is just the right size to fit a journal id in it, and this patch updates the journal code so that each time a metadata block is modified, we tag it with the journal id of the node which is performing the modification. The reason for this is that it should make it much easier to debug issues which arise if we can tell which node was the last to modify a particular metadata block. Since the field is updated before the block is written into the journal, each journal should only contain metadata which is tagged with its own journal id. The one exception to this is the journal header block, which might have a different node's id in it, if that journal was recovered by another node in the cluster. Thus each journal will contain a record of which nodes recovered it, via the journal header. The other field in the metadata header could potentially be used to hold information about what kind of operation was performed, but for the time being we just zero it on each transaction so that if we use it for that in future, we'll know that the information (where it exists) is reliable. I did consider using the other field to hold the journal sequence number, however since in GFS2's journaling we write the modified data into the journal and not the original data, this gives no information as to what action caused the modification, so I think we can probably come up with a better use for those 64 bits in the future. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
Sending a message to userspace in a generic format to warn of events (e.g. quota exceeded) in the quota subsystem is a generically useful feature. This patch makes some minor changes to the send_message function from dquot.c renaming it quota_send_message, moving it to quota.c and exporting it for use by filesystems which do not use the dquot code. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This is required for cluster filesystems which want to use cached ACLs so that they can invalidate the cache when required. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Cc: Alexander Viro <aviro@redhat.com> Cc: Christoph Hellwig <hch@infradead.org>
-
由 Martin K. Petersen 提交于
The discard ioctl is used by mkfs utilities to clear a block device prior to putting metadata down. However, not all devices return zeroed blocks after a discard. Some drives return stale data, potentially containing old superblocks. It is therefore important to know whether discarded blocks are properly zeroed. Both ATA and SCSI drives have configuration bits that indicate whether zeroes are returned after a discard operation. Implement a block level interface that allows this information to be bubbled up the stack and queried via a new block device ioctl. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Christoph Hellwig 提交于
Add support for the ATA TRIM command in libata. We translate a WRITE SAME 16 command with the unmap bit set into an ATA TRIM command and export enough information in READ CAPACITY 16 and the block limits EVPD page so that the new SCSI layer discard support will driver this for us. Note that I hardcode the WRITE_SAME_16 opcode for now as the patch to introduce the symbolic is not in 2.6.32 yet but only in the SCSI tree - as soon as it is merged we can fix it up to properly use the symbolic name. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Tejun Heo 提交于
If ATA device failed FLUSH, it means that the device failed to write out some amount of data and the error needs to be reported to upper layers. As retries can't recover the lost data, FLUSH failures need to be reported immediately in general. However, if FLUSH fails due to transmission errors, the FLUSH needs to be retried; otherwise, filesystems may switch to RO mode and/or raid array may drop a drive for a random transmission glitch. This condition can be rather easily reproduced on certain ahci controllers which go through a PHY event after powersave mode switch + ext4 combination. Powersave mode switch is often closely followed by flush from the filesystem failing the FLUSH with ATA bus error which makes the filesystem code believe that data is lost and drop to RO mode. This was reported in the following bugzilla bug. http://bugzilla.kernel.org/show_bug.cgi?id=14543 This patch makes libata EH retry FLUSH if it wasn't failed by the device. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NAndrey Vihrov <andrey.vihrov@gmail.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Carsten Otte 提交于
This patch moves s390 processor status word into the base kvm_run struct and keeps it up-to date on all userspace exits. The userspace ABI is broken by this, however there are no applications in the wild using this. A capability check is provided so users can verify the updated API exists. Cc: stable@kernel.org Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Jan Kiszka 提交于
This new IOCTL exports all yet user-invisible states related to exceptions, interrupts, and NMIs. Together with appropriate user space changes, this fixes sporadic problems of vmsave/restore, live migration and system reset. [avi: future-proof abi by adding a flags field] Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
These happen when we trap an exception when another exception is being delivered; we only expect these with MCEs and page faults. If something unexpected happens, things probably went south and we're better off reporting an internal error and freezing. Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
Usually userspace will freeze the guest so we can inspect it, but some internal state is not available. Add extra data to internal error reporting so we can expose it to the debugger. Extra data is specific to the suberror. Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Jan Kiszka 提交于
Obviously, people tend to extend this header at the bottom - more or less blindly. Ensure that deprecated stuff gets its own corner again by moving things to the top. Also add some comments and reindent IOCTLs to make them more readable and reduce the risk of number collisions. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Glauber Costa 提交于
When we migrate a kvm guest that uses pvclock between two hosts, we may suffer a large skew. This is because there can be significant differences between the monotonic clock of the hosts involved. When a new host with a much larger monotonic time starts running the guest, the view of time will be significantly impacted. Situation is much worse when we do the opposite, and migrate to a host with a smaller monotonic clock. This proposed ioctl will allow userspace to inform us what is the monotonic clock value in the source host, so we can keep the time skew short, and more importantly, never goes backwards. Userspace may also need to trigger the current data, since from the first migration onwards, it won't be reflected by a simple call to clock_gettime() anymore. [marcelo: future-proof abi with a flags field] [jan: fix KVM_GET_CLOCK by clearing flags field instead of checking it] Signed-off-by: NGlauber Costa <glommer@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Ed Swierk 提交于
Support for Xen PV-on-HVM guests can be implemented almost entirely in userspace, except for handling one annoying MSR that maps a Xen hypercall blob into guest address space. A generic mechanism to delegate MSR writes to userspace seems overkill and risks encouraging similar MSR abuse in the future. Thus this patch adds special support for the Xen HVM MSR. I implemented a new ioctl, KVM_XEN_HVM_CONFIG, that lets userspace tell KVM which MSR the guest will write to, as well as the starting address and size of the hypercall blobs (one each for 32-bit and 64-bit) that userspace has loaded from files. When the guest writes to the MSR, KVM copies one page of the blob from userspace to the guest. I've tested this patch with a hacked-up version of Gerd's userspace code, booting a number of guests (CentOS 5.3 i386 and x86_64, and FreeBSD 8.0-RC1 amd64) and exercising PV network and block devices. [jan: fix i386 build warning] [avi: future proof abi with a flags field] Signed-off-by: NEd Swierk <eswierk@aristanetworks.com> Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Zhai, Edwin 提交于
Introduce kvm_vcpu_on_spin, to be used by VMX/SVM to yield processing once the cpu detects pause-based looping. Signed-off-by: N"Zhai, Edwin" <edwin.zhai@intel.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Alexander Graf 提交于
X86 CPUs need to have some magic happening to enable the virtualization extensions on them. This magic can result in unpleasant results for users, like blocking other VMMs from working (vmx) or using invalid TLB entries (svm). Currently KVM activates virtualization when the respective kernel module is loaded. This blocks us from autoloading KVM modules without breaking other VMMs. To circumvent this problem at least a bit, this patch introduces on demand activation of virtualization. This means, that instead virtualization is enabled on creation of the first virtual machine and disabled on destruction of the last one. So using this, KVM can be easily autoloaded, while keeping other hypervisors usable. Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
Mask irq notifier list is already there. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
Maintain back mapping from irqchip/pin to gsi to speedup interrupt acknowledgment notifications. [avi: build fix on non-x86/ia64] Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
Use gsi indexed array instead of scanning all entries on each interrupt injection. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Gleb Natapov 提交于
This removes assumptions that max GSIs is smaller than number of pins. Sharing is tracked on pin level not GSI level. [avi: no PIC on ia64] Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-