- 14 8月, 2009 12 次提交
-
-
由 Tejun Heo 提交于
With x86 converted to embedding allocator, lpage doesn't have any user left. Kill it along with cpa handling code. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Jan Beulich <JBeulich@novell.com> -
由 Tejun Heo 提交于
Now that percpu core can handle very sparse units, given that vmalloc space is large enough, embedding first chunk allocator can use any memory to build the first chunk. This patch teaches pcpu_embed_first_chunk() about distances between cpus and to use alloc/free callbacks to allocate node specific areas for each group and use them for the first chunk. This brings the benefits of embedding allocator to NUMA configurations - no extra TLB pressure with the flexibility of unified dynamic allocator and no need to restructure arch code to build memory layout suitable for percpu. With units put into atom_size aligned groups according to cpu distances, using large page for dynamic chunks is also easily possible with falling back to reuglar pages if large allocation fails. Embedding allocator users are converted to specify NULL cpu_distance_fn, so this patch doesn't cause any visible behavior difference. Following patches will convert them. Signed-off-by: NTejun Heo <tj@kernel.org> -
由 Tejun Heo 提交于
To directly use spread NUMA memories for percpu units, percpu allocator will be updated to allow sparsely mapping units in a chunk. As the distances between units can be very large, this makes allocating single vmap area for each chunk undesirable. This patch implements pcpu_get_vm_areas() and pcpu_free_vm_areas() which allocates and frees sparse congruent vmap areas. pcpu_get_vm_areas() take @offsets and @sizes array which define distances and sizes of vmap areas. It scans down from the top of vmalloc area looking for the top-most address which can accomodate all the areas. The top-down scan is to avoid interacting with regular vmallocs which can push up these congruent areas up little by little ending up wasting address space and page table. To speed up top-down scan, the highest possible address hint is maintained. Although the scan is linear from the hint, given the usual large holes between memory addresses between NUMA nodes, the scanning is highly likely to finish after finding the first hole for the last unit which is scanned first. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Nick Piggin <npiggin@suse.de> -
由 Tejun Heo 提交于
Currently units are mapped sequentially into address space. This patch adds pcpu_unit_offsets[] which allows units to be mapped to arbitrary offsets from the chunk base address. This is necessary to allow sparse embedding which might would need to allocate address ranges and memory areas which aren't aligned to unit size but allocation atom size (page or large page size). This also simplifies things a bit by removing the need to calculate offset from unit number. With this change, there's no need for the arch code to know pcpu_unit_size. Update pcpu_setup_first_chunk() and first chunk allocators to return regular 0 or -errno return code instead of unit size or -errno. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David S. Miller <davem@davemloft.net> -
由 Tejun Heo 提交于
Till now, non-linear cpu->unit map was expressed using an integer array which maps each cpu to a unit and used only by lpage allocator. Although how many units have been placed in a single contiguos area (group) is known while building unit_map, the information is lost when the result is recorded into the unit_map array. For lpage allocator, as all allocations are done by lpages and whether two adjacent lpages are in the same group or not is irrelevant, this didn't cause any problem. Non-linear cpu->unit mapping will be used for sparse embedding and this grouping information is necessary for that. This patch introduces pcpu_alloc_info which contains all the information necessary for initializing percpu allocator. pcpu_alloc_info contains array of pcpu_group_info which describes how units are grouped and mapped to cpus. pcpu_group_info also has base_offset field to specify its offset from the chunk's base address. pcpu_build_alloc_info() initializes this field as if all groups are allocated back-to-back as is currently done but this will be used to sparsely place groups. pcpu_alloc_info is a rather complex data structure which contains a flexible array which in turn points to nested cpu_map arrays. * pcpu_alloc_alloc_info() and pcpu_free_alloc_info() are provided to help dealing with pcpu_alloc_info. * pcpu_lpage_build_unit_map() is updated to build pcpu_alloc_info, generalized and renamed to pcpu_build_alloc_info(). @cpu_distance_fn may be NULL indicating that all cpus are of LOCAL_DISTANCE. * pcpul_lpage_dump_cfg() is updated to process pcpu_alloc_info, generalized and renamed to pcpu_dump_alloc_info(). It now also prints which group each alloc unit belongs to. * pcpu_setup_first_chunk() now takes pcpu_alloc_info instead of the separate parameters. All first chunk allocators are updated to use pcpu_build_alloc_info() to build alloc_info and call pcpu_setup_first_chunk() with it. This has the side effect of packing units for sparse possible cpus. ie. if cpus 0, 2 and 4 are possible, they'll be assigned unit 0, 1 and 2 instead of 0, 2 and 4. * x86 setup_pcpu_lpage() is updated to deal with alloc_info. * sparc64 setup_per_cpu_areas() is updated to build alloc_info. Although the changes made by this patch are pretty pervasive, it doesn't cause any behavior difference other than packing of sparse cpus. It mostly changes how information is passed among initialization functions and makes room for more flexibility. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> -
由 Tejun Heo 提交于
Unit map handling will be generalized and extended and used for embedding sparse first chunk and other purposes. Relocate two unit_map related functions upward in preparation. This patch just moves the code without any actual change. Signed-off-by: NTejun Heo <tj@kernel.org> -
由 Tejun Heo 提交于
pcpu_fc_alloc_fn_t is about to see more interesting usage, add @align parameter. Signed-off-by: NTejun Heo <tj@kernel.org> -
由 Tejun Heo 提交于
Now that all actual first chunk allocation and copying happen in the first chunk allocators and helpers, there's no reason for pcpu_setup_first_chunk() to try to determine @dyn_size automatically. The only left user is page first chunk allocator. Make it determine dyn_size like other allocators and make @dyn_size mandatory for pcpu_setup_first_chunk(). Signed-off-by: NTejun Heo <tj@kernel.org> -
由 Tejun Heo 提交于
First chunk allocators assume percpu areas have been linked using one of PERCPU_*() macros and depend on __per_cpu_load symbol defined by those macros, so there isn't much point in passing in static area size explicitly when it can be easily calculated from __per_cpu_start and __per_cpu_end. Drop @static_size from all percpu first chunk allocators and helpers. Signed-off-by: NTejun Heo <tj@kernel.org> -
由 Tejun Heo 提交于
Now that all first chunk allocators are in mm/percpu.c, it makes sense to make generalize percpu_alloc kernel parameter. Define PCPU_FC_* and set pcpu_chosen_fc using early_param() in mm/percpu.c. Arch code can use the set value to determine which first chunk allocator to use. Signed-off-by: NTejun Heo <tj@kernel.org> -
由 Tejun Heo 提交于
There's no need to build unused first chunk allocators in. Define CONFIG_NEED_PER_CPU_*_FIRST_CHUNK and let archs enable them selectively. Signed-off-by: NTejun Heo <tj@kernel.org> -
由 Tejun Heo 提交于
Page size isn't always 4k depending on arch and configuration. Rename 4k first chunk allocator to page. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David Howells <dhowells@redhat.com>
-
- 13 8月, 2009 2 次提交
-
-
由 Peter Zijlstra 提交于
Replace PERF_SAMPLE_GROUP with PERF_SAMPLE_READ and introduce PERF_FORMAT_GROUP to deal with group reads in a more generic way. This allows you to get group reads out of read() as well. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Corey J Ashford <cjashfor@us.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> LKML-Reference: <20090813103655.117411814@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Provide weak aliases for hw_perf_counter_setup_online(). This is used by the BTS patches (for v2.6.32), but it interacts with fixes so propagate this upstream. (it has no effect as of yet) Also export perf_counter_output() to architecture code. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 8月, 2009 1 次提交
-
-
由 Trond Myklebust 提交于
We can't call nfs_readdata_release()/nfs_writedata_release() without first initialising and referencing args.context. Doing so inside nfs_direct_read_schedule_segment()/nfs_direct_write_schedule_segment() causes an Oops. We should rather be calling nfs_readdata_free()/nfs_writedata_free() in those cases. Looking at the O_DIRECT code, the "struct nfs_direct_req" is already referencing the nfs_open_context for us. Since the readdata and writedata structures carry a reference to that, we can simplify things by getting rid of the extra nfs_open_context references, so that we can replace all instances of nfs_readdata_release()/nfs_writedata_release(). Reported-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Tested-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: stable@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 8月, 2009 2 次提交
-
-
由 Peter Zijlstra 提交于
Give waitqueue spinlocks their own lockdep classes when they are initialised from init_waitqueue_head(). This means that struct wait_queue::func functions can operate other waitqueues. This is used by CacheFiles to catch the page from a backing fs being unlocked and to wake up another thread to take a copy of it. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NDavid Howells <dhowells@redhat.com> Tested-by: NTakashi Iwai <tiwai@suse.de> Cc: linux-cachefs@redhat.com Cc: torvalds@osdl.org Cc: akpm@linux-foundation.org LKML-Reference: <20090810113305.17284.81508.stgit@warthog.procyon.org.uk> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
PERF_SAMPLE_* output switches should unconditionally output the correct format, as they are the only way to unambiguously parse the PERF_EVENT_SAMPLE data. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1249896447.17467.74.camel@twins> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 09 8月, 2009 2 次提交
-
-
由 Frederic Weisbecker 提交于
Based on Peter's comments, make tracepoint sampling generic just like all the other sampling bits are. This is a rename with no code changes: - PERF_SAMPLE_TP_RECORD to PERF_SAMPLE_RAW - struct perf_tracepoint_record to perf_raw_record We want the system in place that transport tracepoints raw samples events into the perf ring buffer to be generalized and usable by any type of counter. Reported-by; Peter Zijlstra <peterz@infradead.org> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1249698400-5441-4-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
This patch implements the kernel side support for ftrace event record sampling. A new counter sampling attribute is added: PERF_SAMPLE_TP_RECORD which requests ftrace events record sampling. In this case if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint fires, we emit the tracepoint binary record to the perfcounter event buffer, as a sample. Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf record: perf record -f -F 1 -a -e workqueue:workqueue_execution perf report -D 0x21e18 [0x48]: event: 9 . . ... raw event: size 72 bytes . 0000: 09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff ......H........ . 0010: 0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00 ........!...... . 0020: 2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e +...........eve . 0030: 74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00 ts/1........... . 0040: e0 b1 31 81 ff ff ff ff ....... . 0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33 The raw ftrace binary record starts at offset 0020. Translation: struct trace_entry { type = 0x2b = 43; flags = 1; preempt_count = 2; pid = 0xa = 10; tgid = 0xa = 10; } thread_comm = "events/1" thread_pid = 0xa = 10; func = 0xffffffff8131b1e0 = flush_to_ldisc() What will come next? - Userspace support ('perf trace'), 'flight data recorder' mode for perf trace, etc. - The unconditional copy from the profiling callback brings some costs however if someone wants no such sampling to occur, and needs to be fixed in the future. For that we need to have an instant access to the perf counter attribute. This is a matter of a flag to add in the struct ftrace_event. - Take care of the events recursivity! Don't ever try to record a lock event for example, it seems some locking is used in the profiling fast path and lead to a tracing recursivity. That will be fixed using raw spinlock or recursivity protection. - [...] - Profit! :-) Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 08 8月, 2009 4 次提交
-
-
由 Phillip Lougher 提交于
Fix and improve comments in decompress/generic.h that describe the decompressor API. Also remove an unused definition, and rename INBUF_LEN in lib/decompress_inflate.c to conform to bzip2/lzma naming. Signed-off-by: NPhillip Lougher <phillip@lougher.demon.co.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KAMEZAWA Hiroyuki 提交于
At first, init_task's mems_allowed is initialized as this. init_task->mems_allowed == node_state[N_POSSIBLE] And cpuset's top_cpuset mask is initialized as this top_cpuset->mems_allowed = node_state[N_HIGH_MEMORY] Before 2.6.29: policy's mems_allowed is initialized as this. 1. update tasks->mems_allowed by its cpuset->mems_allowed. 2. policy->mems_allowed = nodes_and(tasks->mems_allowed, user's mask) Updating task's mems_allowed in reference to top_cpuset's one. cpuset's mems_allowed is aware of N_HIGH_MEMORY, always. In 2.6.30: After commit 58568d2a ("cpuset,mm: update tasks' mems_allowed in time"), policy's mems_allowed is initialized as this. 1. policy->mems_allowd = nodes_and(task->mems_allowed, user's mask) Here, if task is in top_cpuset, task->mems_allowed is not updated from init's one. Assume user excutes command as #numactrl --interleave=all ,.... policy->mems_allowd = nodes_and(N_POSSIBLE, ALL_SET_MASK) Then, policy's mems_allowd can includes a possible node, which has no pgdat. MPOL's INTERLEAVE just scans nodemask of task->mems_allowd and access this directly. NODE_DATA(nid)->zonelist even if NODE_DATA(nid)==NULL Then, what's we need is making policy->mems_allowed be aware of N_HIGH_MEMORY. This patch does that. But to do so, extra nodemask will be on statck. Because I know cpumask has a new interface of CPUMASK_ALLOC(), I added it to node. This patch stands on old behavior. But I feel this fix itself is just a Band-Aid. But to do fundametal fix, we have to take care of memory hotplug and it takes time. (task->mems_allowd should be N_HIGH_MEMORY, I think.) mpol_set_nodemask() should be aware of N_HIGH_MEMORY and policy's nodemask should be includes only online nodes. In old behavior, this is guaranteed by frequent reference to cpuset's code. Now, most of them are removed and mempolicy has to check it by itself. To do check, a few nodemask_t will be used for calculating nodemask. But, size of nodemask_t can be big and it's not good to allocate them on stack. Now, cpumask_t has CPUMASK_ALLOC/FREE an easy code for get scratch area. NODEMASK_ALLOC/FREE shoudl be there. [akpm@linux-foundation.org: cleanups & tweaks] Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Paul Menage <menage@google.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: David Rientjes <rientjes@google.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
When we want to tear down an inode that lost the add to the cache race in XFS we must not call into ->destroy_inode because that would delete the inode that won the race from the inode cache radix tree. This patch provides the __destroy_inode helper needed to fix this, the actual fix will be in th next patch. As XFS was the only reason destroy_inode was exported we shift the export to the new __destroy_inode. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NEric Sandeen <sandeen@sandeen.net>
-
由 Christoph Hellwig 提交于
Currently inode_init_always calls into ->destroy_inode if the additional initialization fails. That's not only counter-intuitive because inode_init_always did not allocate the inode structure, but in case of XFS it's actively harmful as ->destroy_inode might delete the inode from a radix-tree that has never been added. This in turn might end up deleting the inode for the same inum that has been instanciated by another process and cause lots of cause subtile problems. Also in the case of re-initializing a reclaimable inode in XFS it would free an inode we still want to keep alive. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NEric Sandeen <sandeen@sandeen.net>
-
- 06 8月, 2009 2 次提交
-
-
由 Eric Miao 提交于
Remove assumption on the shift and size of rows/columns form matrix_keypad driver. Signed-off-by: NEric Miao <eric.y.miao@gmail.com> Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
-
由 Peter Zijlstra 提交于
Not all tracepoints are created equal, in specific the ftrace tracepoints are created with TRACE_EVENT_FORMAT() which does not generate the needed bits to tie them into perf counters. For those events, don't create the 'id' file and fail ->profile_enable when their ID is specified through other means. Reported-by: NChris Mason <chris.mason@oracle.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1249497664.5890.4.camel@laptop> [ v2: fix build error in the !CONFIG_EVENT_PROFILE case ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 05 8月, 2009 2 次提交
-
-
由 Michael S. Tsirkin 提交于
kvm_notify_acked_irq does not check irq type, so that it sometimes interprets msi vector as irq. As a result, ack notifiers are not called, which typially hangs the guest. The fix is to track and check irq type. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NAvi Kivity <avi@redhat.com>
-
由 Linus Torvalds 提交于
This is pure preparation of changing the ldisc reference counting to be a true refcount that defines the lifetime of the ldisc. But this is a purely syntactic change for now to make the next steps easier. This patch should make no semantic changes at all. But I wanted to make the ldisc refcount be an atomic (I will be touching it without locks soon enough), and I wanted to rename it so that there isn't quite as much confusion between 'ldo->refcount' (ldisk operations refcount) and 'ld->refcount' (ldisc refcount itself) in the same file. So it's now an atomic 'ld->users' count. It still starts at zero, despite having a reference from 'tty->ldisc', but that will change once we turn it into a _real_ refcount. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Tested-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Tested-by: NSergey Senozhatsky <sergey.senozhatsky@mail.by> Acked-by: NAlan Cox <alan@linux.intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 03 8月, 2009 2 次提交
-
-
由 Saeed Bishara 提交于
The patch fixes a bug when converting dev to mtd_info by using the drvdata of the dev, the previous code used container_of(dev, struct mtd_info, dev), but won't work for the mtdXro devices as they created without being contained inside mtd_info structure. Signed-off-by: NSaeed Bishara <saeed@marvell.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
由 Nicolas Pitre 提交于
When declaring static MTD partitions in board specific code, only including <include/linux/mtd/partitions.h> should suffice without gcc nagging us with: In file included from arch/arm/mach-kirkwood/sheevaplug-setup.c:14: include/linux/mtd/partitions.h:50: warning: 'struct mtd_info' declared inside parameter list include/linux/mtd/partitions.h:50: warning: its scope is only this definition or declaration, which is probably not what you want include/linux/mtd/partitions.h:51: warning: 'struct mtd_info' declared inside parameter list include/linux/mtd/partitions.h:61: warning: 'struct mtd_info' declared inside parameter list include/linux/mtd/partitions.h:67: warning: 'struct mtd_info' declared inside parameter list Signed-off-by: NNicolas Pitre <nico@marvell.com> Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
- 02 8月, 2009 1 次提交
-
-
由 Peter Zijlstra 提交于
In order to be able to distinguish between no samples due to inactivity and no samples due to task ended, Arjan asked for PERF_EVENT_EXIT events. This is useful to the boot delay instrumentation (bootchart) app. This patch changes the PERF_EVENT_FORK to be emitted on every clone, and adds PERF_EVENT_EXIT to be emitted on task exit, after the task's counters have been closed. This task tracing is controlled through: attr.comm || attr.mmap and through the new attr.task field. Suggested-by: NArjan van de Ven <arjan@linux.intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> [ cleaned up perf_counter.h a bit ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 01 8月, 2009 1 次提交
-
-
由 Martin K. Petersen 提交于
Introduce blk_limits_io_min() and make blk_queue_io_min() call it. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 31 7月, 2009 4 次提交
-
-
由 Magnus Damm 提交于
To fix the common case where ->enable() does not set up mult, make sure mult_orig is saved in mult on disable. Also add comments to explain why we do this. Signed-off-by: NMagnus Damm <damm@igel.co.jp> Cc: johnstul@us.ibm.com Cc: lethal@linux-sh.org Cc: akpm@linux-foundation.org LKML-Reference: <20090618152432.10136.9932.sendpatchset@rx1.opensource.se> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
the code allready uses flush_kernel_dcache_page(). This patch updates the driver to the recent sg API changes which require that either SG_MITER_TO_SG or SG_MITER_FROM_SG is set. SG_MITER_TO_SG calls flush_kernel_dcache_page() in sg_mitter_stop() Signed-off-by: NSebastian Andrzej Siewior <sebastian@breakpoint.cc> Acked-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: NPierre Ossman <pierre@ossman.eu>
-
sg_miter_start() is currently unaware of the direction of the copy process (to or from the scatter list). It is important to know the direction because the page has to be flushed in case the data written is seen on a different mapping in user land on cache incoherent architectures. Signed-off-by: NSebastian Andrzej Siewior <sebastian@breakpoint.cc> Acked-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NPierre Ossman <pierre@ossman.eu>
-
由 Li Zefan 提交于
Commit d9c7d394 ("block: prevent possible io_context->refcount overflow") mistakenly changed atomic_inc(&ioc->nr_tasks) to atomic_long_inc(&ioc->refcount). Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Acked-by: NNikanth Karthikesan <knikanth@suse.de> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 30 7月, 2009 5 次提交
-
-
由 Rusty Russell 提交于
I've been doing this for years, and akpm picked me up on it about 12 months ago. lguest partly serves as example code, so let's do it Right. Also, remove two unused fields in struct vblk_info in the example launcher. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com> -
由 Rusty Russell 提交于
I don't really notice it (except to begrudge the extra vertical space), but Ingo does. And he pointed out that one excuse of lguest is as a teaching tool, it should set a good example. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com> -
由 Jiri Slaby 提交于
To avoid userspace build failures such as: .../linux/uio.h:37: error: expected `=', `,', `;', `asm' or `__attribute__' before `iov_length' .../linux/uio.h:47: error: expected declaration specifiers or `...' before `size_t' move uio functions inside a __KERNEL__ block. Signed-off-by: NJiri Slaby <jirislaby@gmail.com> Acked-by: NSam Ravnborg <sam@ravnborg.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dave Hansen 提交于
Once a structure goes over PAGE_SIZE*2, we see occasional allocation failures. Some people have chosen to switch over to things like vmalloc() that will let them keep array-like access to such a large structures. But, vmalloc() has plenty of downsides. Here's an alternative. I think it's what Andrew was suggesting here: http://lkml.org/lkml/2009/7/2/518 I call it a flexible array. It does all of its work in PAGE_SIZE bits, so never does an order>0 allocation. The base level has PAGE_SIZE-2*sizeof(int) bytes of storage for pointers to the second level. So, with a 32-bit arch, you get about 4MB (4183112 bytes) of total storage when the objects pack nicely into a page. It is half that on 64-bit because the pointers are twice the size. There's a table detailing this in the code. There are kerneldocs for the functions, but here's an overview: flex_array_alloc() - dynamically allocate a base structure flex_array_free() - free the array and all of the second-level pages flex_array_free_parts() - free the second-level pages, but not the base (for static bases) flex_array_put() - copy into the array at the given index flex_array_get() - copy out of the array at the given index flex_array_prealloc() - preallocate the second-level pages between the given indexes to guarantee no allocs will occur at put() time. We could also potentially just pass the "element_size" into each of the API functions instead of storing it internally. That would get us one more base pointer on 32-bit. I've been testing this by running it in userspace. The header and patch that I've been using are here, as well as the little script I'm using to generate the size table which goes in the kerneldocs. http://sr71.net/~dave/linux/flexarray/ [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dave Jones 提交于
Found with make headers_check /usr/include/linux/pps.h:52: found __[us]{8,16,32,64} type without #include <linux/types.h> Signed-off-by: NDave Jones <davej@redhat.com> Cc: Rodolfo Giometti <giometti@linux.it> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-