提交 · 8bb79224b87aab92071e94d46e70bd160d89bf34 · openanolis / cloud-kernel

27 7月, 2008 1 次提交

dma-mapping: add the device argument to dma_mapping_error() · 8d8bb39b

由 FUJITA Tomonori 提交于 7月 25, 2008

Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER
architecture does:

This enables us to cleanly fix the Calgary IOMMU issue that some devices
are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423).

I think that per-device dma_mapping_ops support would be also helpful for
KVM people to support PCI passthrough but Andi thinks that this makes it
difficult to support the PCI passthrough (see the above thread).  So I
CC'ed this to KVM camp.  Comments are appreciated.

A pointer to dma_mapping_ops to struct dev_archdata is added.  If the
pointer is non NULL, DMA operations in asm/dma-mapping.h use it.  If it's
NULL, the system-wide dma_ops pointer is used as before.

If it's useful for KVM people, I plan to implement a mechanism to register
a hook called when a new pci (or dma capable) device is created (it works
with hot plugging).  It enables IOMMUs to set up an appropriate
dma_mapping_ops per device.

The major obstacle is that dma_mapping_error doesn't take a pointer to the
device unlike other DMA operations.  So x86 can't have dma_mapping_ops per
device.  Note all the POWER IOMMUs use the same dma_mapping_error function
so this is not a problem for POWER but x86 IOMMUs use different
dma_mapping_error functions.

The first patch adds the device argument to dma_mapping_error.  The patch
is trivial but large since it touches lots of drivers and dma-mapping.h in
all the architecture.

This patch:

dma_mapping_error() doesn't take a pointer to the device unlike other DMA
operations.  So we can't have dma_mapping_ops per device.

Note that POWER already has dma_mapping_ops per device but all the POWER
IOMMUs use the same dma_mapping_error function.  x86 IOMMUs use device
argument.

[akpm@linux-foundation.org: fix sge]
[akpm@linux-foundation.org: fix svc_rdma]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix bnx2x]
[akpm@linux-foundation.org: fix s2io]
[akpm@linux-foundation.org: fix pasemi_mac]
[akpm@linux-foundation.org: fix sdhci]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix sparc]
[akpm@linux-foundation.org: fix ibmvscsi]
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8d8bb39b

26 7月, 2008 3 次提交

remove dummy asm/kvm.h files · 7dcf2a9f

由 Adrian Bunk 提交于 7月 01, 2008

This patch removes the dummy asm/kvm.h files on architectures not (yet)
supporting KVM and uses the same conditional headers installation as
already used for a.out.h .

Also removed are superfluous install rules in the s390 and x86 Kbuild
files (they are already in Kbuild.asm).
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Acked-by: NSam Ravnborg <sam@ravnborg.org>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

7dcf2a9f

include/asm/ptrace.h userspace headers cleanup · f22ab814

由 Adrian Bunk 提交于 7月 25, 2008

This patch contains the following cleanups for the asm/ptrace.h
userspace headers:

- include/asm-generic/Kbuild.asm already lists ptrace.h, remove
  the superfluous listings in the Kbuild files of the following
  architectures:
  - cris
  - frv
  - powerpc
  - x86
- don't expose function prototypes and macros to userspace:
  - arm
  - blackfin
  - cris
  - mn10300
  - parisc
- remove #ifdef CONFIG_'s around #define's:
  - blackfin
  - m68knommu
- sh: AFAIK __SH5__ should work in both kernel and userspace,
      no need to leak CONFIG_SUPERH64 to userspace
- xtensa: cosmetical change to remove empty
            #ifndef __ASSEMBLY__ #else #endif
          from the userspace headers

Not changed by this patch is the fact that the following architectures
have a different struct pt_regs depending on CONFIG_ variables:
- h8300
- m68knommu
- mips

This does not work in userspace.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: NGreg Ungerer <gerg@uclinux.org>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Acked-by: NGrant Grundler <grundler@parisc-linux.org>
Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
Acked-by: NChris Zankel <chris@zankel.net>
Acked-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NPaul Mackerras <paulus@samba.org>
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f22ab814

clean up duplicated alloc/free_thread_info · b69c49b7

由 FUJITA Tomonori 提交于 7月 25, 2008

We duplicate alloc/free_thread_info defines on many platforms (the
majority uses __get_free_pages/free_pages).  This patch defines common
defines and removes these duplicated defines.
__HAVE_ARCH_THREAD_INFO_ALLOCATOR is introduced for platforms that do
something different.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b69c49b7

25 7月, 2008 4 次提交

ide: remove <asm/ide.h> for some archs · 2a8f7450

由 Bartlomiej Zolnierkiewicz 提交于 7月 24, 2008

* Remove <linux/irq.h> include from <asm-ia64.h> (<linux/ide.h> includes
  <linux/interrupt.h> which is enough).

* Remove <asm/ide.h> for alpha/blackfin/h8300/ia64/m32r/sh/x86/xtensa
  (this leaves us with arm/frv/m68k/mips/mn10300/parisc/powerpc/sparc[64]).

There should be no functional changes caused by this patch.
Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>

2a8f7450

ide: define MAX_HWIFS in <linux/ide.h> · d83b8b85

由 Bartlomiej Zolnierkiewicz 提交于 7月 24, 2008

* Now that ide_hwif_t instances are allocated dynamically
  the difference between MAX_HWIFS == 2 and MAX_HWIFS == 10
  is ~100 bytes (x86-32) so use MAX_HWIFS == 10 on all archs
  except these ones that use MAX_HWIFS == 1.

* Define MAX_HWIFS in <linux/ide.h> instead of <asm/ide.h>.

[ Please note that avr32/cris/v850 have no <asm/ide.h>
  and alpha/ia64/sh always define CONFIG_IDE_MAX_HWIFS. ]
Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>

d83b8b85

ide: fix <asm-xtensa/ide.h> · b0a62817

由 Bartlomiej Zolnierkiewicz 提交于 7月 24, 2008

* Add missing <asm-generic/ide_iops.h> include.

While at it:

* Remove needless ide_default_{irq,io_base}() inlines.

Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>

b0a62817

PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures · 27ac792c

由 Andrea Righi 提交于 7月 23, 2008

On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit
boundary. For example:

	u64 val = PAGE_ALIGN(size);

always returns a value < 4GB even if size is greater than 4GB.

The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for
example):

#define PAGE_SHIFT      12
#define PAGE_SIZE       (_AC(1,UL) << PAGE_SHIFT)
#define PAGE_MASK       (~(PAGE_SIZE-1))
...
#define PAGE_ALIGN(addr)       (((addr)+PAGE_SIZE-1)&PAGE_MASK)

The "~" is performed on a 32-bit value, so everything in "and" with
PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary.
Using the ALIGN() macro seems to be the right way, because it uses
typeof(addr) for the mask.

Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in
include/linux/mm.h.

See also lkml discussion: http://lkml.org/lkml/2008/6/11/237

[akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c]
[akpm@linux-foundation.org: fix v850]
[akpm@linux-foundation.org: fix powerpc]
[akpm@linux-foundation.org: fix arm]
[akpm@linux-foundation.org: fix mips]
[akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c]
[akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c]
[akpm@linux-foundation.org: fix powerpc]
Signed-off-by: NAndrea Righi <righi.andrea@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

27ac792c

24 7月, 2008 1 次提交

Remove asm/semaphore.h · 2351ec53

由 Matthew Wilcox 提交于 7月 24, 2008

All users have now been converted to linux/semaphore.h and we don't need
to keep these files around any longer.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>

2351ec53

15 5月, 2008 1 次提交

asm-{alpha,h8300,um,v850,xtensa}/param.h: unbreak HZ for userspace · b7cffc1f

由 Mike Frysinger 提交于 5月 14, 2008

I noticed this because alpha was broken due to the recent commit commit
bdc80787 ("avoid overflows in
kernel/time.c").  Most arches do something like this in their
asm/param.h:

#ifdef __KERNEL__
# define HZ CONFIG_HZ
#else
# define HZ 100
#endif

A few arches though (namely alpha/h8300/um/v850/xtensa) either do no set
HZ at all for !__KERNEL__, or they set it wrongly.  This should bring all
arches in line by setting up HZ for userspace.

Without this currently perl 5.10 doesn't build on alpha:

perl.c: In function 'perl_construct':
perl.c:388: error: 'CONFIG_HZ' undeclared (first use in this function)
-> http://buildd.debian.org/fetch.cgi?pkg=perl;ver=5.10.0-10;arch=alpha;stamp=1210252894Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: maximilian attems <max@stro.at>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
[ HZ on alpha is 1024 for historical reasons.  - Linus ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b7cffc1f

03 5月, 2008 1 次提交

xtensa: types: use <asm-generic/int-*.h> for the xtensa architecture · 4cf63c8a

由 H. Peter Anvin 提交于 4月 06, 2008

This modifies <asm-xtensa/types.h> to use the <asm-generic/int-*.h>
generic include files.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Chris Zankel <chris@zankel.net>

4cf63c8a

29 4月, 2008 1 次提交

kernel: Move arches to use common unaligned access · 6510d419

由 Harvey Harrison 提交于 4月 29, 2008

Unaligned access is ok for the following arches:
cris, m68k, mn10300, powerpc, s390, x86

Arches that use the memmove implementation for native endian, and
the byteshifting for the opposite endianness.
h8300, m32r, xtensa

Packed struct for native endian, byteshifting for other endian:
alpha, blackfin, ia64, parisc, sparc, sparc64, mips, sh

m86knommu is generic_be for Coldfire, otherwise unaligned access is ok.

frv, arm chooses endianness based on compiler settings, uses the byteshifting
versions.  Remove the unaligned trap handler from frv as it is now unused.

v850 is le, uses the byteshifting versions for both be and le.

Remove the now unused asm-generic implementation.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6510d419

28 4月, 2008 1 次提交

mm: introduce pte_special pte bit · 7e675137

由 Nick Piggin 提交于 4月 28, 2008

s390 for one, cannot implement VM_MIXEDMAP with pfn_valid, due to their memory
model (which is more dynamic than most).  Instead, they had proposed to
implement it with an additional path through vm_normal_page(), using a bit in
the pte to determine whether or not the page should be refcounted:

vm_normal_page()
{
	...
        if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
                if (vma->vm_flags & VM_MIXEDMAP) {
#ifdef s390
			if (!mixedmap_refcount_pte(pte))
				return NULL;
#else
                        if (!pfn_valid(pfn))
                                return NULL;
#endif
                        goto out;
                }
	...
}

This is fine, however if we are allowed to use a bit in the pte to determine
refcountedness, we can use that to _completely_ replace all the vma based
schemes.  So instead of adding more cases to the already complex vma-based
scheme, we can have a clearly seperate and simple pte-based scheme (and get
slightly better code generation in the process):

vm_normal_page()
{
#ifdef s390
	if (!mixedmap_refcount_pte(pte))
		return NULL;
	return pte_page(pte);
#else
	...
#endif
}

And finally, we may rather make this concept usable by any architecture rather
than making it s390 only, so implement a new type of pte state for this.
Unfortunately the old vma based code must stay, because some architectures may
not be able to spare pte bits.  This makes vm_normal_page a little bit more
ugly than we would like, but the 2 cases are clearly seperate.

So introduce a pte_special pte state, and use it in mm/memory.c.  It is
currently a noop for all architectures, so this doesn't actually result in any
compiled code changes to mm/memory.o.

BTW:
I haven't put vm_normal_page() into arch code as-per an earlier suggestion.
The reason is that, regardless of where vm_normal_page is actually
implemented, the *abstraction* is still exactly the same. Also, while it
depends on whether the architecture has pte_special or not, that is the
only two possible cases, and it really isn't an arch specific function --
the role of the arch code should be to provide primitive functions and
accessors with which to build the core code; pte_special does that. We do
not want architectures to know or care about vm_normal_page itself, and
we definitely don't want them being able to invent something new there
out of sight of mm/ code. If we made vm_normal_page an arch function, then
we have to make vm_insert_mixed (next patch) an arch function too. So I
don't think moving it to arch code fundamentally improves any abstractions,
while it does practically make the code more difficult to follow, for both
mm and arch developers, and easier to misuse.

[akpm@linux-foundation.org: build fix]
Signed-off-by: NNick Piggin <npiggin@suse.de>
Acked-by: NCarsten Otte <cotte@de.ibm.com>
Cc: Jared Hulbert <jaredeh@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7e675137

17 4月, 2008 1 次提交

Generic semaphore implementation · 64ac24e7

由 Matthew Wilcox 提交于 3月 07, 2008

Semaphores are no longer performance-critical, so a generic C
implementation is better for maintainability, debuggability and
extensibility.  Thanks to Peter Zijlstra for fixing the lockdep
warning.  Thanks to Harvey Harrison for pointing out that the
unlikely() was unnecessary.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
Acked-by: NIngo Molnar <mingo@elte.hu>

64ac24e7

03 4月, 2008 1 次提交

kvm: provide kvm.h for all architecture: fixes headers_install · dd135ebb

由 Christian Borntraeger 提交于 4月 02, 2008

Currently include/linux/kvm.h is not considered by make headers_install,
because Kbuild cannot handle " unifdef-$(CONFIG_FOO) += foo.h.  This problem
was introduced by

commit fb56dbb3
Author: Avi Kivity <avi@qumranet.com>
Date:   Sun Dec 2 10:50:06 2007 +0200

    KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM

    Currently, make headers_check barfs due to <asm/kvm.h>, which <linux/kvm.h>
    includes, not existing.  Rather than add a zillion <asm/kvm.h>s, export kvm.
    only if the arch actually supports it.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

which makes this an 2.6.25 regression.

One way of solving the issue is to enhance Kbuild, but Avi and David conviced
me, that changing headers_install is not the way to go.  This patch changes
the definition for linux/kvm.h to unifdef-y.

If  unifdef-y is used for linux/kvm.h "make headers_check" will fail on all
architectures without asm/kvm.h.  Therefore, this patch also provides
asm/kvm.h on all architectures.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NAvi Kivity <avi@qumranet.com>
Cc: Sam Ravnborg <sam@ravnborg.org
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dd135ebb

14 2月, 2008 12 次提交

[XTENSA] Allow debugger to modify the WINDOWBASE register. · 42086cec

由 Chris Zankel 提交于 1月 28, 2008

For the 'return' command, GDB needs to adjust WINDOWBASE.
In case WB is different from 0, we need to rotate the
window register file and update WINDOWSTART and WMASK.
This patch also removes some ret|= statements for
__get_user/__put_user as the address range was alrady
checked a couple of lines earlier.
Signed-off-by: NChris Zankel <chris@zankel.net>

42086cec

[XTENSA] Fix cache flush macro for D$/I$ aliasing/non-aliasing · 9f8fcf38

由 Chris Zankel 提交于 1月 18, 2008

For configurations that have aliasing in the data cache but
not in the instruction cache, we don't need to flush the
instruction cache. Thus, we didn't define the macros to
flush the instruction cache. Some cache-flush functions,
howerver, were using those macros.
Signed-off-by: NChris Zankel <chris@zankel.net>

9f8fcf38

C
[XTENSA] Exclude thread-global registers from the xtregs structures. · 67926257
由 Chris Zankel 提交于 1月 15, 2008
```
Signed-off-by: NChris Zankel <chris@zankel.net>
```
67926257

[XTENSA] Add support for configurable registers and coprocessors · c658eac6

由 Chris Zankel 提交于 2月 12, 2008

The Xtensa architecture allows to define custom instructions and
registers. Registers that are bound to a coprocessor are only
accessible if the corresponding enable bit is set, which allows
to implement a 'lazy' context switch mechanism. Other registers
needs to be saved and restore at the time of the context switch
or during interrupt handling.

This patch adds support for these additional states:

- save and restore registers that are used by the compiler upon
  interrupt entry and exit.
- context switch additional registers unbound to any coprocessor
- 'lazy' context switch of registers bound to a coprocessor
- ptrace interface to provide access to additional registers
- update configuration files in include/asm-xtensa/variant-fsf
Signed-off-by: NChris Zankel <chris@zankel.net>

c658eac6

[XTENSA] Clean up stat structs. · 71d28e6c

由 Bob Wilson 提交于 10月 16, 2007

Avoid using typedefs for stat fields.
Make stat64.st_blocks an unsigned long long to avoid endian-specific
padding with 32-bit values.
Clean up signed vs. unsigned and int vs. long types to be consistent
with other uses of these values.
Signed-off-by: NBob Wilson <bob.wilson@acm.org>
Signed-off-by: NChris Zankel <chris@zankel.net>

71d28e6c

[XTENSA] Add volatile keyword to asm statements accessing counter registers · de6b0345

由 Chris Zankel 提交于 12月 19, 2007

The compiler get's sometimes to smart and doesn't reread the
counter registers and the kernel doesn't schedule until the
counter wraps around.
Signed-off-by: NChris Zankel <chris@zankel.net>

de6b0345

[XTENSA] Fix modules for non-exec processor configurations · 3b4a49e2

由 Chris Zankel 提交于 1月 07, 2008

We need to use vmalloc_exec for module loading. Also remove
the definitions MODULE_START and MODULE_END, which wasn't
used, and increase the VMALLOC memory range accordingly.
Signed-off-by: NChris Zankel <chris@zankel.net>

3b4a49e2

M
[XTENSA] Add missing cast in elf.h ELF_CORE_COPY_REGS() · 3e92501a
由 Marc Gauthier 提交于 12月 11, 2007
```
Avoids compiler warning.
Signed-off-by: NMarc Gauthier <marc@tensilica.com>
```
3e92501a

[XTENSA] Remove oldmask from sigcontext and fix register flush · 3befce8f

由 Chris Zankel 提交于 2月 12, 2008

Remove oldmask from the sigcontext structure. Also update wmask
and windowstart when we flush the AR registers to stack.
Signed-off-by: NChris Zankel <chris@zankel.net>

3befce8f

[XTENSA] Clean up elf-gregset. · 8d7e8240

由 Chris Zankel 提交于 2月 12, 2008

Remove additional registers from the ELF gregset structure that
are only used by the kernel or are not required or invalid in
user-space. The ar registers are always aligned to a windowbase
value of 0, and the WB register is always assumed to be 0.
Increase the size of the structure to 128 entries. This will
provide enough space in future.
Signed-off-by: NChris Zankel <chris@zankel.net>

8d7e8240

[XTENSA] Fix clobbered register in asm macro · 70e137eb

由 Chris Zankel 提交于 10月 23, 2007

We dangerously re-used an input operand to an asm macro
without defining a constraint. By defining a separate
output operand (instead of input/output operand), the
compiler is more flexible during register allocation.
Signed-off-by: NChris Zankel <chris@zankel.net>

70e137eb

[XTENSA] Fix non-existent pte_token_t typedef to pgtable_t · e584d85f

由 Chris Zankel 提交于 2月 13, 2008

This bug was introduced in 2f569afd.
(CONFIG_HIGHPTE vs. sub-page page tables)
Signed-off-by: NChris Zankel <chris@zankel.net>

e584d85f

09 2月, 2008 5 次提交

fix xtensa timerfd breakage · 3a984a85

由 Adrian Bunk 提交于 2月 08, 2008

In file included from /home/bunk/linux/kernel-2.6/git/linux-2.6/arch/xtensa/kernel/syscall.c:39:
include2/asm/unistd.h:681: error: 'sys_timerfd' undeclared here (not in a function)
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Cc: Christian Zankel <chris@zankel.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3a984a85

CONFIG_HIGHPTE vs. sub-page page tables. · 2f569afd

由 Martin Schwidefsky 提交于 2月 08, 2008

Background: I've implemented 1K/2K page tables for s390. These sub-page
page tables are required to properly support the s390 virtualization
instruction with KVM. The SIE instruction requires that the page tables
have 256 page table entries (pte) followed by 256 page status table entries
(pgste). The pgstes are only required if the process is using the SIE
instruction. The pgstes are updated by the hardware and by the hypervisor
for a number of reasons, one of them is dirty and reference bit tracking.
To avoid wasting memory the standard pte table allocation should return
1K/2K (31/64 bit) and 2K/4K if the process is using SIE.

Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means
the s390 version for pte_alloc_one cannot return a pointer to a struct
page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
cannot return a pointer to a pte either, since that would require more than
32 bit for the return value of pte_alloc_one (and the pte * would not be
accessible since its not kmapped).

Solution: The only solution I found to this dilemma is a new typedef: a
pgtable_t. For s390 pgtable_t will be a (pte *) - to be introduced with a
later patch. For everybody else it will be a (struct page *). The
additional problem with the initialization of the ptl lock and the
NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
a destructor pgtable_page_dtor. The page table allocation and free
functions need to call these two whenever a page table page is allocated or
freed. pmd_populate will get a pgtable_t instead of a struct page pointer.
To get the pgtable_t back from a pmd entry that has been installed with
pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page
call in free_pte_range and apply_to_pte_range.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2f569afd

avoid overflows in kernel/time.c · bdc80787

由 H. Peter Anvin 提交于 2月 08, 2008

When the conversion factor between jiffies and milli- or microseconds is
not a single multiply or divide, as for the case of HZ == 300, we currently
do a multiply followed by a divide.  The intervening result, however, is
subject to overflows, especially since the fraction is not simplified (for
HZ == 300, we multiply by 300 and divide by 1000).

This is exposed to the user when passing a large timeout to poll(), for
example.

This patch replaces the multiply-divide with a reciprocal multiplication on
32-bit platforms.  When the input is an unsigned long, there is no portable
way to do this on 64-bit platforms there is no portable way to do this
since it requires a 128-bit intermediate result (which gcc does support on
64-bit platforms but may generate libgcc calls, e.g.  on 64-bit s390), but
since the output is a 32-bit integer in the cases affected, just simplify
the multiply-divide (*3/10 instead of *300/1000).

The reciprocal multiply used can have off-by-one errors in the upper half
of the valid output range.  This could be avoided at the expense of having
to deal with a potential 65-bit intermediate result.  Since the intent is
to avoid overflow problems and most of the other time conversions are only
semiexact, the off-by-one errors were considered an acceptable tradeoff.

At Ralf Baechle's suggestion, this version uses a Perl script to compute
the necessary constants.  We already have dependencies on Perl for kernel
compiles.  This does, however, require the Perl module Math::BigInt, which
is included in the standard Perl distribution starting with version 5.8.0.
In order to support older versions of Perl, include a table of canned
constants in the script itself, and structure the script so that
Math::BigInt isn't required if pulling values from said table.

Running the script requires that the HZ value is available from the
Makefile.  Thus, this patch also adds the Kconfig variable CONFIG_HZ to the
architectures which didn't already have it (alpha, cris, frv, h8300, m32r,
m68k, m68knommu, sparc, v850, and xtensa.) It does *not* touch the sh or
sh64 architectures, since Paul Mundt has dealt with those separately in the
sh tree.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Ralf Baechle <ralf@linux-mips.org>,
Cc: Sam Ravnborg <sam@ravnborg.org>,
Cc: Paul Mundt <lethal@linux-sh.org>,
Cc: Richard Henderson <rth@twiddle.net>,
Cc: Michael Starvik <starvik@axis.com>,
Cc: David Howells <dhowells@redhat.com>,
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>,
Cc: Hirokazu Takata <takata@linux-m32r.org>,
Cc: Geert Uytterhoeven <geert@linux-m68k.org>,
Cc: Roman Zippel <zippel@linux-m68k.org>,
Cc: William L. Irwin <sparclinux@vger.kernel.org>,
Cc: Chris Zankel <chris@zankel.net>,
Cc: H. Peter Anvin <hpa@zytor.com>,
Cc: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bdc80787

asm-*/posix_types.h: scrub __GLIBC__ · 531d7d42

由 Mike Frysinger 提交于 2月 08, 2008

Some arches (like alpha and ia64) already have a clean posix_types.h header.
This brings all the others in line by removing all references to __GLIBC__
(and some undocumented __USE_ALL).
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Acked-by: NIngo Molnar <mingo@elte.hu>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

531d7d42

aout: move STACK_TOP[_MAX] to asm/processor.h · 922a70d3

由 David Howells 提交于 2月 08, 2008

Move STACK_TOP[_MAX] out of asm/a.out.h and into asm/processor.h as they're
required whether or not A.OUT format is available.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

922a70d3

08 2月, 2008 2 次提交

Add cmpxchg_local to xtensa · 9a7744f9

由 Mathieu Desnoyers 提交于 2月 07, 2008

Use the architecture specific __cmpxchg_u32 for 32 bits cmpxchg)_local. Else,
use the new generic cmpxchg_local (disables interrupt).
Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9a7744f9

Cleanup asm/{elf,page,user}.h: #ifdef __KERNEL__ is no longer needed · 516c25a8

由 Kirill A. Shutemov 提交于 2月 07, 2008

asm/elf.h, asm/page.h and asm/user.h don't export to userspace now, so we can
drop #ifdef __KERNEL__ for them.

[k.shutemov@gmail.com: remove #ifdef __KERNEL_]
Signed-off-by: NKirill A. Shutemov <k.shutemov@gmail.com>
Reviewed-by: NDavid Woodhouse <dwmw2@infradead.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NKirill A. Shutemov <k.shutemov@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

516c25a8

06 2月, 2008 1 次提交

add mm argument to pte/pmd/pud/pgd_free · 5e541973

由 Benjamin Herrenschmidt 提交于 2月 04, 2008

(with Martin Schwidefsky <schwidefsky@de.ibm.com>)

The pgd/pud/pmd/pte page table allocation functions get a mm_struct pointer as
first argument.  The free functions do not get the mm_struct argument.  This
is 1) asymmetrical and 2) to do mm related page table allocations the mm
argument is needed on the free function as well.

[kamalesh@linux.vnet.ibm.com: i386 fix]
[akpm@linux-foundation.org: coding-syle fixes]
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NKamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5e541973

01 2月, 2008 1 次提交

[NET]: Introducing socket mark socket option. · 4a19ec58

由 Laszlo Attila Toth 提交于 1月 30, 2008

A userspace program may wish to set the mark for each packets its send
without using the netfilter MARK target. Changing the mark can be used
for mark based routing without netfilter or for packet filtering.

It requires CAP_NET_ADMIN capability.
Signed-off-by: NLaszlo Attila Toth <panther@balabit.hu>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a19ec58

24 10月, 2007 2 次提交

J
xtensa: dma-mapping.h is using linux/scatterlist.h functions, so include it · 8c7837c4
由 Jens Axboe 提交于 10月 24, 2007
```
It's currently using asm/scatterlist.h, but that is not enough.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
8c7837c4

xtensa: fix sg->page fallout · 891039a9

由 Emil Medve 提交于 10月 23, 2007

Signed-off-by: NEmil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

891039a9

23 10月, 2007 2 次提交

Add CONFIG_DEBUG_SG sg validation · d6ec0842

由 Jens Axboe 提交于 10月 22, 2007

Add a Kconfig entry which will toggle some sanity checks on the sg
entry and tables.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d6ec0842

Change table chaining layout · 18dabf47

由 Jens Axboe 提交于 10月 22, 2007

Change the page member of the scatterlist structure to be an unsigned
long, and encode more stuff in the lower bits:

- Bits 0 and 1 zero: this is a normal sg entry. Next sg entry is located
  at sg + 1.
- Bit 0 set: this is a chain entry, the next real entry is at ->page_link
  with the two low bits masked off.
- Bit 1 set: this is the final entry in the sg entry. sg_next() will return
  NULL when passed such an entry.

It's thus important that sg table users use the proper accessors to get
and set the page member.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

18dabf47

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功