提交 · f1eecf0e4f0796911cc076f38fcf05fea0b353d5 · openeuler / Kernel

20 10月, 2008 1 次提交

container freezer: implement freezer cgroup subsystem · dc52ddc0

由 Matt Helsley 提交于 16年前

This patch implements a new freezer subsystem in the control groups
framework.  It provides a way to stop and resume execution of all tasks in
a cgroup by writing in the cgroup filesystem.

The freezer subsystem in the container filesystem defines a file named
freezer.state.  Writing "FROZEN" to the state file will freeze all tasks
in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
the cgroup.  Reading will return the current state.

* Examples of usage :

   # mkdir /containers/freezer
   # mount -t cgroup -ofreezer freezer  /containers
   # mkdir /containers/0
   # echo $some_pid > /containers/0/tasks

to get status of the freezer subsystem :

   # cat /containers/0/freezer.state
   RUNNING

to freeze all tasks in the container :

   # echo FROZEN > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   FREEZING
   # cat /containers/0/freezer.state
   FROZEN

to unfreeze all tasks in the container :

   # echo RUNNING > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   RUNNING

This is the basic mechanism which should do the right thing for user space
task in a simple scenario.

It's important to note that freezing can be incomplete.  In that case we
return EBUSY.  This means that some tasks in the cgroup are busy doing
something that prevents us from completely freezing the cgroup at this
time.  After EBUSY, the cgroup will remain partially frozen -- reflected
by freezer.state reporting "FREEZING" when read.  The state will remain
"FREEZING" until one of these things happens:

	1) Userspace cancels the freezing operation by writing "RUNNING" to
		the freezer.state file
	2) Userspace retries the freezing operation by writing "FROZEN" to
		the freezer.state file (writing "FREEZING" is not legal
		and returns EIO)
	3) The tasks that blocked the cgroup from entering the "FROZEN"
		state disappear from the cgroup's set of tasks.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: export thaw_process]
Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
Signed-off-by: NMatt Helsley <matthltc@us.ibm.com>
Acked-by: NSerge E. Hallyn <serue@us.ibm.com>
Tested-by: NMatt Helsley <matthltc@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dc52ddc0

17 10月, 2008 4 次提交

FRV: Eliminate NULL test and memset after alloc_bootmem · 70a3075d

由 Julia Lawall 提交于 16年前

As noted by Akinobu Mita in patch b1fceac2,
alloc_bootmem and related functions never return NULL and always return a
zeroed region of memory.  Thus a NULL test or memset after calls to these
functions is unnecessary.

 arch/frv/mm/init.c |    2 --
 1 file changed, 2 deletions(-)

This was fixed using the following semantic patch.
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
expression E;
statement S;
@@

E = \(alloc_bootmem\|alloc_bootmem_low\|alloc_bootmem_pages\|alloc_bootmem_low_pages\)(...)
... when != E
(
- BUG_ON (E == NULL);
|
- if (E == NULL) S
)

@@
expression E,E1;
@@

E = \(alloc_bootmem\|alloc_bootmem_low\|alloc_bootmem_pages\|alloc_bootmem_low_pages\)(...)
... when != E
- memset(E,0,E1);
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

70a3075d

FRV: Provide dma_map_page() for NOMMU and fix comments · c9af956c

由 David Howells 提交于 16年前

Provide dma_map_page() for the NOMMU-mode FRV arch.

Also do some fixing on the comments attached to the various DMA functions for
both MMU and NOMMU mode FRV code.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c9af956c

frv: use generic pci_enable_resources() · 9bd8f9c6

由 Bjorn Helgaas 提交于 16年前

Use the generic pci_enable_resources() instead of the arch-specific code.

Unlike this arch-specific code, the generic version:
    - checks PCI_NUM_RESOURCES (11), not 6, resources
    - skips resources that have neither IORESOURCE_IO nor IORESOURCE_MEM set
    - skips ROM resources unless IORESOURCE_ROM_ENABLE is set
    - checks for resource collisions with "!r->parent"
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9bd8f9c6

sysctl: simplify ->strategy · f221e726

由 Alexey Dobriyan 提交于 16年前

name and nlen parameters passed to ->strategy hook are unused, remove
them.  In general ->strategy hook should know what it's doing, and don't
do something tricky for which, say, pointer to original userspace array
may be needed (name).
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net> [ networking bits ]
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f221e726

02 8月, 2008 1 次提交

FRV: Wire up new system calls · 784dd7b6

由 David Howells 提交于 16年前

Wire up for FRV the system calls that were added in the last merge window.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

784dd7b6

27 7月, 2008 2 次提交

frv: use generic show_mem() · e275e0a6

由 Johannes Weiner 提交于 16年前

Remove arch-specific show_mem() in favor of the generic version.

This also removes the following redundant information display:

	- free pages, printed by show_free_areas()

where show_mem() calls show_free_areas().
Signed-off-by: NJohannes Weiner <hannes@saeurebad.de>
Acked-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e275e0a6

frv: use the common ascii hex helpers · 19caeed6

由 Harvey Harrison 提交于 16年前

Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Acked-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

19caeed6

25 7月, 2008 1 次提交

remove include/linux/pm_legacy.h · d75f65fd

由 Adrian Bunk 提交于 16年前

Remove the obsolete and no longer used include/linux/pm_legacy.h
Reviewed-by: NRobert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Cc: Pavel Machek <pavel@suse.cz>
Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d75f65fd

28 6月, 2008 1 次提交

PCI: remove unused arch pcibios_update_resource() functions · 0aea5313

由 Adrian Bunk 提交于 16年前

Russell King did the following back in 2003:

<--  snip  -->

    [PCI] pci-9: Kill per-architecture pcibios_update_resource()

    Kill pcibios_update_resource(), replacing it with pci_update_resource().
    pci_update_resource() uses pcibios_resource_to_bus() to convert a
    resource to a device BAR - the transformation should be exactly the
    same as the transformation used for the PCI bridges.

    pci_update_resource "knows" about 64-bit BARs, but doesn't attempt to
    set the high 32-bits to anything non-zero - currently no architecture
    attempts to do something different.  If anyone cares, please fix; I'm
    going to reflect current behaviour for the time being.

    Ivan pointed out the following architectures need to examine their
    pcibios_update_resource() implementation - they should make sure that
    this new implementation does the right thing.  #warning's have been
    added where appropriate.

        ia64
        mips
        mips64

    This cset also includes a fix for the problem reported by AKPM where
    64-bit arch compilers complain about the resource mask being placed
    in a u32.

<--  snip  -->

This patch removes the unused pcibios_update_resource() functions the
kernel gained since, from FRV, m68k, mips & sh architectures.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Acked-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NGreg Ungerer <gerg@uclinux.org>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Acked-by: NRalf Baechle <ralf@linux-mips.org>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

0aea5313

07 6月, 2008 1 次提交

Fix various old email addresses for dwmw2 · 44d1b980

由 David Woodhouse 提交于 16年前

Although if people have questions about ARCnet, perhaps it's _better_
for them to be mailing dwmw2@cam.ac.uk about it...
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

44d1b980

25 5月, 2008 1 次提交

frv: export empty_zero_page · fb56f0f9

由 Adrian Bunk 提交于 16年前

Fix the following build error:

ERROR: "empty_zero_page" [fs/ext4/ext4dev.ko] undefined!
Reported-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fb56f0f9

17 5月, 2008 1 次提交
- A
  [PATCH] take init_files to fs/file.c · f52111b1
  由 Al Viro 提交于 16年前
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  f52111b1
04 5月, 2008 1 次提交

unified (weak) sys_pipe implementation · d35c7b0e

由 Ulrich Drepper 提交于 16年前

This replaces the duplicated arch-specific versions of "sys_pipe()" with
one unified implementation.  This removes almost 250 lines of duplicated
code.

It's marked __weak, so that *if* an architecture wants to override the
default implementation it can do so by simply having its own replacement
version, since many architectures use alternate calling conventions for
the 'pipe()' system call for legacy reasons (ie traditional UNIX
implementations often return the two file descriptors in registers)

I still haven't changed the cris version even though Linus says the BKL
isn't needed.  The arch maintainer can easily do it if there are really
no obstacles.
Signed-off-by: NUlrich Drepper <drepper@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d35c7b0e

01 5月, 2008 1 次提交

frv: unbreak misalignment handling changes · adafbedf

由 David Howells 提交于 16年前

Fix a reference in a arch/frv/mm/Makefile to unaligned.c which has now been
deleted.

Also revert the change to the guard macro name in include/asm-frv/unaligned.h.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

adafbedf

29 4月, 2008 5 次提交

frv: use kbuild.h instead of defining macros in asm-offsets.c · de400bd2

由 Christoph Lameter 提交于 16年前

Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

de400bd2

kernel: Move arches to use common unaligned access · 6510d419

由 Harvey Harrison 提交于 16年前

Unaligned access is ok for the following arches:
cris, m68k, mn10300, powerpc, s390, x86

Arches that use the memmove implementation for native endian, and
the byteshifting for the opposite endianness.
h8300, m32r, xtensa

Packed struct for native endian, byteshifting for other endian:
alpha, blackfin, ia64, parisc, sparc, sparc64, mips, sh

m86knommu is generic_be for Coldfire, otherwise unaligned access is ok.

frv, arm chooses endianness based on compiler settings, uses the byteshifting
versions.  Remove the unaligned trap handler from frv as it is now unused.

v850 is le, uses the byteshifting versions for both be and le.

Remove the now unused asm-generic implementation.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6510d419

Remove the macro get_personality · ecd0fa98

由 WANG Cong 提交于 16年前

Remove the macro get_personality, use ->personality instead.

Cc: Christoph Hellwig <hch@infradead.org
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: NWANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ecd0fa98

iomap: fix 64 bits resources on 32 bits · b70d3a2c

由 Benjamin Herrenschmidt 提交于 16年前

Almost all implementations of pci_iomap() in the kernel, including the generic
lib/iomap.c one, copies the content of a struct resource into unsigned long's
which will break on 32 bits platforms with 64 bits resources.

This fixes all definitions of pci_iomap() to use resource_size_t.  I also
"fixed" the 64bits arch for consistency.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b70d3a2c

frv si_addr annotations · ff471b24

由 Al Viro 提交于 16年前

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Acked-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ff471b24

22 4月, 2008 1 次提交

frv: unexport kmap_atomic_to_page · 9fd91217

由 Adrian Bunk 提交于 16年前

This patch removes the no longer used export of kmap_atomic_to_page.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9fd91217

21 4月, 2008 2 次提交

PCI: remove pcibios_fixup_ghosts() · 6355f3d1

由 Greg Kroah-Hartman 提交于 17年前

This function was obviously never being used since early 2.5 days as any
device that it would try to remove would never really be removed from
the system due to the PCI device list being held in the driver core, not
the general list of PCI devices.

As we have not had a single report of a problem here in 4 years, I think
it's safe to remove now.
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

6355f3d1

PCI: remove initial bios sort of PCI devices on x86 · 1ba6ab11

由 Greg Kroah-Hartman 提交于 17年前

We currently keep 2 lists of PCI devices in the system, one in the
driver core, and one all on its own.  This second list is sorted at boot
time, in "BIOS" order, to try to remain compatible with older kernels
(2.2 and earlier days).  There was also a "nosort" option to turn this
sorting off, to remain compatible with even older kernel versions, but
that just ends up being what we have been doing from 2.5 days...

Unfortunately, the second list of devices is not really ever used to 
determine the probing order of PCI devices or drivers[1].  That is done
using the driver core list instead.  This change happened back in the
early 2.5 days.

Relying on BIOS ording for the binding of drivers to specific device
names is problematic for many reasons, and userspace tools like udev
exist to properly name devices in a persistant manner if that is needed,
no reliance on the BIOS is needed.

Matt Domsch and others at Dell noticed this back in 2006, and added a
boot option to sort the PCI device lists (both of them) in a
breadth-first manner to help remain compatible with the 2.4 order, if
needed for any reason.  This option is not going away, as some systems
rely on them.

This patch removes the sorting of the internal PCI device list in "BIOS"
mode, as it's not needed at all anymore, and hasn't for many years.
I've also removed the PCI flags for this from some other arches that for
some reason defined them, but never used them.

This should not change the ordering of any drivers or device probing.

[1] The old-style pci_get_device and pci_find_device() still used this
sorting order, but there are very few drivers that use these functions,
as they are deprecated for use in this manner.  If for some reason, a
driver rely on the order and uses these functions, the breadth-first
boot option will resolve any problem.

Cc: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

1ba6ab11

17 4月, 2008 1 次提交

Generic semaphore implementation · 64ac24e7

由 Matthew Wilcox 提交于 16年前

Semaphores are no longer performance-critical, so a generic C
implementation is better for maintainability, debuggability and
extensibility.  Thanks to Peter Zijlstra for fixing the lockdep
warning.  Thanks to Harvey Harrison for pointing out that the
unlikely() was unnecessary.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
Acked-by: NIngo Molnar <mingo@elte.hu>

64ac24e7

15 4月, 2008 1 次提交

PM: Remove legacy PM · 6afe1a1f

由 Pavel Machek 提交于 16年前

AFAICT pm_send_all is a nop when noone uses pm_register...

Hmm.. can we just force CONFIG_PM_LEGACY=n, and see what happens?

Or maybe this is better idea? It may break build somewhere, but it
should be easy to fix... (it builds here, i386 and x86-64).
Signed-off-by: NPavel Machek <pavel@suse.cz>
Acked-by: NRalf Baechle <ralf@linux-mips.org>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: NLen Brown <len.brown@intel.com>

6afe1a1f

14 4月, 2008 1 次提交

FRV: Correctly determine the address of an illegal instruction · 4f3f8e94

由 David Howells 提交于 16年前

Correctly determine the address of an illegal instruction. The EPCR0 register
holds this value (masked by EPCR0_PC) if the validity bit is set (masked by
EPCR0_V). So the test as to whether the contents of the register are usable
should be involve checking the _V bit, not the _PC bits.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4f3f8e94

11 4月, 2008 2 次提交

FRV: Make NOMMU-mode work with base addresses other than 0xC0000000 [try #2] · ed9b949f

由 David Howells 提交于 16年前

Make NOMMU-mode work with base addresses other than 0xC0000000 by:

 (1) Giving the code that sets up the protection registers the right address
     in __sdram_base.  Rather than being hard coded to 0xC0000000, the value
     of __page_offset is obtained from the linker script.

 (2) Eliminate the check in __switch_to() that verifies the current thread
     info is in the 0xCxxxxxxx region.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ed9b949f

FRV: Add support for emulation of userspace atomic ops [try #2] · e31c243f

由 David Howells 提交于 16年前

Use traps 120-126 to emulate atomic cmpxchg32, xchg32, and XOR-, OR-, AND-, SUB-
and ADD-to-memory operations for userspace.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e31c243f

21 2月, 2008 2 次提交

FRV: Change the timerfd syscalls to be the same as i386 · e80af3a8

由 David Howells 提交于 17年前

Change the FRV timerfd syscalls to be the same as i386 timerfd syscalls.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e80af3a8

FRV: Drop the .data.idt section for FRV · 2d0e2baa

由 David Howells 提交于 17年前

There is no .data.idt section for FRV, so drop it from the linker script.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2d0e2baa

14 2月, 2008 2 次提交

xtime_lock vs update_process_times · aa02cd2d

由 Peter Zijlstra 提交于 17年前

Commit d3d74453 ("hrtimer: fixup the
HRTIMER_CB_IRQSAFE_NO_SOFTIRQ fallback") broke several archs, and since
only Russell bothered to merge the fix, and Greg to ACK his arch, I'm
sending this for merger.

I have confirmation that the Alpha bit results in a booting kernel.
That leaves: blackfin, frv, sh and sparc untested.

The deadlock in question was found by Russell:

  IRQ handle
    -> timer_tick() - xtime seqlock held for write
      -> update_process_times()
        -> run_local_timers()
          -> hrtimer_run_queues()
            -> hrtimer_get_softirq_time() - tries to get a read lock

Now, Thomas assures me the fix is trivial, only do_timer() needs to be
done under the xtime_lock, and update_process_times() can savely be
removed from under it.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NGreg Ungerer <gerg@uclinux.org>
CC: Richard Henderson <rth@twiddle.net>
CC: Bryan Wu <bryan.wu@analog.com>
CC: David Howells <dhowells@redhat.com>
CC: Paul Mundt <lethal@linux-sh.org>
CC: William Irwin <wli@holomorphy.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NIvan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

aa02cd2d

FRV: Fix up parse error in linker script · d897d2b5

由 David Howells 提交于 17年前

Fix up parse error in FRV linker script, presumably introduced through changes
to the INIT_TEXT and EXIT_TEXT macros.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d897d2b5

09 2月, 2008 4 次提交

ide: introduce HAVE_IDE · ec7748b5

由 Sam Ravnborg 提交于 17年前

To allow flexible configuration of IDE introduce HAVE_IDE.
All archs except arm, um and s390 unconditionally select it.
For arm the actual configuration determine if IDE is supported.

This is a step towards introducing drivers/Kconfig for arm.
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
Acked-by: NRussell King - ARM Linux <linux@arm.linux.org.uk>
Acked-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ec7748b5

CONFIG_HIGHPTE vs. sub-page page tables. · 2f569afd

由 Martin Schwidefsky 提交于 17年前

Background: I've implemented 1K/2K page tables for s390. These sub-page
page tables are required to properly support the s390 virtualization
instruction with KVM. The SIE instruction requires that the page tables
have 256 page table entries (pte) followed by 256 page status table entries
(pgste). The pgstes are only required if the process is using the SIE
instruction. The pgstes are updated by the hardware and by the hypervisor
for a number of reasons, one of them is dirty and reference bit tracking.
To avoid wasting memory the standard pte table allocation should return
1K/2K (31/64 bit) and 2K/4K if the process is using SIE.

Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means
the s390 version for pte_alloc_one cannot return a pointer to a struct
page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
cannot return a pointer to a pte either, since that would require more than
32 bit for the return value of pte_alloc_one (and the pte * would not be
accessible since its not kmapped).

Solution: The only solution I found to this dilemma is a new typedef: a
pgtable_t. For s390 pgtable_t will be a (pte *) - to be introduced with a
later patch. For everybody else it will be a (struct page *). The
additional problem with the initialization of the ptl lock and the
NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
a destructor pgtable_page_dtor. The page table allocation and free
functions need to call these two whenever a page table page is allocated or
freed. pmd_populate will get a pgtable_t instead of a struct page pointer.
To get the pgtable_t back from a pmd entry that has been installed with
pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page
call in free_pte_range and apply_to_pte_range.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2f569afd

avoid overflows in kernel/time.c · bdc80787

由 H. Peter Anvin 提交于 17年前

When the conversion factor between jiffies and milli- or microseconds is
not a single multiply or divide, as for the case of HZ == 300, we currently
do a multiply followed by a divide.  The intervening result, however, is
subject to overflows, especially since the fraction is not simplified (for
HZ == 300, we multiply by 300 and divide by 1000).

This is exposed to the user when passing a large timeout to poll(), for
example.

This patch replaces the multiply-divide with a reciprocal multiplication on
32-bit platforms.  When the input is an unsigned long, there is no portable
way to do this on 64-bit platforms there is no portable way to do this
since it requires a 128-bit intermediate result (which gcc does support on
64-bit platforms but may generate libgcc calls, e.g.  on 64-bit s390), but
since the output is a 32-bit integer in the cases affected, just simplify
the multiply-divide (*3/10 instead of *300/1000).

The reciprocal multiply used can have off-by-one errors in the upper half
of the valid output range.  This could be avoided at the expense of having
to deal with a potential 65-bit intermediate result.  Since the intent is
to avoid overflow problems and most of the other time conversions are only
semiexact, the off-by-one errors were considered an acceptable tradeoff.

At Ralf Baechle's suggestion, this version uses a Perl script to compute
the necessary constants.  We already have dependencies on Perl for kernel
compiles.  This does, however, require the Perl module Math::BigInt, which
is included in the standard Perl distribution starting with version 5.8.0.
In order to support older versions of Perl, include a table of canned
constants in the script itself, and structure the script so that
Math::BigInt isn't required if pulling values from said table.

Running the script requires that the HZ value is available from the
Makefile.  Thus, this patch also adds the Kconfig variable CONFIG_HZ to the
architectures which didn't already have it (alpha, cris, frv, h8300, m32r,
m68k, m68knommu, sparc, v850, and xtensa.) It does *not* touch the sh or
sh64 architectures, since Paul Mundt has dealt with those separately in the
sh tree.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Ralf Baechle <ralf@linux-mips.org>,
Cc: Sam Ravnborg <sam@ravnborg.org>,
Cc: Paul Mundt <lethal@linux-sh.org>,
Cc: Richard Henderson <rth@twiddle.net>,
Cc: Michael Starvik <starvik@axis.com>,
Cc: David Howells <dhowells@redhat.com>,
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>,
Cc: Hirokazu Takata <takata@linux-m32r.org>,
Cc: Geert Uytterhoeven <geert@linux-m68k.org>,
Cc: Roman Zippel <zippel@linux-m68k.org>,
Cc: William L. Irwin <sparclinux@vger.kernel.org>,
Cc: Chris Zankel <chris@zankel.net>,
Cc: H. Peter Anvin <hpa@zytor.com>,
Cc: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bdc80787

procfs: constify function pointer tables · 03a44825

由 Jan Engelhardt 提交于 17年前

Signed-off-by: NJan Engelhardt <jengelh@computergmbh.de>
Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Acked-By: NDavid Howells <dhowells@redhat.com>
Acked-by: NBryan Wu <bryan.wu@analog.com>
Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

03a44825

08 2月, 2008 1 次提交

Introduce flags for reserve_bootmem() · 72a7fe39

由 Bernhard Walle 提交于 17年前

This patchset adds a flags variable to reserve_bootmem() and uses the
BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions
between crashkernel area and already used memory.

This patch:

Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE.
If that flag is set, the function returns with -EBUSY if the memory already
has been reserved in the past.  This is to avoid conflicts.

Because that code runs before SMP initialisation, there's no race condition
inside reserve_bootmem_core().

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix powerpc build]
Signed-off-by: NBernhard Walle <bwalle@suse.de>
Cc: <linux-arch@vger.kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

72a7fe39

07 2月, 2008 1 次提交

calibrate_delay() must be __cpuinit · 6c81c32f

由 Adrian Bunk 提交于 17年前

calibrate_delay() must be __cpuinit, not __{dev,}init.

I've verified that this is correct for all users.

While doing the latter, I also did the following cleanups:
- remove pointless additional prototypes in C files
- ensure all users #include <linux/delay.h>

This fixes the following section mismatches with CONFIG_HOTPLUG=n,
CONFIG_HOTPLUG_CPU=y:

WARNING: vmlinux.o(.text+0x1128d): Section mismatch: reference to .init.text.1:calibrate_delay (between 'check_cx686_slop' and 'set_cx86_reorder')
WARNING: vmlinux.o(.text+0x25102): Section mismatch: reference to .init.text.1:calibrate_delay (between 'smp_callin' and 'cpu_coregroup_map')
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christian Zankel <chris@zankel.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6c81c32f

06 2月, 2008 2 次提交

timerfd: fix remaining architectures · 9692bd9c

由 Andrew Morton 提交于 17年前

Cc: David Howells <dhowells@redhat.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9692bd9c

frv: use find_task_by_vpid in cxn_pin_by_pid · 540e3102

由 Pavel Emelyanov 提交于 17年前

The function is question gets the pid from sysctl table, so this one is a
virtual pid, i.e.  the pid of a task as it is seen from inside a namespace.

So the find_task_by_vpid() must be used here.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

540e3102

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功