1. 13 10月, 2009 1 次提交
    • A
      net: Introduce recvmmsg socket syscall · a2e27255
      Arnaldo Carvalho de Melo 提交于
      Meaning receive multiple messages, reducing the number of syscalls and
      net stack entry/exit operations.
      
      Next patches will introduce mechanisms where protocols that want to
      optimize this operation will provide an unlocked_recvmsg operation.
      
      This takes into account comments made by:
      
      . Paul Moore: sock_recvmsg is called only for the first datagram,
        sock_recvmsg_nosec is used for the rest.
      
      . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
        works in the same fashion as the ppoll one.
      
        If the underlying protocol returns a datagram with MSG_OOB set, this
        will make recvmmsg return right away with as many datagrams (+ the OOB
        one) it has received so far.
      
      . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
        datagrams and then recvmsg returns an error, recvmmsg will return
        the successfully received datagrams, store the error and return it
        in the next call.
      
      This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
      where we will be able to acquire the lock only at batch start and end, not at
      every underlying recvmsg call.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2e27255
  2. 21 9月, 2009 1 次提交
  3. 09 9月, 2009 1 次提交
  4. 02 9月, 2009 2 次提交
    • D
      KEYS: Add a keyctl to install a process's session keyring on its parent [try #6] · ee18d64c
      David Howells 提交于
      Add a keyctl to install a process's session keyring onto its parent.  This
      replaces the parent's session keyring.  Because the COW credential code does
      not permit one process to change another process's credentials directly, the
      change is deferred until userspace next starts executing again.  Normally this
      will be after a wait*() syscall.
      
      To support this, three new security hooks have been provided:
      cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
      the blank security creds and key_session_to_parent() - which asks the LSM if
      the process may replace its parent's session keyring.
      
      The replacement may only happen if the process has the same ownership details
      as its parent, and the process has LINK permission on the session keyring, and
      the session keyring is owned by the process, and the LSM permits it.
      
      Note that this requires alteration to each architecture's notify_resume path.
      This has been done for all arches barring blackfin, m68k* and xtensa, all of
      which need assembly alteration to support TIF_NOTIFY_RESUME.  This allows the
      replacement to be performed at the point the parent process resumes userspace
      execution.
      
      This allows the userspace AFS pioctl emulation to fully emulate newpag() and
      the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
      alter the parent process's PAG membership.  However, since kAFS doesn't use
      PAGs per se, but rather dumps the keys into the session keyring, the session
      keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
      the newpag flag.
      
      This can be tested with the following program:
      
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <keyutils.h>
      
      	#define KEYCTL_SESSION_TO_PARENT	18
      
      	#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)
      
      	int main(int argc, char **argv)
      	{
      		key_serial_t keyring, key;
      		long ret;
      
      		keyring = keyctl_join_session_keyring(argv[1]);
      		OSERROR(keyring, "keyctl_join_session_keyring");
      
      		key = add_key("user", "a", "b", 1, keyring);
      		OSERROR(key, "add_key");
      
      		ret = keyctl(KEYCTL_SESSION_TO_PARENT);
      		OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");
      
      		return 0;
      	}
      
      Compiled and linked with -lkeyutils, you should see something like:
      
      	[dhowells@andromeda ~]$ keyctl show
      	Session Keyring
      	       -3 --alswrv   4043  4043  keyring: _ses
      	355907932 --alswrv   4043    -1   \_ keyring: _uid.4043
      	[dhowells@andromeda ~]$ /tmp/newpag
      	[dhowells@andromeda ~]$ keyctl show
      	Session Keyring
      	       -3 --alswrv   4043  4043  keyring: _ses
      	1055658746 --alswrv   4043  4043   \_ user: a
      	[dhowells@andromeda ~]$ /tmp/newpag hello
      	[dhowells@andromeda ~]$ keyctl show
      	Session Keyring
      	       -3 --alswrv   4043  4043  keyring: hello
      	340417692 --alswrv   4043  4043   \_ user: a
      
      Where the test program creates a new session keyring, sticks a user key named
      'a' into it and then installs it on its parent.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      ee18d64c
    • D
      KEYS: Extend TIF_NOTIFY_RESUME to (almost) all architectures [try #6] · d0420c83
      David Howells 提交于
      Implement TIF_NOTIFY_RESUME for most of those architectures in which isn't yet
      available, and, whilst we're at it, have it call the appropriate tracehook.
      
      After this patch, blackfin, m68k* and xtensa still lack support and need
      alteration of assembly code to make it work.
      
      Resume notification can then be used (by a later patch) to install a new
      session keyring on the parent of a process.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      
      cc: linux-arch@vger.kernel.org
      Signed-off-by: NJames Morris <jmorris@namei.org>
      d0420c83
  5. 09 7月, 2009 2 次提交
    • T
      linker script: unify usage of discard definition · 023bf6f1
      Tejun Heo 提交于
      Discarded sections in different archs share some commonality but have
      considerable differences.  This led to linker script for each arch
      implementing its own /DISCARD/ definition, which makes maintaining
      tedious and adding new entries error-prone.
      
      This patch makes all linker scripts to move discard definitions to the
      end of the linker script and use the common DISCARDS macro.  As ld
      uses the first matching section definition, archs can include default
      discarded sections by including them earlier in the linker script.
      
      ia64 is notable because it first throws away some ia64 specific
      subsections and then include the rest of the sections into the final
      image, so those sections must be discarded before the inclusion.
      
      defconfig compile tested for x86, x86-64, powerpc, powerpc64, ia64,
      alpha, sparc, sparc64 and s390.  Michal Simek tested microblaze.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Acked-by: NMike Frysinger <vapier@gentoo.org>
      Tested-by: NMichal Simek <monstr@monstr.eu>
      Cc: linux-arch@vger.kernel.org
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: microblaze-uclinux@itee.uq.edu.au
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Tony Luck <tony.luck@intel.com>
      023bf6f1
    • J
      Remove multiple KERN_ prefixes from printk formats · ad361c98
      Joe Perches 提交于
      Commit 5fd29d6c ("printk: clean up
      handling of log-levels and newlines") changed printk semantics.  printk
      lines with multiple KERN_<level> prefixes are no longer emitted as
      before the patch.
      
      <level> is now included in the output on each additional use.
      
      Remove all uses of multiple KERN_<level>s in formats.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ad361c98
  6. 24 6月, 2009 1 次提交
    • T
      linker script: throw away .discard section · 405d967d
      Tejun Heo 提交于
      x86 throws away .discard section but no other archs do.  Also,
      .discard is not thrown away while linking modules.  Make every arch
      and module linking throw it away.  This will be used to define dummy
      variables for percpu declarations and definitions.
      
      This patch is based on Ivan Kokshaysky's alpha percpu patch.
      
      [ Impact: always throw away everything in .discard ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Bryan Wu <cooloney@kernel.org>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      405d967d
  7. 19 6月, 2009 1 次提交
  8. 17 6月, 2009 1 次提交
  9. 13 6月, 2009 1 次提交
    • H
      avr32: Fix oops on unaligned user access · bb6e6470
      Haavard Skinnemoen 提交于
      The unaligned address exception handler (and others) does not scan the
      fixup tables before oopsing. This is bad because it means passing a
      badly aligned pointer from user space might crash the kernel.
      
      Fix this by scanning the fixup tables in _exception(). This should
      resolve the issue for unaligned addresses as well as other less common
      exceptions that might be happening during a userspace access. The page
      fault handler already does fixup processing.
      Signed-off-by: NHaavard Skinnemoen <haavard.skinnemoen@atmel.com>
      bb6e6470
  10. 12 6月, 2009 1 次提交
  11. 21 5月, 2009 1 次提交
    • E
      syscall: Sort out syscall_restart name clash. · 288ddad5
      Eric W. Biederman 提交于
      Stephen Rothwell <sfr@canb.auug.org.au> writes:
      
      > Today's linux-next build of at least some av32 and arm configs failed like this:
      >
      > arch/avr32/kernel/signal.c:216: error: conflicting types for 'restart_syscall'
      > include/linux/sched.h:2184: error: previous definition of 'restart_syscall' was here
      >
      > Caused by commit 690cc3ff ("syscall:
      > Implement a convinience function restart_syscall") from the net tree.
      
      Grrr. Some days it feels like all of the good names are already taken.
      
      Let's just rename the two static users in arm and avr32 to get this
      sorted out.
      Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      288ddad5
  12. 22 4月, 2009 1 次提交
  13. 03 4月, 2009 1 次提交
  14. 16 1月, 2009 1 次提交
    • B
      avr32: Fix out-of-range rcalls in large kernels · 8d29b7b9
      Ben Nizette 提交于
      Replace handcoded rcall instructions with the call pseudo-instruction.
      For kernels too far over 1MB the rcall instruction can't reach and
      linking will fail.  We already call the final linker with --relax which
      converts call pseudo-instructions to the right things anyway.
      
      This fixes
      
      arch/avr32/kernel/built-in.o: In function `syscall_exit_work':
      (.ex.text+0x198): relocation truncated to fit: R_AVR32_22H_PCREL against symbol `schedule' defined in .sched.text section in kernel/built-in.o
      arch/avr32/kernel/built-in.o: In function `fault_exit_work':
      (.ex.text+0x3b6): relocation truncated to fit: R_AVR32_22H_PCREL against symbol `schedule' defined in .sched.text section in kernel/built-in.o
      
      But I'm still left with
      
      arch/avr32/kernel/built-in.o:(.fixup+0x2): relocation truncated to fit: R_AVR32_22H_PCREL against `.text'+45a
      arch/avr32/kernel/built-in.o:(.fixup+0x8): relocation truncated to fit: R_AVR32_22H_PCREL against `.text'+8ea
      arch/avr32/kernel/built-in.o:(.fixup+0xe): relocation truncated to fit: R_AVR32_22H_PCREL against `.text'+abe
      arch/avr32/kernel/built-in.o:(.fixup+0x14): relocation truncated to fit: R_AVR32_22H_PCREL against `.text'+ac8
      arch/avr32/kernel/built-in.o:(.fixup+0x1a): relocation truncated to fit: R_AVR32_22H_PCREL against `.text'+ad2
      arch/avr32/kernel/built-in.o:(.fixup+0x20): relocation truncated to fit: R_AVR32_22H_PCREL against `.text'+adc
      arch/avr32/kernel/built-in.o:(.fixup+0x26): relocation truncated to fit: R_AVR32_22H_PCREL against `.text'+ae6
      arch/avr32/kernel/built-in.o:(.fixup+0x2c): relocation truncated to fit: R_AVR32_22H_PCREL against `.text'+af0
      arch/avr32/kernel/built-in.o:(.fixup+0x32): additional relocation overflows omitted from the output
      
      These are caused by a similar problem with 'rjmp' instructions.
      Unfortunately, there's no easy fix for these at the moment since we
      don't have a arbitrary-range 'jmp' instruction similar to 'call'.
      Signed-off-by: NBen Nizette <bn@niasdigital.com>
      Signed-off-by: NHaavard Skinnemoen <haavard.skinnemoen@atmel.com>
      8d29b7b9
  15. 13 1月, 2009 1 次提交
  16. 07 1月, 2009 1 次提交
    • R
      remove linux/hardirq.h from asm-generic/local.h · ba84be23
      Russell King 提交于
      While looking at reducing the amount of architecture namespace pollution
      in the generic kernel, I found that asm/irq.h is included in the vast
      majority of compilations on ARM (around 650 files.)
      
      Since asm/irq.h includes a sub-architecture include file on ARM, this
      causes a negative impact on the ccache's ability to re-use the build
      results from other sub-architectures, so we have a desire to reduce the
      dependencies on asm/irq.h.
      
      It turns out that a major cause of this is the needless include of
      linux/hardirq.h into asm-generic/local.h.  The patch below removes this
      include, resulting in some 250 to 300 files (around half) of the kernel
      then omitting asm/irq.h.
      
      My test builds still succeed, provided two ARM files are fixed
      (arch/arm/kernel/traps.c and arch/arm/mm/fault.c) - so there may be
      negative impacts for this on other architectures.
      
      Note that x86 does not include asm/irq.h nor linux/hardirq.h in its
      asm/local.h, so this patch can be viewed as bringing the generic version
      into line with the x86 version.
      
      [kosaki.motohiro@jp.fujitsu.com: add #include <linux/irqflags.h> to acpi/processor_idle.c]
      [adobriyan@gmail.com: fix sparc64]
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba84be23
  17. 01 1月, 2009 1 次提交
  18. 17 12月, 2008 1 次提交
  19. 13 12月, 2008 1 次提交
  20. 09 10月, 2008 1 次提交
  21. 22 9月, 2008 1 次提交
  22. 20 9月, 2008 4 次提交
  23. 01 9月, 2008 1 次提交
  24. 05 8月, 2008 1 次提交
  25. 25 7月, 2008 1 次提交
  26. 22 7月, 2008 1 次提交
    • A
      sysdev: Pass the attribute to the low level sysdev show/store function · 4a0b2b4d
      Andi Kleen 提交于
      This allow to dynamically generate attributes and share show/store
      functions between attributes. Right now most attributes are generated
      by special macros and lots of duplicated code. With the attribute
      passed it's instead possible to attach some data to the attribute
      and then use that in shared low level functions to do different things.
      
      I need this for the dynamically generated bank attributes in the x86
      machine check code, but it'll allow some further cleanups.
      
      I converted all users in tree to the new show/store prototype. It's a single
      huge patch to avoid unbisectable sections.
      
      Runtime tested: x86-32, x86-64
      Compiled only: ia64, powerpc
      Not compile tested/only grep converted: sh, arm, avr32
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      4a0b2b4d
  27. 19 7月, 2008 1 次提交
    • T
      nohz: prevent tick stop outside of the idle loop · b8f8c3cf
      Thomas Gleixner 提交于
      Jack Ren and Eric Miao tracked down the following long standing
      problem in the NOHZ code:
      
      	scheduler switch to idle task
      	enable interrupts
      
      Window starts here
      
      	----> interrupt happens (does not set NEED_RESCHED)
      	      	irq_exit() stops the tick
      
      	----> interrupt happens (does set NEED_RESCHED)
      
      	return from schedule()
      	
      	cpu_idle(): preempt_disable();
      
      Window ends here
      
      The interrupts can happen at any point inside the race window. The
      first interrupt stops the tick, the second one causes the scheduler to
      rerun and switch away from idle again and we end up with the tick
      disabled.
      
      The fact that it needs two interrupts where the first one does not set
      NEED_RESCHED and the second one does made the bug obscure and extremly
      hard to reproduce and analyse. Kudos to Jack and Eric.
      
      Solution: Limit the NOHZ functionality to the idle loop to make sure
      that we can not run into such a situation ever again.
      
      cpu_idle()
      {
      	preempt_disable();
      
      	while(1) {
      		 tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
      		 			          are in the idle loop
      
      		 while (!need_resched())
      		       halt();
      
      		 tick_nohz_restart_sched_tick(); <- disables NOHZ mode
      		 preempt_enable_no_resched();
      		 schedule();
      		 preempt_disable();
      	}
      }
      
      In hindsight we should have done this forever, but ... 
      
      /me grabs a large brown paperbag.
      
      Debugged-by: Jack Ren <jack.ren@marvell.com>, 
      Debugged-by: Neric miao <eric.y.miao@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      b8f8c3cf
  28. 18 7月, 2008 1 次提交
    • H
      fix core/stacktrace changes on avr32, mips, sh · 8b95d917
      Heiko Carstens 提交于
      Fixes this type of problem:
      
        CC      arch/s390/kernel/stacktrace.o
      arch/s390/kernel/stacktrace.c:84: warning: data definition has no type or storage class
      arch/s390/kernel/stacktrace.c:84: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
      arch/s390/kernel/stacktrace.c:84: warning: parameter names (without types) in function declaration
      arch/s390/kernel/stacktrace.c:97: warning: data definition has no type or storage class
      arch/s390/kernel/stacktrace.c:97: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
      arch/s390/kernel/stacktrace.c:97: warning: parameter names (without types) in function declaration
      
      caused by "stacktrace: export save_stack_trace[_tsk]"
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8b95d917
  29. 03 7月, 2008 1 次提交
  30. 02 7月, 2008 2 次提交
    • H
      avr32: Cover the kernel page tables in the user PGDs · a9a934f2
      Haavard Skinnemoen 提交于
      Expand the per-process PGDs so that they cover the kernel virtual
      memory area as well. This simplifies the TLB miss handler fastpath
      since it doesn't have to check for kernel addresses anymore.
      
      If a TLB miss happens on a kernel address and a second-level page
      table can't be found, we check swapper_pg_dir and copy the PGD entry
      into the user PGD if it can be found there.
      Signed-off-by: NHaavard Skinnemoen <haavard.skinnemoen@atmel.com>
      a9a934f2
    • H
      avr32: Store virtual addresses in the PGD · cfd23e93
      Haavard Skinnemoen 提交于
      Instead of storing physical addresses along with page flags in the
      PGD, store virtual addresses and use NULL to indicate a not present
      second-level page table. A non-page-aligned page table indicates a bad
      PMD.
      
      This simplifies the TLB miss handler since it no longer has to check
      the Present bit and no longer has to convert the PGD entry from
      physical to virtual address. Instead, it has to check for a NULL
      entry, which is slightly cheaper than either.
      Signed-off-by: NHaavard Skinnemoen <haavard.skinnemoen@atmel.com>
      cfd23e93
  31. 27 6月, 2008 3 次提交
  32. 26 5月, 2008 1 次提交