1. 29 8月, 2011 2 次提交
  2. 27 8月, 2011 1 次提交
  3. 26 8月, 2011 2 次提交
    • N
      kernel/printk: do not turn off bootconsole in printk_late_init() if keep_bootcon · 4c30c6f5
      Nishanth Aravamudan 提交于
      It seems that 7bf69395 ("console: allow to retain boot console via
      boot option keep_bootcon") doesn't always achieve what it aims, as when
      printk_late_init() runs it unconditionally turns off all boot consoles.
      With this patch, I am able to see more messages on the boot console in
      KVM guests than I can without, when keep_bootcon is specified.
      
      I think it is appropriate for the relevant -stable trees.  However, it's
      more of an annoyance than a serious bug (ideally you don't need to keep
      the boot console around as console handover should be working -- I was
      encountering a situation where the console handover wasn't working and
      not having the boot console available meant I couldn't see why).
      Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Greg KH <gregkh@suse.de>
      Acked-by: NFabio M. Di Nitto <fdinitto@redhat.com>
      Cc: <stable@kernel.org>		[2.6.39.x, 3.0.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4c30c6f5
    • A
      Add a personality to report 2.6.x version numbers · be27425d
      Andi Kleen 提交于
      I ran into a couple of programs which broke with the new Linux 3.0
      version.  Some of those were binary only.  I tried to use LD_PRELOAD to
      work around it, but it was quite difficult and in one case impossible
      because of a mix of 32bit and 64bit executables.
      
      For example, all kind of management software from HP doesnt work, unless
      we pretend to run a 2.6 kernel.
      
        $ uname -a
        Linux svivoipvnx001 3.0.0-08107-g97cd98f #1062 SMP Fri Aug 12 18:11:45 CEST 2011 i686 i686 i386 GNU/Linux
      
        $ hpacucli ctrl all show
      
        Error: No controllers detected.
      
        $ rpm -qf /usr/sbin/hpacucli
        hpacucli-8.75-12.0
      
      Another notable case is that Python now reports "linux3" from
      sys.platform(); which in turn can break things that were checking
      sys.platform() == "linux2":
      
        https://bugzilla.mozilla.org/show_bug.cgi?id=664564
      
      It seems pretty clear to me though it's a bug in the apps that are using
      '==' instead of .startswith(), but this allows us to unbreak broken
      programs.
      
      This patch adds a UNAME26 personality that makes the kernel report a
      2.6.40+x version number instead.  The x is the x in 3.x.
      
      I know this is somewhat ugly, but I didn't find a better workaround, and
      compatibility to existing programs is important.
      
      Some programs also read /proc/sys/kernel/osrelease.  This can be worked
      around in user space with mount --bind (and a mount namespace)
      
      To use:
      
        wget ftp://ftp.kernel.org/pub/linux/kernel/people/ak/uname26/uname26.c
        gcc -o uname26 uname26.c
        ./uname26 program
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      be27425d
  4. 24 8月, 2011 1 次提交
    • L
      Revert "irq: Always set IRQF_ONESHOT if no primary handler is specified" · 69dd3d8e
      Linus Torvalds 提交于
      This reverts commit f3637a5f.
      
      It turns out that this breaks several drivers, one example being OMAP
      boards which use the on-board OMAP UARTs and the omap-serial driver that
      will not boot to userspace after the commit.
      
      Paul Walmsley reports that enabling CONFIG_DEBUG_SHIRQ reveals 'IRQ
      handler type mismatch' errors:
      
        IRQ handler type mismatch for IRQ 74
        current handler: serial idle
        ...
      
      and the reason is that setting IRQF_ONESHOT will now result in those
      interrupt handlers having different IRQF flags, and thus being
      unsharable.  So the commit log in the reverted commit:
      
                                  "Since it is required for those users and
          there is no difference for others it makes sense to add this flag
          unconditionally."
      
      is simply not true: there may not be any difference from a "actions at
      irq time", but there is a *big* difference wrt this flag testing irq
      management (see __setup_irq() in kernel/irq/manage.c).
      
      One solution may be to stop verifying IRQF_ONESHOT in __setup_irq(), but
      right now the safe course of action is to revert the change.  Let's
      revisit this in a later merge window.
      Reported-by: NPaul Walmsley <paul@pwsan.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Requested-by: NAlan Cox <alan@lxorguk.ukuu.org.uk>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      69dd3d8e
  5. 19 8月, 2011 1 次提交
  6. 14 8月, 2011 1 次提交
  7. 13 8月, 2011 1 次提交
    • C
      xfs: remove subdirectories · c59d87c4
      Christoph Hellwig 提交于
      Use the move from Linux 2.6 to Linux 3.x as an excuse to kill the
      annoying subdirectories in the XFS source code.  Besides the large
      amount of file rename the only changes are to the Makefile, a few
      files including headers with the subdirectory prefix, and the binary
      sysctl compat code that includes a header under fs/xfs/ from
      kernel/.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      c59d87c4
  8. 12 8月, 2011 1 次提交
    • V
      move RLIMIT_NPROC check from set_user() to do_execve_common() · 72fa5997
      Vasiliy Kulikov 提交于
      The patch http://lkml.org/lkml/2003/7/13/226 introduced an RLIMIT_NPROC
      check in set_user() to check for NPROC exceeding via setuid() and
      similar functions.
      
      Before the check there was a possibility to greatly exceed the allowed
      number of processes by an unprivileged user if the program relied on
      rlimit only.  But the check created new security threat: many poorly
      written programs simply don't check setuid() return code and believe it
      cannot fail if executed with root privileges.  So, the check is removed
      in this patch because of too often privilege escalations related to
      buggy programs.
      
      The NPROC can still be enforced in the common code flow of daemons
      spawning user processes.  Most of daemons do fork()+setuid()+execve().
      The check introduced in execve() (1) enforces the same limit as in
      setuid() and (2) doesn't create similar security issues.
      
      Neil Brown suggested to track what specific process has exceeded the
      limit by setting PF_NPROC_EXCEEDED process flag.  With the change only
      this process would fail on execve(), and other processes' execve()
      behaviour is not changed.
      
      Solar Designer suggested to re-check whether NPROC limit is still
      exceeded at the moment of execve().  If the process was sleeping for
      days between set*uid() and execve(), and the NPROC counter step down
      under the limit, the defered execve() failure because NPROC limit was
      exceeded days ago would be unexpected.  If the limit is not exceeded
      anymore, we clear the flag on successful calls to execve() and fork().
      
      The flag is also cleared on successful calls to set_user() as the limit
      was exceeded for the previous user, not the current one.
      
      Similar check was introduced in -ow patches (without the process flag).
      
      v3 - clear PF_NPROC_EXCEEDED on successful calls to set_user().
      Reviewed-by: NJames Morris <jmorris@namei.org>
      Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
      Acked-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72fa5997
  9. 11 8月, 2011 1 次提交
    • N
      blktrace: add FLUSH/FUA support · c09c47ca
      Namhyung Kim 提交于
      Add FLUSH/FUA support to blktrace. As FLUSH precedes WRITE and/or
      FUA follows WRITE, use the same 'F' flag for both cases and
      distinguish them by their (relative) position. The end results
      look like (other flags might be shown also):
      
       - WRITE:            W
       - WRITE_FLUSH:      FW
       - WRITE_FUA:        WF
       - WRITE_FLUSH_FUA:  FWF
      
      Note that we reuse TC_BARRIER due to lack of bit space of act_mask
      so that the older versions of blktrace tools will report flush
      requests as barriers from now on.
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      c09c47ca
  10. 10 8月, 2011 1 次提交
  11. 09 8月, 2011 1 次提交
  12. 06 8月, 2011 1 次提交
    • J
      jump label: Reduce the cycle count by changing the link order · b77f0f3c
      Jason Baron 提交于
      In the course of testing jump labels for use with the CFS
      bandwidth controller, Paul Turner, discovered that using jump
      labels reduced the branch count and the instruction count, but
      did not reduce the cycle count or wall time.
      
      I noticed that having the jump_label.o included in the kernel
      but not used in any way still caused this increase in cycle
      count and wall time. Thus, I moved jump_label.o in the
      kernel/Makefile, thus changing the link order, and presumably
      moving it out of hot icache areas. This brought down the cycle
      count/time as expected.
      
      In addition to Paul's testing,  I've tested the patch using a
      single 'static_branch()' in the getppid() path, and basically
      running tight loops of calls to getppid(). Here are my results
      for the branch disabled case:
      
      With jump labels turned on (CONFIG_JUMP_LABEL), branch disabled:
      
       Performance counter stats for 'bash -c /tmp/getppid;true' (50 runs):
      
           3,969,510,217 instructions             #	   0.864 IPC     ( +-0.000% )
           4,592,334,954 cycles                     ( +-   0.046% )
             751,634,470 branches                   ( +-   0.000% )
      
              1.722635797  seconds time elapsed   ( +-   0.046% )
      
      Jump labels turned off (CONFIG_JUMP_LABEL not set), branch
      disabled:
      
       Performance counter stats for 'bash -c /tmp/getppid;true' (50 runs):
      
           4,009,611,846 instructions             #	   0.867 IPC     ( +-0.000% )
           4,622,210,580 cycles                     ( +-   0.012% )
             771,662,904 branches                   ( +-   0.000% )
      
              1.734341454  seconds time elapsed   ( +-   0.022% )
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Cc: rth@redhat.com
      Cc: a.p.zijlstra@chello.nl
      Cc: rostedt@goodmis.org
      Link: http://lkml.kernel.org/r/20110805204040.GG2522@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      Tested-by: NPaul Turner <pjt@google.com>
      b77f0f3c
  13. 04 8月, 2011 6 次提交
  14. 02 8月, 2011 4 次提交
  15. 31 7月, 2011 1 次提交
  16. 28 7月, 2011 6 次提交
  17. 27 7月, 2011 7 次提交
    • A
      atomic: use <linux/atomic.h> · 60063497
      Arun Sharma 提交于
      This allows us to move duplicated code in <asm/atomic.h>
      (atomic_inc_not_zero() for now) to <linux/atomic.h>
      Signed-off-by: NArun Sharma <asharma@fb.com>
      Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Acked-by: NMike Frysinger <vapier@gentoo.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      60063497
    • H
      panic: panic=-1 for immediate reboot · 4302fbc8
      Hugh Dickins 提交于
      When a kernel BUG or oops occurs, ChromeOS intends to panic and
      immediately reboot, with stacktrace and other messages preserved in RAM
      across reboot.
      
      But the longer we delay, the more likely the user is to poweroff and
      lose the info.
      
      panic_timeout (seconds before rebooting) is set by panic= boot option or
      sysctl or /proc/sys/kernel/panic; but 0 means wait forever, so at
      present we have to delay at least 1 second.
      
      Let a negative number mean reboot immediately (with the small cosmetic
      benefit of suppressing that newline-less "Rebooting in %d seconds.."
      message).
      Signed-off-by: NHugh Dickins <hughd@chromium.org>
      Signed-off-by: NMandeep Singh Baines <msb@chromium.org>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Olaf Hering <olaf@aepfle.de>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Cc: Dave Airlie <airlied@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4302fbc8
    • V
      gcov: disable CONSTRUCTORS for UML · 947be5df
      Vitaliy Ivanov 提交于
      Selecting GCOV for UML causing configuration mismatch:
      
        warning: (GCOV_KERNEL) selects CONSTRUCTORS which has unmet direct dependencies (!UML)
      
      Constructors are not needed for UML.
      Signed-off-by: NVitaliy Ivanov <vitalivanov@gmail.com>
      Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Acked-by: NWANG Cong <xiyou.wangcong@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      947be5df
    • V
      ipc: introduce shm_rmid_forced sysctl · b34a6b1d
      Vasiliy Kulikov 提交于
      Add support for the shm_rmid_forced sysctl.  If set to 1, all shared
      memory objects in current ipc namespace will be automatically forced to
      use IPC_RMID.
      
      The POSIX way of handling shmem allows one to create shm objects and
      call shmdt(), leaving shm object associated with no process, thus
      consuming memory not counted via rlimits.
      
      With shm_rmid_forced=1 the shared memory object is counted at least for
      one process, so OOM killer may effectively kill the fat process holding
      the shared memory.
      
      It obviously breaks POSIX - some programs relying on the feature would
      stop working.  So set shm_rmid_forced=1 only if you're sure nobody uses
      "orphaned" memory.  Use shm_rmid_forced=0 by default for compatability
      reasons.
      
      The feature was previously impemented in -ow as a configure option.
      
      [akpm@linux-foundation.org: fix documentation, per Randy]
      [akpm@linux-foundation.org: fix warning]
      [akpm@linux-foundation.org: readability/conventionality tweaks]
      [akpm@linux-foundation.org: fix shm_rmid_forced/shm_forced_rmid confusion, use standard comment layout]
      Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Serge E. Hallyn" <serge.hallyn@canonical.com>
      Cc: Daniel Lezcano <daniel.lezcano@free.fr>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Solar Designer <solar@openwall.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b34a6b1d
    • D
    • M
      cpusets: randomize node rotor used in cpuset_mem_spread_node() · 778d3b0f
      Michal Hocko 提交于
      [ This patch has already been accepted as commit 0ac0c0d0 but later
        reverted (commit 35926ff5) because it itroduced arch specific
        __node_random which was defined only for x86 code so it broke other
        archs.  This is a followup without any arch specific code.  Other than
        that there are no functional changes.]
      
      Some workloads that create a large number of small files tend to assign
      too many pages to node 0 (multi-node systems).  Part of the reason is
      that the rotor (in cpuset_mem_spread_node()) used to assign nodes starts
      at node 0 for newly created tasks.
      
      This patch changes the rotor to be initialized to a random node number
      of the cpuset.
      
      [akpm@linux-foundation.org: fix layout]
      [Lee.Schermerhorn@hp.com: Define stub numa_random() for !NUMA configuration]
      [mhocko@suse.cz: Make it arch independent]
      [akpm@linux-foundation.org: fix CONFIG_NUMA=y, MAX_NUMNODES>1 build]
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Paul Menage <menage@google.com>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Paul Menage <menage@google.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Robin Holt <holt@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      778d3b0f
    • S
      futex: Fix regression with read only mappings · 9ea71503
      Shawn Bohrer 提交于
      commit 7485d0d3 (futexes: Remove rw
      parameter from get_futex_key()) in 2.6.33 fixed two problems:  First, It
      prevented a loop when encountering a ZERO_PAGE. Second, it fixed RW
      MAP_PRIVATE futex operations by forcing the COW to occur by
      unconditionally performing a write access get_user_pages_fast() to get
      the page.  The commit also introduced a user-mode regression in that it
      broke futex operations on read-only memory maps.  For example, this
      breaks workloads that have one or more reader processes doing a
      FUTEX_WAIT on a futex within a read only shared file mapping, and a
      writer processes that has a writable mapping issuing the FUTEX_WAKE.
      
      This fixes the regression for valid futex operations on RO mappings by
      trying a RO get_user_pages_fast() when the RW get_user_pages_fast()
      fails. This change makes it necessary to also check for invalid use
      cases, such as anonymous RO mappings (which can never change) and the
      ZERO_PAGE which the commit referenced above was written to address.
      
      This patch does restore the original behavior with RO MAP_PRIVATE
      mappings, which have inherent user-mode usage problems and don't really
      make sense.  With this patch performing a FUTEX_WAIT within a RO
      MAP_PRIVATE mapping will be successfully woken provided another process
      updates the region of the underlying mapped file.  However, the mmap()
      man page states that for a MAP_PRIVATE mapping:
      
        It is unspecified whether changes made to the file after
        the mmap() call are visible in the mapped region.
      
      So user-mode users attempting to use futex operations on RO MAP_PRIVATE
      mappings are depending on unspecified behavior.  Additionally a
      RO MAP_PRIVATE mapping could fail to wake up in the following case.
      
        Thread-A: call futex(FUTEX_WAIT, memory-region-A).
                  get_futex_key() return inode based key.
                  sleep on the key
        Thread-B: call mprotect(PROT_READ|PROT_WRITE, memory-region-A)
        Thread-B: write memory-region-A.
                  COW happen. This process's memory-region-A become related
                  to new COWed private (ie PageAnon=1) page.
        Thread-B: call futex(FUETX_WAKE, memory-region-A).
                  get_futex_key() return mm based key.
                  IOW, we fail to wake up Thread-A.
      
      Once again doing something like this is just silly and users who do
      something like this get what they deserve.
      
      While RO MAP_PRIVATE mappings are nonsensical, checking for a private
      mapping requires walking the vmas and was deemed too costly to avoid a
      userspace hang.
      
      This Patch is based on Peter Zijlstra's initial patch with modifications to
      only allow RO mappings for futex operations that need VERIFY_READ access.
      Reported-by: NDavid Oliver <david@rgmadvisors.com>
      Signed-off-by: NShawn Bohrer <sbohrer@rgmadvisors.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NDarren Hart <dvhart@linux.intel.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: peterz@infradead.org
      Cc: eric.dumazet@gmail.com
      Cc: zvonler@rgmadvisors.com
      Cc: hughd@google.com
      Link: http://lkml.kernel.org/r/1309450892-30676-1-git-send-email-sbohrer@rgmadvisors.com
      Cc: stable@kernel.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      9ea71503
  18. 26 7月, 2011 2 次提交