1. 26 11月, 2010 1 次提交
  2. 20 11月, 2010 1 次提交
    • L
      Revert "kernel: make /proc/kallsyms mode 400 to reduce ease of attacking" · 33e0d57f
      Linus Torvalds 提交于
      This reverts commit 59365d13.
      
      It turns out that this can break certain existing user land setups.
      Quoth Sarah Sharp:
      
       "On Wednesday, I updated my branch to commit 460781b5 from linus' tree,
        and my box would not boot.  klogd segfaulted, which stalled the whole
        system.
      
        At first I thought it actually hung the box, but it continued booting
        after 5 minutes, and I was able to log in.  It dropped back to the
        text console instead of the graphical bootup display for that period
        of time.  dmesg surprisingly still works.  I've bisected the problem
        down to this commit (commit 59365d13)
      
        The box is running klogd 1.5.5ubuntu3 (from Jaunty).  Yes, I know
        that's old.  I read the bit in the commit about changing the
        permissions of kallsyms after boot, but if I can't boot that doesn't
        help."
      
      So let's just keep the old default, and encourage distributions to do
      the "chmod -r /proc/kallsyms" in their bootup scripts.  This is not
      worth a kernel option to change default behavior, since it's so easily
      done in user space.
      Reported-and-bisected-by: NSarah Sharp <sarah.a.sharp@linux.intel.com>
      Cc: Marcus Meissner <meissner@suse.de>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Eugene Teo <eugeneteo@kernel.org>
      Cc: Jesper Juhl <jj@chaosbits.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      33e0d57f
  3. 18 11月, 2010 3 次提交
  4. 17 11月, 2010 1 次提交
    • M
      kernel: make /proc/kallsyms mode 400 to reduce ease of attacking · 59365d13
      Marcus Meissner 提交于
      Making /proc/kallsyms readable only for root by default makes it
      slightly harder for attackers to write generic kernel exploits by
      removing one source of knowledge where things are in the kernel.
      
      This is the second submit, discussion happened on this on first submit
      and mostly concerned that this is just one hole of the sieve ...  but
      one of the bigger ones.
      
      Changing the permissions of at least System.map and vmlinux is also
      required to fix the same set, but a packaging issue.
      
      Target of this starter patch and follow ups is removing any kind of
      kernel space address information leak from the kernel.
      
      [ Side note: the default of root-only reading is the "safe" value, and
        it's easy enough to then override at any time after boot.  The /proc
        filesystem allows root to change the permissions with a regular
        chmod, so you can "revert" this at run-time by simply doing
      
          chmod og+r /proc/kallsyms
      
        as root if you really want regular users to see the kernel symbols.
        It does help some tools like "perf" figure them out without any
        setup, so it may well make sense in some situations.  - Linus ]
      Signed-off-by: NMarcus Meissner <meissner@suse.de>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NEugene Teo <eugeneteo@kernel.org>
      Reviewed-by: NJesper Juhl <jj@chaosbits.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      59365d13
  5. 16 11月, 2010 3 次提交
  6. 12 11月, 2010 3 次提交
  7. 11 11月, 2010 5 次提交
    • P
      sched: Fix cross-sched-class wakeup preemption · 1e5a7405
      Peter Zijlstra 提交于
      Instead of dealing with sched classes inside each check_preempt_curr()
      implementation, pull out this logic into the generic wakeup preemption
      path.
      
      This fixes a hang in KVM (and others) where we are waiting for the
      stop machine thread to run ...
      Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Tested-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Tested-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1288891946.2039.31.camel@laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1e5a7405
    • M
      PM / OPP: Hide OPP configuration when SoCs do not provide an implementation · 43e60861
      Mark Brown 提交于
      Since the OPP API is only useful with an appropraite SoC-specific
      implementation there is no point in offering the ability to enable
      the API on general systems. Provide an ARCH_HAS OPP Kconfig symbol
      which masks out the option unless selected by an implementation.
      Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
      Acked-by: NNishanth Menon <nm@ti.com>
      Acked-by: NKevin Hilman <khilman@deeprootsystems.com>
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      43e60861
    • P
      sched: Fix runnable condition for stoptask · 2d467090
      Peter Zijlstra 提交于
      Heiko reported that the TASK_RUNNING check is not sufficient for
      CONFIG_PREEMPT=y since we can get preempted with !TASK_RUNNING.
      
      He suggested adding a ->se.on_rq test to the existing TASK_RUNNING
      one, however TASK_RUNNING will always have ->se.on_rq, so we might as
      well reduce that to a single test.
      
      [ stop tasks should never get preempted, but its good to handle
        this case correctly should this ever happen ]
      Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2d467090
    • S
      sched: Use group weight, idle cpu metrics to fix imbalances during idle · aae6d3dd
      Suresh Siddha 提交于
      Currently we consider a sched domain to be well balanced when the imbalance
      is less than the domain's imablance_pct. As the number of cores and threads
      are increasing, current values of imbalance_pct (for example 25% for a
      NUMA domain) are not enough to detect imbalances like:
      
      a) On a WSM-EP system (two sockets, each having 6 cores and 12 logical threads),
      24 cpu-hogging tasks get scheduled as 13 on one socket and 11 on another
      socket. Leading to an idle HT cpu.
      
      b) On a hypothetial 2 socket NHM-EX system (each socket having 8 cores and
      16 logical threads), 16 cpu-hogging tasks can get scheduled as 9 on one
      socket and 7 on another socket. Leaving one core in a socket idle
      whereas in another socket we have a core having both its HT siblings busy.
      
      While this issue can be fixed by decreasing the domain's imbalance_pct
      (by making it a function of number of logical cpus in the domain), it
      can potentially cause more task migrations across sched groups in an
      overloaded case.
      
      Fix this by using imbalance_pct only during newly_idle and busy
      load balancing. And during idle load balancing, check if there
      is an imbalance in number of idle cpu's across the busiest and this
      sched_group or if the busiest group has more tasks than its weight that
      the idle cpu in this_group can pull.
      Reported-by: NNikhil Rao <ncrao@google.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1284760952.2676.11.camel@sbsiddha-MOBL3.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      aae6d3dd
    • S
      perf_events: Fix time tracking in samples · eed01528
      Stephane Eranian 提交于
      This patch corrects time tracking in samples. Without this patch
      both time_enabled and time_running are bogus when user asks for
      PERF_SAMPLE_READ.
      
      One uses PERF_SAMPLE_READ to sample the values of other counters
      in each sample. Because of multiplexing, it is necessary to know
      both time_enabled, time_running to be able to scale counts correctly.
      
      In this second version of the patch, we maintain a shadow
      copy of ctx->time which allows us to compute ctx->time without
      calling update_context_time() from NMI context. We avoid the
      issue that update_context_time() must always be called with
      ctx->lock held.
      
      We do not keep shadow copies of the other event timings
      because if the lead event is overflowing then it is active
      and thus it's been scheduled in via event_sched_in() in
      which case neither tstamp_stopped, tstamp_running can be modified.
      
      This timing logic only applies to samples when PERF_SAMPLE_READ
      is used.
      
      Note that this patch does not address timing issues related
      to sampling inheritance between tasks. This will be addressed
      in a future patch.
      
      With this patch, the libpfm4 example task_smpl now reports
      correct counts (shown on 2.4GHz Core 2):
      
      $ task_smpl -p 2400000000 -e unhalted_core_cycles:u,instructions_retired:u,baclears  noploop 5
      noploop for 5 seconds
      IIP:0x000000004006d6 PID:5596 TID:5596 TIME:466,210,211,430 STREAM_ID:33 PERIOD:2,400,000,000 ENA=1,010,157,814 RUN=1,010,157,814 NR=3
      	2,400,000,254 unhalted_core_cycles:u (33)
      	2,399,273,744 instructions_retired:u (34)
      	53,340 baclears (35)
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4cc6e14b.1e07e30a.256e.5190@mx.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      eed01528
  8. 10 11月, 2010 3 次提交
    • C
      block: remove REQ_HARDBARRIER · 02e031cb
      Christoph Hellwig 提交于
      REQ_HARDBARRIER is dead now, so remove the leftovers.  What's left
      at this point is:
      
       - various checks inside the block layer.
       - sanity checks in bio based drivers.
       - now unused bio_empty_barrier helper.
       - Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while,
         but Xen really needs to sort out it's barrier situaton.
       - setting of ordered tags in uas - dead code copied from old scsi
         drivers.
       - scsi different retry for barriers - it's dead and should have been
         removed when flushes were converted to FS requests.
       - blktrace handling of barriers - removed.  Someone who knows blktrace
         better should add support for REQ_FLUSH and REQ_FUA, though.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      02e031cb
    • D
      futex: Address compiler warnings in exit_robust_list · 4c115e95
      Darren Hart 提交于
      Since commit 1dcc41bb (futex: Change 3rd arg of fetch_robust_entry()
      to unsigned int*) some gcc versions decided to emit the following
      warning:
      
      kernel/futex.c: In function ‘exit_robust_list’:
      kernel/futex.c:2492: warning: ‘next_pi’ may be used uninitialized in this function
      
      The commit did not introduce the warning as gcc should have warned
      before that commit as well. It's just gcc being silly.
      
      The code path really can't result in next_pi being unitialized (or
      should not), but let's keep the build clean. Annotate next_pi as an
      uninitialized_var.
      
      [ tglx: Addressed the same issue in futex_compat.c and massaged the
        	changelog ]
      Signed-off-by: NDarren Hart <dvhart@linux.intel.com>
      Tested-by: NMatt Fleming <matt@console-pimps.org>
      Tested-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      LKML-Reference: <1288897200-13008-1-git-send-email-dvhart@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      4c115e95
    • H
      [S390] ftrace: build without frame pointers on s390 · becf91f1
      Heiko Carstens 提交于
      s390 doesn't need FRAME_POINTERS in order to have a working function tracer.
      We don't need frame pointers in order to get strack traces since we always
      have valid backchains by using the -mkernel-backchain gcc option.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      becf91f1
  9. 06 11月, 2010 2 次提交
    • D
      watchdog: Fix section mismatch and potential undefined behavior. · 433039e9
      David Daney 提交于
      Commit d9ca07a0 ("watchdog: Avoid kernel crash when disabling
      watchdog") introduces a section mismatch.
      
      Now that we reference no_watchdog from non-__init code it can no longer
      be __initdata.
      Signed-off-by: NDavid Daney <ddaney@caviumnetworks.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      433039e9
    • O
      posix-cpu-timers: workaround to suppress the problems with mt exec · e0a70217
      Oleg Nesterov 提交于
      posix-cpu-timers.c correctly assumes that the dying process does
      posix_cpu_timers_exit_group() and removes all !CPUCLOCK_PERTHREAD
      timers from signal->cpu_timers list.
      
      But, it also assumes that timer->it.cpu.task is always the group
      leader, and thus the dead ->task means the dead thread group.
      
      This is obviously not true after de_thread() changes the leader.
      After that almost every posix_cpu_timer_ method has problems.
      
      It is not simple to fix this bug correctly. First of all, I think
      that timer->it.cpu should use struct pid instead of task_struct.
      Also, the locking should be reworked completely. In particular,
      tasklist_lock should not be used at all. This all needs a lot of
      nontrivial and hard-to-test changes.
      
      Change __exit_signal() to do posix_cpu_timers_exit_group() when
      the old leader dies during exec. This is not the fix, just the
      temporary hack to hide the problem for 2.6.37 and stable. IOW,
      this is obviously wrong but this is what we currently have anyway:
      cpu timers do not work after mt exec.
      
      In theory this change adds another race. The exiting leader can
      detach the timers which were attached to the new leader. However,
      the window between de_thread() and release_task() is small, we
      can pretend that sys_timer_create() was called before de_thread().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e0a70217
  10. 05 11月, 2010 1 次提交
  11. 30 10月, 2010 12 次提交
  12. 29 10月, 2010 1 次提交
  13. 28 10月, 2010 4 次提交