1. 01 7月, 2011 5 次提交
  2. 28 6月, 2011 1 次提交
    • K
      Fix node_start/end_pfn() definition for mm/page_cgroup.c · c6830c22
      KAMEZAWA Hiroyuki 提交于
      commit 21a3c964 uses node_start/end_pfn(nid) for detection start/end
      of nodes. But, it's not defined in linux/mmzone.h but defined in
      /arch/???/include/mmzone.h which is included only under
      CONFIG_NEED_MULTIPLE_NODES=y.
      
      Then, we see
        mm/page_cgroup.c: In function 'page_cgroup_init':
        mm/page_cgroup.c:308: error: implicit declaration of function 'node_start_pfn'
        mm/page_cgroup.c:309: error: implicit declaration of function 'node_end_pfn'
      
      So, fixiing page_cgroup.c is an idea...
      
      But node_start_pfn()/node_end_pfn() is a very generic macro and
      should be implemented in the same manner for all archs.
      (m32r has different implementation...)
      
      This patch removes definitions of node_start/end_pfn() in each archs
      and defines a unified one in linux/mmzone.h. It's not under
      CONFIG_NEED_MULTIPLE_NODES, now.
      
      A result of macro expansion is here (mm/page_cgroup.c)
      
      for !NUMA
       start_pfn = ((&contig_page_data)->node_start_pfn);
        end_pfn = ({ pg_data_t *__pgdat = (&contig_page_data); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});
      
      for NUMA (x86-64)
        start_pfn = ((node_data[nid])->node_start_pfn);
        end_pfn = ({ pg_data_t *__pgdat = (node_data[nid]); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});
      
      Changelog:
       - fixed to avoid using "nid" twice in node_end_pfn() macro.
      Reported-and-acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Reported-and-tested-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c6830c22
  3. 20 6月, 2011 4 次提交
  4. 19 6月, 2011 1 次提交
  5. 17 6月, 2011 1 次提交
    • K
      xen/setup: Fix for incorrect xen_extra_mem_start. · acd049c6
      Konrad Rzeszutek Wilk 提交于
      The earlier attempts (24bdb0b6)
      at fixing this problem caused other problems to surface (PV guests
      with no PCI passthrough would have SWIOTLB turned on - which meant
      64MB of precious contingous DMA32 memory being eaten up per guest).
      The problem was: "on xen we add an extra memory region at the end of
      the e820, and on this particular machine this extra memory region
      would start below 4g and cross over the 4g boundary:
      
      [0xfee01000-0x192655000)
      
      Unfortunately e820_end_of_low_ram_pfn does not expect an
      e820 layout like that so it returns 4g, therefore initial_memory_mapping
      will map [0 - 0x100000000), that is a memory range that includes some
      reserved memory regions."
      
      The memory range was the IOAPIC regions, and with the 1-1 mapping
      turned on, it would map them as RAM, not as MMIO regions. This caused
      the hypervisor to complain. Fortunately this is experienced only under
      the initial domain so we guard for it.
      Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      acd049c6
  6. 16 6月, 2011 2 次提交
  7. 15 6月, 2011 1 次提交
  8. 14 6月, 2011 1 次提交
  9. 10 6月, 2011 1 次提交
    • M
      exec: delay address limit change until point of no return · dac853ae
      Mathias Krause 提交于
      Unconditionally changing the address limit to USER_DS and not restoring
      it to its old value in the error path is wrong because it prevents us
      using kernel memory on repeated calls to this function.  This, in fact,
      breaks the fallback of hard coded paths to the init program from being
      ever successful if the first candidate fails to load.
      
      With this patch applied switching to USER_DS is delayed until the point
      of no return is reached which makes it possible to have a multi-arch
      rootfs with one arch specific init binary for each of the (hard coded)
      probed paths.
      
      Since the address limit is already set to USER_DS when start_thread()
      will be invoked, this redundancy can be safely removed.
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dac853ae
  10. 09 6月, 2011 2 次提交
  11. 08 6月, 2011 1 次提交
    • T
      x86: cpu-hotplug: Prevent softirq wakeup on wrong CPU · fd8a7de1
      Thomas Gleixner 提交于
      After a newly plugged CPU sets the cpu_online bit it enables
      interrupts and goes idle. The cpu which brought up the new cpu waits
      for the cpu_online bit and when it observes it, it sets the cpu_active
      bit for this cpu. The cpu_active bit is the relevant one for the
      scheduler to consider the cpu as a viable target.
      
      With forced threaded interrupt handlers which imply forced threaded
      softirqs we observed the following race:
      
      cpu 0                         cpu 1
      
      bringup(cpu1);
                                    set_cpu_online(smp_processor_id(), true);
      		              local_irq_enable();
      while (!cpu_online(cpu1));
                                    timer_interrupt()
                                      -> wake_up(softirq_thread_cpu1);
                                           -> enqueue_on(softirq_thread_cpu1, cpu0);
      
                                                                              ^^^^
      
      cpu_notify(CPU_ONLINE, cpu1);
        -> sched_cpu_active(cpu1)
           -> set_cpu_active((cpu1, true);
      
      When an interrupt happens before the cpu_active bit is set by the cpu
      which brought up the newly onlined cpu, then the scheduler refuses to
      enqueue the woken thread which is bound to that newly onlined cpu on
      that newly onlined cpu due to the not yet set cpu_active bit and
      selects a fallback runqueue. Not really an expected and desirable
      behaviour.
      
      So far this has only been observed with forced hard/softirq threading,
      but in theory this could happen without forced threaded hard/softirqs
      as well. It's probably unobservable as it would take a massive
      interrupt storm on the newly onlined cpu which causes the softirq loop
      to wake up the softirq thread and an even longer delay of the cpu
      which waits for the cpu_online bit.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: stable@kernel.org # 2.6.39
      fd8a7de1
  12. 07 6月, 2011 1 次提交
    • J
      x86/amd-iommu: Fix boot crash with hidden PCI devices · 26018874
      Joerg Roedel 提交于
      Some PCIe cards ship with a PCI-PCIe bridge which is not
      visible as a PCI device in Linux. But the device-id of the
      bridge is present in the IOMMU tables which causes a boot
      crash in the IOMMU driver.
      This patch fixes by removing these cards from the IOMMU
      handling. This is a pure -stable fix, a real fix to handle
      this situation appriatly will follow for the next merge
      window.
      
      Cc: stable@kernel.org	# > 2.6.32
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      26018874
  13. 06 6月, 2011 3 次提交
  14. 04 6月, 2011 1 次提交
  15. 02 6月, 2011 1 次提交
    • M
      x86/PCI/ACPI: fix type mismatch · 6e33a852
      Márton Németh 提交于
      The flags field of struct resource from linux/ioport.h is "unsigned
      long". Change the "type" parameter of coalesce_windows() function to
      match that field. This fixes the following warning messages when
      compiling with "make C=1 W=1 bzImage modules":
      
      arch/x86/pci/acpi.c: In function ‘coalesce_windows’:
      arch/x86/pci/acpi.c:198: warning: conversion to ‘long unsigned int’ from ‘int’ may change the sign of the result
      arch/x86/pci/acpi.c:203: warning: conversion to ‘long unsigned int’ from ‘int’ may change the sign of the result
      Signed-off-by: NMárton Németh <nm127@freemail.hu>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      6e33a852
  16. 30 5月, 2011 4 次提交
  17. 29 5月, 2011 6 次提交
    • L
      x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param · 5d4c47e0
      Len Brown 提交于
      mwait_idle() is a C1-only idle loop intended to be more efficient
      than HLT on SMP hardware that supports it.
      
      But mwait_idle() has been replaced by the more general
      mwait_idle_with_hints(), which handles both C1 and deeper C-states.
      ACPI uses only mwait_idle_with_hints(), and never uses mwait_idle().
      
      Deprecate mwait_idle() and the "idle=mwait" cmdline param
      to simplify the x86 idle code.
      
      After this change, kernels configured with
      (!CONFIG_ACPI=n && !CONFIG_INTEL_IDLE=n) when run on hardware
      that support MWAIT will simply use HLT.  If MWAIT is desired
      on those systems, cpuidle and the cpuidle drivers above
      can be used.
      
      cc: x86@kernel.org
      cc: stable@kernel.org # .39.x
      Signed-off-by: NLen Brown <len.brown@intel.com>
      5d4c47e0
    • L
      x86 idle: deprecate "no-hlt" cmdline param · cdaab4a0
      Len Brown 提交于
      We'd rather that modern machines not check if HLT works on
      every entry into idle, for the benefit of machines that had
      marginal electricals 15-years ago.  If those machines are still running
      the upstream kernel, they can use "idle=poll".  The only difference
      will be that they'll now invoke HLT in machine_hlt().
      
      cc: x86@kernel.org # .39.x
      Signed-off-by: NLen Brown <len.brown@intel.com>
      cdaab4a0
    • L
      x86 idle APM: deprecate CONFIG_APM_CPU_IDLE · 99c63221
      Len Brown 提交于
      We don't want to export the pm_idle function pointer to modules.
      Currently CONFIG_APM_CPU_IDLE w/ CONFIG_APM_MODULE forces us to.
      
      CONFIG_APM_CPU_IDLE is of dubious value, it runs only on 32-bit
      uniprocessor laptops that are over 10 years old.  It calls into
      the BIOS during idle, and is known to cause a number of machines
      to fail.
      
      Removing CONFIG_APM_CPU_IDLE and will allow us to stop exporting
      pm_idle.  Any systems that were calling into the APM BIOS
      at run-time will simply use HLT instead.
      
      cc: x86@kernel.org
      cc: Jiri Kosina <jkosina@suse.cz>
      cc: stable@kernel.org # .39.x
      Signed-off-by: NLen Brown <len.brown@intel.com>
      99c63221
    • L
      x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it · 06ae40ce
      Len Brown 提交于
      In the long run, we don't want default_idle() or (pm_idle)() to
      be exported outside of process.c.  Start by not exporting them
      to modules, unless the APM build demands it.
      
      cc: x86@kernel.org
      cc: Jiri Kosina <jkosina@suse.cz>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      06ae40ce
    • L
      x86 idle: clarify AMD erratum 400 workaround · 02c68a02
      Len Brown 提交于
      The workaround for AMD erratum 400 uses the term "c1e" falsely suggesting:
      1. Intel C1E is somehow involved
      2. All AMD processors with C1E are involved
      
      Use the string "amd_c1e" instead of simply "c1e" to clarify that
      this workaround is specific to AMD's version of C1E.
      Use the string "e400" to clarify that the workaround is specific
      to AMD processors with Erratum 400.
      
      This patch is text-substitution only, with no functional change.
      
      cc: x86@kernel.org
      Acked-by: NBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      02c68a02
    • E
      ns: Wire up the setns system call · 7b21fddd
      Eric W. Biederman 提交于
      32bit and 64bit on x86 are tested and working.  The rest I have looked
      at closely and I can't find any problems.
      
      setns is an easy system call to wire up.  It just takes two ints so I
      don't expect any weird architecture porting problems.
      
      While doing this I have noticed that we have some architectures that are
      very slow to get new system calls.  cris seems to be the slowest where
      the last system calls wired up were preadv and pwritev.  avr32 is weird
      in that recvmmsg was wired up but never declared in unistd.h.  frv is
      behind with perf_event_open being the last syscall wired up.  On h8300
      the last system call wired up was epoll_wait.  On m32r the last system
      call wired up was fallocate.  mn10300 has recvmmsg as the last system
      call wired up.  The rest seem to at least have syncfs wired up which was
      new in the 2.6.39.
      
      v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com>
      v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com>
      v4: Moved wiring up of the system call to another patch
      v5: ported to v2.6.39-rc6
      v6: rebased onto parisc-next and net-next to avoid syscall  conflicts.
      v7: ported to Linus's latest post 2.6.39 tree.
      
      >  arch/blackfin/include/asm/unistd.h     |    3 ++-
      >  arch/blackfin/mach-common/entry.S      |    1 +
      Acked-by: NMike Frysinger <vapier@gentoo.org>
      
      Oh - ia64 wiring looks good.
      Acked-by: NTony Luck <tony.luck@intel.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7b21fddd
  18. 28 5月, 2011 1 次提交
    • S
      x86: Put back -pg to tsc.o and add no GCOV to vread_tsc_64.o · 89e1be50
      Steven Rostedt 提交于
      The commit 44259b1a
          Author: Andy Lutomirski <luto@MIT.EDU>
          x86-64: Move vread_tsc into a new file with sensible options
      
      Removed the -pg from tsc.o which caused the function graph tracer
      to go into an infinite function call recursion as it uses the tsc
      internally outside its recursion protection, thus tracing the tsc
      breaks the function graph tracer.
      
      This commit also added the file vread_tsc_64.c that gets used
      by vdso but failed to prevent GCOV from monkeying with it,
      causing userspace to try to access kernel data when GCOV was
      enabled.
      
      Thanks to Thomas Gleixner for pointing out GCOV as the likely
      culprit that added strange kernel accesses into the vread_tsc()
      call.
      
      Cc: Author: Andy Lutomirski <luto@MIT.EDU>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      89e1be50
  19. 27 5月, 2011 3 次提交