1. 22 4月, 2009 1 次提交
  2. 03 4月, 2009 2 次提交
  3. 01 4月, 2009 2 次提交
  4. 13 3月, 2009 1 次提交
    • R
      UML on UML fixed: it did not start · 86d6f2bf
      Renzo Davoli 提交于
      It is currently impossible to run a user-mode linux machine inside another
      user-mode linux (UML on UML).  It breaks after a few instructions.  When
      it tries to check whether SYSEMU is installed (the inner) UML receives an
      inconsistent result (from the outer UML).
      
      This is the output of a broken attempt:
      $ ./linux mem=256m ubd0=cow
      Locating the bottom of the address space ... 0x0
      Locating the top of the address space ... 0xc0000000
      Core dump limits :
              soft - 0
              hard - NONE
      Checking that ptrace can change system call numbers...OK
      Checking ptrace new tags for syscall emulation...unsupported
      Checking syscall emulation patch for ptrace...check_sysemu : expected SIGTRAP, got status = 256
      $
      
      The problem is the following:
      
      PTRACE_SYSCALL/SINGLESTEP is currently managed inside arch_ptrace for ARCH=um.
      
      PTRACE_SYSEMU/SUSEMU_SINGLESTEP is not captured in arch_ptrace's switch,
      therefore it is erroneously passed back to ptrace_request (in
      kernel/ptrace).
      
      This simple patch simply forces ptrace to return an error on
      PTRACE_SYSEMU/SUSEMU_SINGLESTEP as it is unsupported on ARCH=um, and fixes
      the problem.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NRenzo Davoli <renzo@cs.unibo.it>
      Reviewed-by: NWANG Cong <xiyou.wangcong@gmail.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      86d6f2bf
  5. 11 1月, 2009 1 次提交
    • Y
      sparseirq: use kstat_irqs_cpu instead · dee4102a
      Yinghai Lu 提交于
      Impact: build fix
      
      Ingo Molnar wrote:
      
      > tip/arch/blackfin/kernel/irqchip.c: In function 'show_interrupts':
      > tip/arch/blackfin/kernel/irqchip.c:85: error: 'struct kernel_stat' has no member named 'irqs'
      > make[2]: *** [arch/blackfin/kernel/irqchip.o] Error 1
      > make[2]: *** Waiting for unfinished jobs....
      >
      
      So could move kstat_irqs array to irq_desc struct.
      
      (s390, m68k, sparc) are not touched yet, because they don't support genirq
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dee4102a
  6. 07 1月, 2009 1 次提交
    • N
      mm: invoke oom-killer from page fault · 1c0fe6e3
      Nick Piggin 提交于
      Rather than have the pagefault handler kill a process directly if it gets
      a VM_FAULT_OOM, have it call into the OOM killer.
      
      With increasingly sophisticated oom behaviour (cpusets, memory cgroups,
      oom killing throttling, oom priority adjustment or selective disabling,
      panic on oom, etc), it's silly to unconditionally kill the faulting
      process at page fault time.  Create a hook for pagefault oom path to call
      into instead.
      
      Only converted x86 and uml so far.
      
      [akpm@linux-foundation.org: make __out_of_memory() static]
      [akpm@linux-foundation.org: fix comment]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Cc: Jeff Dike <jdike@addtoit.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1c0fe6e3
  7. 01 1月, 2009 1 次提交
  8. 13 12月, 2008 2 次提交
    • R
      cpumask: convert struct clock_event_device to cpumask pointers. · 320ab2b0
      Rusty Russell 提交于
      Impact: change calling convention of existing clock_event APIs
      
      struct clock_event_timer's cpumask field gets changed to take pointer,
      as does the ->broadcast function.
      
      Another single-patch change.  For safety, we BUG_ON() in
      clockevents_register_device() if it's not set.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      320ab2b0
    • R
      cpumask: centralize cpu_online_map and cpu_possible_map · 98a79d6a
      Rusty Russell 提交于
      Impact: cleanup
      
      Each SMP arch defines these themselves.  Move them to a central
      location.
      
      Twists:
      1) Some archs (m32, parisc, s390) set possible_map to all 1, so we add a
         CONFIG_INIT_ALL_POSSIBLE for this rather than break them.
      
      2) mips and sparc32 '#define cpu_possible_map phys_cpu_present_map'.
         Those archs simply have phys_cpu_present_map replaced everywhere.
      
      3) Alpha defined cpu_possible_map to cpu_present_map; this is tricky
         so I just manipulate them both in sync.
      
      4) IA64, cris and m32r have gratuitous 'extern cpumask_t cpu_possible_map'
         declarations.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Reviewed-by: NGrant Grundler <grundler@parisc-linux.org>
      Tested-by: NTony Luck <tony.luck@intel.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Cc: Mike Travis <travis@sgi.com>
      Cc: ink@jurassic.park.msu.ru
      Cc: rmk@arm.linux.org.uk
      Cc: starvik@axis.com
      Cc: tony.luck@intel.com
      Cc: takata@linux-m32r.org
      Cc: ralf@linux-mips.org
      Cc: grundler@parisc-linux.org
      Cc: paulus@samba.org
      Cc: schwidefsky@de.ibm.com
      Cc: lethal@linux-sh.org
      Cc: wli@holomorphy.com
      Cc: davem@davemloft.net
      Cc: jdike@addtoit.com
      Cc: mingo@redhat.com
      98a79d6a
  9. 23 10月, 2008 2 次提交
  10. 17 10月, 2008 1 次提交
  11. 09 9月, 2008 1 次提交
    • M
      kernel/cpu.c: create a CPU_STARTING cpu_chain notifier · e545a614
      Manfred Spraul 提交于
      Right now, there is no notifier that is called on a new cpu, before the new
      cpu begins processing interrupts/softirqs.
      Various kernel function would need that notification, e.g. kvm works around
      by calling smp_call_function_single(), rcu polls cpu_online_map.
      
      The patch adds a CPU_STARTING notification. It also adds a helper function
      that sends the message to all cpu_chain handlers.
      
      Tested on x86-64.
      All other archs are untested. Especially on sparc, I'm not sure if I got
      it right.
      Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e545a614
  12. 27 7月, 2008 1 次提交
  13. 25 7月, 2008 3 次提交
  14. 19 7月, 2008 1 次提交
    • T
      nohz: prevent tick stop outside of the idle loop · b8f8c3cf
      Thomas Gleixner 提交于
      Jack Ren and Eric Miao tracked down the following long standing
      problem in the NOHZ code:
      
      	scheduler switch to idle task
      	enable interrupts
      
      Window starts here
      
      	----> interrupt happens (does not set NEED_RESCHED)
      	      	irq_exit() stops the tick
      
      	----> interrupt happens (does set NEED_RESCHED)
      
      	return from schedule()
      	
      	cpu_idle(): preempt_disable();
      
      Window ends here
      
      The interrupts can happen at any point inside the race window. The
      first interrupt stops the tick, the second one causes the scheduler to
      rerun and switch away from idle again and we end up with the tick
      disabled.
      
      The fact that it needs two interrupts where the first one does not set
      NEED_RESCHED and the second one does made the bug obscure and extremly
      hard to reproduce and analyse. Kudos to Jack and Eric.
      
      Solution: Limit the NOHZ functionality to the idle loop to make sure
      that we can not run into such a situation ever again.
      
      cpu_idle()
      {
      	preempt_disable();
      
      	while(1) {
      		 tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
      		 			          are in the idle loop
      
      		 while (!need_resched())
      		       halt();
      
      		 tick_nohz_restart_sched_tick(); <- disables NOHZ mode
      		 preempt_enable_no_resched();
      		 schedule();
      		 preempt_disable();
      	}
      }
      
      In hindsight we should have done this forever, but ... 
      
      /me grabs a large brown paperbag.
      
      Debugged-by: Jack Ren <jack.ren@marvell.com>, 
      Debugged-by: Neric miao <eric.y.miao@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      b8f8c3cf
  15. 26 6月, 2008 1 次提交
  16. 07 6月, 2008 1 次提交
    • T
      uml: deal with inaccessible address space start · 40fb16a3
      Tom Spink 提交于
      This patch makes os_get_task_size locate the bottom of the address space,
      as well as the top.  This is for systems which put a lower limit on mmap
      addresses.  It works by manually scanning pages from zero onwards until a
      valid page is found.
      
      Because the bottom of the address space may not be zero, it's not
      sufficient to assume the top of the address space is the size of the
      address space.  The size is the difference between the top address and
      bottom address.
      
      [jdike@addtoit.com: changed the name to reflect that this function is
      supposed to return the top of the process address space, not its size and
      changed the return value to reflect that.  Also some minor formatting
      changes]
      Signed-off-by: NTom Spink <tspink@gmail.com>
      Signed-off-by: NJeff Dike <jdike@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      40fb16a3
  17. 22 5月, 2008 1 次提交
  18. 17 5月, 2008 1 次提交
  19. 13 5月, 2008 5 次提交
  20. 04 5月, 2008 1 次提交
    • U
      unified (weak) sys_pipe implementation · d35c7b0e
      Ulrich Drepper 提交于
      This replaces the duplicated arch-specific versions of "sys_pipe()" with
      one unified implementation.  This removes almost 250 lines of duplicated
      code.
      
      It's marked __weak, so that *if* an architecture wants to override the
      default implementation it can do so by simply having its own replacement
      version, since many architectures use alternate calling conventions for
      the 'pipe()' system call for legacy reasons (ie traditional UNIX
      implementations often return the two file descriptors in registers)
      
      I still haven't changed the cris version even though Linus says the BKL
      isn't needed.  The arch maintainer can easily do it if there are really
      no obstacles.
      Signed-off-by: NUlrich Drepper <drepper@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d35c7b0e
  21. 29 4月, 2008 2 次提交
  22. 28 4月, 2008 1 次提交
  23. 25 2月, 2008 1 次提交
  24. 24 2月, 2008 2 次提交
  25. 09 2月, 2008 4 次提交
    • J
      uml: fix mm_context memory leak · ac2a6599
      Jeff Dike 提交于
      [ Spotted by Miklos ]
      
      Fix a memory leak in init_new_context.  The struct page ** buffer allocated
      for install_special_mapping was never recorded, and thus leaked when the
      mm_struct was freed.  Fix it by saving the pointer in mm_context_t and freeing
      it in arch_exit_mmap.
      Signed-off-by: NJeff Dike <jdike@linux.intel.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ac2a6599
    • J
      uml: runtime host VMSPLIT detection · 536788fe
      Jeff Dike 提交于
      Calculate TASK_SIZE at run-time by figuring out the host's VMSPLIT - this is
      needed on i386 if UML is to run on hosts with varying VMSPLITs without
      recompilation.
      
      TASK_SIZE is now defined in terms of a variable, task_size.  This gets rid of
      an include of pgtable.h from processor.h, which can cause include loops.
      
      On i386, task_size is calculated early in boot by probing the address space in
      a binary search to figure out where the boundary between usable and non-usable
      memory is.  This tries to make sure that a page that is considered to be in
      userspace is, or can be made, read-write.  I'm concerned about a system-global
      VDSO page in kernel memory being hit and considered to be a userspace page.
      
      On x86_64, task_size is just the old value of CONFIG_TOP_ADDR.
      
      A bunch of config variable are gone now.  CONFIG_TOP_ADDR is directly replaced
      by TASK_SIZE.  NEST_LEVEL is gone since the relocation of the stubs makes it
      irrelevant.  All the HOST_VMSPLIT stuff is gone.  All references to these in
      arch/um/Makefile are also gone.
      
      I noticed and fixed a missing extern in os.h when adding os_get_task_size.
      
      Note: This has been revised to fix the 32-bit UML on 64-bit host bug that
      Miklos ran into.
      Signed-off-by: NJeff Dike <jdike@linux.intel.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      536788fe
    • M
      CONFIG_HIGHPTE vs. sub-page page tables. · 2f569afd
      Martin Schwidefsky 提交于
      Background: I've implemented 1K/2K page tables for s390.  These sub-page
      page tables are required to properly support the s390 virtualization
      instruction with KVM.  The SIE instruction requires that the page tables
      have 256 page table entries (pte) followed by 256 page status table entries
      (pgste).  The pgstes are only required if the process is using the SIE
      instruction.  The pgstes are updated by the hardware and by the hypervisor
      for a number of reasons, one of them is dirty and reference bit tracking.
      To avoid wasting memory the standard pte table allocation should return
      1K/2K (31/64 bit) and 2K/4K if the process is using SIE.
      
      Problem: Page size on s390 is 4K, page table size is 1K or 2K.  That means
      the s390 version for pte_alloc_one cannot return a pointer to a struct
      page.  Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
      cannot return a pointer to a pte either, since that would require more than
      32 bit for the return value of pte_alloc_one (and the pte * would not be
      accessible since its not kmapped).
      
      Solution: The only solution I found to this dilemma is a new typedef: a
      pgtable_t.  For s390 pgtable_t will be a (pte *) - to be introduced with a
      later patch.  For everybody else it will be a (struct page *).  The
      additional problem with the initialization of the ptl lock and the
      NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
      a destructor pgtable_page_dtor.  The page table allocation and free
      functions need to call these two whenever a page table page is allocated or
      freed.  pmd_populate will get a pgtable_t instead of a struct page pointer.
       To get the pgtable_t back from a pmd entry that has been installed with
      pmd_populate a new function pmd_pgtable is added.  It replaces the pmd_page
      call in free_pte_range and apply_to_pte_range.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2f569afd
    • D
      aout: remove unnecessary inclusions of {asm, linux}/a.out.h · 1eb11411
      David Howells 提交于
      Remove now unnecessary inclusions of {asm,linux}/a.out.h.
      
      [akpm@linux-foundation.org: fix alpha build]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1eb11411