1. 09 5月, 2007 1 次提交
    • C
      move die notifier handling to common code · 1eeb66a1
      Christoph Hellwig 提交于
      This patch moves the die notifier handling to common code.  Previous
      various architectures had exactly the same code for it.  Note that the new
      code is compiled unconditionally, this should be understood as an appel to
      the other architecture maintainer to implement support for it aswell (aka
      sprinkling a notify_die or two in the proper place)
      
      arm had a notifiy_die that did something totally different, I renamed it to
      arm_notify_die as part of the patch and made it static to the file it's
      declared and used at.  avr32 used to pass slightly less information through
      this interface and I brought it into line with the other architectures.
      
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: fix vmalloc_sync_all bustage]
      [bryan.wu@analog.com: fix vmalloc_sync_all in nommu]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: <linux-arch@vger.kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: NBryan Wu <bryan.wu@analog.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1eeb66a1
  2. 03 5月, 2007 2 次提交
    • J
      [PATCH] i386: PARAVIRT: Allow paravirt backend to choose kernel PMD sharing · 5311ab62
      Jeremy Fitzhardinge 提交于
      Normally when running in PAE mode, the 4th PMD maps the kernel address space,
      which can be shared among all processes (since they all need the same kernel
      mappings).
      
      Xen, however, does not allow guests to have the kernel pmd shared between page
      tables, so parameterize pgtable.c to allow both modes of operation.
      
      There are several side-effects of this.  One is that vmalloc will update the
      kernel address space mappings, and those updates need to be propagated into
      all processes if the kernel mappings are not intrinsically shared.  In the
      non-PAE case, this is done by maintaining a pgd_list of all processes; this
      list is used when all process pagetables must be updated.  pgd_list is
      threaded via otherwise unused entries in the page structure for the pgd, which
      means that the pgd must be page-sized for this to work.
      
      Normally the PAE pgd is only 4x64 byte entries large, but Xen requires the PAE
      pgd to page aligned anyway, so this patch forces the pgd to be page
      aligned+sized when the kernel pmd is unshared, to accomodate both these
      requirements.
      
      Also, since there may be several distinct kernel pmds (if the user/kernel
      split is below 3G), there's no point in allocating them from a slab cache;
      they're just allocated with get_free_page and initialized appropriately.  (Of
      course the could be cached if there is just a single kernel pmd - which is the
      default with a 3G user/kernel split - but it doesn't seem worthwhile to add
      yet another case into this code).
      
      [ Many thanks to wli for review comments. ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NWilliam Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      5311ab62
    • J
      [PATCH] i386: adjustments to page table dump during oops (v4) · 28609f6e
      Jan Beulich 提交于
      - make the page table contents printing PAE capable
      - make sure the address stored in current->thread.cr2 is unmodified
        from what was read from CR2
      - don't call oops_may_print() multiple times, when one time suffices
      - print pte even in highpte case, as long as the pte page isn't in
        actually in high memory (which is specifically the case for all page
        tables covering kernel space)
      
      (Changes to v3: Use sizeof()*2 rather than the suggested sizeof()*4 for
      printing width, use fixed 16-nibble width for PAE, and also apply the
      max_low_pfn range check to the middle level lookup on PAE.)
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      28609f6e
  3. 13 2月, 2007 1 次提交
  4. 12 2月, 2007 1 次提交
  5. 07 12月, 2006 1 次提交
  6. 30 9月, 2006 2 次提交
    • S
      [PATCH] pidspace: is_init() · f400e198
      Sukadev Bhattiprolu 提交于
      This is an updated version of Eric Biederman's is_init() patch.
      (http://lkml.org/lkml/2006/2/6/280).  It applies cleanly to 2.6.18-rc3 and
      replaces a few more instances of ->pid == 1 with is_init().
      
      Further, is_init() checks pid and thus removes dependency on Eric's other
      patches for now.
      
      Eric's original description:
      
      	There are a lot of places in the kernel where we test for init
      	because we give it special properties.  Most  significantly init
      	must not die.  This results in code all over the kernel test
      	->pid == 1.
      
      	Introduce is_init to capture this case.
      
      	With multiple pid spaces for all of the cases affected we are
      	looking for only the first process on the system, not some other
      	process that has pid == 1.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NSukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: <lxc-devel@lists.sourceforge.net>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f400e198
    • J
      [PATCH] make PROT_WRITE imply PROT_READ · df67b3da
      Jason Baron 提交于
      Make PROT_WRITE imply PROT_READ for a number of architectures which don't
      support write only in hardware.
      
      While looking at this, I noticed that some architectures which do not
      support write only mappings already take the exact same approach.  For
      example, in arch/alpha/mm/fault.c:
      
      "
              if (cause < 0) {
                      if (!(vma->vm_flags & VM_EXEC))
                              goto bad_area;
              } else if (!cause) {
                      /* Allow reads even for write-only mappings */
                      if (!(vma->vm_flags & (VM_READ | VM_WRITE)))
                              goto bad_area;
              } else {
                      if (!(vma->vm_flags & VM_WRITE))
                              goto bad_area;
              }
      "
      
      Thus, this patch brings other architectures which do not support write only
      mappings in-line and consistent with the rest.  I've verified the patch on
      ia64, x86_64 and x86.
      
      Additional discussion:
      
      Several architectures, including x86, can not support write-only mappings.
      The pte for x86 reserves a single bit for protection and its two states are
      read only or read/write.  Thus, write only is not supported in h/w.
      
      Currently, if i 'mmap' a page write-only, the first read attempt on that page
      creates a page fault and will SEGV.  That check is enforced in
      arch/blah/mm/fault.c.  However, if i first write that page it will fault in
      and the pte will be set to read/write.  Thus, any subsequent reads to the page
      will succeed.  It is this inconsistency in behavior that this patch is
      attempting to address.  Furthermore, if the page is swapped out, and then
      brought back the first read will also cause a SEGV.  Thus, any arbitrary read
      on a page can potentially result in a SEGV.
      
      According to the SuSv3 spec, "if the application requests only PROT_WRITE, the
      implementation may also allow read access." Also as mentioned, some
      archtectures, such as alpha, shown above already take the approach that i am
      suggesting.
      
      The counter-argument to this raised by Arjan, is that the kernel is enforcing
      the write only mapping the best it can given the h/w limitations.  This is
      true, however Alan Cox, and myself would argue that the inconsitency in
      behavior, that is applications can sometimes work/sometimes fails is highly
      undesireable.  If you read through the thread, i think people, came to an
      agreement on the last patch i posted, as nobody has objected to it...
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NAndi Kleen <ak@muc.de>
      Acked-by: NAlan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Ian Molton <spyro@f2s.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      df67b3da
  7. 26 9月, 2006 2 次提交
    • R
      [PATCH] i386: Allow a kernel not to be in ring 0 · 78be3706
      Rusty Russell 提交于
      We allow for the fact that the guest kernel may not run in ring 0.  This
      requires some abstraction in a few places when setting %cs or checking
      privilege level (user vs kernel).
      
      This is Chris' [RFC PATCH 15/33] move segment checks to subarch, except rather
      than using #define USER_MODE_MASK which depends on a config option, we use
      Zach's more flexible approach of assuming ring 3 == userspace.  I also used
      "get_kernel_rpl()" over "get_kernel_cs()" because I think it reads better in
      the code...
      
      1) Remove the hardcoded 3 and introduce #define SEGMENT_RPL_MASK 3 2) Add a
      get_kernel_rpl() macro, and don't assume it's zero.
      
      And:
      
      Clean up of patch for letting kernel run other than ring 0:
      
      a. Add some comments about the SEGMENT_IS_*_CODE() macros.
      b. Add a USER_RPL macro.  (Code was comparing a value to a mask
         in some places and to the magic number 3 in other places.)
      c. Add macros for table indicator field and use them.
      d. Change the entry.S tests for LDT stack segment to use the macros
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      78be3706
    • A
      [PATCH] i386: make fault notifier unconditional and export it · 474c2568
      Andi Kleen 提交于
      It's needed for external debuggers and overhead is very small.
      
      Also make the actual notifier chain they use static
      
      Cc: jbeulich@novell.com
      Signed-off-by: NAndi Kleen <ak@suse.de>
      474c2568
  8. 01 7月, 2006 1 次提交
  9. 27 6月, 2006 1 次提交
  10. 23 6月, 2006 2 次提交
  11. 23 3月, 2006 3 次提交
    • A
      [PATCH] pause_on_oops command line option · dd287796
      Andrew Morton 提交于
      Attempt to fix the problem wherein people's oops reports scroll off the screen
      due to repeated oopsing or to oopses on other CPUs.
      
      If this happens the user can reboot with the `pause_on_oops=<seconds>' option.
      It will allow the first oopsing CPU to print an oops record just a single
      time.  Second oopsing attempts, or oopses on other CPUs will cause those CPUs
      to enter a tight loop until the specified number of seconds have elapsed.
      
      The patch implements the infrastructure generically in the expectation that
      architectures other than x86 will find it useful.
      
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      dd287796
    • I
      [PATCH] make bug messages more consistent · 91368d73
      Ingo Molnar 提交于
      Consolidate all kernel bug printouts to begin with the "BUG: " string.
      Makes it easier to find them in large bootup logs.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      91368d73
    • J
      [PATCH] i386: actively synchronize vmalloc area when registering certain callbacks · 101f12af
      Jan Beulich 提交于
      Registering a callback handler through register_die_notifier() is obviously
      primarily intended for use by modules.  However, the way these currently
      get called it is basically impossible for them to actually be used by
      modules, as there is, on non-PAE configurationes, a good chance (the larger
      the module, the better) for the system to crash as a result.
      
      This is because the callback gets invoked
      
      (a) in the page fault path before the top level page table propagation
          gets carried out (hence a fault to propagate the top level page table
          entry/entries mapping to module's code/data would nest infinitly) and
      
      (b) in the NMI path, where nested faults must absolutely not happen,
          since otherwise the IRET from the nested fault re-enables NMIs,
          potentially resulting in nested NMI occurences.
      
      Besides the modular aspect, similar problems would even arise for in-
      kernel consumers of the API if they touched ioremap()ed or vmalloc()ed
      memory inside their handlers.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      101f12af
  12. 31 10月, 2005 1 次提交
  13. 08 9月, 2005 1 次提交
  14. 05 9月, 2005 2 次提交
    • Z
      [PATCH] i386: inline asm cleanup · 4bb0d3ec
      Zachary Amsden 提交于
      i386 Inline asm cleanup.  Use cr/dr accessor functions.
      
      Also, a potential bugfix.  Also, some CR accessors really should be volatile.
      Reads from CR0 (numeric state may change in an exception handler), writes to
      CR4 (flipping CR4.TSD) and reads from CR2 (page fault) prevent instruction
      re-ordering.  I did not add memory clobber to CR3 / CR4 / CR0 updates, as it
      was not there to begin with, and in no case should kernel memory be clobbered,
      except when doing a TLB flush, which already has memory clobber.
      
      I noticed that page invalidation does not have a memory clobber.  I can't find
      a bug as a result, but there is definitely a potential for a bug here:
      
      #define __flush_tlb_single(addr) \
      	__asm__ __volatile__("invlpg %0": :"m" (*(char *) addr))
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4bb0d3ec
    • I
      [PATCH] x86: compress the stack layout of do_page_fault() · 869f96a0
      Ingo Molnar 提交于
      This patch pushes the creation of a rare signal frame (SIGBUS or SIGSEGV)
      into a separate function, thus saving stackspace in the main
      do_page_fault() stackframe.  The effect is 132 bytes less of stack used by
      the typical do_page_fault() invocation - resulting in a denser
      cache-layout.
      
      (Another minor effect is that in case of kernel crashes that come from a
      pagefault, we add less space to the already existing frame, giving the
      crash functions a slightly higher chance to do their stuff without
      overflowing the stack.)
      
      (The changes also result in slightly cleaner code.)
      
      argument bugfix from "Guillaume C." <guichaz@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      869f96a0
  15. 26 6月, 2005 2 次提交
  16. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4