1. 13 12月, 2012 1 次提交
  2. 20 9月, 2012 1 次提交
    • A
      x86: get rid of TIF_IRET hackery · e76623d6
      Al Viro 提交于
      TIF_NOTIFY_RESUME will work in precisely the same way; all that
      is achieved by TIF_IRET is appearing that there's some work to be
      done, so we end up on the iret exit path.  Just use NOTIFY_RESUME.
      And for execve() do that in 32bit start_thread(), not sys_execve()
      itself.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e76623d6
  3. 06 6月, 2012 1 次提交
  4. 22 3月, 2012 1 次提交
    • A
      mm: thp: fix pmd_bad() triggering in code paths holding mmap_sem read mode · 1a5a9906
      Andrea Arcangeli 提交于
      In some cases it may happen that pmd_none_or_clear_bad() is called with
      the mmap_sem hold in read mode.  In those cases the huge page faults can
      allocate hugepmds under pmd_none_or_clear_bad() and that can trigger a
      false positive from pmd_bad() that will not like to see a pmd
      materializing as trans huge.
      
      It's not khugepaged causing the problem, khugepaged holds the mmap_sem
      in write mode (and all those sites must hold the mmap_sem in read mode
      to prevent pagetables to go away from under them, during code review it
      seems vm86 mode on 32bit kernels requires that too unless it's
      restricted to 1 thread per process or UP builds).  The race is only with
      the huge pagefaults that can convert a pmd_none() into a
      pmd_trans_huge().
      
      Effectively all these pmd_none_or_clear_bad() sites running with
      mmap_sem in read mode are somewhat speculative with the page faults, and
      the result is always undefined when they run simultaneously.  This is
      probably why it wasn't common to run into this.  For example if the
      madvise(MADV_DONTNEED) runs zap_page_range() shortly before the page
      fault, the hugepage will not be zapped, if the page fault runs first it
      will be zapped.
      
      Altering pmd_bad() not to error out if it finds hugepmds won't be enough
      to fix this, because zap_pmd_range would then proceed to call
      zap_pte_range (which would be incorrect if the pmd become a
      pmd_trans_huge()).
      
      The simplest way to fix this is to read the pmd in the local stack
      (regardless of what we read, no need of actual CPU barriers, only
      compiler barrier needed), and be sure it is not changing under the code
      that computes its value.  Even if the real pmd is changing under the
      value we hold on the stack, we don't care.  If we actually end up in
      zap_pte_range it means the pmd was not none already and it was not huge,
      and it can't become huge from under us (khugepaged locking explained
      above).
      
      All we need is to enforce that there is no way anymore that in a code
      path like below, pmd_trans_huge can be false, but pmd_none_or_clear_bad
      can run into a hugepmd.  The overhead of a barrier() is just a compiler
      tweak and should not be measurable (I only added it for THP builds).  I
      don't exclude different compiler versions may have prevented the race
      too by caching the value of *pmd on the stack (that hasn't been
      verified, but it wouldn't be impossible considering
      pmd_none_or_clear_bad, pmd_bad, pmd_trans_huge, pmd_none are all inlines
      and there's no external function called in between pmd_trans_huge and
      pmd_none_or_clear_bad).
      
      		if (pmd_trans_huge(*pmd)) {
      			if (next-addr != HPAGE_PMD_SIZE) {
      				VM_BUG_ON(!rwsem_is_locked(&tlb->mm->mmap_sem));
      				split_huge_page_pmd(vma->vm_mm, pmd);
      			} else if (zap_huge_pmd(tlb, vma, pmd, addr))
      				continue;
      			/* fall through */
      		}
      		if (pmd_none_or_clear_bad(pmd))
      
      Because this race condition could be exercised without special
      privileges this was reported in CVE-2012-1179.
      
      The race was identified and fully explained by Ulrich who debugged it.
      I'm quoting his accurate explanation below, for reference.
      
      ====== start quote =======
            mapcount 0 page_mapcount 1
            kernel BUG at mm/huge_memory.c:1384!
      
          At some point prior to the panic, a "bad pmd ..." message similar to the
          following is logged on the console:
      
            mm/memory.c:145: bad pmd ffff8800376e1f98(80000000314000e7).
      
          The "bad pmd ..." message is logged by pmd_clear_bad() before it clears
          the page's PMD table entry.
      
              143 void pmd_clear_bad(pmd_t *pmd)
              144 {
          ->  145         pmd_ERROR(*pmd);
              146         pmd_clear(pmd);
              147 }
      
          After the PMD table entry has been cleared, there is an inconsistency
          between the actual number of PMD table entries that are mapping the page
          and the page's map count (_mapcount field in struct page). When the page
          is subsequently reclaimed, __split_huge_page() detects this inconsistency.
      
             1381         if (mapcount != page_mapcount(page))
             1382                 printk(KERN_ERR "mapcount %d page_mapcount %d\n",
             1383                        mapcount, page_mapcount(page));
          -> 1384         BUG_ON(mapcount != page_mapcount(page));
      
          The root cause of the problem is a race of two threads in a multithreaded
          process. Thread B incurs a page fault on a virtual address that has never
          been accessed (PMD entry is zero) while Thread A is executing an madvise()
          system call on a virtual address within the same 2 MB (huge page) range.
      
                     virtual address space
                    .---------------------.
                    |                     |
                    |                     |
                  .-|---------------------|
                  | |                     |
                  | |                     |<-- B(fault)
                  | |                     |
            2 MB  | |/////////////////////|-.
            huge <  |/////////////////////|  > A(range)
            page  | |/////////////////////|-'
                  | |                     |
                  | |                     |
                  '-|---------------------|
                    |                     |
                    |                     |
                    '---------------------'
      
          - Thread A is executing an madvise(..., MADV_DONTNEED) system call
            on the virtual address range "A(range)" shown in the picture.
      
          sys_madvise
            // Acquire the semaphore in shared mode.
            down_read(&current->mm->mmap_sem)
            ...
            madvise_vma
              switch (behavior)
              case MADV_DONTNEED:
                   madvise_dontneed
                     zap_page_range
                       unmap_vmas
                         unmap_page_range
                           zap_pud_range
                             zap_pmd_range
                               //
                               // Assume that this huge page has never been accessed.
                               // I.e. content of the PMD entry is zero (not mapped).
                               //
                               if (pmd_trans_huge(*pmd)) {
                                   // We don't get here due to the above assumption.
                               }
                               //
                               // Assume that Thread B incurred a page fault and
                   .---------> // sneaks in here as shown below.
                   |           //
                   |           if (pmd_none_or_clear_bad(pmd))
                   |               {
                   |                 if (unlikely(pmd_bad(*pmd)))
                   |                     pmd_clear_bad
                   |                     {
                   |                       pmd_ERROR
                   |                         // Log "bad pmd ..." message here.
                   |                       pmd_clear
                   |                         // Clear the page's PMD entry.
                   |                         // Thread B incremented the map count
                   |                         // in page_add_new_anon_rmap(), but
                   |                         // now the page is no longer mapped
                   |                         // by a PMD entry (-> inconsistency).
                   |                     }
                   |               }
                   |
                   v
          - Thread B is handling a page fault on virtual address "B(fault)" shown
            in the picture.
      
          ...
          do_page_fault
            __do_page_fault
              // Acquire the semaphore in shared mode.
              down_read_trylock(&mm->mmap_sem)
              ...
              handle_mm_fault
                if (pmd_none(*pmd) && transparent_hugepage_enabled(vma))
                    // We get here due to the above assumption (PMD entry is zero).
                    do_huge_pmd_anonymous_page
                      alloc_hugepage_vma
                        // Allocate a new transparent huge page here.
                      ...
                      __do_huge_pmd_anonymous_page
                        ...
                        spin_lock(&mm->page_table_lock)
                        ...
                        page_add_new_anon_rmap
                          // Here we increment the page's map count (starts at -1).
                          atomic_set(&page->_mapcount, 0)
                        set_pmd_at
                          // Here we set the page's PMD entry which will be cleared
                          // when Thread A calls pmd_clear_bad().
                        ...
                        spin_unlock(&mm->page_table_lock)
      
          The mmap_sem does not prevent the race because both threads are acquiring
          it in shared mode (down_read).  Thread B holds the page_table_lock while
          the page's map count and PMD table entry are updated.  However, Thread A
          does not synchronize on that lock.
      
      ====== end quote =======
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Reported-by: NUlrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Jones <davej@redhat.com>
      Acked-by: NLarry Woodman <lwoodman@redhat.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: <stable@vger.kernel.org>		[2.6.38+]
      Cc: Mark Salter <msalter@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1a5a9906
  5. 13 3月, 2012 1 次提交
  6. 18 1月, 2012 2 次提交
    • A
      x86-32: Fix build failure with AUDIT=y, AUDITSYSCALL=n · 6015ff10
      Al Viro 提交于
      JONGMAN HEO reports:
      
        With current linus git (commit a25a2b84), I got following build error,
      
        arch/x86/kernel/vm86_32.c: In function 'do_sys_vm86':
        arch/x86/kernel/vm86_32.c:340: error: implicit declaration of function '__audit_syscall_exit'
        make[3]: *** [arch/x86/kernel/vm86_32.o] Error 1
      
      OK, I can reproduce it (32bit allmodconfig with AUDIT=y, AUDITSYSCALL=n)
      
      It's due to commit d7e7528b: "Audit: push audit success and retcode
      into arch ptrace.h".
      Reported-by: NJONGMAN HEO <jongman.heo@samsung.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6015ff10
    • E
      Audit: push audit success and retcode into arch ptrace.h · d7e7528b
      Eric Paris 提交于
      The audit system previously expected arches calling to audit_syscall_exit to
      supply as arguments if the syscall was a success and what the return code was.
      Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things
      by converting from negative retcodes to an audit internal magic value stating
      success or failure.  This helper was wrong and could indicate that a valid
      pointer returned to userspace was a failed syscall.  The fix is to fix the
      layering foolishness.  We now pass audit_syscall_exit a struct pt_reg and it
      in turns calls back into arch code to collect the return value and to
      determine if the syscall was a success or failure.  We also define a generic
      is_syscall_success() macro which determines success/failure based on if the
      value is < -MAX_ERRNO.  This works for arches like x86 which do not use a
      separate mechanism to indicate syscall failure.
      
      We make both the is_syscall_success() and regs_return_value() static inlines
      instead of macros.  The reason is because the audit function must take a void*
      for the regs.  (uml calls theirs struct uml_pt_regs instead of just struct
      pt_regs so audit_syscall_exit can't take a struct pt_regs).  Since the audit
      function takes a void* we need to use static inlines to cast it back to the
      arch correct structure to dereference it.
      
      The other major change is that on some arches, like ia64, MIPS and ppc, we
      change regs_return_value() to give us the negative value on syscall failure.
      THE only other user of this macro, kretprobe_example.c, won't notice and it
      makes the value signed consistently for the audit functions across all archs.
      
      In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old
      audit code as the return value.  But the ptrace_64.h code defined the macro
      regs_return_value() as regs[3].  I have no idea which one is correct, but this
      patch now uses the regs_return_value() function, so it now uses regs[3].
      
      For powerpc we previously used regs->result but now use the
      regs_return_value() function which uses regs->gprs[3].  regs->gprs[3] is
      always positive so the regs_return_value(), much like ia64 makes it negative
      before calling the audit code when appropriate.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion]
      Acked-by: Tony Luck <tony.luck@intel.com> [for ia64]
      Acked-by: Richard Weinberger <richard@nod.at> [for uml]
      Acked-by: David S. Miller <davem@davemloft.net> [for sparc]
      Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips]
      Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
      d7e7528b
  7. 14 1月, 2011 1 次提交
  8. 24 9月, 2010 1 次提交
    • B
      x86, vm86: Fix preemption bug for int1 debug and int3 breakpoint handlers. · 6554287b
      Bart Oldeman 提交于
      Impact: fix kernel bug such as:
      BUG: scheduling while atomic: dosemu.bin/19680/0x00000004
      See also Ubuntu bug 455067 at
      https://bugs.launchpad.net/ubuntu/+source/linux/+bug/455067
      
      Commits 4915a35e
      ("Use preempt_conditional_sti/cli in do_int3, like on x86_64.")
      and 3d2a71a5
      ("x86, traps: converge do_debug handlers")
      started disabling preemption in int1 and int3 handlers on i386.
      The problem with vm86 is that the call to handle_vm86_trap() may jump
      straight to entry_32.S and never returns so preempt is never enabled
      again, and there is an imbalance in the preempt count.
      
      Commit be716615 ("x86, vm86:
      fix preemption bug"), which was later (accidentally?) reverted by commit
      08d68323 ("hw-breakpoints: modifying
      generic debug exception to use thread-specific debug registers")
      fixed the problem for debug exceptions but not for breakpoints.
      
      There are three solutions to this problem.
      
      1. Reenable preemption before calling handle_vm86_trap(). This
      was the approach that was later reverted.
      
      2. Do not disable preemption for i386 in breakpoint and debug handlers.
      This was the situation before October 2008. As far as I understand
      preemption only needs to be disabled on x86_64 because a seperate stack is
      used, but it's nice to have things work the same way on
      i386 and x86_64.
      
      3. Let handle_vm86_trap() return instead of jumping to assembly code.
      By setting a flag in _TIF_WORK_MASK, either TIF_IRET or TIF_NOTIFY_RESUME,
      the code in entry_32.S is instructed to return to 32 bit mode from
      V86 mode. The logic in entry_32.S was already present to handle signals.
      (I chose TIF_IRET because it's slightly more efficient in
      do_notify_resume() in signal.c, but in fact TIF_IRET can probably be
      replaced by TIF_NOTIFY_RESUME everywhere.)
      
      I'm submitting approach 3, because I believe it is the most elegant
      and prevents future confusion. Still, an obvious
      preempt_conditional_cli(regs); is necessary in traps.c to correct the
      bug.
      
      [ hpa: This is technically a regression, but because:
        1. the regression is so old,
        2. the patch seems relatively high risk, justifying more testing, and
        3. we're late in the 2.6.36-rc cycle,
      
        I'm queuing it up for the 2.6.37 merge window.  It might, however,
        justify as a -stable backport at a latter time, hence Cc: stable. ]
      Signed-off-by: NBart Oldeman <bartoldeman@users.sourceforge.net>
      LKML-Reference: <alpine.DEB.2.00.1009231312330.4732@localhost.localdomain>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Alexander van Heukelum <heukelum@fastmail.fm>
      Cc: <stable@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      6554287b
  9. 10 12月, 2009 1 次提交
  10. 07 6月, 2009 1 次提交
  11. 07 5月, 2009 1 次提交
  12. 12 2月, 2009 1 次提交
    • B
      x86: use regparm(3) for passed-in pt_regs pointer · b12bdaf1
      Brian Gerst 提交于
      Some syscalls need to access the pt_regs structure, either to copy
      user register state or to modifiy it.  This patch adds stubs to load
      the address of the pt_regs struct into the %eax register, and changes
      the syscalls to take the pointer as an argument instead of relying on
      the assumption that the pt_regs structure overlaps the function
      arguments.
      
      Drop the use of regparm(1) due to concern about gcc bugs, and to move
      in the direction of the eventual removal of regparm(0) for asmlinkage.
      Signed-off-by: NBrian Gerst <brgerst@gmail.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      b12bdaf1
  13. 11 2月, 2009 1 次提交
  14. 10 2月, 2009 1 次提交
    • T
      x86: add %gs accessors for x86_32 · d9a89a26
      Tejun Heo 提交于
      Impact: cleanup
      
      On x86_32, %gs is handled lazily.  It's not saved and restored on
      kernel entry/exit but only when necessary which usually is during task
      switch but there are few other places.  Currently, it's done by
      calling savesegment() and loadsegment() explicitly.  Define
      get_user_gs(), set_user_gs() and task_user_gs() and use them instead.
      
      While at it, clean up register access macros in signal.c.
      
      This cleans up code a bit and will help future changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9a89a26
  15. 22 7月, 2008 1 次提交
  16. 17 4月, 2008 4 次提交
  17. 30 1月, 2008 5 次提交
  18. 14 10月, 2007 1 次提交
    • D
      Delete filenames in comments. · 835c34a1
      Dave Jones 提交于
      Since the x86 merge, lots of files that referenced their own filenames
      are no longer correct.  Rather than keep them up to date, just delete
      them, as they add no real value.
      
      Additionally:
      - fix up comment formatting in scx200_32.c
      - Remove a credit from myself in setup_64.c from a time when we had no SCM
      - remove longwinded history from tsc_32.c which can be figured out from
        git.
      Signed-off-by: NDave Jones <davej@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      835c34a1
  19. 11 10月, 2007 2 次提交
  20. 09 5月, 2007 1 次提交
  21. 13 2月, 2007 1 次提交
  22. 07 12月, 2006 1 次提交
    • J
      [PATCH] i386: Update sys_vm86 to cope with changed pt_regs and %gs usage · 49d26b6e
      Jeremy Fitzhardinge 提交于
      sys_vm86 uses a struct kernel_vm86_regs, which is identical to pt_regs, but
      adds an extra space for all the segment registers.  Previously this structure
      was completely independent, so changes in pt_regs had to be reflected in
      kernel_vm86_regs.  This changes just embeds pt_regs in kernel_vm86_regs, and
      makes the appropriate changes to vm86.c to deal with the new naming.
      
      Also, since %gs is dealt with differently in the kernel, this change adjusts
      vm86.c to reflect this.
      
      While making these changes, I also cleaned up some frankly bizarre code which
      was added when auditing was added to sys_vm86.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Chuck Ebbert <76306.1226@compuserve.com>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Jan Beulich <jbeulich@novell.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      49d26b6e
  23. 05 10月, 2006 1 次提交
    • D
      IRQ: Maintain regs pointer globally rather than passing to IRQ handlers · 7d12e780
      David Howells 提交于
      Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
      of passing regs around manually through all ~1800 interrupt handlers in the
      Linux kernel.
      
      The regs pointer is used in few places, but it potentially costs both stack
      space and code to pass it around.  On the FRV arch, removing the regs parameter
      from all the genirq function results in a 20% speed up of the IRQ exit path
      (ie: from leaving timer_interrupt() to leaving do_IRQ()).
      
      Where appropriate, an arch may override the generic storage facility and do
      something different with the variable.  On FRV, for instance, the address is
      maintained in GR28 at all times inside the kernel as part of general exception
      handling.
      
      Having looked over the code, it appears that the parameter may be handed down
      through up to twenty or so layers of functions.  Consider a USB character
      device attached to a USB hub, attached to a USB controller that posts its
      interrupts through a cascaded auxiliary interrupt controller.  A character
      device driver may want to pass regs to the sysrq handler through the input
      layer which adds another few layers of parameter passing.
      
      I've build this code with allyesconfig for x86_64 and i386.  I've runtested the
      main part of the code on FRV and i386, though I can't test most of the drivers.
      I've also done partial conversion for powerpc and MIPS - these at least compile
      with minimal configurations.
      
      This will affect all archs.  Mostly the changes should be relatively easy.
      Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
      
      	struct pt_regs *old_regs = set_irq_regs(regs);
      
      And put the old one back at the end:
      
      	set_irq_regs(old_regs);
      
      Don't pass regs through to generic_handle_irq() or __do_IRQ().
      
      In timer_interrupt(), this sort of change will be necessary:
      
      	-	update_process_times(user_mode(regs));
      	-	profile_tick(CPU_PROFILING, regs);
      	+	update_process_times(user_mode(get_irq_regs()));
      	+	profile_tick(CPU_PROFILING);
      
      I'd like to move update_process_times()'s use of get_irq_regs() into itself,
      except that i386, alone of the archs, uses something other than user_mode().
      
      Some notes on the interrupt handling in the drivers:
      
       (*) input_dev() is now gone entirely.  The regs pointer is no longer stored in
           the input_dev struct.
      
       (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking.  It does
           something different depending on whether it's been supplied with a regs
           pointer or not.
      
       (*) Various IRQ handler function pointers have been moved to type
           irq_handler_t.
      Signed-Off-By: NDavid Howells <dhowells@redhat.com>
      (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
      7d12e780
  24. 01 7月, 2006 1 次提交
  25. 01 5月, 2006 1 次提交
  26. 21 3月, 2006 1 次提交
    • J
      [PATCH] make vm86 call audit_syscall_exit · 7e7f8a03
      Jason Baron 提交于
      hi,
      
      The motivation behind the patch below was to address messages in
      /var/log/messages such as:
      
      Jan 31 10:54:15 mets kernel: audit(:0): major=252 name_count=0: freeing
      multiple contexts (1)
      Jan 31 10:54:15 mets kernel: audit(:0): major=113 name_count=0: freeing
      multiple contexts (2)
      
      I can reproduce by running 'get-edid' from:
      http://john.fremlin.de/programs/linux/read-edid/.
      
      These messages come about in the log b/c the vm86 calls do not exit via
      the normal system call exit paths and thus do not call
      'audit_syscall_exit'. The next system call will then free the context for
      itself and for the vm86 context, thus generating the above messages. This
      patch addresses the issue by simply adding a call to 'audit_syscall_exit'
      from the vm86 code.
      
      Besides fixing the above error messages the patch also now allows vm86
      system calls to become auditable. This is useful since strace does not
      appear to properly record the return values from sys_vm86.
      
      I think this patch is also a step in the right direction in terms of
      cleaning up some core auditing code. If we can correct any other paths
      that do not properly call the audit exit and entries points, then we can
      also eliminate the notion of context chaining.
      
      I've tested this patch by verifying that the log messages no longer
      appear, and that the audit records for sys_vm86 appear to be correct.
      Also, 'read_edid' produces itentical output.
      
      thanks,
      
      -Jason
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7e7f8a03
  27. 15 1月, 2006 1 次提交
  28. 13 1月, 2006 1 次提交
  29. 12 1月, 2006 1 次提交
  30. 30 10月, 2005 1 次提交
    • H
      [PATCH] mm: i386 sh sh64 ready for split ptlock · 60ec5585
      Hugh Dickins 提交于
      Use pte_offset_map_lock, instead of pte_offset_map (or inappropriate
      pte_offset_kernel) and mm-wide page_table_lock, in sundry arch places.
      
      The i386 vm86 mark_screen_rdonly: yes, there was and is an assumption that the
      screen fits inside the one page table, as indeed it does.
      
      The sh __do_page_fault: which handles both kernel faults (without lock) and
      user mm faults (locked - though it set_pte without locking before).
      
      The sh64 flush_cache_range and helpers: which wrongly thought callers held
      page_table_lock before (only its tlb_start_vma did, and no longer does so);
      moved the flush loop down, and adjusted the large versus small range decision
      to consider a range which spans page tables as large.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Acked-by: NPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      60ec5585
  31. 05 9月, 2005 1 次提交