1. 24 8月, 2010 1 次提交
    • M
      [S390] fix tlb flushing vs. concurrent /proc accesses · 050eef36
      Martin Schwidefsky 提交于
      The tlb flushing code uses the mm_users field of the mm_struct to
      decide if each page table entry needs to be flushed individually with
      IPTE or if a global flush for the mm_struct is sufficient after all page
      table updates have been done. The comment for mm_users says "How many
      users with user space?" but the /proc code increases mm_users after it
      found the process structure by pid without creating a new user process.
      Which makes mm_users useless for the decision between the two tlb
      flusing methods. The current code can be confused to not flush tlb
      entries by a concurrent access to /proc files if e.g. a fork is in
      progres. The solution for this problem is to make the tlb flushing
      logic independent from the mm_users field.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      050eef36
  2. 07 12月, 2009 1 次提交
    • M
      [S390] Improve address space mode selection. · b11b5334
      Martin Schwidefsky 提交于
      Introduce user_mode to replace the two variables switch_amode and
      s390_noexec. There are three valid combinations of the old values:
        1) switch_amode == 0 && s390_noexec == 0
        2) switch_amode == 1 && s390_noexec == 0
        3) switch_amode == 1 && s390_noexec == 1
      They get replaced by
        1) user_mode == HOME_SPACE_MODE
        2) user_mode == PRIMARY_SPACE_MODE
        3) user_mode == SECONDARY_SPACE_MODE
      The new kernel parameter user_mode=[primary,secondary,home] lets
      you choose the address space mode the user space processes should
      use. In addition the CONFIG_S390_SWITCH_AMODE config option
      is removed.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b11b5334
  3. 26 3月, 2009 1 次提交
  4. 28 10月, 2008 1 次提交
    • C
      [S390] pgtables: Fix race in enable_sie vs. page table ops · 250cf776
      Christian Borntraeger 提交于
      The current enable_sie code sets the mm->context.pgstes bit to tell
      dup_mm that the new mm should have extended page tables. This bit is also
      used by the s390 specific page table primitives to decide about the page
      table layout - which means context.pgstes has two meanings. This can cause
      any kind of bugs. For example  - e.g. shrink_zone can call
      ptep_clear_flush_young while enable_sie is running. ptep_clear_flush_young
      will test for context.pgstes. Since enable_sie changed that value of the old
      struct mm without changing the page table layout ptep_clear_flush_young will
      do the wrong thing.
      The solution is to split pgstes into two bits
      - one for the allocation
      - one for the current state
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      250cf776
  5. 02 8月, 2008 1 次提交
  6. 27 4月, 2008 1 次提交
    • C
      s390: KVM preparation: provide hook to enable pgstes in user pagetable · 402b0862
      Carsten Otte 提交于
      The SIE instruction on s390 uses the 2nd half of the page table page to
      virtualize the storage keys of a guest. This patch offers the s390_enable_sie
      function, which reorganizes the page tables of a single-threaded process to
      reserve space in the page table:
      s390_enable_sie makes sure that the process is single threaded and then uses
      dup_mm to create a new mm with reorganized page tables. The old mm is freed
      and the process has now a page status extended field after every page table.
      
      Code that wants to exploit pgstes should SELECT CONFIG_PGSTE.
      
      This patch has a small common code hit, namely making dup_mm non-static.
      
      Edit (Carsten): I've modified Martin's patch, following Jeremy Fitzhardinge's
      review feedback. Now we do have the prototype for dup_mm in
      include/linux/sched.h. Following Martin's suggestion, s390_enable_sie() does now
      call task_lock() to prevent race against ptrace modification of mm_users.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
      Acked-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      402b0862
  7. 10 2月, 2008 3 次提交
  8. 26 1月, 2008 1 次提交
    • M
      [S390] Fix tlb flushing with idte. · 6f457e1a
      Martin Schwidefsky 提交于
      The clear-by-asce operation of the idte instruction gets an asce
      (address-space-control-element) as argument to specify which TLBs
      need to get flushed. The current code passes a plain pointer to
      the start of the pgd without the additional bits which would make
      the pointer an asce. The current machines don't mind the difference
      but a future model might want to use the designation type control
      bits in the asce as a filter for the TLBs to flush.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      6f457e1a
  9. 22 10月, 2007 1 次提交
  10. 03 5月, 2007 1 次提交
    • J
      [PATCH] x86: PARAVIRT: add hooks to intercept mm creation and destruction · d6dd61c8
      Jeremy Fitzhardinge 提交于
      Add hooks to allow a paravirt implementation to track the lifetime of
      an mm.  Paravirtualization requires three hooks, but only two are
      needed in common code.  They are:
      
      arch_dup_mmap, which is called when a new mmap is created at fork
      
      arch_exit_mmap, which is called when the last process reference to an
        mm is dropped, which typically happens on exit and exec.
      
      The third hook is activate_mm, which is called from the arch-specific
      activate_mm() macro/function, and so doesn't need stub versions for
      other architectures.  It's called when an mm is first used.
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: linux-arch@vger.kernel.org
      Cc: James Bottomley <James.Bottomley@SteelEye.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      d6dd61c8
  11. 06 2月, 2007 1 次提交
    • G
      [S390] noexec protection · c1821c2e
      Gerald Schaefer 提交于
      This provides a noexec protection on s390 hardware. Our hardware does
      not have any bits left in the pte for a hw noexec bit, so this is a
      different approach using shadow page tables and a special addressing
      mode that allows separate address spaces for code and data.
      
      As a special feature of our "secondary-space" addressing mode, separate
      page tables can be specified for the translation of data addresses
      (storage operands) and instruction addresses. The shadow page table is
      used for the instruction addresses and the standard page table for the
      data addresses.
      The shadow page table is linked to the standard page table by a pointer
      in page->lru.next of the struct page corresponding to the page that
      contains the standard page table (since page->private is not really
      private with the pte_lock and the page table pages are not in the LRU
      list).
      Depending on the software bits of a pte, it is either inserted into
      both page tables or just into the standard (data) page table. Pages of
      a vma that does not have the VM_EXEC bit set get mapped only in the
      data address space. Any try to execute code on such a page will cause a
      page translation exception. The standard reaction to this is a SIGSEGV
      with two exceptions: the two system call opcodes 0x0a77 (sys_sigreturn)
      and 0x0aad (sys_rt_sigreturn) are allowed. They are stored by the
      kernel to the signal stack frame. Unfortunately, the signal return
      mechanism cannot be modified to use an SA_RESTORER because the
      exception unwinding code depends on the system call opcode stored
      behind the signal stack frame.
      
      This feature requires that user space is executed in secondary-space
      mode and the kernel in home-space mode, which means that the addressing
      modes need to be switched and that the noexec protection only works
      for user space.
      After switching the addressing modes, we cannot use the mvcp/mvcs
      instructions anymore to copy between kernel and user space. A new
      mvcos instruction has been added to the z9 EC/BC hardware which allows
      to copy between arbitrary address spaces, but on older hardware the
      page tables need to be walked manually.
      Signed-off-by: NGerald Schaefer <geraldsc@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      c1821c2e
  12. 09 11月, 2005 1 次提交
  13. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4