1. 17 5月, 2013 1 次提交
  2. 03 5月, 2013 1 次提交
  3. 30 4月, 2013 1 次提交
    • G
      mm/hugetlb: add more arch-defined huge_pte functions · 106c992a
      Gerald Schaefer 提交于
      Commit abf09bed ("s390/mm: implement software dirty bits")
      introduced another difference in the pte layout vs.  the pmd layout on
      s390, thoroughly breaking the s390 support for hugetlbfs.  This requires
      replacing some more pte_xxx functions in mm/hugetlbfs.c with a
      huge_pte_xxx version.
      
      This patch introduces those huge_pte_xxx functions and their generic
      implementation in asm-generic/hugetlb.h, which will now be included on
      all architectures supporting hugetlbfs apart from s390.  This change
      will be a no-op for those architectures.
      
      [akpm@linux-foundation.org: fix warning]
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Acked-by: Michal Hocko <mhocko@suse.cz>	[for !s390 parts]
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      106c992a
  4. 23 4月, 2013 1 次提交
  5. 17 4月, 2013 1 次提交
    • L
      s390: move dummy io_remap_pfn_range() to asm/pgtable.h · 4f2e2903
      Linus Torvalds 提交于
      Commit b4cbb197 ("vm: add vm_iomap_memory() helper function") added
      a helper function wrapper around io_remap_pfn_range(), and every other
      architecture defined it in <asm/pgtable.h>.
      
      The s390 choice of <asm/io.h> may make sense, but is not very convenient
      for this case, and gratuitous differences like that cause unexpected errors like this:
      
         mm/memory.c: In function 'vm_iomap_memory':
         mm/memory.c:2439:2: error: implicit declaration of function 'io_remap_pfn_range' [-Werror=implicit-function-declaration]
      
      Glory be the kbuild test robot who noticed this, bisected it, and
      reported it to the guilty parties (ie me).
      
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4f2e2903
  6. 02 4月, 2013 2 次提交
    • H
      s390/mm: provide emtpy check_pgt_cache() function · 765a0cac
      Heiko Carstens 提交于
      All architectures need to provide a check_pgt_cache() function. The s390 one
      got lost somewhere.
      So reintroduce it to prevent future compile errors e.g. if Thomas Gleixner's
      idle loop rework patches get merged.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      765a0cac
    • H
      s390/uaccess: fix page table walk · ea81531d
      Heiko Carstens 提交于
      When translating user space addresses to kernel addresses the follow_table()
      function had two bugs:
      
      - PROT_NONE mappings could be read accessed via the kernel mapping. That is
        e.g. putting a filename into a user page, then protecting the page with
        PROT_NONE and afterwards issuing the "open" syscall with a pointer to
        the filename would incorrectly succeed.
      
      - when walking the page tables it used the pgd/pud/pmd/pte primitives which
        with dynamic page tables give no indication which real level of page tables
        is being walked (region2, region3, segment or page table). So in case of an
        exception the translation exception code passed to __handle_fault() is not
        necessarily correct.
        This is not really an issue since __handle_fault() doesn't evaluate the code.
        Only in case of e.g. a SIGBUS this code gets passed to user space. If user
        space can do something sane with the value is a different question though.
      
      To fix these issues don't use any Linux primitives. Only walk the page tables
      like the hardware would do it, however we leave quite some checks away since
      we know that we only have full size page tables and each index is within bounds.
      
      In theory this should fix all issues...
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Reviewed-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      ea81531d
  7. 08 3月, 2013 1 次提交
  8. 28 2月, 2013 1 次提交
  9. 14 2月, 2013 2 次提交
    • M
      s390/mm: implement software dirty bits · abf09bed
      Martin Schwidefsky 提交于
      The s390 architecture is unique in respect to dirty page detection,
      it uses the change bit in the per-page storage key to track page
      modifications. All other architectures track dirty bits by means
      of page table entries. This property of s390 has caused numerous
      problems in the past, e.g. see git commit ef5d437f
      "mm: fix XFS oops due to dirty pages without buffers on s390".
      
      To avoid future issues in regard to per-page dirty bits convert
      s390 to a fault based software dirty bit detection mechanism. All
      user page table entries which are marked as clean will be hardware
      read-only, even if the pte is supposed to be writable. A write by
      the user process will trigger a protection fault which will cause
      the user pte to be marked as dirty and the hardware read-only bit
      is removed.
      
      With this change the dirty bit in the storage key is irrelevant
      for Linux as a host, but the storage key is still required for
      KVM guests. The effect is that page_test_and_clear_dirty and the
      related code can be removed. The referenced bit in the storage
      key is still used by the page_test_and_clear_young primitive to
      provide page age information.
      
      For page cache pages of mappings with mapping_cap_account_dirty
      there will not be any change in behavior as the dirty bit tracking
      already uses read-only ptes to control the amount of dirty pages.
      Only for swap cache pages and pages of mappings without
      mapping_cap_account_dirty there can be additional protection faults.
      To avoid an excessive number of additional faults the mk_pte
      primitive checks for PageDirty if the pgprot value allows for writes
      and pre-dirties the pte. That avoids all additional faults for
      tmpfs and shmem pages until these pages are added to the swap cache.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      abf09bed
    • H
      s390/mm: provide PAGE_SHARED define · bddb7ae2
      Heiko Carstens 提交于
      Only needed to make some drivers compile...
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      bddb7ae2
  10. 22 1月, 2013 1 次提交
  11. 13 1月, 2013 1 次提交
  12. 13 12月, 2012 1 次提交
  13. 23 11月, 2012 2 次提交
  14. 26 10月, 2012 1 次提交
  15. 09 10月, 2012 6 次提交
  16. 20 7月, 2012 1 次提交
    • H
      s390/comments: unify copyright messages and remove file names · a53c8fab
      Heiko Carstens 提交于
      Remove the file name from the comment at top of many files. In most
      cases the file name was wrong anyway, so it's rather pointless.
      
      Also unify the IBM copyright statement. We did have a lot of sightly
      different statements and wanted to change them one after another
      whenever a file gets touched. However that never happened. Instead
      people start to take the old/"wrong" statements to use as a template
      for new files.
      So unify all of them in one go.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      a53c8fab
  17. 24 5月, 2012 1 次提交
  18. 27 12月, 2011 1 次提交
    • M
      [S390] add support for physical memory > 4TB · 14045ebf
      Martin Schwidefsky 提交于
      The kernel address space of a 64 bit kernel currently uses a three level
      page table and the vmemmap array has a fixed address and a fixed maximum
      size. A three level page table is good enough for systems with less than
      3.8TB of memory, for bigger systems four page table levels need to be
      used. Each page table level costs a bit of performance, use 3 levels for
      normal systems and 4 levels only for the really big systems.
      To avoid bloating sparse.o too much set MAX_PHYSMEM_BITS to 46 for a
      maximum of 64TB of memory.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      14045ebf
  19. 01 12月, 2011 1 次提交
  20. 14 11月, 2011 1 次提交
  21. 30 10月, 2011 2 次提交
  22. 20 9月, 2011 1 次提交
  23. 24 7月, 2011 1 次提交
    • M
      [S390] kvm guest address space mapping · e5992f2e
      Martin Schwidefsky 提交于
      Add code that allows KVM to control the virtual memory layout that
      is seen by a guest. The guest address space uses a second page table
      that shares the last level pte-tables with the process page table.
      If a page is unmapped from the process page table it is automatically
      unmapped from the guest page table as well.
      
      The guest address space mapping starts out empty, KVM can map any
      individual 1MB segments from the process virtual memory to any 1MB
      aligned location in the guest virtual memory. If a target segment in
      the process virtual memory does not exist or is unmapped while a
      guest mapping exists the desired target address is stored as an
      invalid segment table entry in the guest page table.
      The population of the guest page table is fault driven.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      e5992f2e
  24. 06 6月, 2011 1 次提交
    • M
      [S390] fix kvm defines for 31 bit compile · 6c61cfe9
      Martin Schwidefsky 提交于
      KVM is not available for 31 bit but the KVM defines cause warnings:
      
      arch/s390/include/asm/pgtable.h: In function 'ptep_test_and_clear_user_dirty':
      arch/s390/include/asm/pgtable.h:817: warning: integer constant is too large for 'unsigned long' type
      arch/s390/include/asm/pgtable.h:818: warning: integer constant is too large for 'unsigned long' type
      arch/s390/include/asm/pgtable.h: In function 'ptep_test_and_clear_user_young':
      arch/s390/include/asm/pgtable.h:837: warning: integer constant is too large for 'unsigned long' type
      arch/s390/include/asm/pgtable.h:838: warning: integer constant is too large for 'unsigned long' type
      
      Add 31 bit versions of the KVM defines to remove the warnings.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      6c61cfe9
  25. 29 5月, 2011 1 次提交
    • H
      [S390] mm: fix storage key handling · a43a9d93
      Heiko Carstens 提交于
      page_get_storage_key() and page_set_storage_key() expect a page address
      and not its page frame number. This got inconsistent with 2d42552d
      "[S390] merge page_test_dirty and page_clear_dirty".
      
      Result is that we read/write storage keys from random pages and do not
      have a working dirty bit tracking at all.
      E.g. SetPageUpdate() doesn't clear the dirty bit of requested pages, which
      for example ext4 doesn't like very much and panics after a while.
      
      Unable to handle kernel paging request at virtual user address (null)
      Oops: 0004 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      Modules linked in:
      CPU: 1 Not tainted 2.6.39-07551-g139f37f5-dirty #152
      Process flush-94:0 (pid: 1576, task: 000000003eb34538, ksp: 000000003c287b70)
      Krnl PSW : 0704c00180000000 0000000000316b12 (jbd2_journal_file_inode+0x10e/0x138)
                 R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3
      Krnl GPRS: 0000000000000000 0000000000000000 0000000000000000 0700000000000000
                 0000000000316a62 000000003eb34cd0 0000000000000025 000000003c287b88
                 0000000000000001 000000003c287a70 000000003f1ec678 000000003f1ec000
                 0000000000000000 000000003e66ec00 0000000000316a62 000000003c287988
      Krnl Code: 0000000000316b04: f0a0000407f4       srp     4(11,%r0),2036,0
                 0000000000316b0a: b9020022           ltgr    %r2,%r2
                 0000000000316b0e: a7740015           brc     7,316b38
                >0000000000316b12: e3d0c0000024       stg     %r13,0(%r12)
                 0000000000316b18: 4120c010           la      %r2,16(%r12)
                 0000000000316b1c: 4130d060           la      %r3,96(%r13)
                 0000000000316b20: e340d0600004       lg      %r4,96(%r13)
                 0000000000316b26: c0e50002b567       brasl   %r14,36d5f4
      Call Trace:
      ([<0000000000316a62>] jbd2_journal_file_inode+0x5e/0x138)
       [<00000000002da13c>] mpage_da_map_and_submit+0x2e8/0x42c
       [<00000000002daac2>] ext4_da_writepages+0x2da/0x504
       [<00000000002597e8>] writeback_single_inode+0xf8/0x268
       [<0000000000259f06>] writeback_sb_inodes+0xd2/0x18c
       [<000000000025a700>] writeback_inodes_wb+0x80/0x168
       [<000000000025aa92>] wb_writeback+0x2aa/0x324
       [<000000000025abde>] wb_do_writeback+0xd2/0x274
       [<000000000025ae3a>] bdi_writeback_thread+0xba/0x1c4
       [<00000000001737be>] kthread+0xa6/0xb0
       [<000000000056c1da>] kernel_thread_starter+0x6/0xc
       [<000000000056c1d4>] kernel_thread_starter+0x0/0xc
      INFO: lockdep is turned off.
      Last Breaking-Event-Address:
       [<0000000000316a8a>] jbd2_journal_file_inode+0x86/0x138
      Reported-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      a43a9d93
  26. 23 5月, 2011 3 次提交
    • M
      [S390] refactor page table functions for better pgste support · b2fa47e6
      Martin Schwidefsky 提交于
      Rework the architecture page table functions to access the bits in the
      page table extension array (pgste). There are a number of changes:
      1) Fix missing pgste update if the attach_count for the mm is <= 1.
      2) For every operation that affects the invalid bit in the pte or the
         rcp byte in the pgste the pcl lock needs to be acquired. The function
         pgste_get_lock gets the pcl lock and returns the current pgste value
         for a pte pointer. The function pgste_set_unlock stores the pgste
         and releases the lock. Between these two calls the bits in the pgste
         can be shuffled.
      3) Define two software bits in the pte _PAGE_SWR and _PAGE_SWC to avoid
         calling SetPageDirty and SetPageReferenced from pgtable.h. If the
         host reference backup bit or the host change backup bit has been
         set the dirty/referenced state is transfered to the pte. The common
         code will pick up the state from the pte.
      4) Add ptep_modify_prot_start and ptep_modify_prot_commit for mprotect.
      5) Remove pgd_populate_kernel, pud_populate_kernel, pmd_populate_kernel
         pgd_clear_kernel, pud_clear_kernel, pmd_clear_kernel and ptep_invalidate.
      6) Rename kvm_s390_test_and_clear_page_dirty to
         ptep_test_and_clear_user_dirty and add ptep_test_and_clear_user_young.
      7) Define mm_exclusive() and mm_has_pgste() helper to improve readability.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b2fa47e6
    • M
      [S390] merge page_test_dirty and page_clear_dirty · 2d42552d
      Martin Schwidefsky 提交于
      The page_clear_dirty primitive always sets the default storage key
      which resets the access control bits and the fetch protection bit.
      That will surprise a KVM guest that sets non-zero access control
      bits or the fetch protection bit. Merge page_test_dirty and
      page_clear_dirty back to a single function and only clear the
      dirty bit from the storage key.
      
      In addition move the function page_test_and_clear_dirty and
      page_test_and_clear_young to page.h where they belong. This
      requires to change the parameter from a struct page * to a page
      frame number.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      2d42552d
    • M
      [S390] Remove data execution protection · 043d0708
      Martin Schwidefsky 提交于
      The noexec support on s390 does not rely on a bit in the page table
      entry but utilizes the secondary space mode to distinguish between
      memory accesses for instructions vs. data. The noexec code relies
      on the assumption that the cpu will always use the secondary space
      page table for data accesses while it is running in the secondary
      space mode. Up to the z9-109 class machines this has been the case.
      Unfortunately this is not true anymore with z10 and later machines.
      The load-relative-long instructions lrl, lgrl and lgfrl access the
      memory operand using the same addressing-space mode that has been
      used to fetch the instruction.
      This breaks the noexec mode for all user space binaries compiled
      with march=z10 or later. The only option is to remove the current
      noexec support.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      043d0708
  27. 27 10月, 2010 1 次提交
  28. 25 10月, 2010 2 次提交