1. 23 5月, 2013 1 次提交
    • V
      ARC: copy_(to|from)_user() to honor usermode-access permissions · a950549c
      Vineet Gupta 提交于
      This manifested as grep failing psuedo-randomly:
      
      -------------->8---------------------
      [ARCLinux]$ ip address show lo | grep inet
      [ARCLinux]$ ip address show lo | grep inet
      [ARCLinux]$ ip address show lo | grep inet
      [ARCLinux]$
      [ARCLinux]$ ip address show lo | grep inet
          inet 127.0.0.1/8 scope host lo
      -------------->8---------------------
      
      ARC700 MMU provides fully orthogonal permission bits per page:
      Ur, Uw, Ux, Kr, Kw, Kx
      
      The user mode page permission templates used to have all Kernel mode
      access bits enabled.
      This caused a tricky race condition observed with uClibc buffered file
      read and UNIX pipes.
      
      1. Read access to an anon mapped page in libc .bss: write-protected
         zero_page mapped: TLB Entry installed with Ur + K[rwx]
      
      2. grep calls libc:getc() -> buffered read layer calls read(2) with the
         internal read buffer in same .bss page.
         The read() call is on STDIN which has been redirected to a pipe.
         read(2) => sys_read() => pipe_read() => copy_to_user()
      
      3. Since page has Kernel-write permission (despite being user-mode
         write-protected), copy_to_user() suceeds w/o taking a MMU TLB-Miss
         Exception (page-fault for ARC). core-MM is unaware that kernel
         erroneously wrote to the reserved read-only zero-page (BUG #1)
      
      4. Control returns to userspace which now does a write to same .bss page
         Since Linux MM is not aware that page has been modified by kernel, it
         simply reassigns a new writable zero-init page to mapping, loosing the
         prior write by kernel - effectively zero'ing out the libc read buffer
         under the hood - hence grep doesn't see right data (BUG #2)
      
      The fix is to make all kernel-mode access permissions mirror the
      user-mode ones. Note that the kernel still has full access to pages,
      when accessed directly (w/o MMU) - this fix ensures that kernel-mode
      access in copy_to_from() path uses the same faulting access model as for
      pure user accesses to keep MM fully aware of page state.
      
      The issue is peudo-random because it only shows up if the TLB entry
      installed in #1 is present at the time of #3. If it is evicted out, due
      to TLB pressure or some-such, then copy_to_user() does take a TLB Miss
      Exception, with a routine write-to-anon COW processing installing a
      fresh page for kernel writes and also usable as it is in userspace.
      
      Further the issue was dormant for so long as it depends on where the
      libc internal read buffer (in .bss) is mapped at runtime.
      If it happens to reside in file-backed data mapping of libc (in the
      page-aligned slack space trailing the file backed data), loader zero
      padding the slack space, does the early cow page replacement, setting
      things up at the very beginning itself.
      
      With gcc 4.8 based builds, the libc buffer got pushed out to a real
      anon mapping which triggers the issue.
      Reported-by: NAnton Kolesov <akolesov@synopsys.com>
      Cc: <stable@vger.kernel.org> # 3.9
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      a950549c
  2. 10 5月, 2013 1 次提交
    • V
      ARC: [mm] Aliasing VIPT dcache support 2/4 · 4102b533
      Vineet Gupta 提交于
      This is the meat of the series which prevents any dcache alias creation
      by always keeping the U and K mapping of a page congruent.
      If a mapping already exists, and other tries to access the page, prev
      one is flushed to physical page (wback+inv)
      
      Essentially flush_dcache_page()/copy_user_highpage() create K-mapping
      of a page, but try to defer flushing, unless U-mapping exist.
      When page is actually mapped to userspace, update_mmu_cache() flushes
      the K-mapping (in certain cases this can be optimised out)
      
      Additonally flush_cache_mm(), flush_cache_range(), flush_cache_page()
      handle the puring of stale userspace mappings on exit/munmap...
      
      flush_anon_page() handles the existing U-mapping for anon page before
      kernel reads it via the GUP path.
      
      Note that while not complete, this is enough to boot a simple
      dynamically linked Busybox based rootfs
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      4102b533
  3. 07 5月, 2013 1 次提交
    • V
      ARC: [mm] optimize needless full mm TLB flush on munmap · 8d56bec2
      Vineet Gupta 提交于
      munmap ends up calling tlb_flush() which for ARC was flushing the entire
      TLB unconditionally (by moving the MMU to a new ASID)
      
      do_munmap
        unmap_region
          unmap_vmas
            unmap_single_vma
               unmap_page_range
                  tlb_start_vma
                  zap_pud_range
                  tlb_end_vma()
        tlb_finish_mmu
          tlb_flush()  ---> unconditional flush_tlb_mm()
      
      So even a single page munmap, a frequent operation when uClibc dynamic
      linker (ldso) is loading the dependent shared libraries, would move the
      the ASID multiple times - needlessly invalidating the pre-faulted TLB
      entries (and increasing the rate of ASID wraparound + full TLB flush).
      
      This is now optimised to only be called if tlb->full_mm (which means
      for exit/execve) cases only. And for those cases, flush_tlb_mm() is
      already optimised to be a no-op for mm->mm_users == 0.
      
      So essentially there are no mmore full mm flushes - except for fork which
      anyhow needs it for properly COW'ing parent address space.
      
      munmap now needs to do TLB range flush, which is implemented with
      tlb_end_vma()
      
      Results
      -------
      1. ASID now consistenly moves by 4 during a simple ls (as opposed to 5 or
         7 before).
      
      2. LMBench microbenchmark also shows improvements
      
      Basic system parameters
      ------------------------------------------------------------------------------
      Host                 OS Description              Mhz  tlb  cache  mem scal
                                                           pages line   par load
                                                                 bytes
      --------- ------------- ----------------------- ---- ----- ----- ------ ----
      3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0404-gcc-4.4-ba   80     8    64 1.1000 1
      3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0405-avoid-full   80     8    64 1.1200 1
      
      Processor, Processes - times in microseconds - smaller is better
      ------------------------------------------------------------------------------
      Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                                   call  I/O stat clos TCP  inst hndl proc proc proc
      --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
      3.9-rc5-0 Linux 3.9.0-r   80 4.81 8.69 68.6 118. 239. 8.53 31.6 4839 13.K 34.K
      3.9-rc5-0 Linux 3.9.0-r   80 4.46 8.36 53.8 91.3 223. 8.12 24.2 4725 13.K 33.K
      
      File & VM system latencies in microseconds - smaller is better
      -------------------------------------------------------------------------------
      Host                 OS   0K File      10K File     Mmap    Prot   Page 100fd
                              Create Delete Create Delete Latency Fault  Fault selct
      --------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
      3.9-rc5-0 Linux 3.9.0-r  314.7  223.2 1054.9  390.2  3615.0 1.590 20.1 126.6
      3.9-rc5-0 Linux 3.9.0-r  265.8  183.8 1014.2  314.1  3193.0 6.910 18.8 110.4
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      8d56bec2
  4. 16 2月, 2013 2 次提交