1. 29 8月, 2013 2 次提交
    • V
      ARC: MMUv4 preps/1 - Fold PTE K/U access flags · 64b703ef
      Vineet Gupta 提交于
      The current ARC VM code has 13 flags in Page Table entry: some software
      (accesed/dirty/non-linear-maps) and rest hardware specific. With 8k MMU
      page, we need 19 bits for addressing page frame so remaining 13 bits is
      just about enough to accomodate the current flags.
      
      In MMUv4 there are 2 additional flags, SZ (normal or super page) and WT
      (cache access mode write-thru) - and additionally PFN is 20 bits (vs. 19
      before for 8k). Thus these can't be held in current PTE w/o making each
      entry 64bit wide.
      
      It seems there is some scope of compressing the current PTE flags (and
      freeing up a few bits). Currently PTE contains fully orthogonal distinct
      access permissions for kernel and user mode (Kr, Kw, Kx; Ur, Uw, Ux)
      which can be folded into one set (R, W, X). The translation of 3 PTE
      bits into 6 TLB bits (when programming the MMU) can be done based on
      following pre-requites/assumptions:
      
      1. For kernel-mode-only translations (vmalloc: 0x7000_0000 to
         0x7FFF_FFFF), PTE additionally has PAGE_GLOBAL flag set (and user
         space entries can never be global). Thus such a PTE can translate
         to Kr, Kw, Kx (as appropriate) and zero for User mode counterparts.
      
      2. For non global entries, the PTE flags can be used to create mirrored
         K and U TLB bits. This is true after commit a950549c
         "ARC: copy_(to|from)_user() to honor usermode-access permissions"
         which ensured that user-space translations _MUST_ have same access
         permissions for both U/K mode accesses so that  copy_{to,from}_user()
         play fair with fault based CoW break and such...
      
      There is no such thing as free lunch - the cost is slightly infalted
      TLB-Miss Handlers.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      64b703ef
    • V
      ARC: Code cosmetics (Nothing semantical) · 4b06ff35
      Vineet Gupta 提交于
      * reduce editor lines taken by pt_regs
      * ARCompact ISA specific part of TLB Miss handlers clubbed together
      * cleanup some comments
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      4b06ff35
  2. 26 8月, 2013 2 次提交
    • V
      ARC: Entry Handler tweaks: Optimize away redundant IRQ_DISABLE_SAVE · fce16bc3
      Vineet Gupta 提交于
      In the exception return path, for both U/K cases, intr are already
      disabled (for various existing reasons). So when we drop down to
      @restore_regs, we need not redo that.
      
      There was subtle issue - when intr were NOT being disabled for
      ret-to-kernel-but-no-preemption case - now fixed by moving the
      IRQ_DISABLE further up in @resume_kernel_mode.
      
      So what do we gain:
      
      * Shaves off a few insn in return path.
      
      * Eliminates the need for IRQ_DISABLE_SAVE assembler macro for ARCv2
        hence allows for entry code sharing.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      fce16bc3
    • V
      ARC: Exception Handlers Code consolidation · 37f3ac49
      Vineet Gupta 提交于
      After the recent cleanups, all the exception handlers now have same
      boilerplate prologue code. Move that into common macro.
      
      This reduces readability but helps greatly with sharing / duplicating
      entry code with ARCv2 ISA where the handlers are pretty much the same,
      just the entry prologue is different (due to hardware assist).
      
      Also while at it, add the missing FAKE_RET_FROM_EXCPN calls in couple of
      places to drop down to pure kernel mode (from exception mode) before
      jumping off into "C" code.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      37f3ac49
  3. 27 7月, 2013 1 次提交
  4. 29 6月, 2013 1 次提交
  5. 27 6月, 2013 1 次提交
    • P
      arc: delete __cpuinit usage from all arc files · ce759956
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      Note that some harmless section mismatch warnings may result, since
      notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
      are flagged as __cpuinit  -- so if we remove the __cpuinit from
      arch specific callers, we will also get section mismatch warnings.
      As an intermediate step, we intend to turn the linux/init.h cpuinit
      content into no-ops as early as possible, since that will get rid
      of these warnings.  In any case, they are temporary and harmless.
      
      This removes all the arch/arc uses of the __cpuinit macros from
      all C files.  Currently arc does not have any __CPUINIT used in
      assembly files.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      ce759956
  6. 26 6月, 2013 2 次提交
    • V
      ARC: Remove explicit passing around of ECR · 38a9ff6d
      Vineet Gupta 提交于
      With ECR now part of pt_regs
      
      * No need to propagate from lowest asm handlers as arg
      * No need to save it in tsk->thread.cause_code
      * Avoid bit chopping to access the bit-fields
      
      More code consolidation, cleanup
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      38a9ff6d
    • V
      ARC: pt_regs update #5: Use real ECR for pt_regs->event vs. synth values · 502a0c77
      Vineet Gupta 提交于
      pt_regs->event was set with artificial values to identify the low level
      system event (syscall trap / breakpoint trap / exceptions / interrupts)
      
      With r8 saving out of the way, the full word can be used to save real
      ECR (Exception Cause Register) which helps idenify the event naturally,
      including additional info such as cause code, param.
      Only for Interrupts, where ECR is not applicable, do we resort to
      synthetic non ECR values.
      
      SAVE_ALL_TRAP/EXCEPTIONS can now be merged as they both use ECR with
      different runtime values.
      
      The ptrace helpers now use the sub-fields of ECR to distinguish the
      events (e.g. vector 0x25 is trap, param 0 is syscall...)
      
      The following benefits will follow:
      
      (1) This centralizes the location of where ECR is saved and will allow
          the cleanup of task->thread.cause_code ECR placeholder which is set
          in non-uniform way. Then ARC VM code can safely rely on it being
          there for purpose of finer grained VM_EXEC dcache flush (based on
          exec fault: I-TLB Miss)
      
      (2) Further, ECR being passed around from low level handlers as arg can
          be eliminated as it is part of standard reg-file in pt_regs
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      502a0c77
  7. 22 6月, 2013 14 次提交
    • V
      ARC: stop using pt_regs->orig_r8 · 352c1d95
      Vineet Gupta 提交于
      Historically, pt_regs have had orig_r8, an overloaded container for
        (1) backup copy of r8 (syscall number Trap Exceptions)
        (2) additional system state: (syscall/Exception/Interrupt)
      
      There is no point in keeping (1) since syscall number is never clobbered
      in-place, in pt_regs, unlike r0 which duals as first syscall arg as well
      as syscall return value and in case of syscall restart, the orig arg0
      needs restoring (from orig_r0)  after having been updated in-place with
      syscall ret value.
      
      This further paves way to convert (2) to contain ECR itself (rather than
      current madeup values)
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      352c1d95
    • V
      ARC: pt_regs update #4: r25 saved/restored unconditionally · 359105bd
      Vineet Gupta 提交于
      (This is a VERY IMP change for low level interrupt/exception handling)
      
      -----------------------------------------------------------------------
      WHAT
      -----------------------------------------------------------------------
      * User 25 now saved in pt_regs->user_r25 (vs. tsk->thread_info.user_r25)
      
      * This allows Low level interrupt code to unconditionally save r25
        (vs. the prev version which would only do it for U->K transition).
        Ofcourse for nested interrupts, only the pt_regs->user_r25 of
        bottom-most frame is useful.
      
      * simplifies the interrupt prologue/epilogue
      
      * Needed for ARCv2 ISA code and done here to keep design similar with
        ARCompact event handling
      
      -----------------------------------------------------------------------
      WHY
      -------------------------------------------------------------------------
      With CONFIG_ARC_CURR_IN_REG, r25 is used to cache "current" task pointer
      in kernel mode. So when entering kernel mode from User Mode
      - user r25 is specially safe-kept (it being a callee reg is NOT part of
        pt_regs which are saved by default on each interrupt/trap/exception)
      - r25 loaded with current task pointer.
      
      Further, if interrupt was taken in kernel mode, this is skipped since we
      know that r25 already has valid "current" pointer.
      
      With 2 level of interrupts in ARCompact ISA, detecting this is difficult
      but still possible, since we could be in kernel mode but r25 not already saved
      (in fact the stack itself might not have been switched).
      
      A. User mode
      B. L1 IRQ taken
      C. L2 IRQ taken (while on 1st line of L1 ISR)
      
      So in #C, although in kernel mode, r25 not saved (infact SP not
      switched at all)
      
      Given that ARcompact has manual stack switching, we could use a bit of
      trickey - The low level code would make sure that SP is only set to kernel
      mode value at the very end (after saving r25). So a non kernel mode SP,
      even if in kernel mode, meant r25 was NOT saved.
      
      The same paradigm won't work in ARCv2 ISA since SP is auto-switched so
      it's setting can't be delayed/constrained.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      359105bd
    • V
      ARC: K/U SP saved from one location in stack switching macro · ba3558c7
      Vineet Gupta 提交于
      This paves way for further simplifications.
      
      There's an overhead of 1 insn for the non-common case of interrupt taken
      from kernel mode.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      ba3558c7
    • V
    • V
      ARC: Increase readability of entry handlers · 3ebedbb2
      Vineet Gupta 提交于
      * use artificial PUSH/POP contructs for CORE Reg save/restore to stack
      * use artificial PUSHAX/POPAX contructs for Auxiliary Space regs
      * macro'ize multiple copies of callee-reg-save/restore (SAVE_R13_TO_R24)
      * use BIC insn for inverse-and operation
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      3ebedbb2
    • V
      ARC: pt_regs update #3: Remove unused gutter at start of callee_regs · 16f9afe6
      Vineet Gupta 提交于
      This is trickier than prev two:
      
      * context switching code saves kernel mode callee regs in the format of
        struct callee_regs thus needs adjustment. This also reduces the height
        of topmost kernel stack frame by 1 word.
      
      * Since kernel stack unwinder is sensitive to height of topmost kernel
        stack frame, that needs a word of adjustment too.
      
      ptrace needs a bit of updating since pt_regs now diverges from
      user_regs_struct.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      16f9afe6
    • V
    • V
      ARC: pt_regs update #1: Align pt_regs end with end of kernel stack page · 283237a0
      Vineet Gupta 提交于
      Historically, pt_regs would end at offset of 1 word from end of stack
      page.
      
              -----------------  -> START of page (task->stack)
              |               |
              | thread_info   |
              -----------------
              |               |
         ^    ~               ~
         |    ~               ~
         |    |               |
         |    |               | <---- pt_regs used to END here
              -----------------
              | 1 word GUTTER |
              ----------------- -> End of page (START of kernel stack)
      
      This required special "one-off" considerations in low level code.
      
      The root cause is very likely assumption of "empty" SP by the original
      ARC kernel hackers, despite ARC700 always been "full" SP.
      
      So finally RIP one word gutter !
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      283237a0
    • V
      ARC: pt_regs update #0: remove kernel stack canary · bed30976
      Vineet Gupta 提交于
      This stack slot is going to be used in subsequent commits
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      bed30976
    • V
      ARC: [mm] Make stack/heap Non-executable by default · 3abc9448
      Vineet Gupta 提交于
      1. For VM_EXEC based delayed dcache/icache flush, reduces the number of
         flushes.
      
      2. Makes this security feature ON by default rather than OFF before.
      
      3. Applications can use mprotect() to selectively override this.
      
      4. ELF binaries have a GNU_STACK segment which can easily override the
         kernel default permissions.
         For nested-functions/trampolines, gcc already auto-enables executable
         stack in elf. Others needing this can use -Wl,-z,execstack option.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      3abc9448
    • V
      ARC: [mm] Assume pagecache page dirty by default · 2ed21dae
      Vineet Gupta 提交于
      Similar to ARM/SH
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      2ed21dae
    • V
      ARC: cache detection code bitrot · 30499186
      Vineet Gupta 提交于
      * Number of (i|d)cache ways can be retrieved from BCRs and hence no need
        to cross check with with built-in constants
      * Use of IS_ENABLED() to check for a Kconfig option
      * is_not_cache_aligned() not used anymore
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      30499186
    • V
      ARC: Disintegrate arcregs.h · da1677b0
      Vineet Gupta 提交于
      * Move the various sub-system defines/types into relevant files/functions
        (reduces compilation time)
      
      * move CPU specific stuff out of asm/tlb.h into asm/mmu.h
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      da1677b0
    • V
  8. 25 5月, 2013 1 次提交
    • V
      ARC: lazy dcache flush broke gdb in non-aliasing configs · 7bb66f6e
      Vineet Gupta 提交于
      gdbserver inserting a breakpoint ends up calling copy_user_page() for a
      code page. The generic version of which (non-aliasing config) didn't set
      the PG_arch_1 bit hence update_mmu_cache() didn't sync dcache/icache for
      corresponding dynamic loader code page - causing garbade to be executed.
      
      So now aliasing versions of copy_user_highpage()/clear_page() are made
      default. There is no significant overhead since all of special alias
      handling code is compiled out for non-aliasing build
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      7bb66f6e
  9. 23 5月, 2013 3 次提交
    • V
      ARC: Use enough bits for determining page's cache color · 006dfb3c
      Vineet Gupta 提交于
      The current code uses 2 bits for determining page's dcache color, thus
      sorting pages into 4 bins, whereas the aliasing dcache really has 2 bins
      (8k page, 64k dcache - 4 way-set-assoc).
      This can cause extraneous flushes - e.g. color 0 and 2.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      006dfb3c
    • V
      ARC: Brown paper bag bug in macro for checking cache color · 3e87974d
      Vineet Gupta 提交于
      The VM_EXEC check in update_mmu_cache() was getting optimized away
      because of a stupid error in definition of macro addr_not_cache_congruent()
      
      The intention was to have the equivalent of following:
      
      	if (a || (1 ? b : 0))
      
      but we ended up with following:
      
      	if (a || 1 ? b : 0)
      
      And because precedence of '||' is more that that of '?', gcc was optimizing
      away evaluation of <a>
      
      Nasty Repercussions:
      1. For non-aliasing configs it would mean some extraneous dcache flushes
         for non-code pages if U/K mappings were not congruent.
      2. For aliasing config, some needed dcache flush for code pages might
         be missed if U/K mappings were congruent.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      3e87974d
    • V
      ARC: copy_(to|from)_user() to honor usermode-access permissions · a950549c
      Vineet Gupta 提交于
      This manifested as grep failing psuedo-randomly:
      
      -------------->8---------------------
      [ARCLinux]$ ip address show lo | grep inet
      [ARCLinux]$ ip address show lo | grep inet
      [ARCLinux]$ ip address show lo | grep inet
      [ARCLinux]$
      [ARCLinux]$ ip address show lo | grep inet
          inet 127.0.0.1/8 scope host lo
      -------------->8---------------------
      
      ARC700 MMU provides fully orthogonal permission bits per page:
      Ur, Uw, Ux, Kr, Kw, Kx
      
      The user mode page permission templates used to have all Kernel mode
      access bits enabled.
      This caused a tricky race condition observed with uClibc buffered file
      read and UNIX pipes.
      
      1. Read access to an anon mapped page in libc .bss: write-protected
         zero_page mapped: TLB Entry installed with Ur + K[rwx]
      
      2. grep calls libc:getc() -> buffered read layer calls read(2) with the
         internal read buffer in same .bss page.
         The read() call is on STDIN which has been redirected to a pipe.
         read(2) => sys_read() => pipe_read() => copy_to_user()
      
      3. Since page has Kernel-write permission (despite being user-mode
         write-protected), copy_to_user() suceeds w/o taking a MMU TLB-Miss
         Exception (page-fault for ARC). core-MM is unaware that kernel
         erroneously wrote to the reserved read-only zero-page (BUG #1)
      
      4. Control returns to userspace which now does a write to same .bss page
         Since Linux MM is not aware that page has been modified by kernel, it
         simply reassigns a new writable zero-init page to mapping, loosing the
         prior write by kernel - effectively zero'ing out the libc read buffer
         under the hood - hence grep doesn't see right data (BUG #2)
      
      The fix is to make all kernel-mode access permissions mirror the
      user-mode ones. Note that the kernel still has full access to pages,
      when accessed directly (w/o MMU) - this fix ensures that kernel-mode
      access in copy_to_from() path uses the same faulting access model as for
      pure user accesses to keep MM fully aware of page state.
      
      The issue is peudo-random because it only shows up if the TLB entry
      installed in #1 is present at the time of #3. If it is evicted out, due
      to TLB pressure or some-such, then copy_to_user() does take a TLB Miss
      Exception, with a routine write-to-anon COW processing installing a
      fresh page for kernel writes and also usable as it is in userspace.
      
      Further the issue was dormant for so long as it depends on where the
      libc internal read buffer (in .bss) is mapped at runtime.
      If it happens to reside in file-backed data mapping of libc (in the
      page-aligned slack space trailing the file backed data), loader zero
      padding the slack space, does the early cow page replacement, setting
      things up at the very beginning itself.
      
      With gcc 4.8 based builds, the libc buffer got pushed out to a real
      anon mapping which triggers the issue.
      Reported-by: NAnton Kolesov <akolesov@synopsys.com>
      Cc: <stable@vger.kernel.org> # 3.9
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      a950549c
  10. 10 5月, 2013 4 次提交
  11. 07 5月, 2013 6 次提交
    • V
      ARC: [mm] Lazy D-cache flush (non aliasing VIPT) · eacd0e95
      Vineet Gupta 提交于
      flush_dcache_page( ) is MM hook to ensure that a page has consistent
      views between kernel and userspace. Thus it is called when
      
      * kernel writes to a page which at some later point could get mapped to
        userspace (so kernel mapping needs to be flushed-n-inv)
      * kernel is about to read from a page with possible userspace mappings
        (so userspace mappings needs to be made coherent with kernel ones)
      
      However for Non aliasing VIPT dcache, any userspace mapping will always
      be congruent to kernel mapping. Thus d-cache need need not be flushed at
      all (or delayed indefinitely).
      
      The only reason it does need to be flushed is when mapping code pages.
      Since icache doesn't snoop dcache, those dirty dcache lines need to be
      written back to memory and icache line invalidated so that icache lines
      fetch will get the right data.
      
      Decent gains on LMBench fork/exec/sh and File I/O micro-benchmarks.
      
      (1) FPGA @ 80 MHZ
      
      Processor, Processes - times in microseconds - smaller is better
      ------------------------------------------------------------------------------
      Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                                   call  I/O stat clos TCP  inst hndl proc proc proc
      --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
      3.9-rc6-a Linux 3.9.0-r   80 4.79 8.72 66.7 116. 239. 8.39 30.4 4798 14.K 34.K
      3.9-rc6-b Linux 3.9.0-r   80 4.79 8.62 65.4 111. 239. 8.35 29.0 3995 12.K 30.K
      3.9-rc7-c Linux 3.9.0-r   80 4.79 9.00 66.1 106. 239. 8.61 30.4 2858 10.K 24.K
                                                                      ^^^^ ^^^^ ^^^
      
      File & VM system latencies in microseconds - smaller is better
      -------------------------------------------------------------------------------
      Host                 OS   0K File      10K File     Mmap    Prot   Page 100fd
                              Create Delete Create Delete Latency Fault  Fault selct
      --------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
      3.9-rc6-a Linux 3.9.0-r  317.8  204.2 1122.3  375.1 3522.0 4.288     20.7 126.8
      3.9-rc6-b Linux 3.9.0-r  298.7  223.0 1141.6  367.8 3531.0 4.866     20.9 126.4
      3.9-rc7-c Linux 3.9.0-r  278.4  179.2  862.1  339.3 3705.0 3.223     20.3 126.6
                               ^^^^^  ^^^^^  ^^^^^  ^^^^
      
      (2) Customer Silicon @ 500 MHz (166 MHz mem)
      
      ------------------------------------------------------------------------------
      Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                                   call  I/O stat clos TCP  inst hndl proc proc proc
      --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
      abilis-ba Linux 3.9.0-r  497 0.71 1.38 4.58 12.0 35.5 1.40 3.89 2070 5525 13.K
      abilis-ca Linux 3.9.0-r  497 0.71 1.40 4.61 11.8 35.6 1.37 3.92 1411 4317 10.K
                                                                      ^^^^ ^^^^ ^^^
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      eacd0e95
    • V
      ARC: [mm] consolidate icache/dcache sync code · 94bad1af
      Vineet Gupta 提交于
      Now that we have same helper used for all icache invalidates (i.e.
      vaddr+paddr based exact line invalidate), consolidate the open coded
      calls into one place.
      
      Also rename flush_icache_range_vaddr => __sync_icache_dcache
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      94bad1af
    • V
      ARC: [mm] optimise icache flush for user mappings · 24603fdd
      Vineet Gupta 提交于
      ARC icache doesn't snoop dcache thus executable pages need to be made
      coherent before mapping into userspace in flush_icache_page().
      
      However ARC700 CDU (hardware cache flush module) requires both vaddr
      (index in cache) as well as paddr (tag match) to correctly identify a
      line in the VIPT cache. A typical ARC700 SoC has aliasing icache, thus
      the paddr only based flush_icache_page() API couldn't be implemented
      efficiently. It had to loop thru all possible alias indexes and perform
      the invalidate operation (ofcourse the cache op would only succeed at
      the index(es) where tag matches - typically only 1, but the cost of
      visiting all the cache-bins needs to paid nevertheless).
      
      Turns out however that the vaddr (along with paddr) is available in
      update_mmu_cache() hence better suits ARC icache flush semantics.
      With both vaddr+paddr, exactly one flush operation per line is done.
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      24603fdd
    • V
      ARC: [mm] optimize needless full mm TLB flush on munmap · 8d56bec2
      Vineet Gupta 提交于
      munmap ends up calling tlb_flush() which for ARC was flushing the entire
      TLB unconditionally (by moving the MMU to a new ASID)
      
      do_munmap
        unmap_region
          unmap_vmas
            unmap_single_vma
               unmap_page_range
                  tlb_start_vma
                  zap_pud_range
                  tlb_end_vma()
        tlb_finish_mmu
          tlb_flush()  ---> unconditional flush_tlb_mm()
      
      So even a single page munmap, a frequent operation when uClibc dynamic
      linker (ldso) is loading the dependent shared libraries, would move the
      the ASID multiple times - needlessly invalidating the pre-faulted TLB
      entries (and increasing the rate of ASID wraparound + full TLB flush).
      
      This is now optimised to only be called if tlb->full_mm (which means
      for exit/execve) cases only. And for those cases, flush_tlb_mm() is
      already optimised to be a no-op for mm->mm_users == 0.
      
      So essentially there are no mmore full mm flushes - except for fork which
      anyhow needs it for properly COW'ing parent address space.
      
      munmap now needs to do TLB range flush, which is implemented with
      tlb_end_vma()
      
      Results
      -------
      1. ASID now consistenly moves by 4 during a simple ls (as opposed to 5 or
         7 before).
      
      2. LMBench microbenchmark also shows improvements
      
      Basic system parameters
      ------------------------------------------------------------------------------
      Host                 OS Description              Mhz  tlb  cache  mem scal
                                                           pages line   par load
                                                                 bytes
      --------- ------------- ----------------------- ---- ----- ----- ------ ----
      3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0404-gcc-4.4-ba   80     8    64 1.1000 1
      3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0405-avoid-full   80     8    64 1.1200 1
      
      Processor, Processes - times in microseconds - smaller is better
      ------------------------------------------------------------------------------
      Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                                   call  I/O stat clos TCP  inst hndl proc proc proc
      --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
      3.9-rc5-0 Linux 3.9.0-r   80 4.81 8.69 68.6 118. 239. 8.53 31.6 4839 13.K 34.K
      3.9-rc5-0 Linux 3.9.0-r   80 4.46 8.36 53.8 91.3 223. 8.12 24.2 4725 13.K 33.K
      
      File & VM system latencies in microseconds - smaller is better
      -------------------------------------------------------------------------------
      Host                 OS   0K File      10K File     Mmap    Prot   Page 100fd
                              Create Delete Create Delete Latency Fault  Fault selct
      --------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
      3.9-rc5-0 Linux 3.9.0-r  314.7  223.2 1054.9  390.2  3615.0 1.590 20.1 126.6
      3.9-rc5-0 Linux 3.9.0-r  265.8  183.8 1014.2  314.1  3193.0 6.910 18.8 110.4
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      8d56bec2
    • C
      ARC: [TB10x] Add support for TB10x platform · 072eb693
      Christian Ruppert 提交于
      Infrastructure required to make the Linux kernel compile and boot on the
      Abilis Systems TB10x series of SOCs based on ARC700 CPUs:
        - Kmake related files (Kconfig, Makefile, tb10x_defconfig)
        - TB10x platform initialisation
      Signed-off-by: NChristian Ruppert <christian.ruppert@abilis.com>
      Signed-off-by: NPierrick Hascoet <pierrick.hascoet@abilis.com>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      072eb693
    • C
      ARC: Prepare interrupt code for external controllers · a37cdacc
      Christian Ruppert 提交于
      This patch adds some room for CPU-external interrupt controllers in the
      Linux interrupt space. Until now, only the 32 CPU internal interrupt lines
      were supported which does not allow for external interrupt controllers such
      as GPIO modules etc.
      Signed-off-by: NChristian Ruppert <christian.ruppert@abilis.com>
      Signed-off-by: NPierrick Hascoet <pierrick.hascoet@abilis.com>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      a37cdacc
  12. 09 4月, 2013 1 次提交
    • C
      ARC: Add implicit compiler barrier to raw_local_irq* functions · 79e5f05e
      Christian Ruppert 提交于
      ARC irqsave/restore macros were missing the compiler barrier, causing a
      stale load in irq-enabled region be used in irq-safe region, despite
      being changed, because the register holding the value was still live.
      
      The problem manifested as random crashes in timer code when stress
      testing ARCLinux (3.9-rc3) on a !SMP && !PREEMPT_COUNT
      
      Here's the exact sequence which caused this:
       (0). tv1[x] <----> t1 <---> t2
       (1). mod_timer(t1) interrupted after it calls timer_pending()
       (2). mod_timer(t2) completes
       (3). mod_timer(t1) resumes but messes up the list
       (4). __runt_timers( ) uses bogus timer_list entry / crashes in
            timer->function
      
      Essentially mod_timer() was racing against itself and while the spinlock
      serialized the tv1[] timer link list, timer_pending() called outside the
      spinlock, cached timer link list element in a register.
      With low register pressure (and a deep register file), lack of barrier
      in raw_local_irqsave() as well as preempt_disable (!PREEMPT_COUNT
      version), there was nothing to force gcc to reload across the spinlock,
      causing a stale value in reg be used for link list manipulation - ensuing
      a corruption.
      
      ARcompact disassembly which shows the culprit generated code:
      
      mod_timer:
          push_s blink
          mov_s r13,r0	# timer, timer
      ..
          ###### timer_pending( )
          ld_s r3,[r13]       # <------ <variable>.entry.next LOADED
          brne r3, 0, @.L163
      
      .L163:
      ..
          ###### spin_lock_irq( )
          lr  r5, [status32]  # flags
          bic r4, r5, 6       # temp, flags,
          and.f 0, r5, 6      # flags,
          flag.nz r4
      
          ###### detach_if_pending( ) begins
      
          tst_s r3,r3  <--------------
      			# timer_pending( ) checks timer->entry.next
                              # r3 is NOT reloaded by gcc, using stale value
          beq.d @.L169
          mov.eq r0,0
      
          #####  detach_timer( ): __list_del( )
      
          ld r4,[r13,4]    	# <variable>.entry.prev, D.31439
          st r4,[r3,4]     	# <variable>.prev, D.31439
          st r3,[r4]       	# <variable>.next, D.30246
      
      We initially tried to fix this by adding barrier() to preempt_* macros
      for !PREEMPT_COUNT but Linus clarified that it was anything but wrong.
      http://www.spinics.net/lists/kernel/msg1512709.html
      
      [vgupta: updated commitlog]
      
      Reported-by/Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
      Cc: Christian Ruppert <christian.ruppert@abilis.com>
      Cc: Pierrick Hascoet <pierrick.hascoet@abilis.com>
      Debugged-by/Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      79e5f05e
  13. 20 3月, 2013 1 次提交
    • V
      ARC: Fix the typo in event identifier flags used by ptrace · 367f3fcd
      Vineet Gupta 提交于
      orig_r8_IS_EXCPN and orig_r8_IS_BRKPT were same values due to a
      copy/paste error. Although it looks bad and is wrong, it really doesn't
      affect gdb working.
      
      orig_r8_IS_BRKPT is the one relevant to debugging (breakpoints), since
      it is used to provide EFA vs. ERET to a ptrace "stop_pc" request.
      
      So when gdb has inserted a breakpoint, orig_r8_IS_BRKPT is already set,
      and anything else (i.e. orig_r8_IS_EXCPN) becoming same as it, really
      doesn't hurt gdb. The corollary case, could be nasty but nobody uses the
      ptrace "stop_pc" request in that case
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      367f3fcd
  14. 19 3月, 2013 1 次提交
新手
引导
客服 返回
顶部