1. 25 9月, 2016 3 次提交
  2. 23 9月, 2016 1 次提交
  3. 09 7月, 2016 6 次提交
    • C
      powerpc/8xx: add CONFIG_PIN_TLB_IMMR · 62f64b49
      Christophe Leroy 提交于
      CONFIG_PIN_TLB maps IMMR area and the first 24 Mbytes of memory.
      In some circunstances it might be more interesting to not map
      IMMR but map 32 Mbytes of memory instead.
      
      Therefore we add config option CONFIG_PIN_TLB_IMMR to select if
      IMMR shall be pinned or not, hence whether we pin 24 or 32 Mbytes of RAM
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      62f64b49
    • C
      powerpc/8xx: Rework CONFIG_PIN_TLB handling · 4ad27450
      Christophe Leroy 提交于
      On recent kernels, with some debug options like for instance
      CONFIG_LOCKDEP, the BSS requires more than 8M memory, allthough
      the kernel code fits in the first 8M.
      Today, it is necessary to activate CONFIG_PIN_TLB to get more than 8M
      at startup, allthough pinning TLB is not necessary for that.
      
      We could have inconditionaly mapped 16 or 24M bytes at startup
      but some old hardware only have 8M and mapping non-existing RAM
      would be an issue due to speculative accesses.
      
      With the preceding patch however, the TLB entries are populated on
      demand. By setting up the TLB miss handler to handle up to 24M until
      the handler is patched for the entire memory space, it is possible
      to allow access up to more memory without mapping non-existing RAM.
      
      It is therefore not needed anymore to map memory data at all
      at startup. It will be handled by the TLB miss handler.
      
      One might still want to PIN the IMMR and the first 24M of RAM.
      It is now possible to do it in the C memory initialisation
      functions. In addition, we now know how much memory we have
      when we do it, so we are able to adapt the pining to the
      real amount of memory available. So boards with less than 24M
      can now also benefit from PIN_TLB.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      4ad27450
    • C
      powerpc/8xx: Don't use page table for linear memory space · bb7f3808
      Christophe Leroy 提交于
      Instead of using the first level page table to define mappings for
      the linear memory space, we can use direct mapping from the TLB
      handling routines. This has several advantages:
      * No need to read the tables at each TLB miss
      * No issue in 16k pages mode where the 1st level table maps 64 Mbytes
      
      The size of the available linear space is known at system startup.
      In order to avoid data access at each TLB miss to know the memory
      size, the TLB routine is patched at startup with the proper size
      
      This patch provides a 10%-15% improvment of TLB miss handling for
      kernel addresses
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      bb7f3808
    • C
      powerpc/8xx: unpin all TLBs before flushing · 6264dbb9
      Christophe Leroy 提交于
      Bootloader may have pinned some TLB entries so the kernel must
      unpin them before flushing TLBs with tlbia otherwise pinned TLB
      entries won't get flushed
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      6264dbb9
    • C
      powerpc/8xx: Map IMMR area with 512k page at a fixed address · 4badd43a
      Christophe Leroy 提交于
      Once the linear memory space has been mapped with 8Mb pages, as
      seen in the related commit, we get 11 millions DTLB missed during
      the reference 600s period. 77% of the misses are on user addresses
      and 23% are on kernel addresses (1 fourth for linear address space
      and 3 fourth for virtual address space)
      
      Traditionaly, each driver manages one computer board which has its
      own components with its own memory maps.
      But on embedded chips like the MPC8xx, the SOC has all registers
      located in the same IO area.
      
      When looking at ioremaps done during startup, we see that
      many drivers are re-mapping small parts of the IMMR for their own use
      and all those small pieces gets their own 4k page, amplifying the
      number of TLB misses: in our system we get 0xff000000 mapped 31 times
      and 0xff003000 mapped 9 times.
      
      Even if each part of IMMR was mapped only once with 4k pages, it would
      still be several small mappings towards linear area.
      
      This patch maps the IMMR with a single 512k page.
      
      With this patch applied, the number of DTLB misses during the 10 min
      period is reduced to 11.8 millions for a duration of 5.8s, which
      represents 2% of the non-idle time hence yet another 10% reduction.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      4badd43a
    • C
      powerpc/8xx: Fix vaddr for IMMR early remap · f86ef74e
      Christophe Leroy 提交于
      Memory: 124428K/131072K available (3748K kernel code, 188K rwdata,
      648K rodata, 508K init, 290K bss, 6644K reserved)
      Kernel virtual memory layout:
        * 0xfffdf000..0xfffff000  : fixmap
        * 0xfde00000..0xfe000000  : consistent mem
        * 0xfddf6000..0xfde00000  : early ioremap
        * 0xc9000000..0xfddf6000  : vmalloc & ioremap
      SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
      
      Today, IMMR is mapped 1:1 at startup
      
      Mapping IMMR 1:1 is just wrong because it may overlap with another
      area. On most mpc8xx boards it is OK as IMMR is set to 0xff000000
      but for instance on EP88xC board, IMMR is at 0xfa200000 which
      overlaps with VM ioremap area
      
      This patch fixes the virtual address for remapping IMMR with the fixmap
      regardless of the value of IMMR.
      
      The size of IMMR area is 256kbytes (CPM at offset 0, security engine
      at offset 128k) so a 512k page is enough
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      f86ef74e
  4. 12 3月, 2016 4 次提交
    • C
      powerpc/8xx: rewrite set_context() in C · a7761fe4
      Christophe Leroy 提交于
      There is no real need to have set_context() in assembly.
      Now that we have mtspr() handling CPU6 ERRATA directly, we
      can rewrite set_context() in C language for easier maintenance.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      a7761fe4
    • C
      powerpc/8xx: remove special handling of CPU6 errata in set_dec() · 63e9e1c2
      Christophe Leroy 提交于
      CPU6 ERRATA is now handled directly in mtspr(), so we can use the
      standard set_dec() fonction in all cases.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      63e9e1c2
    • C
      powerpc/8xx: Map linear kernel RAM with 8M pages · a372acfa
      Christophe Leroy 提交于
      On a live running system (VoIP gateway for Air Trafic Control), over
      a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
      and approximatly 35 secondes are spent in DTLB handler.
      This represents 5.8% of the overall time and even 10.8% of the
      non-idle time.
      Among those 87 millions DTLB misses, 15% are on user addresses and
      85% are on kernel addresses. And within the kernel addresses, 93%
      are on addresses from the linear address space and only 7% are on
      addresses from the virtual address space.
      
      MPC8xx has no BATs but it has 8Mb page size. This patch implements
      mapping of kernel RAM using 8Mb pages, on the same model as what is
      done on the 40x.
      
      In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
      entries to the same 8Mb physical page. In each second entry, we add
      4Mb to the page physical address to ease life of the FixupDAR
      routine. This is just ignored by HW.
      
      In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
      will point to the first page of the area. The DTLB handler adds
      the 3 bits from EPN to map the correct page.
      
      With this patch applied, we now get only 13 millions TLB misses
      during the 10 minutes period. The idle time has increased to 313s
      and the overall time spent in DTLB miss handler is 6.3s, which
      represents 1% of the overall time and 2.2% of non-idle time.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      a372acfa
    • C
      powerpc/8xx: Save r3 all the time in DTLB miss handler · 913a6b3d
      Christophe Leroy 提交于
      We are spending between 40 and 160 cycles with a mean of 65 cycles in
      the DTLB handling routine (measured with mftbl) so make it more
      simple althought it adds one instruction.
      With this modification, we get three registers available at all time,
      which will help with following patch.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      913a6b3d
  5. 10 3月, 2016 1 次提交
  6. 03 6月, 2015 7 次提交
  7. 30 1月, 2015 6 次提交
  8. 08 11月, 2014 12 次提交