1. 11 11月, 2014 9 次提交
  2. 14 10月, 2014 1 次提交
  3. 12 6月, 2014 1 次提交
  4. 05 6月, 2014 1 次提交
  5. 18 4月, 2014 1 次提交
  6. 08 4月, 2014 1 次提交
    • D
      mm: per-thread vma caching · 615d6e87
      Davidlohr Bueso 提交于
      This patch is a continuation of efforts trying to optimize find_vma(),
      avoiding potentially expensive rbtree walks to locate a vma upon faults.
      The original approach (https://lkml.org/lkml/2013/11/1/410), where the
      largest vma was also cached, ended up being too specific and random,
      thus further comparison with other approaches were needed.  There are
      two things to consider when dealing with this, the cache hit rate and
      the latency of find_vma().  Improving the hit-rate does not necessarily
      translate in finding the vma any faster, as the overhead of any fancy
      caching schemes can be too high to consider.
      
      We currently cache the last used vma for the whole address space, which
      provides a nice optimization, reducing the total cycles in find_vma() by
      up to 250%, for workloads with good locality.  On the other hand, this
      simple scheme is pretty much useless for workloads with poor locality.
      Analyzing ebizzy runs shows that, no matter how many threads are
      running, the mmap_cache hit rate is less than 2%, and in many situations
      below 1%.
      
      The proposed approach is to replace this scheme with a small per-thread
      cache, maximizing hit rates at a very low maintenance cost.
      Invalidations are performed by simply bumping up a 32-bit sequence
      number.  The only expensive operation is in the rare case of a seq
      number overflow, where all caches that share the same address space are
      flushed.  Upon a miss, the proposed replacement policy is based on the
      page number that contains the virtual address in question.  Concretely,
      the following results are seen on an 80 core, 8 socket x86-64 box:
      
      1) System bootup: Most programs are single threaded, so the per-thread
         scheme does improve ~50% hit rate by just adding a few more slots to
         the cache.
      
      +----------------+----------+------------------+
      | caching scheme | hit-rate | cycles (billion) |
      +----------------+----------+------------------+
      | baseline       | 50.61%   | 19.90            |
      | patched        | 73.45%   | 13.58            |
      +----------------+----------+------------------+
      
      2) Kernel build: This one is already pretty good with the current
         approach as we're dealing with good locality.
      
      +----------------+----------+------------------+
      | caching scheme | hit-rate | cycles (billion) |
      +----------------+----------+------------------+
      | baseline       | 75.28%   | 11.03            |
      | patched        | 88.09%   | 9.31             |
      +----------------+----------+------------------+
      
      3) Oracle 11g Data Mining (4k pages): Similar to the kernel build workload.
      
      +----------------+----------+------------------+
      | caching scheme | hit-rate | cycles (billion) |
      +----------------+----------+------------------+
      | baseline       | 70.66%   | 17.14            |
      | patched        | 91.15%   | 12.57            |
      +----------------+----------+------------------+
      
      4) Ebizzy: There's a fair amount of variation from run to run, but this
         approach always shows nearly perfect hit rates, while baseline is just
         about non-existent.  The amounts of cycles can fluctuate between
         anywhere from ~60 to ~116 for the baseline scheme, but this approach
         reduces it considerably.  For instance, with 80 threads:
      
      +----------------+----------+------------------+
      | caching scheme | hit-rate | cycles (billion) |
      +----------------+----------+------------------+
      | baseline       | 1.06%    | 91.54            |
      | patched        | 99.97%   | 14.18            |
      +----------------+----------+------------------+
      
      [akpm@linux-foundation.org: fix nommu build, per Davidlohr]
      [akpm@linux-foundation.org: document vmacache_valid() logic]
      [akpm@linux-foundation.org: attempt to untangle header files]
      [akpm@linux-foundation.org: add vmacache_find() BUG_ON]
      [hughd@google.com: add vmacache_valid_mm() (from Oleg)]
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: adjust and enhance comments]
      Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Reviewed-by: NMichel Lespinasse <walken@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Tested-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      615d6e87
  7. 26 2月, 2014 1 次提交
  8. 25 1月, 2014 1 次提交
  9. 04 10月, 2013 1 次提交
  10. 01 5月, 2013 1 次提交
  11. 02 3月, 2013 8 次提交
  12. 05 2月, 2013 1 次提交
  13. 12 1月, 2013 1 次提交
  14. 12 10月, 2012 3 次提交
    • J
      kdb,vt_console: Fix missed data due to pager overruns · 17b572e8
      Jason Wessel 提交于
      It is possible to miss data when using the kdb pager.  The kdb pager
      does not pay attention to the maximum column constraint of the screen
      or serial terminal.  This result is not incrementing the shown lines
      correctly and the pager will print more lines that fit on the screen.
      Obviously that is less than useful when using a VGA console where you
      cannot scroll back.
      
      The pager will now look at the kdb_buffer string to see how many
      characters are printed.  It might not be perfect considering you can
      output ASCII that might move the cursor position, but it is a
      substantially better approximation for viewing dmesg and trace logs.
      
      This also means that the vt screen needs to set the kdb COLUMNS
      variable.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      17b572e8
    • J
      kdb: Fix dmesg/bta scroll to quit with 'q' · d1871b38
      Jason Wessel 提交于
      If you press 'q' the pager should exit instead of printing everything
      from dmesg which can really bog down a 9600 baud serial link.
      
      The same is true for the bta command.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      d1871b38
    • J
      kgdb: Add module event hooks · f30fed10
      Jason Wessel 提交于
      Allow gdb to auto load kernel modules when it is attached,
      which makes it trivially easy to debug module init functions
      or pre-set breakpoints in a kernel module that has not loaded yet.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      f30fed10
  15. 27 9月, 2012 2 次提交
  16. 31 7月, 2012 3 次提交
  17. 22 7月, 2012 3 次提交
  18. 30 3月, 2012 1 次提交
    • J
      kgdb,debug_core: pass the breakpoint struct instead of address and memory · 98b54aa1
      Jason Wessel 提交于
      There is extra state information that needs to be exposed in the
      kgdb_bpt structure for tracking how a breakpoint was installed.  The
      debug_core only uses the the probe_kernel_write() to install
      breakpoints, but this is not enough for all the archs.  Some arch such
      as x86 need to use text_poke() in order to install a breakpoint into a
      read only page.
      
      Passing the kgdb_bpt structure to kgdb_arch_set_breakpoint() and
      kgdb_arch_remove_breakpoint() allows other archs to set the type
      variable which indicates how the breakpoint was installed.
      
      Cc: stable@vger.kernel.org # >= 2.6.36
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      98b54aa1