1. 16 11月, 2017 11 次提交
    • M
      mm: remove __GFP_COLD · 453f85d4
      Mel Gorman 提交于
      As the page free path makes no distinction between cache hot and cold
      pages, there is no real useful ordering of pages in the free list that
      allocation requests can take advantage of.  Juding from the users of
      __GFP_COLD, it is likely that a number of them are the result of copying
      other sites instead of actually measuring the impact.  Remove the
      __GFP_COLD parameter which simplifies a number of paths in the page
      allocator.
      
      This is potentially controversial but bear in mind that the size of the
      per-cpu pagelists versus modern cache sizes means that the whole per-cpu
      list can often fit in the L3 cache.  Hence, there is only a potential
      benefit for microbenchmarks that alloc/free pages in a tight loop.  It's
      even worse when THP is taken into account which has little or no chance
      of getting a cache-hot page as the per-cpu list is bypassed and the
      zeroing of multiple pages will thrash the cache anyway.
      
      The truncate microbenchmarks are not shown as this patch affects the
      allocation path and not the free path.  A page fault microbenchmark was
      tested but it showed no sigificant difference which is not surprising
      given that the __GFP_COLD branches are a miniscule percentage of the
      fault path.
      
      Link: http://lkml.kernel.org/r/20171018075952.10627-9-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      453f85d4
    • M
      mm: remove cold parameter for release_pages · c6f92f9f
      Mel Gorman 提交于
      All callers of release_pages claim the pages being released are cache
      hot.  As no one cares about the hotness of pages being released to the
      allocator, just ditch the parameter.
      
      No performance impact is expected as the overhead is marginal.  The
      parameter is removed simply because it is a bit stupid to have a useless
      parameter copied everywhere.
      
      Link: http://lkml.kernel.org/r/20171018075952.10627-7-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c6f92f9f
    • M
      mm, pagevec: remove cold parameter for pagevecs · 86679820
      Mel Gorman 提交于
      Every pagevec_init user claims the pages being released are hot even in
      cases where it is unlikely the pages are hot.  As no one cares about the
      hotness of pages being released to the allocator, just ditch the
      parameter.
      
      No performance impact is expected as the overhead is marginal.  The
      parameter is removed simply because it is a bit stupid to have a useless
      parameter copied everywhere.
      
      Link: http://lkml.kernel.org/r/20171018075952.10627-6-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      86679820
    • C
      drivers/block/zram/zram_drv.c: make zram_page_end_io() static · 384bc41f
      Colin Ian King 提交于
      zram_page_end_io() is local to the source and does not need to be in
      global scope, so make it static.
      
      Cleans up sparse warning:
      
        symbol 'zram_page_end_io' was not declared. Should it be static?
      
      Link: http://lkml.kernel.org/r/20171016173336.20320-1-colin.king@canonical.comSigned-off-by: NColin Ian King <colin.king@canonical.com>
      Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      384bc41f
    • L
      kmemcheck: remove annotations · 49502766
      Levin, Alexander (Sasha Levin) 提交于
      Patch series "kmemcheck: kill kmemcheck", v2.
      
      As discussed at LSF/MM, kill kmemcheck.
      
      KASan is a replacement that is able to work without the limitation of
      kmemcheck (single CPU, slow).  KASan is already upstream.
      
      We are also not aware of any users of kmemcheck (or users who don't
      consider KASan as a suitable replacement).
      
      The only objection was that since KASAN wasn't supported by all GCC
      versions provided by distros at that time we should hold off for 2
      years, and try again.
      
      Now that 2 years have passed, and all distros provide gcc that supports
      KASAN, kill kmemcheck again for the very same reasons.
      
      This patch (of 4):
      
      Remove kmemcheck annotations, and calls to kmemcheck from the kernel.
      
      [alexander.levin@verizon.com: correctly remove kmemcheck call from dma_map_sg_attrs]
        Link: http://lkml.kernel.org/r/20171012192151.26531-1-alexander.levin@verizon.com
      Link: http://lkml.kernel.org/r/20171007030159.22241-2-alexander.levin@verizon.comSigned-off-by: NSasha Levin <alexander.levin@verizon.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tim Hansen <devtimhansen@gmail.com>
      Cc: Vegard Nossum <vegardno@ifi.uio.no>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      49502766
    • S
      zram: remove zlib from the list of recommended algorithms · 0b07ff39
      Sergey Senozhatsky 提交于
      ZSTD tends to outperform deflate/inflate, thus we remove zlib from the
      list of recommended algorithms and recommend zstd instead.
      
      Link: http://lkml.kernel.org/r/20170912050005.3247-2-sergey.senozhatsky@gmail.comSigned-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Suggested-by: NMinchan Kim <minchan@kernel.org>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0b07ff39
    • S
      zram: add zstd to the supported algorithms list · 5ef3a8b1
      Sergey Senozhatsky 提交于
      Add ZSTD to the list of supported compression algorithms.
      
      ZRAM fio perf test:
      
                            LZO         DEFLATE         ZSTD
      
      #jobs1
      WRITE:              (2180MB/s)   (77.2MB/s)      (1429MB/s)
      WRITE:              (1617MB/s)   (77.7MB/s)      (1202MB/s)
      READ:                (426MB/s)   (595MB/s)       (1181MB/s)
      READ:                (422MB/s)   (572MB/s)       (1020MB/s)
      READ:                (318MB/s)   (67.8MB/s)      (563MB/s)
      WRITE:               (318MB/s)   (67.9MB/s)      (564MB/s)
      READ:                (336MB/s)   (68.3MB/s)      (583MB/s)
      WRITE:               (335MB/s)   (68.2MB/s)      (582MB/s)
      #jobs2
      WRITE:              (3441MB/s)   (152MB/s)       (2141MB/s)
      WRITE:              (2507MB/s)   (147MB/s)       (1888MB/s)
      READ:                (801MB/s)   (1146MB/s)      (1890MB/s)
      READ:                (767MB/s)   (1096MB/s)      (2073MB/s)
      READ:                (621MB/s)   (126MB/s)       (1009MB/s)
      WRITE:               (621MB/s)   (126MB/s)       (1009MB/s)
      READ:                (656MB/s)   (125MB/s)       (1075MB/s)
      WRITE:               (657MB/s)   (126MB/s)       (1077MB/s)
      #jobs3
      WRITE:              (4772MB/s)   (225MB/s)       (3394MB/s)
      WRITE:              (3905MB/s)   (211MB/s)       (2939MB/s)
      READ:               (1216MB/s)   (1608MB/s)      (3218MB/s)
      READ:               (1159MB/s)   (1431MB/s)      (2981MB/s)
      READ:                (906MB/s)   (156MB/s)       (1457MB/s)
      WRITE:               (907MB/s)   (156MB/s)       (1458MB/s)
      READ:                (953MB/s)   (158MB/s)       (1595MB/s)
      WRITE:               (952MB/s)   (157MB/s)       (1593MB/s)
      #jobs4
      WRITE:              (6036MB/s)   (265MB/s)       (4469MB/s)
      WRITE:              (5059MB/s)   (263MB/s)       (3951MB/s)
      READ:               (1618MB/s)   (2066MB/s)      (4276MB/s)
      READ:               (1573MB/s)   (1942MB/s)      (3830MB/s)
      READ:               (1202MB/s)   (227MB/s)       (1971MB/s)
      WRITE:              (1200MB/s)   (227MB/s)       (1968MB/s)
      READ:               (1265MB/s)   (226MB/s)       (2116MB/s)
      WRITE:              (1264MB/s)   (226MB/s)       (2114MB/s)
      #jobs5
      WRITE:              (5339MB/s)   (233MB/s)       (3781MB/s)
      WRITE:              (4298MB/s)   (234MB/s)       (3276MB/s)
      READ:               (1626MB/s)   (2048MB/s)      (4081MB/s)
      READ:               (1567MB/s)   (1929MB/s)      (3758MB/s)
      READ:               (1174MB/s)   (205MB/s)       (1747MB/s)
      WRITE:              (1173MB/s)   (204MB/s)       (1746MB/s)
      READ:               (1214MB/s)   (208MB/s)       (1890MB/s)
      WRITE:              (1215MB/s)   (208MB/s)       (1892MB/s)
      #jobs6
      WRITE:              (5666MB/s)   (270MB/s)       (4338MB/s)
      WRITE:              (4828MB/s)   (267MB/s)       (3772MB/s)
      READ:               (1803MB/s)   (2058MB/s)      (4946MB/s)
      READ:               (1805MB/s)   (2156MB/s)      (4711MB/s)
      READ:               (1334MB/s)   (235MB/s)       (2135MB/s)
      WRITE:              (1335MB/s)   (235MB/s)       (2137MB/s)
      READ:               (1364MB/s)   (236MB/s)       (2268MB/s)
      WRITE:              (1365MB/s)   (237MB/s)       (2270MB/s)
      #jobs7
      WRITE:              (5474MB/s)   (270MB/s)       (4300MB/s)
      WRITE:              (4666MB/s)   (266MB/s)       (3817MB/s)
      READ:               (2022MB/s)   (2319MB/s)      (5472MB/s)
      READ:               (1924MB/s)   (2260MB/s)      (5031MB/s)
      READ:               (1369MB/s)   (242MB/s)       (2153MB/s)
      WRITE:              (1370MB/s)   (242MB/s)       (2155MB/s)
      READ:               (1499MB/s)   (246MB/s)       (2310MB/s)
      WRITE:              (1497MB/s)   (246MB/s)       (2307MB/s)
      #jobs8
      WRITE:              (5558MB/s)   (273MB/s)       (4439MB/s)
      WRITE:              (4763MB/s)   (271MB/s)       (3918MB/s)
      READ:               (2201MB/s)   (2599MB/s)      (6062MB/s)
      READ:               (2105MB/s)   (2463MB/s)      (5413MB/s)
      READ:               (1490MB/s)   (252MB/s)       (2238MB/s)
      WRITE:              (1488MB/s)   (252MB/s)       (2236MB/s)
      READ:               (1566MB/s)   (254MB/s)       (2434MB/s)
      WRITE:              (1568MB/s)   (254MB/s)       (2437MB/s)
      #jobs9
      WRITE:              (5120MB/s)   (264MB/s)       (4035MB/s)
      WRITE:              (4531MB/s)   (267MB/s)       (3740MB/s)
      READ:               (1940MB/s)   (2258MB/s)      (4986MB/s)
      READ:               (2024MB/s)   (2387MB/s)      (4871MB/s)
      READ:               (1343MB/s)   (246MB/s)       (2038MB/s)
      WRITE:              (1342MB/s)   (246MB/s)       (2037MB/s)
      READ:               (1553MB/s)   (238MB/s)       (2243MB/s)
      WRITE:              (1552MB/s)   (238MB/s)       (2242MB/s)
      #jobs10
      WRITE:              (5345MB/s)   (271MB/s)       (3988MB/s)
      WRITE:              (4750MB/s)   (254MB/s)       (3668MB/s)
      READ:               (1876MB/s)   (2363MB/s)      (5150MB/s)
      READ:               (1990MB/s)   (2256MB/s)      (5080MB/s)
      READ:               (1355MB/s)   (250MB/s)       (2019MB/s)
      WRITE:              (1356MB/s)   (251MB/s)       (2020MB/s)
      READ:               (1490MB/s)   (252MB/s)       (2202MB/s)
      WRITE:              (1488MB/s)   (252MB/s)       (2199MB/s)
      
      jobs1                              perfstat
      instructions                 52,065,555,710 (    0.79)    855,731,114,587 (    2.64)       54,280,709,944 (    1.40)
      branches                     14,020,427,116 ( 725.847)    101,733,449,582 (1074.521)       11,170,591,067 ( 992.869)
      branch-misses                    22,626,174 (   0.16%)        274,197,885 (   0.27%)           25,915,805 (   0.23%)
      jobs2                              perfstat
      instructions                103,633,110,402 (    0.75)  1,710,822,100,914 (    2.59)      107,879,874,104 (    1.28)
      branches                     27,931,237,282 ( 679.203)    203,298,267,479 (1037.326)       22,185,350,842 ( 884.427)
      branch-misses                    46,103,811 (   0.17%)        533,747,204 (   0.26%)           49,682,483 (   0.22%)
      jobs3                              perfstat
      instructions                154,857,283,657 (    0.76)  2,565,748,974,197 (    2.57)      161,515,435,813 (    1.31)
      branches                     41,759,490,355 ( 670.529)    304,905,605,277 ( 978.765)       33,215,805,907 ( 888.003)
      branch-misses                    74,263,293 (   0.18%)        759,746,240 (   0.25%)           76,841,196 (   0.23%)
      jobs4                              perfstat
      instructions                206,215,849,076 (    0.75)  3,420,169,460,897 (    2.60)      215,003,061,664 (    1.31)
      branches                     55,632,141,739 ( 666.501)    406,394,977,433 ( 927.241)       44,214,322,251 ( 883.532)
      branch-misses                   102,287,788 (   0.18%)      1,098,617,314 (   0.27%)          103,891,040 (   0.23%)
      jobs5                              perfstat
      instructions                258,711,315,588 (    0.67)  4,275,657,533,244 (    2.23)      269,332,235,685 (    1.08)
      branches                     69,802,821,166 ( 588.823)    507,996,211,252 ( 797.036)       55,450,846,129 ( 735.095)
      branch-misses                   129,217,214 (   0.19%)      1,243,284,991 (   0.24%)          173,512,278 (   0.31%)
      jobs6                              perfstat
      instructions                312,796,166,008 (    0.61)  5,133,896,344,660 (    2.02)      323,658,769,588 (    1.04)
      branches                     84,372,488,583 ( 520.541)    610,310,494,402 ( 697.642)       66,683,292,992 ( 693.939)
      branch-misses                   159,438,978 (   0.19%)      1,396,368,563 (   0.23%)          174,406,934 (   0.26%)
      jobs7                              perfstat
      instructions                363,211,372,930 (    0.56)  5,988,205,600,879 (    1.75)      377,824,674,156 (    0.93)
      branches                     98,057,013,765 ( 463.117)    711,841,255,974 ( 598.762)       77,879,009,954 ( 600.443)
      branch-misses                   199,513,153 (   0.20%)      1,507,651,077 (   0.21%)          248,203,369 (   0.32%)
      jobs8                              perfstat
      instructions                413,960,354,615 (    0.52)  6,842,918,558,378 (    1.45)      431,938,486,581 (    0.83)
      branches                    111,812,574,884 ( 414.224)    813,299,084,518 ( 491.173)       89,062,699,827 ( 517.795)
      branch-misses                   233,584,845 (   0.21%)      1,531,593,921 (   0.19%)          286,818,489 (   0.32%)
      jobs9                              perfstat
      instructions                465,976,220,300 (    0.53)  7,698,467,237,372 (    1.47)      486,352,600,321 (    0.84)
      branches                    125,931,456,162 ( 424.063)    915,207,005,715 ( 498.192)      100,370,404,090 ( 517.439)
      branch-misses                   256,992,445 (   0.20%)      1,782,809,816 (   0.19%)          345,239,380 (   0.34%)
      jobs10                             perfstat
      instructions                517,406,372,715 (    0.53)  8,553,527,312,900 (    1.48)      540,732,653,094 (    0.84)
      branches                    139,839,780,676 ( 427.732)  1,016,737,699,389 ( 503.172)      111,696,557,638 ( 516.750)
      branch-misses                   259,595,561 (   0.19%)      1,952,570,279 (   0.19%)          357,818,661 (   0.32%)
      
      seconds elapsed        20.630411534     96.084546565    12.743373571
      seconds elapsed        22.292627625     100.984155001   14.407413560
      seconds elapsed        22.396016966     110.344880848   14.032201392
      seconds elapsed        22.517330949     113.351459170   14.243074935
      seconds elapsed        28.548305104     156.515193765   19.159286861
      seconds elapsed        30.453538116     164.559937678   19.362492717
      seconds elapsed        33.467108086     188.486827481   21.492612173
      seconds elapsed        35.617727591     209.602677783   23.256422492
      seconds elapsed        42.584239509     243.959902566   28.458540338
      seconds elapsed        47.683632526     269.635248851   31.542404137
      
      Over all, ZSTD has slower WRITE, but much faster READ (perhaps
      a static compression buffer used during the test helped ZSTD a
      lot), which results in faster test results.
      
      Memory consumption (zram mm_stat file):
      
      zram LZO mm_stat
      mm_stat (jobs1): 2147483648 23068672 33558528        0 33558528        0        0
      mm_stat (jobs2): 2147483648 23068672 33558528        0 33558528        0        0
      mm_stat (jobs3): 2147483648 23068672 33558528        0 33562624        0        0
      mm_stat (jobs4): 2147483648 23068672 33558528        0 33558528        0        0
      mm_stat (jobs5): 2147483648 23068672 33558528        0 33558528        0        0
      mm_stat (jobs6): 2147483648 23068672 33558528        0 33562624        0        0
      mm_stat (jobs7): 2147483648 23068672 33558528        0 33566720        0        0
      mm_stat (jobs8): 2147483648 23068672 33558528        0 33558528        0        0
      mm_stat (jobs9): 2147483648 23068672 33558528        0 33558528        0        0
      mm_stat (jobs10): 2147483648 23068672 33558528        0 33562624        0        0
      
      zram DEFLATE mm_stat
      mm_stat (jobs1): 2147483648 16252928 25178112        0 25178112        0        0
      mm_stat (jobs2): 2147483648 16252928 25178112        0 25178112        0        0
      mm_stat (jobs3): 2147483648 16252928 25178112        0 25178112        0        0
      mm_stat (jobs4): 2147483648 16252928 25178112        0 25178112        0        0
      mm_stat (jobs5): 2147483648 16252928 25178112        0 25178112        0        0
      mm_stat (jobs6): 2147483648 16252928 25178112        0 25178112        0        0
      mm_stat (jobs7): 2147483648 16252928 25178112        0 25190400        0        0
      mm_stat (jobs8): 2147483648 16252928 25178112        0 25190400        0        0
      mm_stat (jobs9): 2147483648 16252928 25178112        0 25178112        0        0
      mm_stat (jobs10): 2147483648 16252928 25178112        0 25178112        0        0
      
      zram ZSTD mm_stat
      mm_stat (jobs1): 2147483648 11010048 16781312        0 16781312        0        0
      mm_stat (jobs2): 2147483648 11010048 16781312        0 16781312        0        0
      mm_stat (jobs3): 2147483648 11010048 16781312        0 16785408        0        0
      mm_stat (jobs4): 2147483648 11010048 16781312        0 16781312        0        0
      mm_stat (jobs5): 2147483648 11010048 16781312        0 16781312        0        0
      mm_stat (jobs6): 2147483648 11010048 16781312        0 16781312        0        0
      mm_stat (jobs7): 2147483648 11010048 16781312        0 16781312        0        0
      mm_stat (jobs8): 2147483648 11010048 16781312        0 16781312        0        0
      mm_stat (jobs9): 2147483648 11010048 16781312        0 16785408        0        0
      mm_stat (jobs10): 2147483648 11010048 16781312        0 16781312        0        0
      
      ==================================================================================
      
      Official benchmarks [1]:
      
      Compressor name         Ratio   Compression     Decompress.
      zstd 1.1.3 -1           2.877   430 MB/s        1110 MB/s
      zlib 1.2.8 -1           2.743   110 MB/s        400 MB/s
      brotli 0.5.2 -0         2.708   400 MB/s        430 MB/s
      quicklz 1.5.0 -1        2.238   550 MB/s        710 MB/s
      lzo1x 2.09 -1           2.108   650 MB/s        830 MB/s
      lz4 1.7.5               2.101   720 MB/s        3600 MB/s
      snappy 1.1.3            2.091   500 MB/s        1650 MB/s
      lzf 3.6 -1              2.077   400 MB/s        860 MB/s
      
      Minchan said:
      
      : I did test with my sample data and compared zstd with deflate.  zstd's
      : compress ratio is lower a little bit but compression speed is much faster
      : 3 times more and decompress speed is too 2 times more.  With different
      : data, it is different but overall, zstd would be better for speed at the
      : cost of a little lower compress ratio(about 5%) so I believe it's worth to
      : replace deflate.
      
      [1] https://github.com/facebook/zstd
      
      Link: http://lkml.kernel.org/r/20170912050005.3247-1-sergey.senozhatsky@gmail.comSigned-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Tested-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5ef3a8b1
    • M
      bdi: introduce BDI_CAP_SYNCHRONOUS_IO · 23c47d2a
      Minchan Kim 提交于
      As discussed at
      
        https://lkml.kernel.org/r/<20170728165604.10455-1-ross.zwisler@linux.intel.com>
      
      someday we will remove rw_page().  If so, we need something to detect
      such super-fast storage on which synchronous IO operations like the
      current rw_page are always a win.
      
      Introduces BDI_CAP_SYNCHRONOUS_IO to indicate such devices.  With it, we
      could use various optimization techniques.
      
      Link: http://lkml.kernel.org/r/1505886205-9671-3-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ilya Dryomov <idryomov@gmail.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      23c47d2a
    • M
      zram: set BDI_CAP_STABLE_WRITES once · e447a015
      Minchan Kim 提交于
      With fast swap storage, the platform wants to use swap more aggressively
      and swap-in is crucial to application latency.
      
      The rw_page() based synchronous devices like zram, pmem and btt are such
      fast storage.  When I profile swapin performance with zram lz4
      decompress test, S/W overhead is more than 70%.  Maybe, it would be
      bigger in nvdimm.
      
      This patchset reduces swap-in latency by skipping swapcache if the swap
      device is a synchronous device like a rw_page() based device.
      
      It enhances by 45% my swapin test (5G sequential swapin, no readahead)
      from 2.41sec to 1.64sec.
      
      This patch (of 4):
      
      Commit 19b7ccf8 ("block: get rid of blk_integrity_revalidate()")
      fixed a weird thing (i.e., reset BDI_CAP_STABLE_WRITES flag
      unconditionally whenever revalidat_disk is called) so zram doesn't need
      to reset the flag any more when revalidating the bdev.  Instead, set the
      flag just once when the zram device is created.
      
      It shouldn't change any behavior.
      
      Link: http://lkml.kernel.org/r/1505886205-9671-2-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
      Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Ilya Dryomov <idryomov@gmail.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e447a015
    • J
      drivers/infiniband/sw/rdmavt/qp.c: use kmalloc_array_node() · 3c073478
      Johannes Thumshirn 提交于
      Now that we have a NUMA-aware version of kmalloc_array() we can use it
      instead of kmalloc_node() without an overflow check in the size
      calculation.
      
      Link: http://lkml.kernel.org/r/20170927082038.3782-5-jthumshirn@suse.deSigned-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NChristoph Lameter <cl@linux.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Damien Le Moal <damien.lemoal@wdc.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mike Marciniszyn <infinipath@intel.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
      Cc: Sean Hefty <sean.hefty@intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c073478
    • J
      drivers/infiniband/hw/qib/qib_init.c: use kmalloc_array_node() · 7d502071
      Johannes Thumshirn 提交于
      Now that we have a NUMA-aware version of kmalloc_array() we can use it
      instead of kmalloc_node() without an overflow check in the size
      calculation.
      
      Link: http://lkml.kernel.org/r/20170927082038.3782-4-jthumshirn@suse.deSigned-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NChristoph Lameter <cl@linux.com>
      Cc: Mike Marciniszyn <infinipath@intel.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Sean Hefty <sean.hefty@intel.com>
      Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Damien Le Moal <damien.lemoal@wdc.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7d502071
  2. 15 11月, 2017 4 次提交
    • H
      geneve: fix fill_info when link down · fd7eafd0
      Hangbin Liu 提交于
      geneve->sock4/6 were added with geneve_open and released with geneve_stop.
      So when geneve link down, we will not able to show remote address and
      checksum info after commit 11387fe4 ("geneve: fix fill_info when using
      collect_metadata").
      
      Fix this by avoid passing *_REMOTE{,6} for COLLECT_METADATA since they are
      mutually exclusive, and always show UDP_ZERO_CSUM6_RX info.
      
      Fixes: 11387fe4 ("geneve: fix fill_info when using collect_metadata")
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd7eafd0
    • B
      net: cdc_ncm: GetNtbFormat endian fix · 6314dab4
      Bjørn Mork 提交于
      The GetNtbFormat and SetNtbFormat requests operate on 16 bit little
      endian values. We get away with ignoring this most of the time, because
      we only care about USB_CDC_NCM_NTB16_FORMAT which is 0x0000.  This
      fails for USB_CDC_NCM_NTB32_FORMAT.
      
      Fix comparison between LE value from device and constant by converting
      the constant to LE.
      Reported-by: NBen Hutchings <ben.hutchings@codethink.co.uk>
      Fixes: 2b02c20c ("cdc_ncm: Set NTB format again after altsetting switch for Huawei devices")
      Cc: Enrico Mioso <mrkiko.rs@gmail.com>
      Cc: Christian Panton <christian@panton.org>
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Acked-By: NEnrico Mioso <mrkiko.rs@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6314dab4
    • A
      usbnet: ipheth: prevent TX queue timeouts when device not ready · bb1b40c7
      Alexander Kappner 提交于
      iOS devices require the host to be "trusted" before servicing network
      packets. Establishing trust requires the user to confirm a dialog on the
      iOS device.Until trust is established, the iOS device will silently discard
      network packets from the host. Currently, the ipheth driver does not detect
      whether an iOS device has established trust with the host, and immediately
      sets up the transmit queues.
      
      This causes the following problems:
      
      - Kernel taint due to WARN() in netdev watchdog.
      - Dmesg spam ("TX timeout").
      - Disruption of user space networking activity (dhcpd, etc...) when new
      interface comes up but cannot be used.
      - Unnecessary host and device wakeups and USB traffic
      
      Example dmesg output:
      
      [ 1101.319778] NETDEV WATCHDOG: eth1 (ipheth): transmit queue 0 timed out
      [ 1101.319817] ------------[ cut here ]------------
      [ 1101.319828] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x20f/0x220
      [ 1101.319831] Modules linked in: ipheth usbmon nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) iwlmvm mac80211 iwlwifi btusb btrtl btbcm btintel qmi_wwan bluetooth cfg80211 ecdh_generic thinkpad_acpi rfkill [last unloaded: ipheth]
      [ 1101.319861] CPU: 0 PID: 0 Comm: swapper/0 Tainted: P           O    4.13.12.1 #1
      [ 1101.319864] Hardware name: LENOVO 20ENCTO1WW/20ENCTO1WW, BIOS N1EET62W (1.35 ) 11/10/2016
      [ 1101.319867] task: ffffffff81e11500 task.stack: ffffffff81e00000
      [ 1101.319873] RIP: 0010:dev_watchdog+0x20f/0x220
      [ 1101.319876] RSP: 0018:ffff8810a3c03e98 EFLAGS: 00010292
      [ 1101.319880] RAX: 000000000000003a RBX: 0000000000000000 RCX: 0000000000000000
      [ 1101.319883] RDX: ffff8810a3c15c48 RSI: ffffffff81ccbfc2 RDI: 00000000ffffffff
      [ 1101.319886] RBP: ffff880c04ebc41c R08: 0000000000000000 R09: 0000000000000379
      [ 1101.319889] R10: 00000100696589d0 R11: 0000000000000378 R12: ffff880c04ebc000
      [ 1101.319892] R13: 0000000000000000 R14: 0000000000000001 R15: ffff880c2865fc80
      [ 1101.319896] FS:  0000000000000000(0000) GS:ffff8810a3c00000(0000) knlGS:0000000000000000
      [ 1101.319899] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1101.319902] CR2: 00007f3ff24ac000 CR3: 0000000001e0a000 CR4: 00000000003406f0
      [ 1101.319905] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1101.319908] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 1101.319910] Call Trace:
      [ 1101.319914]  <IRQ>
      [ 1101.319921]  ? dev_graft_qdisc+0x70/0x70
      [ 1101.319928]  ? dev_graft_qdisc+0x70/0x70
      [ 1101.319934]  ? call_timer_fn+0x2e/0x170
      [ 1101.319939]  ? dev_graft_qdisc+0x70/0x70
      [ 1101.319944]  ? run_timer_softirq+0x1ea/0x440
      [ 1101.319951]  ? timerqueue_add+0x54/0x80
      [ 1101.319956]  ? enqueue_hrtimer+0x38/0xa0
      [ 1101.319963]  ? __do_softirq+0xed/0x2e7
      [ 1101.319970]  ? irq_exit+0xb4/0xc0
      [ 1101.319976]  ? smp_apic_timer_interrupt+0x39/0x50
      [ 1101.319981]  ? apic_timer_interrupt+0x8c/0xa0
      [ 1101.319983]  </IRQ>
      [ 1101.319992]  ? cpuidle_enter_state+0xfa/0x2a0
      [ 1101.319999]  ? do_idle+0x1a3/0x1f0
      [ 1101.320004]  ? cpu_startup_entry+0x5f/0x70
      [ 1101.320011]  ? start_kernel+0x444/0x44c
      [ 1101.320017]  ? early_idt_handler_array+0x120/0x120
      [ 1101.320023]  ? x86_64_start_kernel+0x145/0x154
      [ 1101.320028]  ? secondary_startup_64+0x9f/0x9f
      [ 1101.320033] Code: 20 04 00 00 eb 9f 4c 89 e7 c6 05 59 44 71 00 01 e8 a7 df fd ff 89 d9 4c 89 e6 48 c7 c7 70 b7 cd 81 48 89 c2 31 c0 e8 97 64 90 ff <0f> ff eb bf 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
      [ 1101.320103] ---[ end trace 0cc4d251e2b57080 ]---
      [ 1101.320110] ipheth 1-5:4.2: ipheth_tx_timeout: TX timeout
      
      The last message "TX timeout" is repeated every 5 seconds until trust is
      established or the device is disconnected, filling up dmesg.
      
      The proposed patch eliminates the problem by, upon connection, keeping the
      TX queue and carrier disabled until a packet is first received from the iOS
      device. This is reflected by the confirmed_pairing variable in the device
      structure. Only after at least one packet has been received from the iOS
      device, the transmit queue and carrier are brought up during the periodic
      device poll in ipheth_carrier_set. Because the iOS device will always send
      a packet immediately upon trust being established, this should not delay
      the interface becoming useable. To prevent failed UBRs in
      ipheth_rcvbulk_callback from perpetually re-enabling the queue if it was
      disabled, a new check is added so only successful transfers re-enable the
      queue, whereas failed transfers only trigger an immediate poll.
      
      This has the added benefit of removing the periodic control requests to the
      iOS device until trust has been established and thus should reduce wakeup
      events on both the host and the iOS device.
      Signed-off-by: NAlexander Kappner <agk@godking.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb1b40c7
    • J
      vhost_net: conditionally enable tx polling · feb8892c
      Jason Wang 提交于
      We always poll tx for socket, this is sub optimal since this will
      slightly increase the waitqueue traversing time and more important,
      vhost could not benefit from commit 9e641bdc ("net-tun:
      restructure tun_do_read for better sleep/wakeup efficiency") even if
      we've stopped rx polling during handle_rx(), tx poll were still left
      in the waitqueue.
      
      Pktgen from a remote host to VM over mlx4 on two 2.00GHz Xeon E5-2650
      shows 11.7% improvements on rx PPS. (from 1.28Mpps to 1.44Mpps)
      
      Cc: Wei Xu <wexu@redhat.com>
      Cc: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      feb8892c
  3. 14 11月, 2017 25 次提交