1. 21 5月, 2016 4 次提交
    • S
      zram: introduce per-device debug_stat sysfs node · 623e47fc
      Sergey Senozhatsky 提交于
      debug_stat sysfs is read-only and represents various debugging data that
      zram developers may need.  This file is not meant to be used by anyone
      else: its content is not documented and will change any time w/o any
      notice.  Therefore, the output of debug_stat file contains a version
      string.  To avoid any confusion, we will increase the version number
      every time we modify the output.
      
      At the moment this file exports only one value -- the number of
      re-compressions, IOW, the number of times compression fast path has
      failed.  This stat is temporary any will be useful in case if any
      per-cpu compression streams regressions will be reported.
      
      Link: http://lkml.kernel.org/r/20160513230834.GB26763@bbox
      Link: http://lkml.kernel.org/r/20160511134553.12655-1-sergey.senozhatsky@gmail.comSigned-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      623e47fc
    • S
      zram: remove max_comp_streams internals · 43209ea2
      Sergey Senozhatsky 提交于
      Remove the internal part of max_comp_streams interface, since we
      switched to per-cpu streams.  We will keep RW max_comp_streams attr
      around, because:
      
      a) we may (silently) switch back to idle compression streams list and
         don't want to disturb user space
      
      b) max_comp_streams attr must wait for the next 'lay off cycle'; we
         give user space 2 years to adjust before we remove/downgrade the attr,
         and there are already several attrs scheduled for removal in 4.11, so
         it's too late for max_comp_streams.
      
      This slightly change a user visible behaviour:
      
      - First, reading from max_comp_stream file now will always return the
        number of online CPUs.
      
      - Second, writing to max_comp_stream will not take any effect.
      
      Link: http://lkml.kernel.org/r/20160503165546.25201-1-sergey.senozhatsky@gmail.comSigned-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      43209ea2
    • S
      zram: user per-cpu compression streams · da9556a2
      Sergey Senozhatsky 提交于
      Remove idle streams list and keep compression streams in per-cpu data.
      This removes two contented spin_lock()/spin_unlock() calls from write
      path and also prevent write OP from being preempted while holding the
      compression stream, which can cause slow downs.
      
      For instance, let's assume that we have N cpus and N-2
      max_comp_streams.TASK1 owns the last idle stream, TASK2-TASK3 come in
      with the write requests:
      
        TASK1            TASK2              TASK3
       zram_bvec_write()
        spin_lock
        find stream
        spin_unlock
      
        compress
      
        <<preempted>>   zram_bvec_write()
                         spin_lock
                         find stream
                         spin_unlock
                           no_stream
                             schedule
                                           zram_bvec_write()
                                            spin_lock
                                            find_stream
                                            spin_unlock
                                              no_stream
                                                schedule
         spin_lock
         release stream
         spin_unlock
           wake up TASK2
      
      not only TASK2 and TASK3 will not get the stream, TASK1 will be
      preempted in the middle of its operation; while we would prefer it to
      finish compression and release the stream.
      
      Test environment: x86_64, 4 CPU box, 3G zram, lzo
      
      The following fio tests were executed:
            read, randread, write, randwrite, rw, randrw
      with the increasing number of jobs from 1 to 10.
      
                        4 streams        8 streams       per-cpu
        ===========================================================
        jobs1
        READ:           2520.1MB/s       2566.5MB/s      2491.5MB/s
        READ:           2102.7MB/s       2104.2MB/s      2091.3MB/s
        WRITE:          1355.1MB/s       1320.2MB/s      1378.9MB/s
        WRITE:          1103.5MB/s       1097.2MB/s      1122.5MB/s
        READ:           434013KB/s       435153KB/s      439961KB/s
        WRITE:          433969KB/s       435109KB/s      439917KB/s
        READ:           403166KB/s       405139KB/s      403373KB/s
        WRITE:          403223KB/s       405197KB/s      403430KB/s
        jobs2
        READ:           7958.6MB/s       8105.6MB/s      8073.7MB/s
        READ:           6864.9MB/s       6989.8MB/s      7021.8MB/s
        WRITE:          2438.1MB/s       2346.9MB/s      3400.2MB/s
        WRITE:          1994.2MB/s       1990.3MB/s      2941.2MB/s
        READ:           981504KB/s       973906KB/s      1018.8MB/s
        WRITE:          981659KB/s       974060KB/s      1018.1MB/s
        READ:           937021KB/s       938976KB/s      987250KB/s
        WRITE:          934878KB/s       936830KB/s      984993KB/s
        jobs3
        READ:           13280MB/s        13553MB/s       13553MB/s
        READ:           11534MB/s        11785MB/s       11755MB/s
        WRITE:          3456.9MB/s       3469.9MB/s      4810.3MB/s
        WRITE:          3029.6MB/s       3031.6MB/s      4264.8MB/s
        READ:           1363.8MB/s       1362.6MB/s      1448.9MB/s
        WRITE:          1361.9MB/s       1360.7MB/s      1446.9MB/s
        READ:           1309.4MB/s       1310.6MB/s      1397.5MB/s
        WRITE:          1307.4MB/s       1308.5MB/s      1395.3MB/s
        jobs4
        READ:           20244MB/s        20177MB/s       20344MB/s
        READ:           17886MB/s        17913MB/s       17835MB/s
        WRITE:          4071.6MB/s       4046.1MB/s      6370.2MB/s
        WRITE:          3608.9MB/s       3576.3MB/s      5785.4MB/s
        READ:           1824.3MB/s       1821.6MB/s      1997.5MB/s
        WRITE:          1819.8MB/s       1817.4MB/s      1992.5MB/s
        READ:           1765.7MB/s       1768.3MB/s      1937.3MB/s
        WRITE:          1767.5MB/s       1769.1MB/s      1939.2MB/s
        jobs5
        READ:           18663MB/s        18986MB/s       18823MB/s
        READ:           16659MB/s        16605MB/s       16954MB/s
        WRITE:          3912.4MB/s       3888.7MB/s      6126.9MB/s
        WRITE:          3506.4MB/s       3442.5MB/s      5519.3MB/s
        READ:           1798.2MB/s       1746.5MB/s      1935.8MB/s
        WRITE:          1792.7MB/s       1740.7MB/s      1929.1MB/s
        READ:           1727.6MB/s       1658.2MB/s      1917.3MB/s
        WRITE:          1726.5MB/s       1657.2MB/s      1916.6MB/s
        jobs6
        READ:           21017MB/s        20922MB/s       21162MB/s
        READ:           19022MB/s        19140MB/s       18770MB/s
        WRITE:          3968.2MB/s       4037.7MB/s      6620.8MB/s
        WRITE:          3643.5MB/s       3590.2MB/s      6027.5MB/s
        READ:           1871.8MB/s       1880.5MB/s      2049.9MB/s
        WRITE:          1867.8MB/s       1877.2MB/s      2046.2MB/s
        READ:           1755.8MB/s       1710.3MB/s      1964.7MB/s
        WRITE:          1750.5MB/s       1705.9MB/s      1958.8MB/s
        jobs7
        READ:           21103MB/s        20677MB/s       21482MB/s
        READ:           18522MB/s        18379MB/s       19443MB/s
        WRITE:          4022.5MB/s       4067.4MB/s      6755.9MB/s
        WRITE:          3691.7MB/s       3695.5MB/s      5925.6MB/s
        READ:           1841.5MB/s       1933.9MB/s      2090.5MB/s
        WRITE:          1842.7MB/s       1935.3MB/s      2091.9MB/s
        READ:           1832.4MB/s       1856.4MB/s      1971.5MB/s
        WRITE:          1822.3MB/s       1846.2MB/s      1960.6MB/s
        jobs8
        READ:           20463MB/s        20194MB/s       20862MB/s
        READ:           18178MB/s        17978MB/s       18299MB/s
        WRITE:          4085.9MB/s       4060.2MB/s      7023.8MB/s
        WRITE:          3776.3MB/s       3737.9MB/s      6278.2MB/s
        READ:           1957.6MB/s       1944.4MB/s      2109.5MB/s
        WRITE:          1959.2MB/s       1946.2MB/s      2111.4MB/s
        READ:           1900.6MB/s       1885.7MB/s      2082.1MB/s
        WRITE:          1896.2MB/s       1881.4MB/s      2078.3MB/s
        jobs9
        READ:           19692MB/s        19734MB/s       19334MB/s
        READ:           17678MB/s        18249MB/s       17666MB/s
        WRITE:          4004.7MB/s       4064.8MB/s      6990.7MB/s
        WRITE:          3724.7MB/s       3772.1MB/s      6193.6MB/s
        READ:           1953.7MB/s       1967.3MB/s      2105.6MB/s
        WRITE:          1953.4MB/s       1966.7MB/s      2104.1MB/s
        READ:           1860.4MB/s       1897.4MB/s      2068.5MB/s
        WRITE:          1858.9MB/s       1895.9MB/s      2066.8MB/s
        jobs10
        READ:           19730MB/s        19579MB/s       19492MB/s
        READ:           18028MB/s        18018MB/s       18221MB/s
        WRITE:          4027.3MB/s       4090.6MB/s      7020.1MB/s
        WRITE:          3810.5MB/s       3846.8MB/s      6426.8MB/s
        READ:           1956.1MB/s       1994.6MB/s      2145.2MB/s
        WRITE:          1955.9MB/s       1993.5MB/s      2144.8MB/s
        READ:           1852.8MB/s       1911.6MB/s      2075.8MB/s
        WRITE:          1855.7MB/s       1914.6MB/s      2078.1MB/s
      
      perf stat
      
                                        4 streams                       8 streams                       per-cpu
        ====================================================================================================================
        jobs1
        stalled-cycles-frontend      23,174,811,209 (  38.21%)     23,220,254,188 (  38.25%)       23,061,406,918 (  38.34%)
        stalled-cycles-backend       11,514,174,638 (  18.98%)     11,696,722,657 (  19.27%)       11,370,852,810 (  18.90%)
        instructions                 73,925,005,782 (    1.22)     73,903,177,632 (    1.22)       73,507,201,037 (    1.22)
        branches                     14,455,124,835 ( 756.063)     14,455,184,779 ( 755.281)       14,378,599,509 ( 758.546)
        branch-misses                    69,801,336 (   0.48%)         80,225,529 (   0.55%)           72,044,726 (   0.50%)
        jobs2
        stalled-cycles-frontend      49,912,741,782 (  46.11%)     50,101,189,290 (  45.95%)       32,874,195,633 (  35.11%)
        stalled-cycles-backend       27,080,366,230 (  25.02%)     27,949,970,232 (  25.63%)       16,461,222,706 (  17.58%)
        instructions                122,831,629,690 (    1.13)    122,919,846,419 (    1.13)      121,924,786,775 (    1.30)
        branches                     23,725,889,239 ( 692.663)     23,733,547,140 ( 688.062)       23,553,950,311 ( 794.794)
        branch-misses                    90,733,041 (   0.38%)         96,320,895 (   0.41%)           84,561,092 (   0.36%)
        jobs3
        stalled-cycles-frontend      66,437,834,608 (  45.58%)     63,534,923,344 (  43.69%)       42,101,478,505 (  33.19%)
        stalled-cycles-backend       34,940,799,661 (  23.97%)     34,774,043,148 (  23.91%)       21,163,324,388 (  16.68%)
        instructions                171,692,121,862 (    1.18)    171,775,373,044 (    1.18)      170,353,542,261 (    1.34)
        branches                     32,968,962,622 ( 628.723)     32,987,739,894 ( 630.512)       32,729,463,918 ( 717.027)
        branch-misses                   111,522,732 (   0.34%)        110,472,894 (   0.33%)           99,791,291 (   0.30%)
        jobs4
        stalled-cycles-frontend      98,741,701,675 (  49.72%)     94,797,349,965 (  47.59%)       54,535,655,381 (  33.53%)
        stalled-cycles-backend       54,642,609,615 (  27.51%)     55,233,554,408 (  27.73%)       27,882,323,541 (  17.14%)
        instructions                220,884,807,851 (    1.11)    220,930,887,273 (    1.11)      218,926,845,851 (    1.35)
        branches                     42,354,518,180 ( 592.105)     42,362,770,587 ( 590.452)       41,955,552,870 ( 716.154)
        branch-misses                   138,093,449 (   0.33%)        131,295,286 (   0.31%)          121,794,771 (   0.29%)
        jobs5
        stalled-cycles-frontend     116,219,747,212 (  48.14%)    110,310,397,012 (  46.29%)       66,373,082,723 (  33.70%)
        stalled-cycles-backend       66,325,434,776 (  27.48%)     64,157,087,914 (  26.92%)       32,999,097,299 (  16.76%)
        instructions                270,615,008,466 (    1.12)    270,546,409,525 (    1.14)      268,439,910,948 (    1.36)
        branches                     51,834,046,557 ( 599.108)     51,811,867,722 ( 608.883)       51,412,576,077 ( 729.213)
        branch-misses                   158,197,086 (   0.31%)        142,639,805 (   0.28%)          133,425,455 (   0.26%)
        jobs6
        stalled-cycles-frontend     138,009,414,492 (  48.23%)    139,063,571,254 (  48.80%)       75,278,568,278 (  32.80%)
        stalled-cycles-backend       79,211,949,650 (  27.68%)     79,077,241,028 (  27.75%)       37,735,797,899 (  16.44%)
        instructions                319,763,993,731 (    1.12)    319,937,782,834 (    1.12)      316,663,600,784 (    1.38)
        branches                     61,219,433,294 ( 595.056)     61,250,355,540 ( 598.215)       60,523,446,617 ( 733.706)
        branch-misses                   169,257,123 (   0.28%)        154,898,028 (   0.25%)          141,180,587 (   0.23%)
        jobs7
        stalled-cycles-frontend     162,974,812,119 (  49.20%)    159,290,061,987 (  48.43%)       88,046,641,169 (  33.21%)
        stalled-cycles-backend       92,223,151,661 (  27.84%)     91,667,904,406 (  27.87%)       44,068,454,971 (  16.62%)
        instructions                369,516,432,430 (    1.12)    369,361,799,063 (    1.12)      365,290,380,661 (    1.38)
        branches                     70,795,673,950 ( 594.220)     70,743,136,124 ( 597.876)       69,803,996,038 ( 732.822)
        branch-misses                   181,708,327 (   0.26%)        165,767,821 (   0.23%)          150,109,797 (   0.22%)
        jobs8
        stalled-cycles-frontend     185,000,017,027 (  49.30%)    182,334,345,473 (  48.37%)       99,980,147,041 (  33.26%)
        stalled-cycles-backend      105,753,516,186 (  28.18%)    107,937,830,322 (  28.63%)       51,404,177,181 (  17.10%)
        instructions                418,153,161,055 (    1.11)    418,308,565,828 (    1.11)      413,653,475,581 (    1.38)
        branches                     80,035,882,398 ( 592.296)     80,063,204,510 ( 589.843)       79,024,105,589 ( 730.530)
        branch-misses                   199,764,528 (   0.25%)        177,936,926 (   0.22%)          160,525,449 (   0.20%)
        jobs9
        stalled-cycles-frontend     210,941,799,094 (  49.63%)    204,714,679,254 (  48.55%)      114,251,113,756 (  33.96%)
        stalled-cycles-backend      122,640,849,067 (  28.85%)    122,188,553,256 (  28.98%)       58,360,041,127 (  17.35%)
        instructions                468,151,025,415 (    1.10)    467,354,869,323 (    1.11)      462,665,165,216 (    1.38)
        branches                     89,657,067,510 ( 585.628)     89,411,550,407 ( 588.990)       88,360,523,943 ( 730.151)
        branch-misses                   218,292,301 (   0.24%)        191,701,247 (   0.21%)          178,535,678 (   0.20%)
        jobs10
        stalled-cycles-frontend     233,595,958,008 (  49.81%)    227,540,615,689 (  49.11%)      160,341,979,938 (  43.07%)
        stalled-cycles-backend      136,153,676,021 (  29.03%)    133,635,240,742 (  28.84%)       65,909,135,465 (  17.70%)
        instructions                517,001,168,497 (    1.10)    516,210,976,158 (    1.11)      511,374,038,613 (    1.37)
        branches                     98,911,641,329 ( 585.796)     98,700,069,712 ( 591.583)       97,646,761,028 ( 728.712)
        branch-misses                   232,341,823 (   0.23%)        199,256,308 (   0.20%)          183,135,268 (   0.19%)
      
      per-cpu streams tend to cause significantly less stalled cycles; execute
      less branches and hit less branch-misses.
      
      perf stat reported execution time
      
                                4 streams        8 streams       per-cpu
        ====================================================================
        jobs1
        seconds elapsed        20.909073870     20.875670495    20.817838540
        jobs2
        seconds elapsed        18.529488399     18.720566469    16.356103108
        jobs3
        seconds elapsed        18.991159531     18.991340812    16.766216066
        jobs4
        seconds elapsed        19.560643828     19.551323547    16.246621715
        jobs5
        seconds elapsed        24.746498464     25.221646740    20.696112444
        jobs6
        seconds elapsed        28.258181828     28.289765505    22.885688857
        jobs7
        seconds elapsed        32.632490241     31.909125381    26.272753738
        jobs8
        seconds elapsed        35.651403851     36.027596308    29.108024711
        jobs9
        seconds elapsed        40.569362365     40.024227989    32.898204012
        jobs10
        seconds elapsed        44.673112304     43.874898137    35.632952191
      
      Please see
        Link: http://marc.info/?l=linux-kernel&m=146166970727530
        Link: http://marc.info/?l=linux-kernel&m=146174716719650
      for more test results (under low memory conditions).
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Suggested-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      da9556a2
    • S
      zsmalloc: require GFP in zs_malloc() · d0d8da2d
      Sergey Senozhatsky 提交于
      Pass GFP flags to zs_malloc() instead of using a fixed mask supplied to
      zs_create_pool(), so we can be more flexible, but, more importantly, we
      need this to switch zram to per-cpu compression streams -- zram will try
      to allocate handle with preemption disabled in a fast path and switch to
      a slow path (using different gfp mask) if the fast one has failed.
      
      Apart from that, this also align zs_malloc() interface with zspool/zbud.
      
      [sergey.senozhatsky@gmail.com: pass GFP flags to zs_malloc() instead of using a fixed mask]
        Link: http://lkml.kernel.org/r/20160429150942.GA637@swordfish
      Link: http://lkml.kernel.org/r/20160429150942.GA637@swordfishSigned-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0d8da2d
  2. 20 5月, 2016 1 次提交
    • J
      mm: rename _count, field of the struct page, to _refcount · 0139aa7b
      Joonsoo Kim 提交于
      Many developers already know that field for reference count of the
      struct page is _count and atomic type.  They would try to handle it
      directly and this could break the purpose of page reference count
      tracepoint.  To prevent direct _count modification, this patch rename it
      to _refcount and add warning message on the code.  After that, developer
      who need to handle reference count will find that field should not be
      accessed directly.
      
      [akpm@linux-foundation.org: fix comments, per Vlastimil]
      [akpm@linux-foundation.org: Documentation/vm/transhuge.txt too]
      [sfr@canb.auug.org.au: sync ethernet driver changes]
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Sunil Goutham <sgoutham@cavium.com>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Manish Chopra <manish.chopra@qlogic.com>
      Cc: Yuval Mintz <yuval.mintz@qlogic.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0139aa7b
  3. 11 5月, 2016 1 次提交
  4. 28 4月, 2016 2 次提交
    • I
      rbd: report unsupported features to syslog · d3767f0f
      Ilya Dryomov 提交于
      ... instead of just returning an error.
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NJosh Durgin <jdurgin@redhat.com>
      d3767f0f
    • I
      rbd: fix rbd map vs notify races · 811c6688
      Ilya Dryomov 提交于
      A while ago, commit 9875201e ("rbd: fix use-after free of
      rbd_dev->disk") fixed rbd unmap vs notify race by introducing
      an exported wrapper for flushing notifies and sticking it into
      do_rbd_remove().
      
      A similar problem exists on the rbd map path, though: the watch is
      registered in rbd_dev_image_probe(), while the disk is set up quite
      a few steps later, in rbd_dev_device_setup().  Nothing prevents
      a notify from coming in and crashing on a NULL rbd_dev->disk:
      
          BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
          Call Trace:
           [<ffffffffa0508344>] rbd_watch_cb+0x34/0x180 [rbd]
           [<ffffffffa04bd290>] do_event_work+0x40/0xb0 [libceph]
           [<ffffffff8109d5db>] process_one_work+0x17b/0x470
           [<ffffffff8109e3ab>] worker_thread+0x11b/0x400
           [<ffffffff8109e290>] ? rescuer_thread+0x400/0x400
           [<ffffffff810a5acf>] kthread+0xcf/0xe0
           [<ffffffff810b41b3>] ? finish_task_switch+0x53/0x170
           [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
           [<ffffffff81645dd8>] ret_from_fork+0x58/0x90
           [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
          RIP  [<ffffffffa050828a>] rbd_dev_refresh+0xfa/0x180 [rbd]
      
      If an error occurs during rbd map, we have to error out, potentially
      tearing down a watch.  Just like on rbd unmap, notifies have to be
      flushed, otherwise rbd_watch_cb() may end up trying to read in the
      image header after rbd_dev_image_release() has run:
      
          Assertion failure in rbd_dev_header_info() at line 4722:
      
           rbd_assert(rbd_image_format_valid(rbd_dev->image_format));
      
          Call Trace:
           [<ffffffff81cccee0>] ? rbd_parent_request_create+0x150/0x150
           [<ffffffff81cd4e59>] rbd_dev_refresh+0x59/0x390
           [<ffffffff81cd5229>] rbd_watch_cb+0x69/0x290
           [<ffffffff81fde9bf>] do_event_work+0x10f/0x1c0
           [<ffffffff81107799>] process_one_work+0x689/0x1a80
           [<ffffffff811076f7>] ? process_one_work+0x5e7/0x1a80
           [<ffffffff81132065>] ? finish_task_switch+0x225/0x640
           [<ffffffff81107110>] ? pwq_dec_nr_in_flight+0x2b0/0x2b0
           [<ffffffff81108c69>] worker_thread+0xd9/0x1320
           [<ffffffff81108b90>] ? process_one_work+0x1a80/0x1a80
           [<ffffffff8111b02d>] kthread+0x21d/0x2e0
           [<ffffffff8111ae10>] ? kthread_stop+0x550/0x550
           [<ffffffff82022802>] ret_from_fork+0x22/0x40
           [<ffffffff8111ae10>] ? kthread_stop+0x550/0x550
          RIP  [<ffffffff81ccd8f9>] rbd_dev_header_info+0xa19/0x1e30
      
      To fix this, a) check if RBD_DEV_FLAG_EXISTS is set before calling
      revalidate_disk(), b) move ceph_osdc_flush_notifies() call into
      rbd_dev_header_unwatch_sync() to cover rbd map error paths and c) turn
      header read-in into a critical section.  The latter also happens to
      take care of rbd map foo@bar vs rbd snap rm foo@bar race.
      
      Fixes: http://tracker.ceph.com/issues/15490Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NJosh Durgin <jdurgin@redhat.com>
      811c6688
  5. 26 4月, 2016 1 次提交
    • J
      skd: remove broken discard support · 49bdedb3
      Jeff Moyer 提交于
      Simply creating a file system on an skd device, followed by mount and
      fstrim will result in errors in the logs and then a BUG().  Let's remove
      discard support from that driver.  As far as I can tell, it hasn't
      worked right since it was merged.  This patch also has a side-effect of
      cleaning up an unintentional shadowed declaration inside of
      skd_end_request.
      
      I tested to ensure that I can still do I/O to the device using xfstests
      ./check -g quick.  I didn't do anything more extensive than that,
      though.
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      49bdedb3
  6. 15 4月, 2016 1 次提交
    • M
      block: loop: fix filesystem corruption in case of aio/dio · a7297a6a
      Ming Lei 提交于
      Starting from commit e36f6204(block: split bios to max possible length),
      block core starts to split bio in the middle of bvec.
      
      Unfortunately loop dio/aio doesn't consider this situation, and
      always treat 'iter.iov_offset' as zero. Then filesystem corruption
      is observed.
      
      This patch figures out the offset of the base bvevc via
      'bio->bi_iter.bi_bvec_done' and fixes the issue by passing the offset
      to iov iterator.
      
      Fixes: e36f6204 (block: split bios to max possible length)
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: stable@vger.kernel.org (4.5)
      Signed-off-by: NMing Lei <ming.lei@canonical.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      a7297a6a
  7. 14 4月, 2016 1 次提交
  8. 13 4月, 2016 11 次提交
  9. 06 4月, 2016 1 次提交
  10. 05 4月, 2016 2 次提交
    • K
      mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage · ea1754a0
      Kirill A. Shutemov 提交于
      Mostly direct substitution with occasional adjustment or removing
      outdated comments.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ea1754a0
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  11. 26 3月, 2016 3 次提交
  12. 19 3月, 2016 2 次提交
  13. 18 3月, 2016 1 次提交
    • J
      mm: introduce page reference manipulation functions · fe896d18
      Joonsoo Kim 提交于
      The success of CMA allocation largely depends on the success of
      migration and key factor of it is page reference count.  Until now, page
      reference is manipulated by direct calling atomic functions so we cannot
      follow up who and where manipulate it.  Then, it is hard to find actual
      reason of CMA allocation failure.  CMA allocation should be guaranteed
      to succeed so finding offending place is really important.
      
      In this patch, call sites where page reference is manipulated are
      converted to introduced wrapper function.  This is preparation step to
      add tracepoint to each page reference manipulation function.  With this
      facility, we can easily find reason of CMA allocation failure.  There is
      no functional change in this patch.
      
      In addition, this patch also converts reference read sites.  It will
      help a second step that renames page._count to something else and
      prevents later attempt to direct access to it (Suggested by Andrew).
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NMichal Nazarewicz <mina86@mina86.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe896d18
  14. 16 3月, 2016 3 次提交
  15. 14 3月, 2016 1 次提交
  16. 05 3月, 2016 1 次提交
    • A
      nbd: use correct div_s64 helper · 5e454c67
      Arnd Bergmann 提交于
      The do_div() macro now checks its arguments for the correct type,
      and refuses anything other than u64, so we get a warning about
      nbd_ioctl passing in an loff_t:
      
      drivers/block/nbd.c: In function '__nbd_ioctl':
      drivers/block/nbd.c:757:77: error: comparison of distinct pointer types lacks a cast [-Werror]
      
      This changes the nbd code to use div_s64() instead, which takes
      a signed argument.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: 37091fdd ("nbd: Create size change events for userspace")
      Signed-off-by: NJens Axboe <axboe@fb.com>
      5e454c67
  17. 04 3月, 2016 4 次提交