1. 26 3月, 2019 4 次提交
    • R
      io_uring: offload write to async worker in case of -EAGAIN · 9bf7933f
      Roman Penyaev 提交于
      In case of direct write -EAGAIN will be returned if page cache was
      previously populated.  To avoid immediate completion of a request
      with -EAGAIN error write has to be offloaded to the async worker,
      like io_read() does.
      Signed-off-by: NRoman Penyaev <rpenyaev@suse.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: linux-block@vger.kernel.org
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9bf7933f
    • M
      sbitmap: order READ/WRITE freed instance and setting clear bit · e6d1fa58
      Ming Lei 提交于
      Inside sbitmap_queue_clear(), once the clear bit is set, it will be
      visiable to allocation path immediately. Meantime READ/WRITE on old
      associated instance(such as request in case of blk-mq) may be
      out-of-order with the setting clear bit, so race with re-allocation
      may be triggered.
      
      Adds one memory barrier for ordering READ/WRITE of the freed associated
      instance with setting clear bit for avoiding race with re-allocation.
      
      The following kernel oops triggerd by block/006 on aarch64 may be fixed:
      
      [  142.330954] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000330
      [  142.338794] Mem abort info:
      [  142.341554]   ESR = 0x96000005
      [  142.344632]   Exception class = DABT (current EL), IL = 32 bits
      [  142.350500]   SET = 0, FnV = 0
      [  142.353544]   EA = 0, S1PTW = 0
      [  142.356678] Data abort info:
      [  142.359528]   ISV = 0, ISS = 0x00000005
      [  142.363343]   CM = 0, WnR = 0
      [  142.366305] user pgtable: 64k pages, 48-bit VAs, pgdp = 000000002a3c51c0
      [  142.372983] [0000000000000330] pgd=0000000000000000, pud=0000000000000000
      [  142.379777] Internal error: Oops: 96000005 [#1] SMP
      [  142.384613] Modules linked in: null_blk ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp vfat fat rpcrdma sunrpc rdma_ucm ib_iser rdma_cm iw_cm libiscsi ib_umad scsi_transport_iscsi ib_ipoib ib_cm mlx5_ib ib_uverbs ib_core sbsa_gwdt crct10dif_ce ghash_ce ipmi_ssif sha2_ce ipmi_devintf sha256_arm64 sg sha1_ce ipmi_msghandler ip_tables xfs libcrc32c mlx5_core sdhci_acpi mlxfw ahci_platform at803x sdhci libahci_platform qcom_emac mmc_core hdma hdma_mgmt i2c_dev [last unloaded: null_blk]
      [  142.429753] CPU: 7 PID: 1983 Comm: fio Not tainted 5.0.0.cki #2
      [  142.449458] pstate: 00400005 (nzcv daif +PAN -UAO)
      [  142.454239] pc : __blk_mq_free_request+0x4c/0xa8
      [  142.458830] lr : blk_mq_free_request+0xec/0x118
      [  142.463344] sp : ffff00003360f6a0
      [  142.466646] x29: ffff00003360f6a0 x28: ffff000010e70000
      [  142.471941] x27: ffff801729a50048 x26: 0000000000010000
      [  142.477232] x25: ffff00003360f954 x24: ffff7bdfff021440
      [  142.482529] x23: 0000000000000000 x22: 00000000ffffffff
      [  142.487830] x21: ffff801729810000 x20: 0000000000000000
      [  142.493123] x19: ffff801729a50000 x18: 0000000000000000
      [  142.498413] x17: 0000000000000000 x16: 0000000000000001
      [  142.503709] x15: 00000000000000ff x14: ffff7fe000000000
      [  142.509003] x13: ffff8017dcde09a0 x12: 0000000000000000
      [  142.514308] x11: 0000000000000001 x10: 0000000000000008
      [  142.519597] x9 : ffff8017dcde09a0 x8 : 0000000000002000
      [  142.524889] x7 : ffff8017dcde0a00 x6 : 000000015388f9be
      [  142.530187] x5 : 0000000000000001 x4 : 0000000000000000
      [  142.535478] x3 : 0000000000000000 x2 : 0000000000000000
      [  142.540777] x1 : 0000000000000001 x0 : ffff00001041b194
      [  142.546071] Process fio (pid: 1983, stack limit = 0x000000006460a0ea)
      [  142.552500] Call trace:
      [  142.554926]  __blk_mq_free_request+0x4c/0xa8
      [  142.559181]  blk_mq_free_request+0xec/0x118
      [  142.563352]  blk_mq_end_request+0xfc/0x120
      [  142.567444]  end_cmd+0x3c/0xa8 [null_blk]
      [  142.571434]  null_complete_rq+0x20/0x30 [null_blk]
      [  142.576194]  blk_mq_complete_request+0x108/0x148
      [  142.580797]  null_handle_cmd+0x1d4/0x718 [null_blk]
      [  142.585662]  null_queue_rq+0x60/0xa8 [null_blk]
      [  142.590171]  blk_mq_try_issue_directly+0x148/0x280
      [  142.594949]  blk_mq_try_issue_list_directly+0x9c/0x108
      [  142.600064]  blk_mq_sched_insert_requests+0xb0/0xd0
      [  142.604926]  blk_mq_flush_plug_list+0x16c/0x2a0
      [  142.609441]  blk_flush_plug_list+0xec/0x118
      [  142.613608]  blk_finish_plug+0x3c/0x4c
      [  142.617348]  blkdev_direct_IO+0x3b4/0x428
      [  142.621336]  generic_file_read_iter+0x84/0x180
      [  142.625761]  blkdev_read_iter+0x50/0x78
      [  142.629579]  aio_read.isra.6+0xf8/0x190
      [  142.633409]  __io_submit_one.isra.8+0x148/0x738
      [  142.637912]  io_submit_one.isra.9+0x88/0xb8
      [  142.642078]  __arm64_sys_io_submit+0xe0/0x238
      [  142.646428]  el0_svc_handler+0xa0/0x128
      [  142.650238]  el0_svc+0x8/0xc
      [  142.653104] Code: b9402a63 f9000a7f 3100047f 540000a0 (f9419a81)
      [  142.659202] ---[ end trace 467586bc175eb09d ]---
      
      Fixes: ea86ea2c ("sbitmap: ammortize cost of clearing bits")
      Reported-and-bisected_and_tested-by: Yi Zhang <yi.zhang@redhat.com>
      Cc: Yi Zhang <yi.zhang@redhat.com>
      Cc: "jianchao.wang" <jianchao.w.wang@oracle.com>
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      e6d1fa58
    • J
      blk-mq: fix sbitmap ws_active for shared tags · e8618575
      Jens Axboe 提交于
      We now wrap sbitmap waitqueues in an active counter, so we can avoid
      iterating wakeups unless we have waiters there. This works as long as
      everyone that's manipulating the waitqueues use the proper helpers. For
      the tag wait case for shared tags, however, we add ourselves to the
      waitqueue without incrementing/decrementing the ->ws_active count. This
      means that wakeups can take a long time to happen.
      
      Fix this by manually doing the inc/dec as needed for the wait queue
      handling.
      Reported-by: NMichael Leun <kbug@newton.leun.net>
      Tested-by: NMichael Leun <kbug@newton.leun.net>
      Cc: stable@vger.kernel.org
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Fixes: 5d2ee712 ("sbitmap: optimize wakeup check")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      e8618575
    • A
      io_uring: fix big-endian compat signal mask handling · 9e75ad5d
      Arnd Bergmann 提交于
      On big-endian architectures, the signal masks are differnet
      between 32-bit and 64-bit tasks, so we have to use a different
      function for reading them from user space.
      
      io_cqring_wait() initially got this wrong, and always interprets
      this as a native structure. This is ok on x86 and most arm64,
      but not on s390, ppc64be, mips64be, sparc64 and parisc.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9e75ad5d
  2. 25 3月, 2019 2 次提交
  3. 24 3月, 2019 3 次提交
    • L
      Merge tag 'io_uring-20190323' of git://git.kernel.dk/linux-block · 1bdd3dbf
      Linus Torvalds 提交于
      Pull io_uring fixes and improvements from Jens Axboe:
       "The first five in this series are heavily inspired by the work Al did
        on the aio side to fix the races there.
      
        The last two re-introduce a feature that was in io_uring before it got
        merged, but which I pulled since we didn't have a good way to have
        BVEC iters that already have a stable reference. These aren't
        necessarily related to block, it's just how io_uring pins fixed
        buffers"
      
      * tag 'io_uring-20190323' of git://git.kernel.dk/linux-block:
        block: add BIO_NO_PAGE_REF flag
        iov_iter: add ITER_BVEC_FLAG_NO_REF flag
        io_uring: mark me as the maintainer
        io_uring: retry bulk slab allocs as single allocs
        io_uring: fix poll races
        io_uring: fix fget/fput handling
        io_uring: add prepped flag
        io_uring: make io_read/write return an integer
        io_uring: use regular request ref counts
      1bdd3dbf
    • L
      Merge tag 'for-linus-20190323' of git://git.kernel.dk/linux-block · 2335cbe6
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
       "A set of fixes/changes that should go into this series. This contains:
      
         - Kernel doc / comment updates (Bart, Shenghui)
      
         - Un-export of core-only used function (Bart)
      
         - Fix race on loop file access (Dongli)
      
         - pf/pcd queue cleanup fixes (me)
      
         - Use appropriate helper for RESTART bit set (Yufen)
      
         - Use named identifier for classic poll (Yufen)"
      
      * tag 'for-linus-20190323' of git://git.kernel.dk/linux-block:
        sbitmap: trivial - update comment for sbitmap_deferred_clear_bit
        blkcg: Fix kernel-doc warnings
        blk-iolatency: #include "blk.h"
        block: Unexport blk_mq_add_to_requeue_list()
        block: add BLK_MQ_POLL_CLASSIC for hybrid poll and return EINVAL for unexpected value
        blk-mq: remove unused 'nr_expired' from blk_mq_hw_ctx
        loop: access lo_backing_file only when the loop device is Lo_bound
        blk-mq: use blk_mq_sched_mark_restart_hctx to set RESTART
        paride/pcd: cleanup queues when detection fails
        paride/pf: cleanup queues when detection fails
      2335cbe6
    • L
      Merge tag 'ceph-for-5.1-rc2' of git://github.com/ceph/ceph-client · 9a1050ad
      Linus Torvalds 提交于
      Pull ceph fixes from Ilya Dryomov:
       "A follow up for the new alloc_size logic and a blacklisting fix,
        marked for stable"
      
      * tag 'ceph-for-5.1-rc2' of git://github.com/ceph/ceph-client:
        rbd: drop wait_for_latest_osdmap()
        libceph: wait for latest osdmap in ceph_monc_blacklist_add()
        rbd: set io_min, io_opt and discard_granularity to alloc_size
      9a1050ad
  4. 23 3月, 2019 8 次提交
  5. 22 3月, 2019 18 次提交
  6. 21 3月, 2019 5 次提交
    • W
      mmc: renesas_sdhi: limit block count to 16 bit for old revisions · c9a9497c
      Wolfram Sang 提交于
      R-Car Gen2 has two different SDHI incarnations in the same chip. The
      older one does not support the recently introduced 32 bit register
      access to the block count register. Make sure we use this feature only
      after the first known version.
      
      Thanks to the Renesas Testing team for this bug report!
      
      Fixes: 5603731a ("mmc: tmio: fix access width of Block Count Register")
      Reported-by: NYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: NWolfram Sang <wsa+renesas@sang-engineering.com>
      Reviewed-by: NSimon Horman <horms+renesas@verge.net.au>
      Tested-by: NPhong Hoang <phong.hoang.wz@renesas.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      c9a9497c
    • D
      mmc: alcor: fix DMA reads · 5ea47691
      Daniel Drake 提交于
      Setting max_blk_count to 1 here was causing the mmc block layer
      to always use the MMC_READ_SINGLE_BLOCK command here, which the
      driver does not DMA-accelerate.
      
      Drop the max_blk_ settings here. The mmc host defaults suffice,
      along with the max_segs and max_seg_size settings, which I have
      now documented in more detail.
      
      Now each MMC command reads 4 512-byte blocks, using DMA instead of
      PIO. On my SD card, this increases read performance (measured with dd)
      from 167kb/sec to 4.6mb/sec.
      
      Link: http://lkml.kernel.org/r/CAD8Lp47L5T3jnAjBiPs1cQ+yFA3L6LJtgFvMETnBrY63-Zdi2g@mail.gmail.comSigned-off-by: NDaniel Drake <drake@endlessm.com>
      Reviewed-by: NOleksij Rempel <linux@rempel-privat.de>
      Fixes: c5413ad8 ("mmc: add new Alcor Micro Cardreader SD/MMC driver")
      Cc: stable@vger.kernel.org
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      5ea47691
    • K
      mmc: sdhci-omap: Set caps2 to indicate no physical write protect pin · 031d2ccc
      Kishon Vijay Abraham I 提交于
      After commit 6d5cd068 ("mmc: sdhci: use WP GPIO in
      sdhci_check_ro()") and commit 39ee32ce ("mmc: sdhci-omap: drop
      ->get_ro() implementation"), sdhci-omap relied on SDHCI_PRESENT_STATE
      to check if the card is read-only, if wp-gpios is not populated
      in device tree. However SDHCI_PRESENT_STATE in sdhci-omap does not have
      correct read-only state.
      
      sdhci-omap can be used by platforms with both micro SD slot and standard
      SD slot with physical write protect pin (using GPIO). Set caps2 to
      MMC_CAP2_NO_WRITE_PROTECT based on if wp-gpios property is populated or
      not.
      
      This fix is required since existing device-tree node doesn't have
      "disable-wp" property and to preserve old-dt compatibility.
      
      Fixes: 6d5cd068 ("mmc: sdhci: use WP GPIO in sdhci_check_ro()")
      Fixes: 39ee32ce ("mmc: sdhci-omap: drop ->get_ro() implementation")
      Signed-off-by: NKishon Vijay Abraham I <kishon@ti.com>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      031d2ccc
    • M
      powerpc/security: Fix spectre_v2 reporting · 92edf8df
      Michael Ellerman 提交于
      When I updated the spectre_v2 reporting to handle software count cache
      flush I got the logic wrong when there's no software count cache
      enabled at all.
      
      The result is that on systems with the software count cache flush
      disabled we print:
      
        Mitigation: Indirect branch cache disabled, Software count cache flush
      
      Which correctly indicates that the count cache is disabled, but
      incorrectly says the software count cache flush is enabled.
      
      The root of the problem is that we are trying to handle all
      combinations of options. But we know now that we only expect to see
      the software count cache flush enabled if the other options are false.
      
      So split the two cases, which simplifies the logic and fixes the bug.
      We were also missing a space before "(hardware accelerated)".
      
      The result is we see one of:
      
        Mitigation: Indirect branch serialisation (kernel only)
        Mitigation: Indirect branch cache disabled
        Mitigation: Software count cache flush
        Mitigation: Software count cache flush (hardware accelerated)
      
      Fixes: ee13cb24 ("powerpc/64s: Add support for software count cache flush")
      Cc: stable@vger.kernel.org # v4.19+
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NMichael Neuling <mikey@neuling.org>
      Reviewed-by: NDiana Craciun <diana.craciun@nxp.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      92edf8df
    • A
      mmc: mxcmmc: "Revert mmc: mxcmmc: handle highmem pages" · 2b77158f
      Alexander Shiyan 提交于
      This reverts commit b189e758.
      
      Unable to handle kernel paging request at virtual address c8358000
      pgd = efa405c3
      [c8358000] *pgd=00000000
      Internal error: Oops: 805 [#1] PREEMPT ARM
      CPU: 0 PID: 711 Comm: kworker/0:2 Not tainted 4.20.0+ #30
      Hardware name: Freescale i.MX27 (Device Tree Support)
      Workqueue: events mxcmci_datawork
      PC is at mxcmci_datawork+0xbc/0x2ac
      LR is at mxcmci_datawork+0xac/0x2ac
      pc : [<c04e33c8>]    lr : [<c04e33b8>]    psr: 60000013
      sp : c6c93f08  ip : 24004180  fp : 00000008
      r10: c8358000  r9 : c78b3e24  r8 : c6c92000
      r7 : 00000000  r6 : c7bb8680  r5 : c7bb86d4  r4 : c78b3de0
      r3 : 00002502  r2 : c090b2e0  r1 : 00000880  r0 : 00000000
      Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
      Control: 0005317f  Table: a68a8000  DAC: 00000055
      Process kworker/0:2 (pid: 711, stack limit = 0x389543bc)
      Stack: (0xc6c93f08 to 0xc6c94000)
      3f00:                   c7bb86d4 00000000 00000000 c6cbfde0 c7bb86d4 c7ee4200
      3f20: 00000000 c0907ea8 00000000 c7bb86d8 c0907ea8 c012077c c6cbfde0 c7bb86d4
      3f40: c6cbfde0 c6c92000 c6cbfdf4 c09280ba c0907ea8 c090b2e0 c0907ebc c0120c18
      3f60: c6cbfde0 00000000 00000000 c6cbb580 c7ba7c40 c7837edc c6cbb598 00000000
      3f80: c6cbfde0 c01208f8 00000000 c01254fc c7ba7c40 c0125400 00000000 00000000
      3fa0: 00000000 00000000 00000000 c01010d0 00000000 00000000 00000000 00000000
      3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
      [<c04e33c8>] (mxcmci_datawork) from [<c012077c>] (process_one_work+0x1f0/0x338)
      [<c012077c>] (process_one_work) from [<c0120c18>] (worker_thread+0x320/0x474)
      [<c0120c18>] (worker_thread) from [<c01254fc>] (kthread+0xfc/0x118)
      [<c01254fc>] (kthread) from [<c01010d0>] (ret_from_fork+0x14/0x24)
      Exception stack(0xc6c93fb0 to 0xc6c93ff8)
      3fa0:                                     00000000 00000000 00000000 00000000
      3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
      Code: e3500000 1a000059 e5153050 e5933038 (e48a3004)
      ---[ end trace 54ca629b75f0e737 ]---
      note: kworker/0:2[711] exited with preempt_count 1
      Signed-off-by: NAlexander Shiyan <shc_work@mail.ru>
      Fixes: b189e758 ("mmc: mxcmmc: handle highmem pages")
      Cc: stable@vger.kernel.org
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      2b77158f