1. 06 1月, 2018 3 次提交
    • C
      block: introduce zoned block devices zone write locking · 6cc77e9c
      Christoph Hellwig 提交于
      Components relying only on the request_queue structure for accessing
      block devices (e.g. I/O schedulers) have a limited knowledged of the
      device characteristics. In particular, the device capacity cannot be
      easily discovered, which for a zoned block device also result in the
      inability to easily know the number of zones of the device (the zone
      size is indicated by the chunk_sectors field of the queue limits).
      
      Introduce the nr_zones field to the request_queue structure to simplify
      access to this information. Also, add the bitmap seq_zone_bitmap which
      indicates which zones of the device are sequential zones (write
      preferred or write required) and the bitmap seq_zones_wlock which
      indicates if a zone is write locked, that is, if a write request
      targeting a zone was dispatched to the device. These fields are
      initialized by the low level block device driver (sd.c for ZBC/ZAC
      disks). They are not initialized by stacking drivers (device mappers)
      handling zoned block devices (e.g. dm-linear).
      
      Using this, I/O schedulers can introduce zone write locking to control
      request dispatching to a zoned block device and avoid write request
      reordering by limiting to at most a single write request per zone
      outside of the scheduler at any time.
      
      Based on previous patches from Damien Le Moal.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      [Damien]
      * Fixed comments and identation in blkdev.h
      * Changed helper functions
      * Fixed this commit message
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6cc77e9c
    • B
      pktcdvd: Fix a recently introduced NULL pointer dereference · 882d4171
      Bart Van Assche 提交于
      Call bdev_get_queue(bdev) after bdev->bd_disk has been initialized
      instead of just before that pointer has been initialized. This patch
      avoids that the following command
      
      pktsetup 1 /dev/sr0
      
      triggers the following kernel crash:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000548
      IP: pkt_setup_dev+0x2db/0x670 [pktcdvd]
      CPU: 2 PID: 724 Comm: pktsetup Not tainted 4.15.0-rc4-dbg+ #1
      Call Trace:
       pkt_ctl_ioctl+0xce/0x1c0 [pktcdvd]
       do_vfs_ioctl+0x8e/0x670
       SyS_ioctl+0x3c/0x70
       entry_SYSCALL_64_fastpath+0x23/0x9a
      Reported-by: NMaciej S. Szmigiero <mail@maciej.szmigiero.name>
      Fixes: commit ca18d6f7 ("block: Make most scsi_req_init() calls implicit")
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Tested-by: NMaciej S. Szmigiero <mail@maciej.szmigiero.name>
      Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
      Cc: <stable@vger.kernel.org> # v4.13
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      882d4171
    • B
      pktcdvd: Fix pkt_setup_dev() error path · 5a0ec388
      Bart Van Assche 提交于
      Commit 523e1d39 ("block: make gendisk hold a reference to its queue")
      modified add_disk() and disk_release() but did not update any of the
      error paths that trigger a put_disk() call after disk->queue has been
      assigned. That introduced the following behavior in the pktcdvd driver
      if pkt_new_dev() fails:
      
      Kernel BUG at 00000000e98fd882 [verbose debug info unavailable]
      
      Since disk_release() calls blk_put_queue() anyway if disk->queue != NULL,
      fix this by removing the blk_cleanup_queue() call from the pkt_setup_dev()
      error path.
      
      Fixes: commit 523e1d39 ("block: make gendisk hold a reference to its queue")
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
      Cc: <stable@vger.kernel.org> # v3.2
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5a0ec388
  2. 05 1月, 2018 26 次提交
  3. 23 12月, 2017 1 次提交
    • J
      blk-mq: improve heavily contended tag case · 4e5dff41
      Jens Axboe 提交于
      Even with a number of waitqueues, we can get into a situation where we
      are heavily contended on the waitqueue lock. I got a report on spc1
      where we're spending seconds doing this. Arguably the use case is nasty,
      I reproduce it with one device and 1000 threads banging on the device.
      But that doesn't mean we shouldn't be handling it better.
      
      What ends up happening is that a thread will fail to get a tag, add
      itself to the waitqueue, and subsequently get woken up when a tag is
      freed - only to find itself going back to sleep on the waitqueue.
      
      Instead of waking all threads, use an exclusive wait and wake up our
      sbitmap batch count instead. This seems to work well for me (massive
      improvement for this use case), and it survives basic testing. But I
      haven't fully verified it yet.
      
      An additional improvement is running the queue and checking for a new
      tag BEFORE needing to add ourselves to the waitqueue.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4e5dff41
  4. 18 12月, 2017 9 次提交
  5. 17 12月, 2017 1 次提交
    • A
      x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow · 2aeb0736
      Andrey Ryabinin 提交于
      [ Note, this is a Git cherry-pick of the following commit:
      
          d17a1d97: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      The KASAN shadow is currently mapped using vmemmap_populate() since that
      provides a semi-convenient way to map pages into init_top_pgt.  However,
      since that no longer zeroes the mapped pages, it is not suitable for
      KASAN, which requires zeroed shadow memory.
      
      Add kasan_populate_shadow() interface and use it instead of
      vmemmap_populate().  Besides, this allows us to take advantage of
      gigantic pages and use them to populate the shadow, which should save us
      some memory wasted on page tables and reduce TLB pressure.
      
      Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Bob Picco <bob.picco@oracle.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      2aeb0736