提交 · efea2abcb03215f2efadfe994ff7f652aaff196b · openanolis / cloud-kernel

27 10月, 2017 1 次提交

virtio_blk: Fix an SG_IO regression · efea2abc

由 Bart Van Assche 提交于 10月 27, 2017

Avoid that submitting an SG_IO ioctl triggers a kernel oops that
is preceded by:

usercopy: kernel memory overwrite attempt detected to (null) (<null>) (6 bytes)
kernel BUG at mm/usercopy.c:72!
Reported-by: NDann Frazier <dann.frazier@canonical.com>
Fixes: commit ca18d6f7 ("block: Make most scsi_req_init() calls implicit")
Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Dann Frazier <dann.frazier@canonical.com>
Cc: <stable@vger.kernel.org> # v4.13
Reviewed-by: NChristoph Hellwig <hch@lst.de>

Moved virtblk_initialize_rq() inside CONFIG_VIRTIO_BLK_SCSI.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

efea2abc

25 10月, 2017 1 次提交

nbd: handle interrupted sendmsg with a sndtimeo set · 32e67a3a

由 Josef Bacik 提交于 10月 24, 2017

If you do not set sk_sndtimeo you will get -ERESTARTSYS if there is a
pending signal when you enter sendmsg, which we handle properly.
However if you set a timeout for your commands we'll set sk_sndtimeo to
that timeout, which means that sendmsg will start returning -EINTR
instead of -ERESTARTSYS.  Fix this by checking either cases and doing
the correct thing.

Cc: stable@vger.kernel.org
Fixes: dc88e34d ("nbd: set sk->sk_sndtimeo for our sockets")
Reported-and-tested-by: NDaniel Xu <dlxu@fb.com>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

32e67a3a

10 10月, 2017 1 次提交

nbd: don't set the device size until we're connected · 639812a1

由 Josef Bacik 提交于 10月 09, 2017

A user reported a regression with using the normal ioctl interface on
newer kernels.  This happens because I was setting the device size
before the device was actually connected, which caused us to error out
and close everything down.  This didn't happen on netlink because we
hold the device lock the whole time we're setting things up, but we
don't do that for the ioctl path.  This fixes the problem.

Cc: stable@vger.kernel.org
Fixes: 29eaadc0 ("nbd: stop using the bdev everywhere")
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

639812a1

09 10月, 2017 1 次提交

skd: Use kmem_cache_free · 09aa97c7

由 Himanshu Jha 提交于 10月 09, 2017

Use kmem_cache_free instead of kfree for freeing the memory previously
allocated with kmem_cache_zalloc/kmem_cache_alloc/kmem_cache_node.
Signed-off-by: NHimanshu Jha <himanshujha199640@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

09aa97c7

04 10月, 2017 2 次提交

zram: fix null dereference of handle · ae94264e

由 Minchan Kim 提交于 10月 03, 2017

In testing I found handle passed to zs_map_object in __zram_bvec_read is
NULL so eh kernel goes oops in pin_object().

The reason is there is no routine to check the slot's freeing after
getting the slot's lock.  This patch fixes it.

[minchan@kernel.org: v2]
  Link: http://lkml.kernel.org/r/1505887347-10881-1-git-send-email-minchan@kernel.org
Link: http://lkml.kernel.org/r/1505788488-26723-1-git-send-email-minchan@kernel.org
Fixes: 1f7319c7 ("zram: partial IO refactoring")
Signed-off-by: NMinchan Kim <minchan@kernel.org>
Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ae94264e

null_blk: change configfs dependency to select · 6cd1a6fe

由 Jens Axboe 提交于 10月 03, 2017

A recent commit made null_blk depend on configfs, which is kind of
annoying since you now have to find this dependency and enable that
as well. Discovered this since I no longer had null_blk available
on a box I needed to debug, since it got killed when the config
updated after the configfs change was merged.

Fixes: 3bf2bd20 ("nullb: add configfs interface")
Reviewed-by: NShaohua Li <shli@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6cd1a6fe

03 10月, 2017 1 次提交

nbd: fix -ERESTARTSYS handling · 6e60a3bb

由 Josef Bacik 提交于 10月 02, 2017

Christoph made it so that if we return'ed BLK_STS_RESOURCE whenever we
got ERESTARTSYS from sending our packets we'd return BLK_STS_OK, which
means we'd never requeue and just hang.  We really need to return the
right value from the upper layer.

Fixes: fc17b653 ("blk-mq: switch ->queue_rq return value to blk_status_t")
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6e60a3bb

25 9月, 2017 3 次提交

loop: remove union of use_aio and ref in struct loop_cmd · e5313c14

由 Omar Sandoval 提交于 9月 20, 2017

When the request is completed, lo_complete_rq() checks cmd->use_aio.
However, if this is in fact an aio request, cmd->use_aio will have
already been reused as cmd->ref by lo_rw_aio*. Fix it by not using a
union. On x86_64, there's a hole after the union anyways, so this
doesn't make struct loop_cmd any bigger.

Fixes: 92d77332 ("block/loop: fix use after free")
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e5313c14

nbd: ignore non-nbd ioctl's · 1dae69be

由 Josef Bacik 提交于 5月 05, 2017

In testing we noticed that nbd would spew if you ran a fio job against
the raw device itself.  This is because fio calls a block device
specific ioctl, however the block layer will first pass this back to the
driver ioctl handler in case the driver wants to do something special.
Since the device was setup using netlink this caused us to spew every
time fio called this ioctl.  Since we don't have special handling, just
error out for any non-nbd specific ioctl's that come in.  This fixes the
spew.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1dae69be

brd: fix overflow in __brd_direct_access · 02a48436

由 Mikulas Patocka 提交于 9月 13, 2017

The code in __brd_direct_access multiplies the pgoff variable by page size
and divides it by 512. It can cause overflow on 32-bit architectures. The
overflow happens if we create ramdisk larger than 4G and use it as a
sparse device.

This patch replaces multiplication and division with multiplication by the
number of sectors per page.
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Fixes: 1647b9b9 ("brd: add dax_operations support")
Cc: stable@vger.kernel.org	# 4.12+
Signed-off-by: NJens Axboe <axboe@kernel.dk>

02a48436

09 9月, 2017 1 次提交

drivers/block/zram/zram_drv.c: convert to using memset_l · 48ad1abe

由 Matthew Wilcox 提交于 9月 08, 2017

zram was the motivation for creating memset_l().  Minchan Kim sees a 7%
performance improvement on x86 with 100MB of non-zero deduplicatable
data:

        perf stat -r 10 dd if=/dev/zram0 of=/dev/null

vanilla:        0.232050465 seconds time elapsed ( +-  0.51% )
memset_l:	0.217219387 seconds time elapsed ( +-  0.07% )

Link: http://lkml.kernel.org/r/20170720184539.31609-7-willy@infradead.orgSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
Tested-by: NMinchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: David Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

48ad1abe

07 9月, 2017 12 次提交

block, THP: make block_device_operations.rw_page support THP · 98cc093c

由 Huang Ying 提交于 9月 06, 2017

The .rw_page in struct block_device_operations is used by the swap
subsystem to read/write the page contents from/into the corresponding
swap slot in the swap device.  To support the THP (Transparent Huge
Page) swap optimization, the .rw_page is enhanced to support to
read/write THP if possible.

Link: http://lkml.kernel.org/r/20170724051840.2309-6-ying.huang@intel.comSigned-off-by: N"Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@intel.com> [for brd.c, zram_drv.c, pmem.c]
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal L Verma <vishal.l.verma@intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Shaohua Li <shli@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

98cc093c

zram: add config and doc file for writeback feature · 5a47074f

由 Minchan Kim 提交于 9月 06, 2017

This patch adds document and kconfig for using of writeback feature.

Link: http://lkml.kernel.org/r/1498459987-24562-10-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Cc: Juneho Choi <juno.choi@lge.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5a47074f

zram: read page from backing device · 8e654f8f

由 Minchan Kim 提交于 9月 06, 2017

This patch enables read IO from backing device.  For the feature, it
implements two IO read functions to transfer data from backing storage.

One is asynchronous IO function and other is synchronous one.

A reason I need synchrnous IO is due to partial write which need to
complete read IO before the overwriting partial data.

We can make the partial IO's case asynchronous, too but at the moment, I
don't feel adding more complexity to support such rare use cases so want
to go with simple.

[xieyisheng1@huawei.com: read_from_bdev_async(): return 1 to avoid call page_endio() in zram_rw_page()]
  Link: http://lkml.kernel.org/r/1502707447-6944-1-git-send-email-xieyisheng1@huawei.com
Link: http://lkml.kernel.org/r/1498459987-24562-9-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Signed-off-by: NYisheng Xie <xieyisheng1@huawei.com>
Cc: Juneho Choi <juno.choi@lge.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8e654f8f

zram: write incompressible pages to backing device · db8ffbd4

由 Minchan Kim 提交于 9月 06, 2017

This patch enables write IO to transfer data to backing device.  For
that, it implements write_to_bdev function which creates new bio and
chaining with parent bio to make the parent bio asynchrnous.

For rw_page which don't have parent bio, it submit owned bio and handle
IO completion by zram_page_end_io.

Also, this patch defines new flag ZRAM_WB to mark written page for later
read IO.

[xieyisheng1@huawei.com: fix typo in comment]
  Link: http://lkml.kernel.org/r/1502707447-6944-2-git-send-email-xieyisheng1@huawei.com
Link: http://lkml.kernel.org/r/1498459987-24562-8-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Signed-off-by: NYisheng Xie <xieyisheng1@huawei.com>
Cc: Juneho Choi <juno.choi@lge.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

db8ffbd4

zram: identify asynchronous IO's return value · ae85a807

由 Minchan Kim 提交于 9月 06, 2017

For upcoming asynchronous IO like writeback, zram_rw_page should be
aware of that whether requested IO was completed or submitted
successfully, otherwise error.

For the goal, zram_bvec_rw has three return values.

-errno: returns error number
     0: IO request is done synchronously
     1: IO request is issued successfully.

Link: http://lkml.kernel.org/r/1498459987-24562-7-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Cc: Juneho Choi <juno.choi@lge.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ae85a807

zram: add free space management in backing device · 1363d466

由 Minchan Kim 提交于 9月 06, 2017

With backing device, zram needs management of free space of backing
device.

This patch adds bitmap logic to manage free space which is very naive.
However, it would be simple enough as considering uncompressible pages's
frequenty in zram.

Link: http://lkml.kernel.org/r/1498459987-24562-6-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Cc: Juneho Choi <juno.choi@lge.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1363d466

zram: add interface to specif backing device · 013bf95a

由 Minchan Kim 提交于 9月 06, 2017

For writeback feature, user should set up backing device before the zram
working.

This patch enables the interface via /sys/block/zramX/backing_dev.

Currently, it supports block device only but it could be enhanced for
file as well.

Link: http://lkml.kernel.org/r/1498459987-24562-5-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Cc: Juneho Choi <juno.choi@lge.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

013bf95a

zram: rename zram_decompress_page to __zram_bvec_read · 693dc1ce

由 Minchan Kim 提交于 9月 06, 2017

zram_decompress_page naming is not proper because it doesn't decompress
if page was dedup hit or stored with compression.

Use more abstract term and consistent with write path function
__zram_bvec_write.

Link: http://lkml.kernel.org/r/1498459987-24562-4-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Cc: Juneho Choi <juno.choi@lge.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

693dc1ce

zram: inline zram_compress · 97ec7c8b

由 Minchan Kim 提交于 9月 06, 2017

zram_compress does several things, compress, entry alloc and check
limitation.  I did for just readbility but it hurts modulization.:(

So this patch removes zram_compress functions and inline it in
__zram_bvec_write for upcoming patches.

Link: http://lkml.kernel.org/r/1498459987-24562-3-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Cc: Juneho Choi <juno.choi@lge.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

97ec7c8b

zram: clean up duplicated codes in __zram_bvec_write · 4ebbe7f7

由 Minchan Kim 提交于 9月 06, 2017

Patch series "writeback incompressible pages to storage", v1.

zRam is useful for memory saving with compressible pages but sometime,
workload can be changed and system has lots of incompressible pages
which is very harmful for zram.

This patch supports writeback feature of zram so admin can set up a
block device and with it, zram can save the memory via writing out the
incompressile pages once it found it's incompressible pages (1/4 comp
ratio) instead of keeping the page in memory.

[1-3] is just clean up and [4-8] is step by step feature enablement.
[4-8] is logically not bisectable(ie, logical unit separation)
although I tried to compiled out without breaking but I think it would
be better to review.

This patch (of 9):

__zram_bvec_write has some of duplicated logic for zram meta data
handling of same_page|compressed_page.  This patch aims to clean it up
without behavior change.

[xieyisheng1@huawei.com: fix compr_data_size stat]
  Link: http://lkml.kernel.org/r/1502707447-6944-1-git-send-email-xieyisheng1@huawei.com
Link: http://lkml.kernel.org/r/1496019048-27016-1-git-send-email-minchan@kernel.org
Link: http://lkml.kernel.org/r/1498459987-24562-2-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Signed-off-by: NYisheng Xie <xieyisheng1@huawei.com>
Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Juneho Choi <juno.choi@lge.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4ebbe7f7

rbd: silence bogus uninitialized use warning in rbd_acquire_lock() · 37f13252

由 Kefeng Wang 提交于 7月 13, 2017

  drivers/block/rbd.c: In function 'rbd_acquire_lock':
  drivers/block/rbd.c:3602:44: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]

Silence the warning, found it when built old kernel(3.10) with
OBS(opensuse build service).
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

37f13252

loop: set physical block size to logical block size · bf093753

由 Omar Sandoval 提交于 9月 05, 2017

Commit 6c6b6f28 ("loop: set physical block size to PAGE_SIZE")
caused mkfs.xfs to barf on ppc64 [1]. Always using PAGE_SIZE as the
physical block size still makes the most sense semantically, but let's
just lie and always set it to the same value as the logical block size
(same goes for io_min). In the future we might want to at least bump up
io_min to PAGE_SIZE but I'm sick of these stupid changes so let's play
it safe.

1: https://marc.info/?l=linux-xfs&m=150459024723753&w=2Tested-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bf093753

02 9月, 2017 2 次提交

block/loop: remove unused field · bc75705d

由 Shaohua Li 提交于 9月 01, 2017

nobody uses the list.
Signed-off-by: NShaohua Li <shli@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bc75705d

block/loop: fix use after free · 92d77332

由 Shaohua Li 提交于 9月 01, 2017

lo_rw_aio->call_read_iter->
1       aops->direct_IO
2       iov_iter_revert
lo_rw_aio_complete could happen between 1 and 2, the bio and bvec could
be freed before 2, which accesses bvec.
Signed-off-by: NShaohua Li <shli@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

92d77332

01 9月, 2017 6 次提交

block/loop: allow request merge for directio mode · 40326d8a

由 Shaohua Li 提交于 8月 31, 2017

Currently loop disables merge. While it makes sense for buffer IO mode,
directio mode can benefit from request merge. Without merge, loop could
send small size IO to underlayer disk and harm performance.
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NShaohua Li <shli@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

40326d8a

block/loop: set hw_sectors · 54bb0ade

由 Shaohua Li 提交于 8月 31, 2017

Loop can handle any size of request. Limiting it to 255 sectors just
burns the CPU for bio split and request merge for underlayer disk and
also cause bad fs block allocation in directio mode.
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NShaohua Li <shli@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

54bb0ade

loop: fold loop_switch() into callers · 43cade80

由 Omar Sandoval 提交于 8月 24, 2017

The comments here are really outdated, and blk-mq made flushing much
simpler, so just fold the two cases into the callers.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

43cade80

loop: add ioctl for changing logical block size · 89e4fdec

由 Omar Sandoval 提交于 8月 24, 2017

This is a different approach from the first attempt in f2c6df7d
("loop: support 4k physical blocksize"). Rather than extending
LOOP_{GET,SET}_STATUS, add a separate ioctl just for setting the block
size.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

89e4fdec

loop: set physical block size to PAGE_SIZE · 6c6b6f28

由 Omar Sandoval 提交于 8月 24, 2017

The physical block size is "the lowest possible sector size that the
hardware can operate on without reverting to read-modify-write
operations" (from the comment on blk_queue_physical_block_size()). Since
loop does buffered I/O on the backing file by default, the RMW unit is a
page. This isn't the case for direct I/O mode, but let's keep it simple.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6c6b6f28

loop: get rid of lo_blocksize · 8a0740c4

由 Omar Sandoval 提交于 8月 24, 2017

This is only used for setting the soft block size on the struct
block_device once and then never used again.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8a0740c4

30 8月, 2017 9 次提交

drbd: remove BIOSET_NEED_RESCUER flag from drbd_{md_,}io_bio_set · 974c5856

由 NeilBrown 提交于 8月 30, 2017

Careful analysis shows that this flag is not needed.

The RESCUER flag is only needed when a make_request_fn might:
 - allocate a bio from the bioset
 - submit it with generic_make_request() or similar
 - allocate another bio from the bioset

The second allocation can block until the first bio is processed, so
a rescuer is needed to ensure the first bio does get processed.  With
a rescuer it will only get processed when the make_request_fn completes.

In drbd, allocations from drbd_io_bio_set happen from drbd_new_req()
or w_restart_disk_io() which is only called to handle
RESTART_FROZEN_DISK_IO.

In former is called precisely once from the make_request_fn.
The later is never called by within the make_request_fn.

So there cannot be two allocations in the same call to the
make_request_fn, so a rescuer is not needed.

Allocations from drbd_md_io_bio_set are used for IO to the bitmap and
the activity log.  There are only accessed from worker threads and
workqueues, never directly from make_request_fn.
Again, the rescuer isn't needed.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

974c5856

drbd: Fix allyesconfig build, fix recent commit · 5fc1efd5

由 Philipp Reisner 提交于 8月 30, 2017

Globals where prefixed with drbd_, that was missed in the
in #ifdef'nd code when it is built-in.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Fixes: 183ece30 ("drbd: move global variables to drbd namespace and make some static")
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5fc1efd5

drbd: switch from kmalloc() to kmalloc_array() · 365cf663

由 Roland Kammerer 提交于 8月 29, 2017

We had one call to kmalloc that actually allocates an array. Switch that
one to the kmalloc_array() function.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

365cf663

drbd: abort drbd_start_resync if there is no connection · d3d2948f

由 Roland Kammerer 提交于 8月 29, 2017

This was found by a static analysis tool. While highly unlikely, be sure
to return without dereferencing the NULL pointer.
Reported-by: NShaobo <shaobo@cs.utah.edu>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d3d2948f

drbd: move global variables to drbd namespace and make some static · 183ece30

由 Roland Kammerer 提交于 8月 29, 2017

This is a follow-up to Gregs complaints that drbd clutteres the global
namespace.
Some of DRBD's module parameters are only used within one compilation
unit. Make these static.
Signed-off-by: NRoland Kammerer <roland.kammerer@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

183ece30

drbd: rename "usermode_helper" to "drbd_usermode_helper" · 8ab761e1

由 Greg Kroah-Hartman 提交于 8月 29, 2017

Nothing like having a very generic global variable in a tiny driver
subsystem to make a mess of the global namespace...

Note, there are many other "generic" named global variables in the drbd
subsystem, someone should fix those up one day before they hit a linking
error.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8ab761e1

drbd: fix race between handshake and admin disconnect/down · cde81d99

由 Lars Ellenberg 提交于 8月 29, 2017

conn_try_disconnect() could potentialy hit the BUG_ON()
in _conn_set_state() where it iterates over _drbd_set_state()
and "asserts" via BUG_ON() that the latter was successful.

If the STATE_SENT bit was not yet visible to conn_is_valid_transition()
early in _conn_request_state(), but became visible before conn_set_state()
later in that call path, we could hit the BUG_ON() after _drbd_set_state(),
because it returned SS_IN_TRANSIENT_STATE.

To avoid that race, we better protect set_bit(SENT_STATE) with the spinlock.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cde81d99

drbd: fix potential deadlock when trying to detach during handshake · 33d32fa7

由 Lars Ellenberg 提交于 8月 29, 2017

When requesting a detach, we first suspend IO, and also inhibit meta-data IO
by means of drbd_md_get_buffer(), because we don't want to "fail" the disk
while there is IO in-flight: the transition into D_FAILED for detach purposes
may get misinterpreted as actual IO error in a confused endio function.

We wrap it all into wait_event(), to retry in case the drbd_req_state()
returns SS_IN_TRANSIENT_STATE, as it does for example during an ongoing
connection handshake.

In that example, the receiver thread may need to grab drbd_md_get_buffer()
during the handshake to make progress. To avoid potential deadlock with
detach, detach needs to grab and release the meta data buffer inside of
that wait_event retry loop. To avoid lock inversion between
mutex_lock(&device->state_mutex) and drbd_md_get_buffer(device),
introduce a new enum chg_state_flag CS_INHIBIT_MD_IO, and move the
call to drbd_md_get_buffer() inside the state_mutex grabbed in
drbd_req_state().
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

33d32fa7

drbd: A single dot should be put into a sequence. · 427fd2be

由 Markus Elfring 提交于 8月 29, 2017

Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
Signed-off-by: NRoland Kammerer <roland.kammerer@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

427fd2be

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功