1. 16 3月, 2016 3 次提交
  2. 14 3月, 2016 1 次提交
  3. 05 3月, 2016 1 次提交
    • A
      nbd: use correct div_s64 helper · 5e454c67
      Arnd Bergmann 提交于
      The do_div() macro now checks its arguments for the correct type,
      and refuses anything other than u64, so we get a warning about
      nbd_ioctl passing in an loff_t:
      
      drivers/block/nbd.c: In function '__nbd_ioctl':
      drivers/block/nbd.c:757:77: error: comparison of distinct pointer types lacks a cast [-Werror]
      
      This changes the nbd code to use div_s64() instead, which takes
      a signed argument.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: 37091fdd ("nbd: Create size change events for userspace")
      Signed-off-by: NJens Axboe <axboe@fb.com>
      5e454c67
  4. 04 3月, 2016 13 次提交
  5. 15 2月, 2016 1 次提交
    • M
      nbd: Create size change events for userspace · 37091fdd
      Markus Pargmann 提交于
      The userspace needs to know when nbd devices are ready for use.
      Currently no events are created for the userspace which doesn't work for
      systemd.
      
      See the discussion here: https://github.com/systemd/systemd/pull/358
      
      This patch uses a central point to setup the nbd-internal sizes. A ioctl
      to set a size does not lead to a visible size change. The size of the
      block device will be kept at 0 until nbd is connected. As soon as it
      connects, the size will be changed to the real value and a uevent is
      created. When disconnecting, the blockdevice is set to 0 size and
      another uevent is generated.
      Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
      37091fdd
  6. 11 2月, 2016 1 次提交
    • M
      null_blk: oops when initializing without lightnvm · a514379b
      Matias Bjørling 提交于
      If the LightNVM subsystem is not compiled into the kernel, and the
      null_blk device driver requests lightnvm to be initialized. The call to
      nvm_register fails and the null_add_dev function cleans up the
      initialization. However, at this point the null block device has
      already been added to the nullb_list and thus a second cleanup will
      occur when the function has returned, that leads to a double call to
      blk_cleanup_queue.
      Signed-off-by: NMatias Bjørling <m@bjorling.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      a514379b
  7. 07 2月, 2016 1 次提交
    • J
      floppy: refactor open() flags handling · 09954bad
      Jiri Kosina 提交于
      In case /dev/fdX is open with O_NDELAY / O_NONBLOCK, floppy_open() immediately
      succeeds, without performing any further media / controller preparations.
      That's "correct" wrt. the NODELAY flag, but is hardly correct wrt. the rest
      of the floppy driver, that is not really O_NONBLOCK ready, at all. Therefore
      it's not too surprising, that subsequent attempts to work with the
      filedescriptor produce bad results. Namely, syzkaller tool has been able
      to livelock mmap() on the returned fd to keep waiting on the page unlock
      bit forever.
      
      Quite frankly, I have trouble defining what non-blocking behavior would be for
      floppies. Is waiting ages for the driver to actually succeed reading a sector
      blocking operation? Is waiting for drive motor to start blocking operation? How
      about in case of virtualized floppies?
      
      One option would be returning EWOULDBLOCK in case O_NDLEAY / O_NONBLOCK is
      being passed to open(). That has a theoretical potential of breaking some
      arcane and archaic userspace though.
      
      Let's take a more conservative aproach, and accept the O_NDLEAY flag, and let
      the driver behave as usual.
      
      While at it, clean up a bit handling of !(mode & (FMODE_READ|FMODE_WRITE))
      case and return EINVAL instead of succeeding as well.
      
      Spotted by syzkaller tool.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Tested-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      09954bad
  8. 05 2月, 2016 6 次提交
    • D
      nbd: ratelimit error msgs after socket close · da6ccaaa
      Dan Streetman 提交于
      Make the "Attempted send on closed socket" error messages generated in
      nbd_request_handler() ratelimited.
      
      When the nbd socket is shutdown, the nbd_request_handler() function emits
      an error message for every request remaining in its queue.  If the queue
      is large, this will spam a large amount of messages to the log.  There's
      no need for a separate error message for each request, so this patch
      ratelimits it.
      
      In the specific case this was found, the system was virtual and the error
      messages were logged to the serial port, which overwhelmed it.
      
      Fixes: 4d48a542 ("nbd: fix I/O hang on disconnected nbds")
      Signed-off-by: NDan Streetman <dan.streetman@canonical.com>
      Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
      da6ccaaa
    • M
      nbd: Move flag parsing to a function · d02cf531
      Markus Pargmann 提交于
      nbd changes properties of the blockdevice depending on flags that were
      received. This patch moves this flag parsing into a separate function
      nbd_parse_flags().
      Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
      d02cf531
    • M
      nbd: Cleanup reset of nbd and bdev after a disconnect · 0e4f0f6f
      Markus Pargmann 提交于
      Group all variables that are reset after a disconnect into reset
      functions. This patch adds two of these functions, nbd_reset() and
      nbd_bdev_reset().
      Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
      0e4f0f6f
    • M
      nbd: Timeouts are not user requested disconnects · 1f7b5cf1
      Markus Pargmann 提交于
      It may be useful to know in the client that a connection timed out. The
      current code returns success for a timeout.
      
      This patch reports the error code -ETIMEDOUT for a timeout.
      Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
      1f7b5cf1
    • M
      nbd: Remove signal usage · 23272a67
      Markus Pargmann 提交于
      As discussed on the mailing list, the usage of signals for timeout
      handling has a lot of potential issues. The nbd driver used for some
      time signals for timeouts. These signals where able to get the threads
      out of the blocking socket operations.
      
      This patch removes all signal usage and uses a socket shutdown instead.
      The socket descriptor itself is cleared later when the whole nbd device
      is closed.
      
      The tasks_lock is removed as we do not depend on this anymore. Instead
      a new lock for the socket is introduced so we can safely work with the
      socket in the timeout handler outside of the two main threads.
      
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      23272a67
    • M
      lightnvm: allow to force mm initialization · bf643185
      Matias Bjørling 提交于
      System block allows the device to initialize with its configured media
      manager. The system blocks is written to disk, and read again when media
      manager is determined. For this to work, the backend must store the
      data. Device drivers, such as null_blk, does not have any backend
      storage. This patch allows the media manager to be initialized without a
      storage backend.
      
      It also fix incorrect configuration of capabilities in null_blk, as it
      does not support get/set bad block interface.
      Signed-off-by: NMatias Bjørling <m@bjorling.me>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      bf643185
  9. 03 2月, 2016 1 次提交
    • M
      nbd: Fix debugfs error handling · 27ea43fe
      Markus Pargmann 提交于
      Static checker complains about the implemented error handling. It is
      indeed wrong. We don't care about the return values of created debugfs
      files.
      
      We only have to check the return values of created dirs for NULL
      pointer. If we use a null pointer as parent directory for files, this
      may lead to debugfs files in wrong places.
      Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
      27ea43fe
  10. 01 2月, 2016 1 次提交
    • J
      floppy: fix lock_fdc() signal handling · a0c80efe
      Jiri Kosina 提交于
      floppy_revalidate() doesn't perform any error handling on lock_fdc()
      result. lock_fdc() might actually be interrupted by a signal (it waits for
      fdc becoming non-busy interruptibly). In such case, floppy_revalidate()
      proceeds as if it had claimed the lock, but it fact it doesn't.
      
      In case of multiple threads trying to open("/dev/fdX"), this leads to
      serious corruptions all over the place, because all of a sudden there is
      no critical section protection (that'd otherwise be guaranteed by locked
      fd) whatsoever.
      
      While at this, fix the fact that the 'interruptible' parameter to
      lock_fdc() doesn't make any sense whatsoever, because we always wait
      interruptibly anyway.
      
      Most of the lock_fdc() callsites do properly handle error (and propagate
      EINTR), but floppy_revalidate() and floppy_check_events() don't. Fix this.
      
      Spotted by 'syzkaller' tool.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Tested-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      a0c80efe
  11. 30 1月, 2016 1 次提交
  12. 27 1月, 2016 2 次提交
  13. 23 1月, 2016 2 次提交
  14. 22 1月, 2016 1 次提交
  15. 16 1月, 2016 5 次提交
    • D
      mm, dax, pmem: introduce pfn_t · 34c0fd54
      Dan Williams 提交于
      For the purpose of communicating the optional presence of a 'struct
      page' for the pfn returned from ->direct_access(), introduce a type that
      encapsulates a page-frame-number plus flags.  These flags contain the
      historical "page_link" encoding for a scatterlist entry, but can also
      denote "device memory".  Where "device memory" is a set of pfns that are
      not part of the kernel's linear mapping by default, but are accessed via
      the same memory controller as ram.
      
      The motivation for this new type is large capacity persistent memory
      that needs struct page entries in the 'memmap' to support 3rd party DMA
      (i.e.  O_DIRECT I/O with a persistent memory source/target).  However,
      we also need it in support of maintaining a list of mapped inodes which
      need to be unmapped at driver teardown or freeze_bdev() time.
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34c0fd54
    • J
      zram: don't call idr_remove() from zram_remove() · 17ec4cd9
      Jerome Marchand 提交于
      The use of idr_remove() is forbidden in the callback functions of
      idr_for_each().  It is therefore unsafe to call idr_remove in
      zram_remove().
      
      This patch moves the call to idr_remove() from zram_remove() to
      hot_remove_store().  In the detroy_devices() path, idrs are removed by
      idr_destroy().  This solves an use-after-free detected by KASan.
      
      [akpm@linux-foundation.org: fix coding stype, per Sergey]
      Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
      Acked-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: <stable@vger.kernel.org>	[4.2+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      17ec4cd9
    • S
      zram/zcomp: do not zero out zcomp private pages · e02d238c
      Sergey Senozhatsky 提交于
      Do not __GFP_ZERO allocated zcomp ->private pages.  We keep allocated
      streams around and use them for read/write requests, so we supply a
      zeroed out ->private to compression algorithm as a scratch buffer only
      once -- the first time we use that stream.  For the rest of IO requests
      served by this stream ->private usually contains some temporarily data
      from the previous requests.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e02d238c
    • M
      zram: pass gfp from zcomp frontend to backend · 75d8947a
      Minchan Kim 提交于
      Each zcomp backend uses own gfp flag but it's pointless because the
      context they could be called is driven by upper layer(ie, zcomp
      frontend).  As well, zcomp frondend could call them in different
      context.  One context(ie, zram init part) is it should be better to make
      sure successful allocation other context(ie, further stream allocation
      part for accelarating I/O speed) is just optional so let's pass gfp down
      from driver (ie, zcomp frontend) like normal MM convention.
      
      [sergey.senozhatsky@gmail.com: add missing __vmalloc zero and highmem gfps]
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      75d8947a
    • K
      zram: try vmalloc() after kmalloc() · d913897a
      Kyeongdon Kim 提交于
      When we're using LZ4 multi compression streams for zram swap, we found
      out page allocation failure message in system running test.  That was
      not only once, but a few(2 - 5 times per test).  Also, some failure
      cases were continually occurring to try allocation order 3.
      
      In order to make parallel compression private data, we should call
      kzalloc() with order 2/3 in runtime(lzo/lz4).  But if there is no order
      2/3 size memory to allocate in that time, page allocation fails.  This
      patch makes to use vmalloc() as fallback of kmalloc(), this prevents
      page alloc failure warning.
      
      After using this, we never found warning message in running test, also
      It could reduce process startup latency about 60-120ms in each case.
      
      For reference a call trace :
      
          Binder_1: page allocation failure: order:3, mode:0x10c0d0
          CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty #20
          Call trace:
            dump_backtrace+0x0/0x270
            show_stack+0x10/0x1c
            dump_stack+0x1c/0x28
            warn_alloc_failed+0xfc/0x11c
            __alloc_pages_nodemask+0x724/0x7f0
            __get_free_pages+0x14/0x5c
            kmalloc_order_trace+0x38/0xd8
            zcomp_lz4_create+0x2c/0x38
            zcomp_strm_alloc+0x34/0x78
            zcomp_strm_multi_find+0x124/0x1ec
            zcomp_strm_find+0xc/0x18
            zram_bvec_rw+0x2fc/0x780
            zram_make_request+0x25c/0x2d4
            generic_make_request+0x80/0xbc
            submit_bio+0xa4/0x15c
            __swap_writepage+0x218/0x230
            swap_writepage+0x3c/0x4c
            shrink_page_list+0x51c/0x8d0
            shrink_inactive_list+0x3f8/0x60c
            shrink_lruvec+0x33c/0x4cc
            shrink_zone+0x3c/0x100
            try_to_free_pages+0x2b8/0x54c
            __alloc_pages_nodemask+0x514/0x7f0
            __get_free_pages+0x14/0x5c
            proc_info_read+0x50/0xe4
            vfs_read+0xa0/0x12c
            SyS_read+0x44/0x74
          DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
               0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB
      
      [minchan@kernel.org: change vmalloc gfp and adding comment about gfp]
      [sergey.senozhatsky@gmail.com: tweak comments and styles]
      Signed-off-by: NKyeongdon Kim <kyeongdon.kim@lge.com>
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Acked-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d913897a