1. 11 11月, 2017 4 次提交
  2. 04 11月, 2017 3 次提交
  3. 13 10月, 2017 1 次提交
  4. 06 10月, 2017 1 次提交
  5. 25 9月, 2017 1 次提交
    • W
      blktrace: Fix potential deadlock between delete & sysfs ops · 5acb3cc2
      Waiman Long 提交于
      The lockdep code had reported the following unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(s_active#228);
                                     lock(&bdev->bd_mutex/1);
                                     lock(s_active#228);
        lock(&bdev->bd_mutex);
      
       *** DEADLOCK ***
      
      The deadlock may happen when one task (CPU1) is trying to delete a
      partition in a block device and another task (CPU0) is accessing
      tracing sysfs file (e.g. /sys/block/dm-1/trace/act_mask) in that
      partition.
      
      The s_active isn't an actual lock. It is a reference count (kn->count)
      on the sysfs (kernfs) file. Removal of a sysfs file, however, require
      a wait until all the references are gone. The reference count is
      treated like a rwsem using lockdep instrumentation code.
      
      The fact that a thread is in the sysfs callback method or in the
      ioctl call means there is a reference to the opended sysfs or device
      file. That should prevent the underlying block structure from being
      removed.
      
      Instead of using bd_mutex in the block_device structure, a new
      blk_trace_mutex is now added to the request_queue structure to protect
      access to the blk_trace structure.
      Suggested-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      
      Fix typo in patch subject line, and prune a comment detailing how
      the code used to work.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5acb3cc2
  6. 29 8月, 2017 1 次提交
    • Y
      smp: Avoid using two cache lines for struct call_single_data · 966a9671
      Ying Huang 提交于
      struct call_single_data is used in IPIs to transfer information between
      CPUs.  Its size is bigger than sizeof(unsigned long) and less than
      cache line size.  Currently it is not allocated with any explicit alignment
      requirements.  This makes it possible for allocated call_single_data to
      cross two cache lines, which results in double the number of the cache lines
      that need to be transferred among CPUs.
      
      This can be fixed by requiring call_single_data to be aligned with the
      size of call_single_data. Currently the size of call_single_data is the
      power of 2.  If we add new fields to call_single_data, we may need to
      add padding to make sure the size of new definition is the power of 2
      as well.
      
      Fortunately, this is enforced by GCC, which will report bad sizes.
      
      To set alignment requirements of call_single_data to the size of
      call_single_data, a struct definition and a typedef is used.
      
      To test the effect of the patch, I used the vm-scalability multiple
      thread swap test case (swap-w-seq-mt).  The test will create multiple
      threads and each thread will eat memory until all RAM and part of swap
      is used, so that huge number of IPIs are triggered when unmapping
      memory.  In the test, the throughput of memory writing improves ~5%
      compared with misaligned call_single_data, because of faster IPIs.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NHuang, Ying <ying.huang@intel.com>
      [ Add call_single_data_t and align with size of call_single_data. ]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/87bmnqd6lz.fsf@yhuang-mobile.sh.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      966a9671
  7. 24 8月, 2017 1 次提交
    • B
      bsg-lib: fix kernel panic resulting from missing allocation of reply-buffer · 50b4d485
      Benjamin Block 提交于
      Since we split the scsi_request out of struct request bsg fails to
      provide a reply-buffer for the drivers. This was done via the pointer
      for sense-data, that is not preallocated anymore.
      
      Failing to allocate/assign it results in illegal dereferences because
      LLDs use this pointer unquestioned.
      
      An example panic on s390x, using the zFCP driver, looks like this (I had
      debugging on, otherwise NULL-pointer dereferences wouldn't even panic on
      s390x):
      
      Unable to handle kernel pointer dereference in virtual kernel address space
      Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6403
      Fault in home space mode while using kernel ASCE.
      AS:0000000001590007 R3:0000000000000024
      Oops: 0038 ilc:2 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      Modules linked in: <Long List>
      CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.12.0-bsg-regression+ #3
      Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
      task: 0000000065cb0100 task.stack: 0000000065cb4000
      Krnl PSW : 0704e00180000000 000003ff801e4156 (zfcp_fc_ct_els_job_handler+0x16/0x58 [zfcp])
                 R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
      Krnl GPRS: 0000000000000001 000000005fa9d0d0 000000005fa9d078 0000000000e16866
                 000003ff00000290 6b6b6b6b6b6b6b6b 0000000059f78f00 000000000000000f
                 00000000593a0958 00000000593a0958 0000000060d88800 000000005ddd4c38
                 0000000058b50100 07000000659cba08 000003ff801e8556 00000000659cb9a8
      Krnl Code: 000003ff801e4146: e31020500004        lg      %r1,80(%r2)
                 000003ff801e414c: 58402040           l       %r4,64(%r2)
                #000003ff801e4150: e35020200004       lg      %r5,32(%r2)
                >000003ff801e4156: 50405004           st      %r4,4(%r5)
                 000003ff801e415a: e54c50080000       mvhi    8(%r5),0
                 000003ff801e4160: e33010280012       lt      %r3,40(%r1)
                 000003ff801e4166: a718fffb           lhi     %r1,-5
                 000003ff801e416a: 1803               lr      %r0,%r3
      Call Trace:
      ([<000003ff801e8556>] zfcp_fsf_req_complete+0x726/0x768 [zfcp])
       [<000003ff801ea82a>] zfcp_fsf_reqid_check+0x102/0x180 [zfcp]
       [<000003ff801eb980>] zfcp_qdio_int_resp+0x230/0x278 [zfcp]
       [<00000000009b91b6>] qdio_kick_handler+0x2ae/0x2c8
       [<00000000009b9e3e>] __tiqdio_inbound_processing+0x406/0xc10
       [<00000000001684c2>] tasklet_action+0x15a/0x1d8
       [<0000000000bd28ec>] __do_softirq+0x3ec/0x848
       [<00000000001675a4>] irq_exit+0x74/0xf8
       [<000000000010dd6a>] do_IRQ+0xba/0xf0
       [<0000000000bd19e8>] io_int_handler+0x104/0x2d4
       [<00000000001033b6>] enabled_wait+0xb6/0x188
      ([<000000000010339e>] enabled_wait+0x9e/0x188)
       [<000000000010396a>] arch_cpu_idle+0x32/0x50
       [<0000000000bd0112>] default_idle_call+0x52/0x68
       [<00000000001cd0fa>] do_idle+0x102/0x188
       [<00000000001cd41e>] cpu_startup_entry+0x3e/0x48
       [<0000000000118c64>] smp_start_secondary+0x11c/0x130
       [<0000000000bd2016>] restart_int_handler+0x62/0x78
       [<0000000000000000>]           (null)
      INFO: lockdep is turned off.
      Last Breaking-Event-Address:
       [<000003ff801e41d6>] zfcp_fc_ct_job_handler+0x3e/0x48 [zfcp]
      
      Kernel panic - not syncing: Fatal exception in interrupt
      
      This patch moves bsg-lib to allocate and setup struct bsg_job ahead of
      time, including the allocation of a buffer for the reply-data.
      
      This means, struct bsg_job is not allocated separately anymore, but as part
      of struct request allocation - similar to struct scsi_cmd. Reflect this in
      the function names that used to handle creation/destruction of struct
      bsg_job.
      Reported-by: NSteffen Maier <maier@linux.vnet.ibm.com>
      Suggested-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBenjamin Block <bblock@linux.vnet.ibm.com>
      Fixes: 82ed4db4 ("block: split scsi_request out of struct request")
      Cc: <stable@vger.kernel.org> #4.11+
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      50b4d485
  8. 10 8月, 2017 1 次提交
  9. 28 6月, 2017 4 次提交
  10. 22 6月, 2017 1 次提交
  11. 21 6月, 2017 4 次提交
  12. 19 6月, 2017 3 次提交
  13. 15 6月, 2017 1 次提交
    • B
      block: Fix a blk_exit_rl() regression · dc9edc44
      Bart Van Assche 提交于
      Avoid that the following complaint is reported:
      
       BUG: sleeping function called from invalid context at kernel/workqueue.c:2790
       in_atomic(): 1, irqs_disabled(): 0, pid: 41, name: rcuop/3
       1 lock held by rcuop/3/41:
        #0:  (rcu_callback){......}, at: [<ffffffff8111f9a2>] rcu_nocb_kthread+0x282/0x500
       Call Trace:
        dump_stack+0x86/0xcf
        ___might_sleep+0x174/0x260
        __might_sleep+0x4a/0x80
        flush_work+0x7e/0x2e0
        __cancel_work_timer+0x143/0x1c0
        cancel_work_sync+0x10/0x20
        blk_throtl_exit+0x25/0x60
        blkcg_exit_queue+0x35/0x40
        blk_release_queue+0x42/0x130
        kobject_put+0xa9/0x190
      
      This happens since we invoke callbacks that need to block from the
      queue release handler. Fix this by pushing the final release to
      a workqueue.
      Reported-by: NRoss Zwisler <zwisler@gmail.com>
      Fixes: commit b425e504 ("block: Avoid that blk_exit_rl() triggers a use-after-free")
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      
      Updated changelog
      Signed-off-by: NJens Axboe <axboe@fb.com>
      dc9edc44
  14. 09 6月, 2017 2 次提交
    • C
      block: switch bios to blk_status_t · 4e4cbee9
      Christoph Hellwig 提交于
      Replace bi_error with a new bi_status to allow for a clear conversion.
      Note that device mapper overloaded bi_error with a private value, which
      we'll have to keep arround at least for now and thus propagate to a
      proper blk_status_t value.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4e4cbee9
    • C
      block: introduce new block status code type · 2a842aca
      Christoph Hellwig 提交于
      Currently we use nornal Linux errno values in the block layer, and while
      we accept any error a few have overloaded magic meanings.  This patch
      instead introduces a new  blk_status_t value that holds block layer specific
      status codes and explicitly explains their meaning.  Helpers to convert from
      and to the previous special meanings are provided for now, but I suspect
      we want to get rid of them in the long run - those drivers that have a
      errno input (e.g. networking) usually get errnos that don't know about
      the special block layer overloads, and similarly returning them to userspace
      will usually return somethings that strictly speaking isn't correct
      for file system operations, but that's left as an exercise for later.
      
      For now the set of errors is a very limited set that closely corresponds
      to the previous overloaded errno values, but there is some low hanging
      fruite to improve it.
      
      blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
      typechecking, so that we can easily catch places passing the wrong values.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2a842aca
  15. 02 6月, 2017 1 次提交
  16. 09 5月, 2017 1 次提交
    • D
      block, dax: move "select DAX" from BLOCK to FS_DAX · ef510424
      Dan Williams 提交于
      For configurations that do not enable DAX filesystems or drivers, do not
      require the DAX core to be built.
      
      Given that the 'direct_access' method has been removed from
      'block_device_operations', we can also go ahead and remove the
      block-related dax helper functions from fs/block_dev.c to
      drivers/dax/super.c. This keeps dax details out of the block layer and
      lets the DAX core be built as a module in the FS_DAX=n case.
      
      Filesystems need to include dax.h to call bdev_dax_supported().
      
      Cc: linux-xfs@vger.kernel.org
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: NJan Kara <jack@suse.com>
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      ef510424
  17. 04 5月, 2017 2 次提交
  18. 28 4月, 2017 1 次提交
  19. 26 4月, 2017 1 次提交
  20. 21 4月, 2017 6 次提交