1. 05 10月, 2017 4 次提交
    • J
      sysctl: remove /proc/sys/vm/nr_pdflush_threads · b35bd0d9
      Jens Axboe 提交于
      This tunable has been obsolete since 2.6.32, and writes to the
      file have been failing and complaining in dmesg since then:
      
      nr_pdflush_threads exported in /proc is scheduled for removal
      
      That was 8 years ago. Remove the file ABI obsolete notice, and
      the sysfs file.
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b35bd0d9
    • J
      writeback: eliminate work item allocation in bd_start_writeback() · 85009b4f
      Jens Axboe 提交于
      Handle start-all writeback like we do periodic or kupdate
      style writeback - by marking the bdi_writeback as needing a full
      flush, and simply waking the thread. This eliminates the need to
      allocate and queue a specific work item just for this purpose.
      
      After this change, we truly only ever have one of them running at
      any point in time. We mark the need to start all flushes, and the
      writeback thread will clear it once it has processed the request.
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      85009b4f
    • J
      blk-mq: document the need to have STARTED and COMPLETED share a byte · fc13457f
      Jens Axboe 提交于
      For memory ordering guarantees on stores, we need to ensure that
      these two bits share the same byte of storage in the unsigned
      long. Add a comment as to why, and a BUILD_BUG_ON() to ensure that
      we don't violate this requirement.
      Suggested-by: NBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      fc13457f
    • P
      blk-mq: attempt to fix atomic flag memory ordering · a7af0af3
      Peter Zijlstra 提交于
      Attempt to untangle the ordering in blk-mq. The patch introducing the
      single smp_mb__before_atomic() is obviously broken in that it doesn't
      clearly specify a pairing barrier and an obtained guarantee.
      
      The comment is further misleading in that it hints that the
      deadline store and the COMPLETE store also need to be ordered, but
      AFAICT there is no such dependency. However what does appear to be
      important is the clear happening _after_ the store, and that worked by
      pure accident.
      
      This clarifies blk_mq_start_request() -- we should not get there with
      STARTING set -- this simplifies the code and makes the barrier usage
      sane (the old code could be read to allow not having _any_ atomic after
      the barrier, in which case the barrier hasn't got anything to order). We
      then also introduce the missing pairing barrier for it.
      
      Also down-grade the barrier to smp_wmb(), this is cheaper for
      PowerPC/ARM and doesn't cost anything extra on x86.
      
      And it documents the STARTING vs COMPLETE ordering. Although I've not
      been entirely successful in reverse engineering the blk-mq state
      machine so there might still be more funnies around timeout vs
      requeue.
      
      If I got anything wrong, feel free to educate me by adding comments to
      clarify things ;-)
      
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ming Lei <tom.leiming@gmail.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Andrea Parri <parri.andrea@gmail.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Bart Van Assche <bart.vanassche@wdc.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Fixes: 538b7534 ("blk-mq: request deadline must be visible before marking rq as started")
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      a7af0af3
  2. 03 10月, 2017 17 次提交
  3. 01 10月, 2017 1 次提交
  4. 30 9月, 2017 2 次提交
  5. 27 9月, 2017 1 次提交
  6. 26 9月, 2017 15 次提交
    • C
      block: cryptoloop - Fix build warning · 9979d545
      Corentin Labbe 提交于
      This patch fix the following build warning:
      drivers/block/cryptoloop.c:46:8: warning: variable 'cipher' set but not used [-Wunused-but-set-variable]
      Signed-off-by: NCorentin Labbe <clabbe.montjoie@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9979d545
    • S
      block/loop: make loop cgroup aware · d4478e92
      Shaohua Li 提交于
      loop block device handles IO in a separate thread. The actual IO
      dispatched isn't cloned from the IO loop device received, so the
      dispatched IO loses the cgroup context.
      
      I'm ignoring buffer IO case now, which is quite complicated.  Making the
      loop thread aware cgroup context doesn't really help. The loop device
      only writes to a single file. In current writeback cgroup
      implementation, the file can only belong to one cgroup.
      
      For direct IO case, we could workaround the issue in theory. For
      example, say we assign cgroup1 5M/s BW for loop device and cgroup2
      10M/s. We can create a special cgroup for loop thread and assign at
      least 15M/s for the underlayer disk. In this way, we correctly throttle
      the two cgroups. But this is tricky to setup.
      
      This patch tries to address the issue. We record bio's css in loop
      command. When loop thread is handling the command, we then use the API
      provided in patch 1 to set the css for current task. The bio layer will
      use the css for new IO (from patch 3).
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NShaohua Li <shli@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d4478e92
    • S
      block: make blkcg aware of kthread stored original cgroup info · 902ec5b6
      Shaohua Li 提交于
      bio_blkcg is the only API to get cgroup info for a bio right now. If
      bio_blkcg finds current task is a kthread and has original blkcg
      associated, it will use the css instead of associating the bio to
      current task. This makes it possible that kthread dispatches bios on
      behalf of other threads.
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NShaohua Li <shli@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      902ec5b6
    • S
      blkcg: delete unused APIs · af551fb3
      Shaohua Li 提交于
      Nobody uses the APIs right now.
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NShaohua Li <shli@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      af551fb3
    • S
      kthread: add a mechanism to store cgroup info · 05e3db95
      Shaohua Li 提交于
      kthread usually runs jobs on behalf of other threads. The jobs should be
      charged to cgroup of original threads. But the jobs run in a kthread,
      where we lose the cgroup context of original threads. The patch adds a
      machanism to record cgroup info of original threads in kthread context.
      Later we can retrieve the cgroup info and attach the cgroup info to jobs.
      
      Since this mechanism is only required by kthread, we store the cgroup
      info in kthread data instead of generic task_struct.
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NShaohua Li <shli@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      05e3db95
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · e365806a
      Linus Torvalds 提交于
      Pull compat fix from Al Viro:
       "I really wish gcc warned about conversions from pointer to function
        into void *..."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fix a typo in put_compat_shm_info()
      e365806a
    • A
      fix a typo in put_compat_shm_info() · b776e4b1
      Al Viro 提交于
      "uip" misspelled as "up"; unfortunately, the latter happens to be
      a function and gcc is happy to convert it to void *...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b776e4b1
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 19240e6b
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
      
       - Two sets of NVMe pull requests from Christoph:
            - Fixes for the Fibre Channel host/target to fix spec compliance
            - Allow a zero keep alive timeout
            - Make the debug printk for broken SGLs work better
            - Fix queue zeroing during initialization
            - Set of RDMA and FC fixes
            - Target div-by-zero fix
      
       - bsg double-free fix.
      
       - ndb unknown ioctl fix from Josef.
      
       - Buffered vs O_DIRECT page cache inconsistency fix. Has been floating
         around for a long time, well reviewed. From Lukas.
      
       - brd overflow fix from Mikulas.
      
       - Fix for a loop regression in this merge window, where using a union
         for two members of the loop_cmd turned out to be a really bad idea.
         From Omar.
      
       - Fix for an iostat regression fix in this series, using the wrong API
         to get at the block queue. From Shaohua.
      
       - Fix for a potential blktrace delection deadlock. From Waiman.
      
      * 'for-linus' of git://git.kernel.dk/linux-block: (30 commits)
        nvme-fcloop: fix port deletes and callbacks
        nvmet-fc: sync header templates with comments
        nvmet-fc: ensure target queue id within range.
        nvmet-fc: on port remove call put outside lock
        nvme-rdma: don't fully stop the controller in error recovery
        nvme-rdma: give up reconnect if state change fails
        nvme-core: Use nvme_wq to queue async events and fw activation
        nvme: fix sqhd reference when admin queue connect fails
        block: fix a crash caused by wrong API
        fs: Fix page cache inconsistency when mixing buffered and AIO DIO
        nvmet: implement valid sqhd values in completions
        nvme-fabrics: Allow 0 as KATO value
        nvme: allow timed-out ios to retry
        nvme: stop aer posting if controller state not live
        nvme-pci: Print invalid SGL only once
        nvme-pci: initialize queue memory before interrupts
        nvmet-fc: fix failing max io queue connections
        nvme-fc: use transport-specific sgl format
        nvme: add transport SGL definitions
        nvme.h: remove FC transport-specific error values
        ...
      19240e6b
    • L
      Merge tag 'gfs2-for-linus-4.14-rc3' of... · 17763641
      Linus Torvalds 提交于
      Merge tag 'gfs2-for-linus-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
      
      Pull gfs2 fix from Bob Peterson:
       "GFS2: Fix an old regression in GFS2's debugfs interface
      
       This fixes a regression introduced by commit 88ffbf3e ("GFS2: Use
       resizable hash table for glocks"). The regression caused the glock dump
       in debugfs to not report all the glocks, which makes debugging
       extremely difficult"
      
      * tag 'gfs2-for-linus-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
        gfs2: Fix debugfs glocks dump
      17763641
    • L
      Merge tag 'microblaze-4.14-rc3' of git://git.monstr.eu/linux-2.6-microblaze · cf034616
      Linus Torvalds 提交于
      Pull Microblaze fixes from Michal Simek:
      
       - Kbuild fix
      
       - use vma_pages
      
       - setup default little endians
      
      * tag 'microblaze-4.14-rc3' of git://git.monstr.eu/linux-2.6-microblaze:
        arch: change default endian for microblaze
        microblaze: Cocci spatch "vma_pages"
        microblaze: Add missing kvm_para.h to Kbuild
      cf034616
    • L
      Merge tag 'trace-v4.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · ac0a3646
      Linus Torvalds 提交于
      Pull tracing fixes from Steven Rostedt:
       "Stack tracing and RCU has been having issues with each other and
        lockdep has been pointing out constant problems.
      
        The changes have been going into the stack tracer, but it has been
        discovered that the problem isn't with the stack tracer itself, but it
        is with calling save_stack_trace() from within the internals of RCU.
      
        The stack tracer is the one that can trigger the issue the easiest,
        but examining the problem further, it could also happen from a WARN()
        in the wrong place, or even if an NMI happened in this area and it did
        an rcu_read_lock().
      
        The critical area is where RCU is not watching. Which can happen while
        going to and from idle, or bringing up or taking down a CPU.
      
        The final fix was to put the protection in kernel_text_address() as it
        is the one that requires RCU to be watching while doing the stack
        trace.
      
        To make this work properly, Paul had to allow rcu_irq_enter() happen
        after rcu_nmi_enter(). This should have been done anyway, since an NMI
        can page fault (reading vmalloc area), and a page fault triggers
        rcu_irq_enter().
      
        One patch is just a consolidation of code so that the fix only needed
        to be done in one location"
      
      * tag 'trace-v4.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Remove RCU work arounds from stack tracer
        extable: Enable RCU if it is not watching in kernel_text_address()
        extable: Consolidate *kernel_text_address() functions
        rcu: Allow for page faults in NMI handlers
      ac0a3646
    • J
      nvme-fcloop: fix port deletes and callbacks · fddc9923
      James Smart 提交于
      Now that there are potentially long delays between when a remoteport or
      targetport delete calls is made and when the callback occurs (dev_loss_tmo
      timeout), no longer block in the delete routines and move the final nport
      puts to the callbacks.
      
      Moved the fcloop_nport_get/put/free routines to avoid forward declarations.
      
      Ensure port_info structs used in registrations are nulled in case fields
      are not set (ex: devloss_tmo values).
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      fddc9923
    • J
      nvmet-fc: sync header templates with comments · 6b71f9e1
      James Smart 提交于
      Comments were incorrect:
      - defer_rcv was in host port template. moved to target port template
      - Added Mandatory statements for target port template items
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6b71f9e1
    • J
      nvmet-fc: ensure target queue id within range. · 0c319d3a
      James Smart 提交于
      When searching for queue id's ensure they are within the expected range.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      0c319d3a
    • J
      nvmet-fc: on port remove call put outside lock · 3688feb5
      James Smart 提交于
      Avoid calling the put routine, as it may traverse to free routines while
      holding the target lock.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      3688feb5