1. 14 8月, 2015 8 次提交
    • K
      md/raid5: get rid of bio_fits_rdev() · 7140aafc
      Kent Overstreet 提交于
      Remove bio_fits_rdev() as sufficient merge_bvec_fn() handling is now
      performed by blk_queue_split() in md_make_request().
      
      Cc: Neil Brown <neilb@suse.de>
      Cc: linux-raid@vger.kernel.org
      Acked-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
      [dpark: add more description in commit message]
      Signed-off-by: NDongsu Park <dpark@posteo.net>
      Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      7140aafc
    • M
      md/raid5: split bio for chunk_aligned_read · 7ef6b12a
      Ming Lin 提交于
      If a read request fits entirely in a chunk, it will be passed directly to the
      underlying device (providing it hasn't failed of course).  If it doesn't fit,
      the slightly less efficient path that uses the stripe_cache is used.
      Requests that get to the stripe cache are always completely split up as
      necessary.
      
      So with RAID5, ripping out the merge_bvec_fn doesn't cause it to stop work,
      but could cause it to take the less efficient path more often.
      
      All that is needed to manage this is for 'chunk_aligned_read' do some bio
      splitting, much like the RAID0 code does.
      
      Cc: Neil Brown <neilb@suse.de>
      Cc: linux-raid@vger.kernel.org
      Acked-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      7ef6b12a
    • M
      block: remove split code in blkdev_issue_{discard,write_same} · b49a0871
      Ming Lin 提交于
      The split code in blkdev_issue_{discard,write_same} can go away
      now that any driver that cares does the split. We have to make
      sure bio size doesn't overflow.
      
      For discard, we set max discard sectors to (1<<31)>>9 to ensure
      it doesn't overflow bi_size and hopefully it is of the proper
      granularity as long as the granularity is a power of two.
      Acked-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b49a0871
    • K
      btrfs: remove bio splitting and merge_bvec_fn() calls · 0e28997e
      Kent Overstreet 提交于
      Btrfs has been doing bio splitting from btrfs_map_bio(), by checking
      device limits as well as calling ->merge_bvec_fn() etc. That is not
      necessary any more, because generic_make_request() is now able to
      handle arbitrarily sized bios. So clean up unnecessary code paths.
      
      Cc: Chris Mason <clm@fb.com>
      Cc: Josef Bacik <jbacik@fb.com>
      Cc: linux-btrfs@vger.kernel.org
      Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      [dpark: add more description in commit message]
      Signed-off-by: NDongsu Park <dpark@posteo.net>
      Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      0e28997e
    • K
      bcache: remove driver private bio splitting code · 749b61da
      Kent Overstreet 提交于
      The bcache driver has always accepted arbitrarily large bios and split
      them internally.  Now that every driver must accept arbitrarily large
      bios this code isn't nessecary anymore.
      
      Cc: linux-bcache@vger.kernel.org
      Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
      [dpark: add more description in commit message]
      Signed-off-by: NDongsu Park <dpark@posteo.net>
      Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      749b61da
    • K
      block: simplify bio_add_page() · c66a14d0
      Kent Overstreet 提交于
      Since generic_make_request() can now handle arbitrary size bios, all we
      have to do is make sure the bvec array doesn't overflow.
      __bio_add_page() doesn't need to call ->merge_bvec_fn(), where
      we can get rid of unnecessary code paths.
      
      Removing the call to ->merge_bvec_fn() is also fine, as no driver that
      implements support for BLOCK_PC commands even has a ->merge_bvec_fn()
      method.
      
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
      [dpark: rebase and resolve merge conflicts, change a couple of comments,
       make bio_add_page() warn once upon a cloned bio.]
      Signed-off-by: NDongsu Park <dpark@posteo.net>
      Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      c66a14d0
    • K
      block: make generic_make_request handle arbitrarily sized bios · 54efd50b
      Kent Overstreet 提交于
      The way the block layer is currently written, it goes to great lengths
      to avoid having to split bios; upper layer code (such as bio_add_page())
      checks what the underlying device can handle and tries to always create
      bios that don't need to be split.
      
      But this approach becomes unwieldy and eventually breaks down with
      stacked devices and devices with dynamic limits, and it adds a lot of
      complexity. If the block layer could split bios as needed, we could
      eliminate a lot of complexity elsewhere - particularly in stacked
      drivers. Code that creates bios can then create whatever size bios are
      convenient, and more importantly stacked drivers don't have to deal with
      both their own bio size limitations and the limitations of the
      (potentially multiple) devices underneath them.  In the future this will
      let us delete merge_bvec_fn and a bunch of other code.
      
      We do this by adding calls to blk_queue_split() to the various
      make_request functions that need it - a few can already handle arbitrary
      size bios. Note that we add the call _after_ any call to
      blk_queue_bounce(); this means that blk_queue_split() and
      blk_recalc_rq_segments() don't need to be concerned with bouncing
      affecting segment merging.
      
      Some make_request_fn() callbacks were simple enough to audit and verify
      they don't need blk_queue_split() calls. The skipped ones are:
      
       * nfhd_make_request (arch/m68k/emu/nfblock.c)
       * axon_ram_make_request (arch/powerpc/sysdev/axonram.c)
       * simdisk_make_request (arch/xtensa/platforms/iss/simdisk.c)
       * brd_make_request (ramdisk - drivers/block/brd.c)
       * mtip_submit_request (drivers/block/mtip32xx/mtip32xx.c)
       * loop_make_request
       * null_queue_bio
       * bcache's make_request fns
      
      Some others are almost certainly safe to remove now, but will be left
      for future patches.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Ming Lei <ming.lei@canonical.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
      Cc: drbd-user@lists.linbit.com
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Jim Paris <jim@jtan.com>
      Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Oleg Drokin <oleg.drokin@intel.com>
      Cc: Andreas Dilger <andreas.dilger@intel.com>
      Acked-by: NeilBrown <neilb@suse.de> (for the 'md/md.c' bits)
      Acked-by: NMike Snitzer <snitzer@redhat.com>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
      [dpark: skip more mq-based drivers, resolve merge conflicts, etc.]
      Signed-off-by: NDongsu Park <dpark@posteo.net>
      Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      54efd50b
    • V
      blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL) · 41609892
      Viresh Kumar 提交于
      IS_ERR(_OR_NULL) already contain an 'unlikely' compiler flag and there
      is no need to do that again from its callers. Drop it.
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      41609892
  2. 12 8月, 2015 1 次提交
    • S
      block: don't access bio->bi_error after bio_put() · 9b81c842
      Sasha Levin 提交于
      Commit 4246a0b6 ("block: add a bi_error field to struct bio") has added a few
      dereferences of 'bio' after a call to bio_put(). This causes use-after-frees
      such as:
      
      [521120.719695] BUG: KASan: use after free in dio_bio_complete+0x2b3/0x320 at addr ffff880f36b38714
      [521120.720638] Read of size 4 by task mount.ocfs2/9644
      [521120.721212] =============================================================================
      [521120.722056] BUG kmalloc-256 (Not tainted): kasan: bad access detected
      [521120.722968] -----------------------------------------------------------------------------
      [521120.722968]
      [521120.723915] Disabling lock debugging due to kernel taint
      [521120.724539] INFO: Slab 0xffffea003cdace00 objects=32 used=25 fp=0xffff880f36b38600 flags=0x46fffff80004080
      [521120.726037] INFO: Object 0xffff880f36b38700 @offset=1792 fp=0xffff880f36b38800
      [521120.726037]
      [521120.726974] Bytes b4 ffff880f36b386f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.727898] Object ffff880f36b38700: 00 88 b3 36 0f 88 ff ff 00 00 d8 de 0b 88 ff ff  ...6............
      [521120.728822] Object ffff880f36b38710: 02 00 00 f0 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.729705] Object ffff880f36b38720: 01 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00  ................
      [521120.730623] Object ffff880f36b38730: 00 00 00 00 00 00 00 00 01 00 00 00 00 02 00 00  ................
      [521120.731621] Object ffff880f36b38740: 00 02 00 00 01 00 00 00 d0 f7 87 ad ff ff ff ff  ................
      [521120.732776] Object ffff880f36b38750: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.733640] Object ffff880f36b38760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.734508] Object ffff880f36b38770: 01 00 03 00 01 00 00 00 88 87 b3 36 0f 88 ff ff  ...........6....
      [521120.735385] Object ffff880f36b38780: 00 73 22 ad 02 88 ff ff 40 13 e0 3c 00 ea ff ff  .s".....@..<....
      [521120.736667] Object ffff880f36b38790: 00 02 00 00 00 04 00 00 00 00 00 00 00 00 00 00  ................
      [521120.737596] Object ffff880f36b387a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.738524] Object ffff880f36b387b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.739388] Object ffff880f36b387c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.740277] Object ffff880f36b387d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.741187] Object ffff880f36b387e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.742233] Object ffff880f36b387f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.743229] CPU: 41 PID: 9644 Comm: mount.ocfs2 Tainted: G    B           4.2.0-rc6-next-20150810-sasha-00039-gf909086 #2420
      [521120.744274]  ffff880f36b38000 ffff880d89c8f638 ffffffffb6e9ba8a ffff880101c0e5c0
      [521120.745025]  ffff880d89c8f668 ffffffffad76a313 ffff880101c0e5c0 ffffea003cdace00
      [521120.745908]  ffff880f36b38700 ffff880f36b38798 ffff880d89c8f690 ffffffffad772854
      [521120.747063] Call Trace:
      [521120.747520] dump_stack (lib/dump_stack.c:52)
      [521120.748053] print_trailer (mm/slub.c:653)
      [521120.748582] object_err (mm/slub.c:660)
      [521120.749079] kasan_report_error (include/linux/kasan.h:20 mm/kasan/report.c:152 mm/kasan/report.c:194)
      [521120.750834] __asan_report_load4_noabort (mm/kasan/report.c:250)
      [521120.753580] dio_bio_complete (fs/direct-io.c:478)
      [521120.755752] do_blockdev_direct_IO (fs/direct-io.c:494 fs/direct-io.c:1291)
      [521120.759765] __blockdev_direct_IO (fs/direct-io.c:1322)
      [521120.761658] blkdev_direct_IO (fs/block_dev.c:162)
      [521120.762993] generic_file_read_iter (mm/filemap.c:1738)
      [521120.767405] blkdev_read_iter (fs/block_dev.c:1649)
      [521120.768556] __vfs_read (fs/read_write.c:423 fs/read_write.c:434)
      [521120.772126] vfs_read (fs/read_write.c:454)
      [521120.773118] SyS_pread64 (fs/read_write.c:607 fs/read_write.c:594)
      [521120.776062] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:186)
      [521120.777375] Memory state around the buggy address:
      [521120.778118]  ffff880f36b38600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [521120.779211]  ffff880f36b38680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [521120.780315] >ffff880f36b38700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [521120.781465]                          ^
      [521120.782083]  ffff880f36b38780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [521120.783717]  ffff880f36b38800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [521120.784818] ==================================================================
      
      This patch fixes a few of those places that I caught while auditing the patch, but the
      original patch should be audited further for more occurences of this issue since I'm
      not too familiar with the code.
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      9b81c842
  3. 29 7月, 2015 3 次提交
    • J
      block: shrink struct bio down to 2 cache lines again · 2c68f6dc
      Jens Axboe 提交于
      Commit bcf2843b3f8f added ->bi_error to cleanup the error passing
      for struct bio, but that ended up adding 4 bytes and a 4 byte hole
      to the size of struct bio. For a clean config, that bumped it from
      128 bytes, to 136 bytes, on x86-64.
      
      The ->bi_flags member is currently an unsigned long, but it fits
      easily within an int. Change it to an unsigned int, adjust the
      the pool offset code, and move ->bi_error into the new hole. Then
      we end up with a 128 byte bio again.
      
      Change the bio flag set/clear to use cmpxchg to ensure we don't
      lose any flags when manipulating them.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2c68f6dc
    • J
      block: manipulate bio->bi_flags through helpers · b7c44ed9
      Jens Axboe 提交于
      Some places use helpers now, others don't. We only have the 'is set'
      helper, add helpers for setting and clearing flags too.
      
      It was a bit of a mess of atomic vs non-atomic access. With
      BIO_UPTODATE gone, we don't have any risk of concurrent access to the
      flags. So relax the restriction and don't make any of them atomic. The
      flags that do have serialization issues (reffed and chained), we
      already handle those separately.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b7c44ed9
    • C
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig 提交于
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4246a0b6
  4. 17 7月, 2015 10 次提交
    • J
      block: make /sys/block/<dev>/queue/discard_max_bytes writeable · 0034af03
      Jens Axboe 提交于
      Lots of devices support huge discard sizes these days. Depending
      on how the device handles them internally, huge discards can
      introduce massive latencies (hundreds of msec) on the device side.
      
      We have a sysfs file, discard_max_bytes, that advertises the max
      hardware supported discard size. Make this writeable, and split
      the settings into a soft and hard limit. This can be set from
      'discard_granularity' and up to the hardware limit.
      
      Add a new sysfs file, 'discard_max_hw_bytes', that shows the hw
      set limit.
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      0034af03
    • J
      block: have drivers use blk_queue_max_discard_sectors() · 2bb4cd5c
      Jens Axboe 提交于
      Some drivers use it now, others just set the limits field manually.
      But in preparation for splitting this into a hard and soft limit,
      ensure that they all call the proper function for setting the hw
      limit for discards.
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2bb4cd5c
    • M
      block: partition: convert percpu ref · 6c71013e
      Ming Lei 提交于
      Percpu refcount is the perfect match for partition's case,
      and the conversion is quite straight.
      
      With the convertion, one pair of atomic inc/dec can be saved
      for accounting block I/O, which is run in hot path of block I/O.
      Signed-off-by: NMing Lei <tom.leiming@gmail.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      6c71013e
    • M
      block: partition: introduce hd_free_part() · b54e5ed8
      Ming Lei 提交于
      So the helper can be used in both generic partition
      case and part0 case.
      Signed-off-by: NMing Lei <tom.leiming@gmail.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b54e5ed8
    • L
      Merge tag 'pm+acpi-4.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 21bdb584
      Linus Torvalds 提交于
      Pull power management and ACPI fixes from Rafael Wysocki:
       "These fix two bugs in the cpufreq core (including one recent
        regression), fix a 4.0 PCI regression related to the ACPI resources
        management and quieten an RCU-related lockdep complaint about a
        tracepoint in the suspend-to-idle code.
      
        Specifics:
      
         - Fix a recently introduced issue in the cpufreq policy object
           reinitialization that leads to CPU offline/online breakage (Viresh
           Kumar)
      
         - Make it possible to access frequency tables of offline CPUs which
           is needed by thermal management code among other things (Viresh
           Kumar)
      
         - Fix an ACPI resource management regression introduced during the
           4.0 cycle that may cause incorrect resource validation results to
           appear in 32-bit x86 kernels due to silent truncation of 64-bit
           values to 32-bit (Jiang Liu)
      
         - Fix up an RCU-related lockdep complaint about suspicious RCU usage
           in idle caused by using a suspend tracepoint in the core suspend-
           to-idle code (Rafael J Wysocki)"
      
      * tag 'pm+acpi-4.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI / PCI: Fix regressions caused by resource_size_t overflow with 32-bit kernel
        cpufreq: Allow freq_table to be obtained for offline CPUs
        cpufreq: Initialize the governor again while restoring policy
        suspend-to-idle: Prevent RCU from complaining about tick_freeze()
      21bdb584
    • L
      Merge tag 'platform-drivers-x86-v4.2-3' of... · 3e87ee06
      Linus Torvalds 提交于
      Merge tag 'platform-drivers-x86-v4.2-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
      
      Pull x86 platform driver fixes from Darren Hart:
       "Fix SMBIOS call handling and hwswitch state coherency in the
        dell-laptop driver.  Cleanups for intel_*_ipc drivers.  Details:
      
        dell-laptop:
         - Do not cache hwswitch state
         - Check return value of each SMBIOS call
         - Clear buffer before each SMBIOS call
      
        intel_scu_ipc:
         - Move local memory initialization out of a mutex
      
        intel_pmc_ipc:
         - Update kerneldoc formatting
         - Fix compiler casting warnings"
      
      * tag 'platform-drivers-x86-v4.2-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
        intel_scu_ipc: move local memory initialization out of a mutex
        intel_pmc_ipc: Update kerneldoc formatting
        dell-laptop: Do not cache hwswitch state
        dell-laptop: Check return value of each SMBIOS call
        dell-laptop: Clear buffer before each SMBIOS call
        intel_pmc_ipc: Fix compiler casting warnings
      3e87ee06
    • L
      Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu · f85c7124
      Linus Torvalds 提交于
      Pull m68knommu/coldfire fixes from Greg Ungerer:
       "Contains build fixes and updates for the ColdFire defconfigs.
      
        Specifically there is a couple of fixes that address problems building
        allnoconfig.  Also fix for enabling PCI bus on the M54xx family of
        ColdFire"
      
      * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
        m68k: enable PCI support for m5475evb defconfig
        m68k: fix io functions for ColdFire/MMU/PCI case
        m68knommu: update defconfig for ColdFire m5475evb
        m68knommu: update defconfig for ColdFire m5407c3
        m68knommu: update defconfig for ColdFire m5307c3
        m68knommu: update defconfig for ColdFire m5275evb
        m68knommu: update defconfig for ColdFire m5272c3
        m68knommu: update defconfig for ColdFire m5249evb
        m68knommu: update defconfig for m5208evb
        m68knommu: make ColdFire SoC selection a choice
        m68knommu: improve the clock configuration defaults
        m68knommu: force setting of CONFIG_CLOCK_FREQ for ColdFire
      f85c7124
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 761ab766
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
       "A collection of fixes from the last few weeks that should go into the
        current series.  This contains:
      
         - Various fixes for the per-blkcg policy data, fixing regressions
           since 4.1.  From Arianna and Tejun
      
         - Code cleanup for bcache closure macros from me.  Really just
           flushing this out, it's been sitting in another branch for months
      
         - FIELD_SIZEOF cleanup from Maninder Singh
      
         - bio integrity oops fix from Mike
      
         - Timeout regression fix for blk-mq from Ming Lei"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        blk-mq: set default timeout as 30 seconds
        NVMe: Reread partitions on metadata formats
        bcache: don't embed 'return' statements in closure macros
        blkcg: fix blkcg_policy_data allocation bug
        blkcg: implement all_blkcgs list
        blkcg: blkcg_css_alloc() should grab blkcg_pol_mutex while iterating blkcg_policy[]
        blkcg: allow blkcg_pol_mutex to be grabbed from cgroup [file] methods
        block/blk-cgroup.c: free per-blkcg data when freeing the blkcg
        block: use FIELD_SIZEOF to calculate size of a field
        bio integrity: do not assume bio_integrity_pool exists if bioset exists
      761ab766
    • L
      Merge tag 'jfs-4.2' of git://github.com/kleikamp/linux-shaggy · f76d94de
      Linus Torvalds 提交于
      Pull jfs fixes from David Kleikamp:
       "A couple trivial fixes and an error path fix"
      
      * tag 'jfs-4.2' of git://github.com/kleikamp/linux-shaggy:
        jfs: clean up jfs_rename and fix out of order unlock
        jfs: fix indentation on if statement
        jfs: removed a prohibited space after opening parenthesis
      f76d94de
    • R
      Merge branches 'pm-cpuidle', 'pm-cpufreq' and 'acpi-resources' · 17ffc8b0
      Rafael J. Wysocki 提交于
      * pm-cpuidle:
        suspend-to-idle: Prevent RCU from complaining about tick_freeze()
      
      * pm-cpufreq:
        cpufreq: Allow freq_table to be obtained for offline CPUs
        cpufreq: Initialize the governor again while restoring policy
      
      * acpi-resources:
        ACPI / PCI: Fix regressions caused by resource_size_t overflow with 32-bit kernel
      17ffc8b0
  5. 16 7月, 2015 11 次提交
    • M
      blk-mq: set default timeout as 30 seconds · e56f698b
      Ming Lei 提交于
      It is reasonable to set default timeout of request as 30 seconds instead of
      30000 ticks, which may be 300 seconds if HZ is 100, for example, some arm64
      based systems may choose 100 HZ.
      Signed-off-by: NMing Lei <ming.lei@canonical.com>
      Fixes: c76cbbcf ("blk-mq: put blk_queue_rq_timeout together in blk_mq_init_queue()"
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e56f698b
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 3aa20508
      Linus Torvalds 提交于
      Pull TPM bugfixes from James Morris.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        tpm, tpm_crb: fail when TPM2 ACPI table contents look corrupted
        tpm: Fix initialization of the cdev
      3aa20508
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · 9090fdb9
      Linus Torvalds 提交于
      Pull rdma fixes from Doug Ledford:
       "Mainly fix-ups for the various 4.2 items"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (24 commits)
        IB/core: Destroy ocrdma_dev_id IDR on module exit
        IB/core: Destroy multcast_idr on module exit
        IB/mlx4: Optimize do_slave_init
        IB/mlx4: Fix memory leak in do_slave_init
        IB/mlx4: Optimize freeing of items on error unwind
        IB/mlx4: Fix use of flow-counters for process_mad
        IB/ipath: Convert use of __constant_<foo> to <foo>
        IB/ipoib: Set MTU to max allowed by mode when mode changes
        IB/ipoib: Scatter-Gather support in connected mode
        IB/ucm: Fix bitmap wrap when devnum > IB_UCM_MAX_DEVICES
        IB/ipoib: Prevent lockdep warning in __ipoib_ib_dev_flush
        IB/ucma: Fix lockdep warning in ucma_lock_files
        rds: rds_ib_device.refcount overflow
        RDMA/nes: Fix for incorrect recording of the MAC address
        RDMA/nes: Fix for resolving the neigh
        RDMA/core: Fixes for port mapper client registration
        IB/IPoIB: Fix bad error flow in ipoib_add_port()
        IB/mlx4: Do not attemp to report HCA clock offset on VFs
        IB/cm: Do not queue work to a device that's going away
        IB/srp: Avoid using uninitialized variable
        ...
      9090fdb9
    • K
      NVMe: Reread partitions on metadata formats · 7bee6074
      Keith Busch 提交于
      This patch has the driver automatically reread partitions if a namespace
      has a separate metadata format. Previously revalidating a disk was
      sufficient to get the correct capacity set on such formatted drives,
      but partitions that may exist would not have been surfaced.
      Reported-by: NPaul Grabinar <paul.grabinar@ranbarg.com>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Tested-by: NPaul Grabinar <paul.grabinar@ranbarg.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      7bee6074
    • L
      Merge tag 'locks-v4.2-1' of git://git.samba.org/jlayton/linux · 16ff49a0
      Linus Torvalds 提交于
      Pull file locking updates from Jeff Layton:
       "I had thought that I was going to get away without a pull request this
        cycle.  There was a NFSv4 file locking problem that cropped up that I
        tried to fix in the NFSv4 code alone, but that fix has turned out to
        be problematic.  These patches fix this in the correct way.
      
        Note that this touches some NFSv4 code as well.  Ordinarily I'd wait
        for Trond to ACK this, but he's on holiday right now and the bug is
        rather nasty.  So I suggest we merge this and if he raises issues with
        it we can sort it out when he gets back"
      Acked-by: NBruce Fields <bfields@fieldses.org>
      Acked-by: NDan Williams <dan.j.williams@intel.com>
       [ +1 to this series fixing a 100% reproducible slab corruption +
         general protection fault in my nfs-root test environment. - Dan ]
      Acked-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      
      * tag 'locks-v4.2-1' of git://git.samba.org/jlayton/linux:
        locks: inline posix_lock_file_wait and flock_lock_file_wait
        nfs4: have do_vfs_lock take an inode pointer
        locks: new helpers - flock_lock_inode_wait and posix_lock_inode_wait
        locks: have flock_lock_file take an inode pointer instead of a filp
        Revert "nfs: take extra reference to fl->fl_file when running a LOCKU operation"
      16ff49a0
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · df14a68d
      Linus Torvalds 提交于
      Pull KVM fixes from Paolo Bonzini:
      
       - Fix FPU refactoring ("kvm: x86: fix load xsave feature warning")
      
       - Fix eager FPU mode (Cc stable)
      
       - AMD bits of MTRR virtualization
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        kvm: x86: fix load xsave feature warning
        KVM: x86: apply guest MTRR virtualization on host reserved pages
        KVM: SVM: Sync g_pat with guest-written PAT value
        KVM: SVM: use NPT page attributes
        KVM: count number of assigned devices
        KVM: VMX: fix vmwrite to invalid VMCS
        KVM: x86: reintroduce kvm_is_mmio_pfn
        x86: hyperv: add CPUID bit for crash handlers
      df14a68d
    • L
      Merge tag 'arc-v4.2-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · bec33cd2
      Linus Torvalds 提交于
      Pull ARC fixes from Vineet Gupta:
       - Makefile changes (top-level+ARC) reinstates -O3 builds (regression
         since 3.16)
       - IDU intc related fixes, IRQ affinity
       - patch to make bitops safer for ARC
       - perf fix from Alexey to remove signed PC braino
       - Futex backend gets llock/scond support
      
      * tag 'arc-v4.2-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARCv2: support HS38 releases
        ARC: make sure instruction_pointer() returns unsigned value
        ARC: slightly refactor macros for boot logging
        ARC: Add llock/scond to futex backend
        arc:irqchip: prepare for drivers/irqchip/irqchip.h removal
        ARC: Make ARC bitops "safer" (add anti-optimization)
        ARCv2: [axs103] bump CPU frequency from 75 to 90 MHZ
        ARCv2: intc: IDU: Fix potential race in installing a chained IRQ handler
        ARCv2: intc: IDU: support irq affinity
        ARC: fix unused var wanring
        ARC: Don't memzero twice in dma_alloc_coherent for __GFP_ZERO
        ARC: Override toplevel default -O2 with -O3
        kbuild: Allow arch Makefiles to override {cpp,ld,c}flags
        ARCv2: guard SLC DMA ops with spinlock
        ARC: Kconfig: better way to disable ARC_HAS_LLSC for ARC_CPU_750D
      bec33cd2
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 9c69481e
      Linus Torvalds 提交于
      Pull s390 fixes from Martin Schwidefsky:
       "One improvement for the zcrypt driver, the quality attribute for the
        hwrng device has been missing.  Without it the kernel entropy seeding
        will not happen automatically.
      
        And six bug fixes, the most important one is the fix for the vector
        register corruption due to machine checks"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/nmi: fix vector register corruption
        s390/process: fix sfpc inline assembly
        s390/dasd: fix kernel panic when alias is set offline
        s390/sclp: clear upper register halves in _sclp_print_early
        s390/oprofile: fix compile error
        s390/sclp: fix compile error
        s390/zcrypt: enable s390 hwrng to seed kernel entropy
      9c69481e
    • D
      jfs: clean up jfs_rename and fix out of order unlock · 26456955
      Dave Kleikamp 提交于
      The end of jfs_rename(), which is also used by the error paths,
      included a call to IWRITE_UNLOCK(new_ip) after labels out1, out2
      and out3. If we come in through these labels, IWRITE_LOCK() has not
      been called yet.
      
      In moving that call to the correct spot, I also moved some
      exceptional truncate code earlier as well, since the early error
      paths don't need to deal with it, and I renamed out4: to out_tx: so
      a future patch by Jan Kara doesn't need to deal with renumbering or
      confusing out-of-order labels.
      Signed-off-by: NDave Kleikamp <dave.kleikamp@oracle.com>
      26456955
    • L
      Merge tag 'module-final-v4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux · 97d6e2b6
      Linus Torvalds 提交于
      Pull final init.h/module.h code relocation from Paul Gortmaker:
       "With the release of 4.2-rc2 done, we should not be seeing any new code
        added that gets upset by this small code move, and we've banked yet
        another complete week of testing with this move in place on top of
        4.2-rc1 via linux-next to ensure that remained true.
      
        Given that, I'd like to put it in now so that people formulating new
        work for 4.3-rc1 will be exposed to the ever so slightly stricter (but
        sensible) requirements wrt.  whether they are needing init.h vs.
        module.h macros, even if they are not using linux-next.
      
        The diffstat of the move is slightly asymmetrical due to needing to
        leave behind a couple #ifdef in the old location and add the same ones
        to the new location, but other than that, it is a 1:1 move, complete
        with the module_init/exit trailing semicolon that we can't fix.  That
        is, until/unless someone does a tree-wide sed fix of all the
        approximately 800 currently in tree users relying on it"
      
      * tag 'module-final-v4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
        module: relocate module_init from init.h to module.h
      97d6e2b6
    • L
      Merge tag 'trace-v4.2-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 75580097
      Linus Torvalds 提交于
      Pull tracing fix from Steven Rostedt:
       "Fengguang Wu discovered a crash that happened to be because of the
        branch tracer (traces unlikely and likely branches) when enabled with
        certain debug options.
      
        What happened was that various debug options like lockdep and
        DEBUG_PREEMPT can cause parts of the branch tracer to recurse outside
        its recursion protection.  In fact, part of its recursion protection
        used these features that caused the lockup.  This cleans up the code a
        little and makes the recursion protection a bit more robust"
      
      * tag 'trace-v4.2-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Have branch tracer use recursive field of task struct
      75580097
  6. 15 7月, 2015 7 次提交
    • J
    • C
      intel_scu_ipc: move local memory initialization out of a mutex · 8642d7f8
      Christophe JAILLET 提交于
      '{ }' and memset will both reset the cbuf buffer.
      Only once is enough and this can be done outside fo the mutex.
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: NDarren Hart <dvhart@linux.intel.com>
      8642d7f8
    • J
      IB/core: Destroy ocrdma_dev_id IDR on module exit · d8b2ba7c
      Johannes Thumshirn 提交于
      Destroy ocrdma_dev_id IDR on module exit, reclaiming the allocated memory.
      
      This was detected by the following semantic patch (written by Luis Rodriguez
      <mcgrof@suse.com>)
      <SmPL>
      @ defines_module_init @
      declarer name module_init, module_exit;
      declarer name DEFINE_IDR;
      identifier init;
      @@
      
      module_init(init);
      
      @ defines_module_exit @
      identifier exit;
      @@
      
      module_exit(exit);
      
      @ declares_idr depends on defines_module_init && defines_module_exit @
      identifier idr;
      @@
      
      DEFINE_IDR(idr);
      
      @ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
      identifier declares_idr.idr, defines_module_exit.exit;
      @@
      
      exit(void)
      {
       ...
       idr_destroy(&idr);
       ...
      }
      
      @ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
      identifier declares_idr.idr, defines_module_exit.exit;
      @@
      
      exit(void)
      {
       ...
       +idr_destroy(&idr);
       }
      
      </SmPL>
      Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      d8b2ba7c
    • J
      IB/core: Destroy multcast_idr on module exit · 45d25420
      Johannes Thumshirn 提交于
      Destroy multcast_idr on module exit, reclaiming the allocated memory.
      
      This was detected by the following semantic patch (written by Luis Rodriguez
      <mcgrof@suse.com>)
      <SmPL>
      @ defines_module_init @
      declarer name module_init, module_exit;
      declarer name DEFINE_IDR;
      identifier init;
      @@
      
      module_init(init);
      
      @ defines_module_exit @
      identifier exit;
      @@
      
      module_exit(exit);
      
      @ declares_idr depends on defines_module_init && defines_module_exit @
      identifier idr;
      @@
      
      DEFINE_IDR(idr);
      
      @ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
      identifier declares_idr.idr, defines_module_exit.exit;
      @@
      
      exit(void)
      {
       ...
       idr_destroy(&idr);
       ...
      }
      
      @ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
      identifier declares_idr.idr, defines_module_exit.exit;
      @@
      
      exit(void)
      {
       ...
       +idr_destroy(&idr);
      }
      
      </SmPL>
      Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      45d25420
    • D
      IB/mlx4: Optimize do_slave_init · d9a047ae
      Doug Ledford 提交于
      There is little chance our memory allocation will fail, so we can
      combine initializing the work structs with allocating them instead of
      looping through all of them once to allocate and again to initialize.
      Then when we need to actually find out if our device is up or in the
      process of going down, have all of our work structs batched up, take the
      spin_lock once and only once, and do all of the batch under the one
      spin_lock invocation instead of incurring all of the locked memory cycles
      we would otherwise incur to take/release the spin_lock over and over
      again.
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      d9a047ae
    • D
      IB/mlx4: Fix memory leak in do_slave_init · 9bbf282d
      Doug Ledford 提交于
      We create a number of work structs to be queued up to a workqueue, and
      on completion of the workqueue handler, the workqueue handler frees the
      allocated memory.  If, however, we don't queue the work struct because
      the device is going down, then we need to free the memory ourselves.
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      9bbf282d
    • M
      IB/mlx4: Optimize freeing of items on error unwind · a39a98ff
      Maninder Singh 提交于
      On failure, we loop through all possible pointers and test them before
      calling kfree.  But really, why even attempt to free items we didn't
      allocate when we can easily loop through exactly and only the devices
      for which the original memory allocation succeeded and free just those.
      Signed-off-by: NManinder Singh <maninder1.s@samsung.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      a39a98ff