1. 07 3月, 2014 4 次提交
    • D
      xfs: xfs_check_page_type buffer checks need help · a49935f2
      Dave Chinner 提交于
      xfs_aops_discard_page() was introduced in the following commit:
      
        xfs: truncate delalloc extents when IO fails in writeback
      
      ... to clean up left over delalloc ranges after I/O failure in
      ->writepage(). generic/224 tests for this scenario and occasionally
      reproduces panics on sub-4k blocksize filesystems.
      
      The cause of this is failure to clean up the delalloc range on a
      page where the first buffer does not match one of the expected
      states of xfs_check_page_type(). If a buffer is not unwritten,
      delayed or dirty&mapped, xfs_check_page_type() stops and
      immediately returns 0.
      
      The stress test of generic/224 creates a scenario where the first
      several buffers of a page with delayed buffers are mapped & uptodate
      and some subsequent buffer is delayed. If the ->writepage() happens
      to fail for this page, xfs_aops_discard_page() incorrectly skips
      the entire page.
      
      This then causes later failures either when direct IO maps the range
      and finds the stale delayed buffer, or we evict the inode and find
      that the inode still has a delayed block reservation accounted to
      it.
      
      We can easily fix this xfs_aops_discard_page() failure by making
      xfs_check_page_type() check all buffers, but this breaks
      xfs_convert_page() more than it is already broken. Indeed,
      xfs_convert_page() wants xfs_check_page_type() to tell it if the
      first buffers on the pages are of a type that can be aggregated into
      the contiguous IO that is already being built.
      
      xfs_convert_page() should not be writing random buffers out of a
      page, but the current behaviour will cause it to do so if there are
      buffers that don't match the current specification on the page.
      Hence for xfs_convert_page() we need to:
      
      	a) return "not ok" if the first buffer on the page does not
      	match the specification provided to we don't write anything;
      	and
      	b) abort it's buffer-add-to-io loop the moment we come
      	across a buffer that does not match the specification.
      
      Hence we need to fix both xfs_check_page_type() and
      xfs_convert_page() to work correctly with pages that have mixed
      buffer types, whilst allowing xfs_aops_discard_page() to scan all
      buffers on the page for a type match.
      Reported-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      a49935f2
    • B
      xfs: avoid AGI/AGF deadlock scenario for inode chunk allocation · e480a723
      Brian Foster 提交于
      The inode chunk allocation path can lead to deadlock conditions if
      a transaction is dirtied with an AGF (to fix up the freelist) for
      an AG that cannot satisfy the actual allocation request. This code
      path is written to try and avoid this scenario, but it can be
      reproduced by running xfstests generic/270 in a loop on a 512b fs.
      
      An example situation is:
      - process A attempts an inode allocation on AG 3, modifies
        the freelist, fails the allocation and ultimately moves on to
        AG 0 with the AG 3 AGF held
      - process B is doing a free space operation (i.e., truncate) and
        acquires the AG 0 AGF, waits on the AG 3 AGF
      - process A acquires the AG 0 AGI, waits on the AG 0 AGF (deadlock)
      
      The problem here is that process A acquired the AG 3 AGF while
      moving on to AG 0 (and releasing the AG 3 AGI with the AG 3 AGF
      held). xfs_dialloc() makes one pass through each of the AGs when
      attempting to allocate an inode chunk. The expectation is a clean
      transaction if a particular AG cannot satisfy the allocation
      request. xfs_ialloc_ag_alloc() is written to support this through
      use of the minalignslop allocation args field.
      
      When using the agi->agi_newino optimization, we attempt an exact
      bno allocation request based on the location of the previously
      allocated chunk. minalignslop is set to inform the allocator that
      we will require alignment on this chunk, and thus to not allow the
      request for this AG if the extra space is not available. Suppose
      that the AG in question has just enough space for this request, but
      not at the requested bno. xfs_alloc_fix_freelist() will proceed as
      normal as it determines the request should succeed, and thus it is
      allowed to modify the agf. xfs_alloc_ag_vextent() ultimately fails
      because the requested bno is not available. In response, the caller
      moves on to a NEAR_BNO allocation request for the same AG. The
      alignment is set, but the minalignslop field is never reset. This
      increases the overall requirement of the request from the first
      attempt. If this delta is the difference between allocation success
      and failure for the AG, xfs_alloc_fix_freelist() rejects this
      request outright the second time around and causes the allocation
      request to unnecessarily fail for this AG.
      
      To address this situation, reset the minalignslop field immediately
      after use and prevent it from leaking into subsequent requests.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      e480a723
    • D
      xfs: use NOIO contexts for vm_map_ram · ae687e58
      Dave Chinner 提交于
      When we map pages in the buffer cache, we can do so in GFP_NOFS
      contexts. However, the vmap interfaces do not provide any method of
      communicating this information to memory reclaim, and hence we get
      lockdep complaining about it regularly and occassionally see hangs
      that may be vmap related reclaim deadlocks. We can also see these
      same problems from anywhere where we use vmalloc for a large buffer
      (e.g. attribute code) inside a transaction context.
      
      A typical lockdep report shows up as a reclaim state warning like so:
      
      [14046.101458] =================================
      [14046.102850] [ INFO: inconsistent lock state ]
      [14046.102850] 3.14.0-rc4+ #2 Not tainted
      [14046.102850] ---------------------------------
      [14046.102850] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
      [14046.102850] kswapd0/14 [HC0[0]:SC0[0]:HE1:SE1] takes:
      [14046.102850]  (&xfs_dir_ilock_class){++++?+}, at: [<791a04bb>] xfs_ilock+0xff/0x16a
      [14046.102850] {RECLAIM_FS-ON-W} state was registered at:
      [14046.102850]   [<7904cdb1>] mark_held_locks+0x81/0xe7
      [14046.102850]   [<7904d390>] lockdep_trace_alloc+0x5c/0xb4
      [14046.102850]   [<790c2c28>] kmem_cache_alloc_trace+0x2b/0x11e
      [14046.102850]   [<790ba7f4>] vm_map_ram+0x119/0x3e6
      [14046.102850]   [<7914e124>] _xfs_buf_map_pages+0x5b/0xcf
      [14046.102850]   [<7914ed74>] xfs_buf_get_map+0x67/0x13f
      [14046.102850]   [<7917506f>] xfs_attr_rmtval_set+0x396/0x4d5
      [14046.102850]   [<7916e8bb>] xfs_attr_leaf_addname+0x18f/0x37d
      [14046.102850]   [<7916ed9e>] xfs_attr_set_int+0x2f5/0x3e8
      [14046.102850]   [<7916eefc>] xfs_attr_set+0x6b/0x74
      [14046.102850]   [<79168355>] xfs_xattr_set+0x61/0x81
      [14046.102850]   [<790e5b10>] generic_setxattr+0x59/0x68
      [14046.102850]   [<790e4c06>] __vfs_setxattr_noperm+0x58/0xce
      [14046.102850]   [<790e4d0a>] vfs_setxattr+0x8e/0x92
      [14046.102850]   [<790e4ddd>] setxattr+0xcf/0x159
      [14046.102850]   [<790e5423>] SyS_lsetxattr+0x88/0xbb
      [14046.102850]   [<79268438>] sysenter_do_call+0x12/0x36
      
      Now, we can't completely remove these traces - mainly because
      vm_map_ram() will do GFP_KERNEL allocation and that generates the
      above warning before we get into the reclaim code, but we can turn
      them all into false positive warnings.
      
      To do that, use the method that DM and other IO context code uses to
      avoid this problem: there is a process flag to tell memory reclaim
      not to do IO that we can set appropriately. That prevents GFP_KERNEL
      context reclaim being done from deep inside the vmalloc code in
      places we can't directly pass a GFP_NOFS context to. That interface
      has a pair of wrapper functions: memalloc_noio_save() and
      memalloc_noio_restore().
      
      Adding them around vm_map_ram and the vzalloc call in
      kmem_alloc_large() will prevent deadlocks and most lockdep reports
      for this issue. Also, convert the vzalloc() call in
      kmem_alloc_large() to use __vmalloc() so that we can pass the
      correct gfp context to the data page allocation routine inside
      __vmalloc() so that it is clear that GFP_NOFS context is important
      to this vmalloc call.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      ae687e58
    • D
      xfs: don't leak EFSBADCRC to userspace · ac75a1f7
      Dave Chinner 提交于
      While the verifier routines may return EFSBADCRC when a buffer has
      a bad CRC, we need to translate that to EFSCORRUPTED so that the
      higher layers treat the error appropriately and we return a
      consistent error to userspace. This fixes a xfs/005 regression.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      ac75a1f7
  2. 03 2月, 2014 16 次提交
    • L
      Linus 3.14-rc1 · 38dbfb59
      Linus Torvalds 提交于
      38dbfb59
    • L
      Merge branch 'parisc-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 69048e01
      Linus Torvalds 提交于
      Pull parisc updates from Helge Deller:
       "The three major changes in this patchset is a implementation for
        flexible userspace memory maps, cache-flushing fixes (again), and a
        long-discussed ABI change to make EWOULDBLOCK the same value as
        EAGAIN.
      
        parisc has been the only platform where we had EWOULDBLOCK != EAGAIN
        to keep HP-UX compatibility.  Since we will probably never implement
        full HP-UX support, we prefer to drop this compatibility to make it
        easier for us with Linux userspace programs which mostly never checked
        for both values.  We don't expect major fall-outs because of this
        change, and if we face some, we will simply rebuild the necessary
        applications in the debian archives"
      
      * 'parisc-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: add flexible mmap memory layout support
        parisc: Make EWOULDBLOCK be equal to EAGAIN on parisc
        parisc: convert uapi/asm/stat.h to use native types only
        parisc: wire up sched_setattr and sched_getattr
        parisc: fix cache-flushing
        parisc/sti_console: prefer Linux fonts over built-in ROM fonts
      69048e01
    • M
      hpfs: optimize quad buffer loading · 1c0b8a7a
      Mikulas Patocka 提交于
      HPFS needs to load 4 consecutive 512-byte sectors when accessing the
      directory nodes or bitmaps.  We can't switch to 2048-byte block size
      because files are allocated in the units of 512-byte sectors.
      
      Previously, the driver would allocate a 2048-byte area using kmalloc,
      copy the data from four buffers to this area and eventually copy them
      back if they were modified.
      
      In the current implementation of the buffer cache, buffers are allocated
      in the pagecache.  That means that 4 consecutive 512-byte buffers are
      stored in consecutive areas in the kernel address space.  So, we don't
      need to allocate extra memory and copy the content of the buffers there.
      
      This patch optimizes the code to avoid copying the buffers.  It checks
      if the four buffers are stored in contiguous memory - if they are not,
      it falls back to allocating a 2048-byte area and copying data there.
      Signed-off-by: NMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1c0b8a7a
    • M
      hpfs: remember free space · 2cbe5c76
      Mikulas Patocka 提交于
      Previously, hpfs scanned all bitmaps each time the user asked for free
      space using statfs.  This patch changes it so that hpfs scans the
      bitmaps only once, remembes the free space and on next invocation of
      statfs it returns the value instantly.
      
      New versions of wine are hammering on the statfs syscall very heavily,
      making some games unplayable when they're stored on hpfs, with load
      times in minutes.
      
      This should be backported to the stable kernels because it fixes
      user-visible problem (excessive level load times in wine).
      Signed-off-by: NMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2cbe5c76
    • H
      parisc: add flexible mmap memory layout support · 9dabf60d
      Helge Deller 提交于
      Add support for the flexible mmap memory layout (as described in
      http://lwn.net/Articles/91829). This is especially very interesting on
      parisc since we currently only support 32bit userspace (even with a
      64bit Linux kernel).
      Signed-off-by: NHelge Deller <deller@gmx.de>
      9dabf60d
    • G
      parisc: Make EWOULDBLOCK be equal to EAGAIN on parisc · f5a408d5
      Guy Martin 提交于
      On Linux, only parisc uses a different value for EWOULDBLOCK which
      causes a lot of troubles for applications not checking for both values.
      Since the hpux compat is long dead, make EWOULDBLOCK behave the same as
      all other architectures.
      Signed-off-by: NGuy Martin <gmsoft@tuxicoman.be>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      f5a408d5
    • H
      parisc: convert uapi/asm/stat.h to use native types only · 9391bc77
      Helge Deller 提交于
      The stat.h header file is exported to userspace. Some userspace
      applications failed to compile due to missing/unknown types, so we
      better convert it to use native types only (like it's done on other
      architectures too).
      Signed-off-by: NHelge Deller <deller@gmx.de>
      9391bc77
    • H
      parisc: wire up sched_setattr and sched_getattr · 998bbb2f
      Helge Deller 提交于
      Signed-off-by: NHelge Deller <deller@gmx.de>
      998bbb2f
    • H
      parisc: fix cache-flushing · 57737c49
      Helge Deller 提交于
      This commit:
      f8dae006: parisc: Ensure full cache coherency for kmap/kunmap
      caused negative caching side-effects, e.g. hanging processes with expect and
      too many inequivalent alias messages from flush_dcache_page() on Debian 5 systems.
      
      This patch now partly reverts it and has been in production use on our debian buildd
      makeservers since a week without any major problems.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Cc: stable@vger.kernel.org # v3.9+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      57737c49
    • H
      parisc/sti_console: prefer Linux fonts over built-in ROM fonts · 8a10bc9d
      Helge Deller 提交于
      The built-in ROM fonts lack many necessary ASCII characters, which is
      why it makes sens to prefer the Linux fonts instead if they are
      available.  This makes consoles on STI graphics cards which are not
      supported by the stifb driver (e.g. Visualize FXe) looks much nicer.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org # v3.13
      8a10bc9d
    • L
      Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging · 602456bf
      Linus Torvalds 提交于
      Pull hwmon kconfig fixes from Jean Delvare.
      
      * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
        hwmon: Fix SENSORS_TMP102 dependencies to eliminate build errors
        hwmon: Fix SENSORS_LM75 dependencies to eliminate build errors
      602456bf
    • L
      Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux · 7b383bef
      Linus Torvalds 提交于
      Pull SLAB changes from Pekka Enberg:
       "Random bug fixes that have accumulated in my inbox over the past few
        months"
      
      * 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
        mm: Fix warning on make htmldocs caused by slab.c
        mm: slub: work around unneeded lockdep warning
        mm: sl[uo]b: fix misleading comments
        slub: Fix possible format string bug.
        slub: use lockdep_assert_held
        slub: Fix calculation of cpu slabs
        slab.h: remove duplicate kmalloc declaration and fix kernel-doc warnings
      7b383bef
    • L
      Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux · 87af5e5c
      Linus Torvalds 提交于
      Pull turbostat updates from Len Brown.
      
      * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
        tools/power turbostat: introduce -s to dump counters
        tools/power turbostat: remove unused command line option
        turbostat: Add option to report joules consumed per sample
        turbostat: run on HSX
        turbostat: Add a .gitignore to ignore the compiled turbostat binary
        turbostat: Clean up error handling; disambiguate error messages; use err and errx
        turbostat: Factor out common function to open file and exit on failure
        turbostat: Add a helper to parse a single int out of a file
        turbostat: Check return value of fscanf
        turbostat: Use GCC's CPUID functions to support PIC
        turbostat: Don't attempt to printf an off_t with %zx
        turbostat: Don't put unprocessed uapi headers in the include path
      87af5e5c
    • L
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · e4c0da21
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Olof Johansson:
       "Here's a set of patches for (hopefully) -rc1.  Some of them are fixes,
        but a good number of them also do things such as enable new drivers in
        the defconfigs for platforms that have such devices, increases
        coverage of the multiplatform defconfig and some DTS changes that
        plumbs up some of the devices that now have bindings and driver
        support.
      
        The commit dates are recent; we've mostly collected these fixes in the
        last few days but I also had to rebuild the branch yesterday to sort
        out some internal conflicts which reset the timestamps.  The changes
        should have been tested by each platform maintainer already (and few
        of them have cross-platform impact) so I'm personally not too
        concerned by it at this time"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (23 commits)
        ARM: multi_v7_defconfig: remove redundant entries and re-enable TI_EDMA
        ARM: multi_v7_defconfig: add mvebu drivers
        clocksource: kona: Add basic use of external clock
        drivers: bus: fix CCI driver kcalloc call parameters swap
        ARM: dts: bcm28155-ap: Fix Card Detection GPIO
        ARM: multi_v7_defconfig: Select CONFIG_AT803X_PHY
        ARM: keystone: config: fix build warning when CONFIG_DMADEVICES is not set
        MAINTAINERS: ARM: SiRF: use regex patterns to involve all SiRF drivers
        ARM: dts: zynq: Add SDHCI nodes
        ARM: hisi: don't select SMP
        ARM: tegra: rebuild tegra_defconfig to add DEBUG_FS
        ARM: multi_v7: copy most options from tegra_defconfig
        ARM: iop32x: fix power off handling for the EM7210 board
        ARM: integrator: restore static map on the CP
        ARM: msm_defconfig: Enable MSM clock drivers
        ARM: dts: msm: Add clock controller nodes and hook into uart
        ARM: OMAP4+: move errata initialization to omap4_pm_init_early
        ARM: OMAP4460: cpuidle: Extend PM_OMAP4_ROM_SMP_BOOT_ERRATUM_GICD on cpuidle
        ARM: mvebu: fix compilation warning on Armada 370 (i.e. non-SMP)
        ARM: shmobile: r8a7790.dtsi: ficx i2c[0-3] clock reference
        ...
      e4c0da21
    • J
      hwmon: Fix SENSORS_TMP102 dependencies to eliminate build errors · 632007e2
      Jean Delvare 提交于
      Similar to what was done for the lm75 driver.
      
      Add depends on THERMAL since that is what provides the
      register/unregister functions above, but only if THERMAL_OF was
      selected as this is an optional feature of the driver.
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Acked-by: NEduardo Valentin <eduardo.valentin@ti.com>
      Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
      632007e2
    • J
      hwmon: Fix SENSORS_LM75 dependencies to eliminate build errors · 920130a9
      Jean Delvare 提交于
      Based on an earlier attempt by Randy Dunlap.
      
      Fix SENSORS_LM75 dependencies to eliminate build errors:
      
      drivers/built-in.o: In function `lm75_remove':
      lm75.c:(.text+0x12bd8c): undefined reference to `thermal_zone_of_sensor_unregister'
      drivers/built-in.o: In function `lm75_probe':
      lm75.c:(.text+0x12c123): undefined reference to `thermal_zone_of_sensor_register'
      
      Add depends on THERMAL since that is what provides the
      register/unregister functions above, but only if THERMAL_OF was
      selected as this is an optional feature of the driver.
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Acked-by: NEduardo Valentin <eduardo.valentin@ti.com>
      Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
      920130a9
  3. 02 2月, 2014 9 次提交
  4. 01 2月, 2014 11 次提交