1. 05 4月, 2014 12 次提交
    • H
      mm: get_user_pages(write,force) refuse to COW in shared areas · cda540ac
      Hugh Dickins 提交于
      get_user_pages(write=1, force=1) has always had odd behaviour on write-
      protected shared mappings: although it demands FMODE_WRITE-access to the
      underlying object (do_mmap_pgoff sets neither VM_SHARED nor VM_MAYWRITE
      without that), it ends up with do_wp_page substituting private anonymous
      Copied-On-Write pages for the shared file pages in the area.
      
      That was long ago intentional, as a safety measure to prevent ptrace
      setting a breakpoint (or POKETEXT or POKEDATA) from inadvertently
      corrupting the underlying executable.  Yet exec and dynamic loaders open
      the file read-only, and use MAP_PRIVATE rather than MAP_SHARED.
      
      The traditional odd behaviour still causes surprises and bugs in mm, and
      is probably not what any caller wants - even the comment on the flag
      says "You do not want this" (although it's undoubtedly necessary for
      overriding userspace protections in some contexts, and good when !write).
      
      Let's stop doing that.  But it would be dangerous to remove the long-
      standing safety at this stage, so just make get_user_pages(write,force)
      fail with EFAULT when applied to a write-protected shared area.
      Infiniband may in future want to force write through to underlying
      object: we can add another FOLL_flag later to enable that if required.
      
      Odd though the old behaviour was, there is no doubt that we may turn out
      to break userspace with this change, and have to revert it quickly.
      Issue a WARN_ON_ONCE to help debug the changed case (easily triggered by
      userspace, so only once to prevent spamming the logs); and delay a few
      associated cleanups until this change is proved.
      
      get_user_pages callers who might see trouble from this change:
        ptrace poking, or writing to /proc/<pid>/mem
        drivers/infiniband/
        drivers/media/v4l2-core/
        drivers/gpu/drm/exynos/exynos_drm_gem.c
        drivers/staging/tidspbridge/core/tiomap3430.c
      if they ever apply get_user_pages to write-protected shared mappings
      of an object which was opened for writing.
      
      I went to apply the same change to mm/nommu.c, but retreated.  NOMMU has
      no place for COW, and its VM_flags conventions are not the same: I'd be
      more likely to screw up NOMMU than make an improvement there.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cda540ac
    • L
      Merge tag 'xfs-for-linus-3.15-rc1' of git://oss.sgi.com/xfs/xfs · d15e0310
      Linus Torvalds 提交于
      Pull xfs update from Dave Chinner:
       "There are a couple of new fallocate features in this request - it was
        decided that it was easiest to push them through the XFS tree using
        topic branches and have the ext4 support be based on those branches.
        Hence you may see some overlap with the ext4 tree merge depending on
        how they including those topic branches into their tree.  Other than
        that, there is O_TMPFILE support, some cleanups and bug fixes.
      
        The main changes in the XFS tree for 3.15-rc1 are:
      
         - O_TMPFILE support
         - allowing AIO+DIO writes beyond EOF
         - FALLOC_FL_COLLAPSE_RANGE support for fallocate syscall and XFS
           implementation
         - FALLOC_FL_ZERO_RANGE support for fallocate syscall and XFS
           implementation
         - IO verifier cleanup and rework
         - stack usage reduction changes
         - vm_map_ram NOIO context fixes to remove lockdep warings
         - various bug fixes and cleanups"
      
      * tag 'xfs-for-linus-3.15-rc1' of git://oss.sgi.com/xfs/xfs: (34 commits)
        xfs: fix directory hash ordering bug
        xfs: extra semi-colon breaks a condition
        xfs: Add support for FALLOC_FL_ZERO_RANGE
        fs: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate
        xfs: inode log reservations are still too small
        xfs: xfs_check_page_type buffer checks need help
        xfs: avoid AGI/AGF deadlock scenario for inode chunk allocation
        xfs: use NOIO contexts for vm_map_ram
        xfs: don't leak EFSBADCRC to userspace
        xfs: fix directory inode iolock lockdep false positive
        xfs: allocate xfs_da_args to reduce stack footprint
        xfs: always do log forces via the workqueue
        xfs: modify verifiers to differentiate CRC from other errors
        xfs: print useful caller information in xfs_error_report
        xfs: add xfs_verifier_error()
        xfs: add helper for updating checksums on xfs_bufs
        xfs: add helper for verifying checksums on xfs_bufs
        xfs: Use defines for CRC offsets in all cases
        xfs: skip pointless CRC updates after verifier failures
        xfs: Add support FALLOC_FL_COLLAPSE_RANGE for fallocate
        ...
      d15e0310
    • L
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 24e7ea3b
      Linus Torvalds 提交于
      Pull ext4 updates from Ted Ts'o:
       "Major changes for 3.14 include support for the newly added ZERO_RANGE
        and COLLAPSE_RANGE fallocate operations, and scalability improvements
        in the jbd2 layer and in xattr handling when the extended attributes
        spill over into an external block.
      
        Other than that, the usual clean ups and minor bug fixes"
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (42 commits)
        ext4: fix premature freeing of partial clusters split across leaf blocks
        ext4: remove unneeded test of ret variable
        ext4: fix comment typo
        ext4: make ext4_block_zero_page_range static
        ext4: atomically set inode->i_flags in ext4_set_inode_flags()
        ext4: optimize Hurd tests when reading/writing inodes
        ext4: kill i_version support for Hurd-castrated file systems
        ext4: each filesystem creates and uses its own mb_cache
        fs/mbcache.c: doucple the locking of local from global data
        fs/mbcache.c: change block and index hash chain to hlist_bl_node
        ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate
        ext4: refactor ext4_fallocate code
        ext4: Update inode i_size after the preallocation
        ext4: fix partial cluster handling for bigalloc file systems
        ext4: delete path dealloc code in ext4_ext_handle_uninitialized_extents
        ext4: only call sync_filesystm() when remounting read-only
        fs: push sync_filesystem() down to the file system's remount_fs()
        jbd2: improve error messages for inconsistent journal heads
        jbd2: minimize region locked by j_list_lock in jbd2_journal_forget()
        jbd2: minimize region locked by j_list_lock in journal_get_create_access()
        ...
      24e7ea3b
    • L
      Merge tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux · 8e343c8b
      Linus Torvalds 提交于
      Pull pstore fixes from Tony Luck:
       "Series of small bug fixes for pstore"
      
      * tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
        pstore: Fix memory leak when decompress using big_oops_buf
        pstore: Fix buffer overflow while write offset equal to buffer size
        pstore: Correct the max_dump_cnt clearing of ramoops
        pstore: Fix NULL pointer fault if get NULL prz in ramoops_get_next_prz
        pstore: skip zero size persistent ram buffer in traverse
        pstore: clarify clearing of _read_cnt in ramoops_context
      8e343c8b
    • L
      Merge tag 'upstream-3.15-rc1' of git://git.infradead.org/linux-ubifs · 370d2662
      Linus Torvalds 提交于
      Pull ubifs updates from Artem Bityutskiy:
       "This pull request includes the 'ubiblock' driver which provides R/O
        block access to UBI volumes.  It is useful for those who want to use
        squashfs on top of raw flash devices.  UBI will provide bit-flip
        handling and wear-levelling in this case (e.g., if there are other UBI
        volumes with R/W UBIFS too).
      
        The driver is actually pretty small and it is part of the UBI kernel
        subsystem.  Delivered by Ezequiel Garcia, along with a piece of
        documentation on the MTD web site and the user-space tool for creating
        and removing block devices"
      
      * tag 'upstream-3.15-rc1' of git://git.infradead.org/linux-ubifs:
        UBI: block: Remove __initdata from ubiblock_param_ops
        UBI: make UBI_IOCVOLCRBLK take a parameter for future usage
        UBI: rename block device ioctls
        UBI: block: Use ENOSYS as return value when CONFIG_UBIBLOCK=n
        UBI: block: Add CONFIG_BLOCK dependency
        UBI: block: Use 'u64' for the 64-bit dividend
        UBI: block: Mark init-only symbol as __initdata
        UBI: block: do not use term "attach"
        UBI: R/O block driver on top of UBI volumes
      370d2662
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · d15fee81
      Linus Torvalds 提交于
      Pull fuse update from Miklos Szeredi:
       "This series adds cached writeback support to fuse, improving write
        throughput"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: fix "uninitialized variable" warning
        fuse: Turn writeback cache on
        fuse: Fix O_DIRECT operations vs cached writeback misorder
        fuse: fuse_flush() should wait on writeback
        fuse: Implement write_begin/write_end callbacks
        fuse: restructure fuse_readpage()
        fuse: Flush files on wb close
        fuse: Trust kernel i_mtime only
        fuse: Trust kernel i_size only
        fuse: Connection bit for enabling writeback
        fuse: Prepare to handle short reads
        fuse: Linking file to inode helper
      d15fee81
    • L
      Merge tag 'dlm-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm · 56c225fe
      Linus Torvalds 提交于
      Pull dlm updates from David Teigland:
       "This set includes a couple trivial cleanups and changes recovery log
        messages from DEBUG to INFO"
      
      * tag 'dlm-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
        dlm: use INFO for recovery messages
        fs: Include appropriate header file in dlm/ast.c
        dlm: silence a harmless use after free warning
      56c225fe
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 53c56662
      Linus Torvalds 提交于
      Pull btrfs changes from Chris Mason:
       "This is a pretty long stream of bug fixes and performance fixes.
      
        Qu Wenruo has replaced the btrfs async threads with regular kernel
        workqueues.  We'll keep an eye out for performance differences, but
        it's nice to be using more generic code for this.
      
        We still have some corruption fixes and other patches coming in for
        the merge window, but this batch is tested and ready to go"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (108 commits)
        Btrfs: fix a crash of clone with inline extents's split
        btrfs: fix uninit variable warning
        Btrfs: take into account total references when doing backref lookup
        Btrfs: part 2, fix incremental send's decision to delay a dir move/rename
        Btrfs: fix incremental send's decision to delay a dir move/rename
        Btrfs: remove unnecessary inode generation lookup in send
        Btrfs: fix race when updating existing ref head
        btrfs: Add trace for btrfs_workqueue alloc/destroy
        Btrfs: less fs tree lock contention when using autodefrag
        Btrfs: return EPERM when deleting a default subvolume
        Btrfs: add missing kfree in btrfs_destroy_workqueue
        Btrfs: cache extent states in defrag code path
        Btrfs: fix deadlock with nested trans handles
        Btrfs: fix possible empty list access when flushing the delalloc inodes
        Btrfs: split the global ordered extents mutex
        Btrfs: don't flush all delalloc inodes when we doesn't get s_umount lock
        Btrfs: reclaim delalloc metadata more aggressively
        Btrfs: remove unnecessary lock in may_commit_transaction()
        Btrfs: remove the unnecessary flush when preparing the pages
        Btrfs: just do dirty page flush for the inode with compression before direct IO
        ...
      53c56662
    • L
      Merge tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw · 34917f97
      Linus Torvalds 提交于
      Pull GFS2 updates from Steven Whitehouse:
       "One of the main highlights this time, is not the patches themselves
        but instead the widening contributor base.  It is good to see that
        interest is increasing in GFS2, and I'd like to thank all the
        contributors to this patch set.
      
        In addition to the usual set of bug fixes and clean ups, there are
        patches to improve inode creation performance when xattrs are required
        and some improvements to the transaction code which is intended to
        help improve scalability after further changes in due course.
      
        Journal extent mapping is also updated to make it more efficient and
        again, this is a foundation for future work in this area.
      
        The maximum number of ACLs has been increased to 300 (for a 4k block
        size) which means that even with a few additional xattrs from selinux,
        everything should fit within a single fs block.
      
        There is also a patch to bring GFS2's own copy of the writepages code
        up to the same level as the core VFS.  Eventually we may be able to
        merge some of this code, since it is fairly similar.
      
        The other major change this time, is bringing consistency to the
        printing of messages via fs_<level>, pr_<level> macros"
      
      * tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw: (29 commits)
        GFS2: Fix address space from page function
        GFS2: Fix uninitialized VFS inode in gfs2_create_inode
        GFS2: Fix return value in slot_get()
        GFS2: inline function gfs2_set_mode
        GFS2: Remove extraneous function gfs2_security_init
        GFS2: Increase the max number of ACLs
        GFS2: Re-add a call to log_flush_wait when flushing the journal
        GFS2: Ensure workqueue is scheduled after noexp request
        GFS2: check NULL return value in gfs2_ok_to_move
        GFS2: Convert gfs2_lm_withdraw to use fs_err
        GFS2: Use fs_<level> more often
        GFS2: Use pr_<level> more consistently
        GFS2: Move recovery variables to journal structure in memory
        GFS2: global conversion to pr_foo()
        GFS2: return -E2BIG if hit the maximum limits of ACLs
        GFS2: Clean up journal extent mapping
        GFS2: replace kmalloc - __vmalloc / memset 0
        GFS2: Remove extra "if" in gfs2_log_flush()
        fs: NULL dereference in posix_acl_to_xattr()
        GFS2: Move log buffer accounting to transaction
        ...
      34917f97
    • L
      Merge branch 'locks-3.15' of git://git.samba.org/jlayton/linux · f7789dc0
      Linus Torvalds 提交于
      Pull file locking updates from Jeff Layton:
       "Highlights:
      
         - maintainership change for fs/locks.c.  Willy's not interested in
           maintaining it these days, and is OK with Bruce and I taking it.
         - fix for open vs setlease race that Al ID'ed
         - cleanup and consolidation of file locking code
         - eliminate unneeded BUG() call
         - merge of file-private lock implementation"
      
      * 'locks-3.15' of git://git.samba.org/jlayton/linux:
        locks: make locks_mandatory_area check for file-private locks
        locks: fix locks_mandatory_locked to respect file-private locks
        locks: require that flock->l_pid be set to 0 for file-private locks
        locks: add new fcntl cmd values for handling file private locks
        locks: skip deadlock detection on FL_FILE_PVT locks
        locks: pass the cmd value to fcntl_getlk/getlk64
        locks: report l_pid as -1 for FL_FILE_PVT locks
        locks: make /proc/locks show IS_FILE_PVT locks as type "FLPVT"
        locks: rename locks_remove_flock to locks_remove_file
        locks: consolidate checks for compatible filp->f_mode values in setlk handlers
        locks: fix posix lock range overflow handling
        locks: eliminate BUG() call when there's an unexpected lock on file close
        locks: add __acquires and __releases annotations to locks_start and locks_stop
        locks: remove "inline" qualifier from fl_link manipulation functions
        locks: clean up comment typo
        locks: close potential race between setlease and open
        MAINTAINERS: update entry for fs/locks.c
      f7789dc0
    • L
      Merge branch 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · 7df93452
      Linus Torvalds 提交于
      Pull renameat2 system call from Miklos Szeredi:
       "This adds a new syscall, renameat2(), which is the same as renameat()
        but with a flags argument.
      
        The purpose of extending rename is to add cross-rename, a symmetric
        variant of rename, which exchanges the two files.  This allows
        interesting things, which were not possible before, for example
        atomically replacing a directory tree with a symlink, etc...  This
        also allows overlayfs and friends to operate on whiteouts atomically.
      
        Andy Lutomirski also suggested a "noreplace" flag, which disables the
        overwriting behavior of rename.
      
        These two flags, RENAME_EXCHANGE and RENAME_NOREPLACE are only
        implemented for ext4 as an example and for testing"
      
      * 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ext4: add cross rename support
        ext4: rename: split out helper functions
        ext4: rename: move EMLINK check up
        ext4: rename: create ext4_renament structure for local vars
        vfs: add cross-rename
        vfs: lock_two_nondirectories: allow directory args
        security: add flags to rename hooks
        vfs: add RENAME_NOREPLACE flag
        vfs: add renameat2 syscall
        vfs: rename: use common code for dir and non-dir
        vfs: rename: move d_move() up
        vfs: add d_is_dir()
      7df93452
    • L
      Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 3c83e61e
      Linus Torvalds 提交于
      Pull media updates from Mauro Carvalho Chehab:
       "The main set of series of patches for media subsystem, including:
         - document RC sysfs class
         - added an API to setup scancode to allow waking up systems using the
           Remote Controller
         - add API for SDR devices.  Drivers are still on staging
         - some API improvements for getting EDID data from media
           inputs/outputs
         - new DVB frontend driver for drx-j (ATSC)
         - one driver (it913x/it9137) got removed, in favor of an improvement
           on another driver (af9035)
         - added a skeleton V4L2 PCI driver at documentation
         - added a dual flash driver (lm3646)
         - added a new IR driver (img-ir)
         - added an IR scancode decoder for the Sharp protocol
         - some improvements at the usbtv driver, to allow its core to be
           reused.
         - added a new SDR driver (rtl2832u_sdr)
         - added a new tuner driver (msi001)
         - several improvements at em28xx driver to fix PM support, device
           removal and to split the V4L2 specific bits into a separate
           sub-driver
         - one driver got converted to videobuf2 (s2255drv)
         - the e4000 tuner driver now follows an improved binding model
         - some fixes at V4L2 compat32 code
         - several fixes and enhancements at videobuf2 code
         - some cleanups at V4L2 API documentation
         - usual driver enhancements, new board additions and misc fixups"
      
      [ NOTE! This merge effective drops commit 4329b93b ("of: Reduce
        indentation in of_graph_get_next_endpoint").
      
        The of_graph_get_next_endpoint() function was moved and renamed by
        commit fd9fdb78 ("[media] of: move graph helpers from
        drivers/media/v4l2-core to drivers/of").  It was originally called
        v4l2_of_get_next_endpoint() and lived in the file
        drivers/media/v4l2-core/v4l2-of.c.
      
        In that original location, it was then fixed to support empty port
        nodes by commit b9db140c ("[media] v4l: of: Support empty port
        nodes"), and that commit clashes badly with the dropped "Reduce
        intendation" commit.  I had to choose one or the other, and decided
        that the "Support empty port nodes" commit was more important ]
      
      * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (426 commits)
        [media] em28xx-dvb: fix PCTV 461e tuner I2C binding
        Revert "[media] em28xx-dvb: fix PCTV 461e tuner I2C binding"
        [media] em28xx: fix PCTV 290e LNA oops
        [media] em28xx-dvb: fix PCTV 461e tuner I2C binding
        [media] m88ds3103: fix bug on .set_tone()
        [media] saa7134: fix WARN_ON during resume
        [media] v4l2-dv-timings: add module name, description, license
        [media] videodev2.h: add parenthesis around macro arguments
        [media] saa6752hs: depends on CRC32
        [media] si4713: fix Kconfig dependencies
        [media] Sensoray 2255 uses videobuf2
        [media] adv7180: free an interrupt on failure paths in init_device()
        [media] e4000: make VIDEO_V4L2 dependency optional
        [media] af9033: Don't export functions for the hardware filter
        [media] af9035: use af9033 PID filters
        [media] af9033: implement PID filter
        [media] rtl2832_sdr: do not use dynamic stack allocation
        [media] e4000: fix 32-bit build error
        [media] em28xx-audio: make sure audio is unmuted on open()
        [media] DocBook media: v4l2_format_sdr was renamed to v4l2_sdr_format
        ...
      3c83e61e
  2. 04 4月, 2014 28 次提交