1. 09 9月, 2015 12 次提交
    • M
      mm: add vmf_insert_pfn_pmd() · 5cad465d
      Matthew Wilcox 提交于
      Similar to vm_insert_pfn(), but for PMDs rather than PTEs.  The 'vmf_'
      prefix instead of 'vm_' prefix is intended to indicate that it returns a
      VMF_ value rather than an errno (which would only have to be converted
      into a VMF_ value anyway).
      Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5cad465d
    • M
      mm: export various functions for the benefit of DAX · fc437044
      Matthew Wilcox 提交于
      To use the huge zero page in DAX, we need these functions exported.
      Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fc437044
    • M
      mm: add a pmd_fault handler · b96375f7
      Matthew Wilcox 提交于
      Allow non-anonymous VMAs to provide huge pages in response to a page fault.
      Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b96375f7
    • M
      thp: prepare for DAX huge pages · 4897c765
      Matthew Wilcox 提交于
      Add a vma_is_dax() helper macro to test whether the VMA is DAX, and use it
      in zap_huge_pmd() and __split_huge_page_pmd().
      
      [akpm@linux-foundation.org: fix build]
      Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4897c765
    • A
      dax: revert userfaultfd change · 7c414164
      Andrew Morton 提交于
      Undo the change which "userfaultfd: call handle_userfault() for
      userfaultfd_missing() faults" made to set_huge_zero_page().  DAX will
      need that return value.
      
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7c414164
    • M
      dax: move DAX-related functions to a new header · c94c2acf
      Matthew Wilcox 提交于
      In order to handle the !CONFIG_TRANSPARENT_HUGEPAGES case, we need to
      return VM_FAULT_FALLBACK from the inlined dax_pmd_fault(), which is
      defined in linux/mm.h.  Given that we don't want to include <linux/mm.h>
      in <linux/fs.h>, the easiest solution is to move the DAX-related
      functions to a new header, <linux/dax.h>.  We could also have moved
      VM_FAULT_* definitions to a new header, or a different header that isn't
      quite such a boil-the-ocean header as <linux/mm.h>, but this felt like
      the best option.
      Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c94c2acf
    • K
      thp: vma_adjust_trans_huge(): adjust file-backed VMA too · e1b9996b
      Kirill A. Shutemov 提交于
      This series of patches adds support for using PMD page table entries to
      map DAX files.  We expect NV-DIMMs to start showing up that are many
      gigabytes in size and the memory consumption of 4kB PTEs will be
      astronomical.
      
      The patch series leverages much of the Transparant Huge Pages
      infrastructure, going so far as to borrow one of Kirill's patches from
      his THP page cache series.
      
      This patch (of 10):
      
      Since we're going to have huge pages in page cache, we need to call adjust
      file-backed VMA, which potentially can contain huge pages.
      
      For now we call it for all VMAs.
      
      Probably later we will need to introduce a flag to indicate that the VMA
      has huge pages.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NMatthew Wilcox <matthew.r.wilcox@intel.com>
      Acked-by: NHillf Danton <dhillf@gmail.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e1b9996b
    • O
      mremap: fix the wrong !vma->vm_file check in copy_vma() · ce75799b
      Oleg Nesterov 提交于
      Test-case:
      
      	#define _GNU_SOURCE
      	#include <stdio.h>
      	#include <unistd.h>
      	#include <stdlib.h>
      	#include <string.h>
      	#include <sys/mman.h>
      	#include <assert.h>
      
      	void *find_vdso_vaddr(void)
      	{
      		FILE *perl;
      		char buf[32] = {};
      
      		perl = popen("perl -e 'open STDIN,qq|/proc/@{[getppid]}/maps|;"
      				"/^(.*?)-.*vdso/ && print hex $1 while <>'", "r");
      		fread(buf, sizeof(buf), 1, perl);
      		fclose(perl);
      
      		return (void *)atol(buf);
      	}
      
      	#define PAGE_SIZE	4096
      
      	void *get_unmapped_area(void)
      	{
      		void *p = mmap(0, PAGE_SIZE, PROT_NONE,
      				MAP_PRIVATE|MAP_ANONYMOUS, -1,0);
      		assert(p != MAP_FAILED);
      		munmap(p, PAGE_SIZE);
      		return p;
      	}
      
      	char save[2][PAGE_SIZE];
      
      	int main(void)
      	{
      		void *vdso = find_vdso_vaddr();
      		void *page[2];
      
      		assert(vdso);
      		memcpy(save, vdso, sizeof (save));
      		// force another fault on the next check
      		assert(madvise(vdso, 2 * PAGE_SIZE, MADV_DONTNEED) == 0);
      
      		page[0] = mremap(vdso,
      				PAGE_SIZE, PAGE_SIZE, MREMAP_FIXED | MREMAP_MAYMOVE,
      				get_unmapped_area());
      		page[1] = mremap(vdso + PAGE_SIZE,
      				PAGE_SIZE, PAGE_SIZE, MREMAP_FIXED | MREMAP_MAYMOVE,
      				get_unmapped_area());
      
      		assert(page[0] != MAP_FAILED && page[1] != MAP_FAILED);
      		printf("match: %d %d\n",
      			!memcmp(save[0], page[0], PAGE_SIZE),
      			!memcmp(save[1], page[1], PAGE_SIZE));
      
      		return 0;
      	}
      
      fails without this patch. Before the previous commit it gets the wrong
      page, now it segfaults (which is imho better).
      
      This is because copy_vma() wrongly assumes that if vma->vm_file == NULL
      is irrelevant until the first fault which will use do_anonymous_page().
      This is obviously wrong for the special mapping.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ce75799b
    • O
      mmap: fix the usage of ->vm_pgoff in special_mapping paths · 8a9cc3b5
      Oleg Nesterov 提交于
      Test-case:
      
      	#include <stdio.h>
      	#include <unistd.h>
      	#include <stdlib.h>
      	#include <string.h>
      	#include <sys/mman.h>
      	#include <assert.h>
      
      	void *find_vdso_vaddr(void)
      	{
      		FILE *perl;
      		char buf[32] = {};
      
      		perl = popen("perl -e 'open STDIN,qq|/proc/@{[getppid]}/maps|;"
      				"/^(.*?)-.*vdso/ && print hex $1 while <>'", "r");
      		fread(buf, sizeof(buf), 1, perl);
      		fclose(perl);
      
      		return (void *)atol(buf);
      	}
      
      	#define PAGE_SIZE	4096
      
      	int main(void)
      	{
      		void *vdso = find_vdso_vaddr();
      		assert(vdso);
      
      		// of course they should differ, and they do so far
      		printf("vdso pages differ: %d\n",
      			!!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));
      
      		// split into 2 vma's
      		assert(mprotect(vdso, PAGE_SIZE, PROT_READ) == 0);
      
      		// force another fault on the next check
      		assert(madvise(vdso, 2 * PAGE_SIZE, MADV_DONTNEED) == 0);
      
      		// now they no longer differ, the 2nd vm_pgoff is wrong
      		printf("vdso pages differ: %d\n",
      			!!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));
      
      		return 0;
      	}
      
      Output:
      
      	vdso pages differ: 1
      	vdso pages differ: 0
      
      This is because split_vma() correctly updates ->vm_pgoff, but the logic
      in insert_vm_struct() and special_mapping_fault() is absolutely broken,
      so the fault at vdso + PAGE_SIZE return the 1st page. The same happens
      if you simply unmap the 1st page.
      
      special_mapping_fault() does:
      
      	pgoff = vmf->pgoff - vma->vm_pgoff;
      
      and this is _only_ correct if vma->vm_start mmaps the first page from
      ->vm_private_data array.
      
      vdso or any other user of install_special_mapping() is not anonymous,
      it has the "backing storage" even if it is just the array of pages.
      So we actually need to make vm_pgoff work as an offset in this array.
      
      Note: this also allows to fix another problem: currently gdb can't access
      "[vvar]" memory because in this case special_mapping_fault() doesn't work.
      Now that we can use ->vm_pgoff we can implement ->access() and fix this.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a9cc3b5
    • O
      mm: introduce vma_is_anonymous(vma) helper · b5330628
      Oleg Nesterov 提交于
      special_mapping_fault() is absolutely broken.  It seems it was always
      wrong, but this didn't matter until vdso/vvar started to use more than
      one page.
      
      And after this change vma_is_anonymous() becomes really trivial, it
      simply checks vm_ops == NULL.  However, I do think the helper makes
      sense.  There are a lot of ->vm_ops != NULL checks, the helper makes the
      caller's code more understandable (self-documented) and this is more
      grep-friendly.
      
      This patch (of 3):
      
      Preparation.  Add the new simple helper, vma_is_anonymous(vma), and change
      handle_pte_fault() to use it.  It will have more users.
      
      The name is not accurate, say a hpet_mmap()'ed vma is not anonymous.
      Perhaps it should be named vma_has_fault() instead.  But it matches the
      logic in mmap.c/memory.c (see next changes).  "True" just means that a
      page fault will use do_anonymous_page().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b5330628
    • G
      selftests/userfaultfd: fix compiler warnings on 32-bit · af8713b7
      Geert Uytterhoeven 提交于
      On 32-bit:
      
          userfaultfd.c: In function 'locking_thread':
          userfaultfd.c:152: warning: left shift count >= width of type
          userfaultfd.c: In function 'uffd_poll_thread':
          userfaultfd.c:295: warning: cast to pointer from integer of different size
          userfaultfd.c: In function 'uffd_read_thread':
          userfaultfd.c:332: warning: cast to pointer from integer of different size
      
      Fix the shift warning by splitting the shift in two parts, and the
      integer/pointer warnigns by adding intermediate casts to "unsigned long".
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      af8713b7
    • K
      cgroup: fix seq_show_option merge with legacy_name · 61e57c0c
      Kees Cook 提交于
      When seq_show_option (commit a068acf2: "fs: create and use
      seq_show_option for escaping") was merged, it did not correctly collide
      with cgroup's addition of legacy_name (commit 3e1d2eed: "cgroup:
      introduce cgroup_subsys->legacy_name") changes.
      
      This fixes the reported name.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      61e57c0c
  2. 08 9月, 2015 4 次提交
    • L
      Merge tag 'nfs-for-4.3-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 4e4adb2f
      Linus Torvalds 提交于
      Pull NFS client updates from Trond Myklebust:
       "Highlights include:
      
        Stable patches:
         - Fix atomicity of pNFS commit list updates
         - Fix NFSv4 handling of open(O_CREAT|O_EXCL|O_RDONLY)
         - nfs_set_pgio_error sometimes misses errors
         - Fix a thinko in xs_connect()
         - Fix borkage in _same_data_server_addrs_locked()
         - Fix a NULL pointer dereference of migration recovery ops for v4.2
           client
         - Don't let the ctime override attribute barriers.
         - Revert "NFSv4: Remove incorrect check in can_open_delegated()"
         - Ensure flexfiles pNFS driver updates the inode after write finishes
         - flexfiles must not pollute the attribute cache with attrbutes from
           the DS
         - Fix a protocol error in layoutreturn
         - Fix a protocol issue with NFSv4.1 CLOSE stateids
      
        Bugfixes + cleanups
         - pNFS blocks bugfixes from Christoph
         - Various cleanups from Anna
         - More fixes for delegation corner cases
         - Don't fsync twice for O_SYNC/IS_SYNC files
         - Fix pNFS and flexfiles layoutstats bugs
         - pnfs/flexfiles: avoid duplicate tracking of mirror data
         - pnfs: Fix layoutget/layoutreturn/return-on-close serialisation
           issues
         - pnfs/flexfiles: error handling retries a layoutget before fallback
           to MDS
      
        Features:
         - Full support for the OPEN NFS4_CREATE_EXCLUSIVE4_1 mode from
           Kinglong
         - More RDMA client transport improvements from Chuck
         - Removal of the deprecated ib_reg_phys_mr() and ib_rereg_phys_mr()
           verbs from the SUNRPC, Lustre and core infiniband tree.
         - Optimise away the close-to-open getattr if there is no cached data"
      
      * tag 'nfs-for-4.3-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (108 commits)
        NFSv4: Respect the server imposed limit on how many changes we may cache
        NFSv4: Express delegation limit in units of pages
        Revert "NFS: Make close(2) asynchronous when closing NFS O_DIRECT files"
        NFS: Optimise away the close-to-open getattr if there is no cached data
        NFSv4.1/flexfiles: Clean up ff_layout_write_done_cb/ff_layout_commit_done_cb
        NFSv4.1/flexfiles: Mark the layout for return in ff_layout_io_track_ds_error()
        nfs: Remove unneeded checking of the return value from scnprintf
        nfs: Fix truncated client owner id without proto type
        NFSv4.1/flexfiles: Mark layout for return if the mirrors are invalid
        NFSv4.1/flexfiles: RW layouts are valid only if all mirrors are valid
        NFSv4.1/flexfiles: Fix incorrect usage of pnfs_generic_mark_devid_invalid()
        NFSv4.1/flexfiles: Fix freeing of mirrors
        NFSv4.1/pNFS: Don't request a minimal read layout beyond the end of file
        NFSv4.1/pnfs: Handle LAYOUTGET return values correctly
        NFSv4.1/pnfs: Don't ask for a read layout for an empty file.
        NFSv4.1: Fix a protocol issue with CLOSE stateids
        NFSv4.1/flexfiles: Don't mark the entire deviceid as bad for file errors
        SUNRPC: Prevent SYN+SYNACK+RST storms
        SUNRPC: xs_reset_transport must mark the connection as disconnected
        NFSv4.1/pnfs: Ensure layoutreturn reserves space for the opaque payload
        ...
      4e4adb2f
    • L
      Merge tag 'xfs-for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs · 77a78806
      Linus Torvalds 提交于
      Pull xfs updates from Dave Chinner:
       "There isn't a whole lot to this update - it's mostly bug fixes and
        they are spread pretty much all over XFS.  There are some corruption
        fixes, some fixes for log recovery, some fixes that prevent unount
        from hanging, a lockdep annotation rework for inode locking to prevent
        false positives and the usual random bunch of cleanups and minor
        improvements.
      
        Deatils:
      
         - large rework of EFI/EFD lifecycle handling to fix log recovery
           corruption issues, crashes and unmount hangs
      
         - separate metadata UUID on disk to enable changing boot label UUID
           for v5 filesystems
      
         - fixes for gcc miscompilation on certain platforms and optimisation
           levels
      
         - remote attribute allocation and recovery corruption fixes
      
         - inode lockdep annotation rework to fix bugs with too many
           subclasses
      
         - directory inode locking changes to prevent lockdep false positives
      
         - a handful of minor corruption fixes
      
         - various other small cleanups and bug fixes"
      
      * tag 'xfs-for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (42 commits)
        xfs: fix error gotos in xfs_setattr_nonsize
        xfs: add mssing inode cache attempts counter increment
        xfs: return errors from partial I/O failures to files
        libxfs: bad magic number should set da block buffer error
        xfs: fix non-debug build warnings
        xfs: collapse allocsize and biosize mount option handling
        xfs: Fix file type directory corruption for btree directories
        xfs: lockdep annotations throw warnings on non-debug builds
        xfs: Fix uninitialized return value in xfs_alloc_fix_freelist()
        xfs: inode lockdep annotations broke non-lockdep build
        xfs: flush entire file on dio read/write to cached file
        xfs: Fix xfs_attr_leafblock definition
        libxfs: readahead of dir3 data blocks should use the read verifier
        xfs: stop holding ILOCK over filldir callbacks
        xfs: clean up inode lockdep annotations
        xfs: swap leaf buffer into path struct atomically during path shift
        xfs: relocate sparse inode mount warning
        xfs: dquots should be stamped with sb_meta_uuid
        xfs: log recovery needs to validate against sb_meta_uuid
        xfs: growfs not aware of sb_meta_uuid
        ...
      77a78806
    • T
      NFSv4: Respect the server imposed limit on how many changes we may cache · 5445b1fb
      Trond Myklebust 提交于
      The NFSv4 delegation spec allows the server to tell a client to limit how
      much data it cache after the file is closed. In return, the server
      guarantees enough free space to avoid ENOSPC situations, etc.
      Prior to this patch, we assumed we could always cache aggressively after
      close. Unfortunately, this causes problems with servers that set the
      limit to 0 and therefore do not offer any ENOSPC guarantees.
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      5445b1fb
    • T
      NFSv4: Express delegation limit in units of pages · 7d160a6c
      Trond Myklebust 提交于
      Since we're tracking modifications to the page cache on a per-page
      basis, it makes sense to express the limit to how much we may cache
      in units of pages.
      Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
      7d160a6c
  3. 06 9月, 2015 9 次提交
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 7d9071a0
      Linus Torvalds 提交于
      Pull vfs updates from Al Viro:
       "In this one:
      
         - d_move fixes (Eric Biederman)
      
         - UFS fixes (me; locking is mostly sane now, a bunch of bugs in error
           handling ought to be fixed)
      
         - switch of sb_writers to percpu rwsem (Oleg Nesterov)
      
         - superblock scalability (Josef Bacik and Dave Chinner)
      
         - swapon(2) race fix (Hugh Dickins)"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (65 commits)
        vfs: Test for and handle paths that are unreachable from their mnt_root
        dcache: Reduce the scope of i_lock in d_splice_alias
        dcache: Handle escaped paths in prepend_path
        mm: fix potential data race in SyS_swapon
        inode: don't softlockup when evicting inodes
        inode: rename i_wb_list to i_io_list
        sync: serialise per-superblock sync operations
        inode: convert inode_sb_list_lock to per-sb
        inode: add hlist_fake to avoid the inode hash lock in evict
        writeback: plug writeback at a high level
        change sb_writers to use percpu_rw_semaphore
        shift percpu_counter_destroy() into destroy_super_work()
        percpu-rwsem: kill CONFIG_PERCPU_RWSEM
        percpu-rwsem: introduce percpu_rwsem_release() and percpu_rwsem_acquire()
        percpu-rwsem: introduce percpu_down_read_trylock()
        document rwsem_release() in sb_wait_write()
        fix the broken lockdep logic in __sb_start_write()
        introduce __sb_writers_{acquired,release}() helpers
        ufs_inode_get{frag,block}(): get rid of 'phys' argument
        ufs_getfrag_block(): tidy up a bit
        ...
      7d9071a0
    • L
      Merge tag 'for-linus-4.3-merge-window-part-1' of... · bd779669
      Linus Torvalds 提交于
      Merge tag 'for-linus-4.3-merge-window-part-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
      
      Pull 9p updates from Eric Van Hensbergen:
       "Just a few cleanups for 4.3 merge window for the 9p file system.  I've
        gotten several more over the past week, but this group has been in
        for-next for at least a couple of weeks so I figured I'd push them
        first while I test the rest.
      
        Most of the ones not in this set are bug-fixes anyways so I could hold
        them for rc1"
      
      * tag 'for-linus-4.3-merge-window-part-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
        9p: fix return code of read() when count is 0
        9p: remove unused option Opt_trans
      bd779669
    • L
      Merge tag 'media/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 9cfcc658
      Linus Torvalds 提交于
      Pull media updates from Mauro Carvalho Chehab:
       - new DVB frontend drivers: ascot2e, cxd2841er, horus3a, lnbh25
       - new HDMI capture driver: tc358743
       - new driver for NetUP DVB new boards (netup_unidvb)
       - IR support for DVBSky cards (smipcie-ir)
       - Coda driver has gain macroblock tiling support
       - Renesas R-Car gains JPEG codec driver
       - new DVB platform driver for STi boards: c8sectpfe
       - added documentation for the media core kABI to device-drivers DocBook
       - lots of driver fixups, cleanups and improvements
      
      * tag 'media/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (297 commits)
        [media] c8sectpfe: Remove select on undefined LIBELF_32
        [media] i2c: fix platform_no_drv_owner.cocci warnings
        [media] cx231xx: Use wake_up_interruptible() instead of wake_up_interruptible_nr()
        [media] tc358743: only queue subdev notifications if devnode is set
        [media] tc358743: add missing Kconfig dependency/select
        [media] c8sectpfe: Use %pad to print 'dma_addr_t'
        [media] DocBook media: Fix typo "the the" in xml files
        [media] tc358743: make reset gpio optional
        [media] tc358743: set direction of reset gpio using devm_gpiod_get
        [media] dvbdev: document most of the functions/data structs
        [media] dvb_frontend.h: document the struct dvb_frontend
        [media] dvb-frontend.h: document struct dtv_frontend_properties
        [media] dvb-frontend.h: document struct dvb_frontend_ops
        [media] dvb: Use DVBFE_ALGO_HW where applicable
        [media] dvb_frontend.h: document struct analog_demod_ops
        [media] dvb_frontend.h: Document struct dvb_tuner_ops
        [media] Docbook: Document struct analog_parameters
        [media] dvb_frontend.h: get rid of dvbfe_modcod
        [media] add documentation for struct dvb_tuner_info
        [media] dvb_frontend: document dvb_frontend_tune_settings
        ...
      9cfcc658
    • L
      Merge branch 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration · e3a98ac4
      Linus Torvalds 提交于
      Pull mailbox updates from Jassi Brar:
       "Mainly we move from jiffy based timer to HRTIMER for finer control
        over polling.  Then a controller reduces its polling period from 10 to
        1ms"
      
      * 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
        mailbox: arm_mhu: reduce txpoll_period from 10ms to 1 ms
        mailbox: switch to hrtimer for tx_complete polling
        mailbox: Drop owner assignment from platform_driver
      e3a98ac4
    • L
      Merge tag 'md/4.3' of git://neil.brown.name/md · 2a013e37
      Linus Torvalds 提交于
      Pull md updates from Neil Brown:
      
       - an assortment of little fixes, several for minor races only likely to
         be hit during testing
      
       - further cluster-md-raid1 development, not ready for real use yet.
      
       - new RAID6 syndrome code for ARM NEON
      
       - fix a race where a write can return before failure of one device is
         properly recorded in metadata, so an immediate crash might result in
         that write being lost.
      
      * tag 'md/4.3' of git://neil.brown.name/md: (33 commits)
        md/raid5: ensure device failure recorded before write request returns.
        md/raid5: use bio_list for the list of bios to return.
        md/raid10: ensure device failure recorded before write request returns.
        md/raid1: ensure device failure recorded before write request returns.
        md-cluster: remove inappropriate try_module_get from join()
        md: extend spinlock protection in register_md_cluster_operations
        md-cluster: Read the disk bitmap sb and check if it needs recovery
        md-cluster: only call complete(&cinfo->completion) when node join cluster
        md-cluster: add missed lockres_free
        md-cluster: remove the unused sb_lock
        md-cluster: init suspend_list and suspend_lock early in join
        md-cluster: add the error check if failed to get dlm lock
        md-cluster: init completion within lockres_init
        md-cluster: fix deadlock issue on message lock
        md-cluster: transfer the resync ownership to another node
        md-cluster: split recover_slot for future code reuse
        md-cluster: use %pU to print UUIDs
        md: setup safemode_timer before it's being used
        md/raid5: handle possible race as reshape completes.
        md: sync sync_completed has correct value as recovery finishes.
        ...
      2a013e37
    • L
      Merge tag 'nfsd-4.3' of git://linux-nfs.org/~bfields/linux · 17447717
      Linus Torvalds 提交于
      Pull nfsd updates from Bruce Fields:
       "Nothing major, but:
      
         - Add Jeff Layton as an nfsd co-maintainer: no change to existing
           practice, just an acknowledgement of the status quo.
      
         - Two patches ("nfsd: ensure that...") for a race overlooked by the
           state locking rewrite, causing a crash noticed by multiple users.
      
         - Lots of smaller bugfixes all over from Kinglong Mee.
      
         - From Jeff, some cleanup of server rpc code in preparation for
           possible shift of nfsd threads to workqueues"
      
      * tag 'nfsd-4.3' of git://linux-nfs.org/~bfields/linux: (52 commits)
        nfsd: deal with DELEGRETURN racing with CB_RECALL
        nfsd: return CLID_INUSE for unexpected SETCLIENTID_CONFIRM case
        nfsd: ensure that delegation stateid hash references are only put once
        nfsd: ensure that the ol stateid hash reference is only put once
        net: sunrpc: fix tracepoint Warning: unknown op '->'
        nfsd: allow more than one laundry job to run at a time
        nfsd: don't WARN/backtrace for invalid container deployment.
        fs: fix fs/locks.c kernel-doc warning
        nfsd: Add Jeff Layton as co-maintainer
        NFSD: Return word2 bitmask if setting security label in OPEN/CREATE
        NFSD: Set the attributes used to store the verifier for EXCLUSIVE4_1
        nfsd: SUPPATTR_EXCLCREAT must be encoded before SECURITY_LABEL.
        nfsd: Fix an FS_LAYOUT_TYPES/LAYOUT_TYPES encode bug
        NFSD: Store parent's stat in a separate value
        nfsd: Fix two typos in comments
        lockd: NLM grace period shouldn't block NFSv4 opens
        nfsd: include linux/nfs4.h in export.h
        sunrpc: Switch to using hash list instead single list
        sunrpc/nfsd: Remove redundant code by exports seq_operations functions
        sunrpc: Store cache_detail in seq_file's private directly
        ...
      17447717
    • L
      Merge branch 'for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 22365979
      Linus Torvalds 提交于
      Pull btrfs updates from Chris Mason:
       "This has Jeff Mahoney's long standing trim patch that fixes corners
        where trims were missing.  Omar has some raid5/6 fixes, especially for
        using scrub and device replace when devices are missing.
      
        Zhao Lie continues cleaning and fixing things, this series fixes some
        really hard to hit corners in xfstests.  I had to pull it last merge
        window due to some deadlocks, but those are now resolved.
      
        I added support for Tejun's new blkio controllers.  It seems to work
        well for single devices, we'll expand to multi-device as well"
      
      * 'for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (47 commits)
        btrfs: fix compile when block cgroups are not enabled
        Btrfs: fix file read corruption after extent cloning and fsync
        Btrfs: check if previous transaction aborted to avoid fs corruption
        btrfs: use __GFP_NOFAIL in alloc_btrfs_bio
        btrfs: Prevent from early transaction abort
        btrfs: Remove unused arguments in tree-log.c
        btrfs: Remove useless condition in start_log_trans()
        Btrfs: add support for blkio controllers
        Btrfs: remove unused mutex from struct 'btrfs_fs_info'
        Btrfs: fix parity scrub of RAID 5/6 with missing device
        Btrfs: fix device replace of a missing RAID 5/6 device
        Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation
        Btrfs: count devices correctly in readahead during RAID 5/6 replace
        Btrfs: remove misleading handling of missing device scrub
        btrfs: fix clone / extent-same deadlocks
        Btrfs: fix defrag to merge tail file extent
        Btrfs: fix warning in backref walking
        btrfs: Add WARN_ON() for double lock in btrfs_tree_lock()
        btrfs: Remove root argument in extent_data_ref_count()
        btrfs: Fix wrong comment of btrfs_alloc_tree_block()
        ...
      22365979
    • L
      Merge branch 'akpm' (patches from Andrew) · 6c0f568e
      Linus Torvalds 提交于
      Merge patch-bomb from Andrew Morton:
      
       - a few misc things
      
       - Andy's "ambient capabilities"
      
       - fs/nofity updates
      
       - the ocfs2 queue
      
       - kernel/watchdog.c updates and feature work.
      
       - some of MM.  Includes Andrea's userfaultfd feature.
      
      [ Hadn't noticed that userfaultfd was 'default y' when applying the
        patches, so that got fixed in this merge instead.  We do _not_ mark
        new features that nobody uses yet 'default y'   - Linus ]
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits)
        mm/hugetlb.c: make vma_has_reserves() return bool
        mm/madvise.c: make madvise_behaviour_valid() return bool
        mm/memory.c: make tlb_next_batch() return bool
        mm/dmapool.c: change is_page_busy() return from int to bool
        mm: remove struct node_active_region
        mremap: simplify the "overlap" check in mremap_to()
        mremap: don't do uneccesary checks if new_len == old_len
        mremap: don't do mm_populate(new_addr) on failure
        mm: move ->mremap() from file_operations to vm_operations_struct
        mremap: don't leak new_vma if f_op->mremap() fails
        mm/hugetlb.c: make vma_shareable() return bool
        mm: make GUP handle pfn mapping unless FOLL_GET is requested
        mm: fix status code which move_pages() returns for zero page
        mm: memcontrol: bring back the VM_BUG_ON() in mem_cgroup_swapout()
        genalloc: add support of multiple gen_pools per device
        genalloc: add name arg to gen_pool_get() and devm_gen_pool_create()
        mm/memblock: WARN_ON when nid differs from overlap region
        Documentation/features/vm: add feature description and arch support status for batched TLB flush after unmap
        mm: defer flush of writable TLB entries
        mm: send one IPI per CPU to TLB flush all entries after unmapping pages
        ...
      6c0f568e
    • E
      task_work: remove fifo ordering guarantee · c8219906
      Eric Dumazet 提交于
      In commit f341861f ("task_work: add a scheduling point in
      task_work_run()") I fixed a latency problem adding a cond_resched()
      call.
      
      Later, commit ac3d0da8 added yet another loop to reverse a list,
      bringing back the latency spike :
      
      I've seen in some cases this loop taking 275 ms, if for example a
      process with 2,000,000 files is killed.
      
      We could add yet another cond_resched() in the reverse loop, or we
      can simply remove the reversal, as I do not think anything
      would depend on order of task_work_add() submitted works.
      
      Fixes: ac3d0da8 ("task_work: Make task_work_add() lockless")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NMaciej Żenczykowski <maze@google.com>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c8219906
  4. 05 9月, 2015 15 次提交