1. 18 11月, 2016 5 次提交
    • L
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 62389867
      Linus Torvalds 提交于
      Pull block fixes from Jens Axboe:
       "A set of fixes, one for NVMe from Keith, and a set for nvme-{rdma,t,f}
        from the usual suspects, fixing actual problems that would be a shame
        to release 4.9 with"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        nvme/pci: Don't free queues on error
        nvmet-rdma: drain the queue-pair just before freeing it
        nvme-rdma: stop and free io queues on connect failure
        nvmet-rdma: don't forget to delete a queue from the list of connection failed
        nvmet: Don't queue fatal error work if csts.cfs is set
        nvme-rdma: reject non-connect commands before the queue is live
        nvmet-rdma: Fix possible NULL deref when handling rdma cm events
      62389867
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · 57400d30
      Linus Torvalds 提交于
      Pull rmda fixes from Doug Ledford.
       "First round of -rc fixes.
      
        Due to various issues, I've been away and couldn't send a pull request
        for about three weeks. There were a number of -rc patches that built
        up in the meantime (some where there already from the early -rc
        stages). Obviously, there were way too many to send now, so I tried to
        pare the list down to the more important patches for the -rc cycle.
      
        Most of the code has had plenty of soak time at the various vendor's
        testing setups, so I doubt there will be another -rc pull request this
        cycle. I also tried to limit the patches to those with smaller
        footprints, so even though a shortlog is longer than I would like, the
        actual diffstat is mostly very small with the exception of just three
        files that had more changes, and a couple files with pure removals.
      
        Summary:
         - Misc Intel hfi1 fixes
         - Misc Mellanox mlx4, mlx5, and rxe fixes
         - A couple cxgb4 fixes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (34 commits)
        iw_cxgb4: invalidate the mr when posting a read_w_inv wr
        iw_cxgb4: set *bad_wr for post_send/post_recv errors
        IB/rxe: Update qp state for user query
        IB/rxe: Clear queue buffer when modifying QP to reset
        IB/rxe: Fix handling of erroneous WR
        IB/rxe: Fix kernel panic in UDP tunnel with GRO and RX checksum
        IB/mlx4: Fix create CQ error flow
        IB/mlx4: Check gid_index return value
        IB/mlx5: Fix NULL pointer dereference on debug print
        IB/mlx5: Fix fatal error dispatching
        IB/mlx5: Resolve soft lock on massive reg MRs
        IB/mlx5: Use cache line size to select CQE stride
        IB/mlx5: Validate requested RQT size
        IB/mlx5: Fix memory leak in query device
        IB/core: Avoid unsigned int overflow in sg_alloc_table
        IB/core: Add missing check for addr_resolve callback return value
        IB/core: Set routable RoCE gid type for ipv4/ipv6 networks
        IB/cm: Mark stale CM id's whenever the mad agent was unregistered
        IB/uverbs: Fix leak of XRC target QPs
        IB/hfi1: Remove incorrect IS_ERR check
        ...
      57400d30
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · bec1b089
      Linus Torvalds 提交于
      Pull vfs fixes from Al Viro:
       "A couple of regression fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fix iov_iter_advance() for ITER_PIPE
        xattr: Fix setting security xattrs on sockfs
      bec1b089
    • L
      Merge tag 'for-linus-4.9-rc5-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux · d46bc34d
      Linus Torvalds 提交于
      Pull orangefs fix from Mike Marshall:
       "orangefs: add .owner to debugfs file_operations
      
        Without ".owner = THIS_MODULE" it is possible to crash the kernel by
        unloading the Orangefs module while someone is reading debugfs files"
      
      * tag 'for-linus-4.9-rc5-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
        orangefs: add .owner to debugfs file_operations
      d46bc34d
    • A
      mremap: fix race between mremap() and page cleanning · 5d190420
      Aaron Lu 提交于
      Prior to 3.15, there was a race between zap_pte_range() and
      page_mkclean() where writes to a page could be lost.  Dave Hansen
      discovered by inspection that there is a similar race between
      move_ptes() and page_mkclean().
      
      We've been able to reproduce the issue by enlarging the race window with
      a msleep(), but have not been able to hit it without modifying the code.
      So, we think it's a real issue, but is difficult or impossible to hit in
      practice.
      
      The zap_pte_range() issue is fixed by commit 1cf35d47("mm: split
      'tlb_flush_mmu()' into tlb flushing and memory freeing parts").  And
      this patch is to fix the race between page_mkclean() and mremap().
      
      Here is one possible way to hit the race: suppose a process mmapped a
      file with READ | WRITE and SHARED, it has two threads and they are bound
      to 2 different CPUs, e.g.  CPU1 and CPU2.  mmap returned X, then thread
      1 did a write to addr X so that CPU1 now has a writable TLB for addr X
      on it.  Thread 2 starts mremaping from addr X to Y while thread 1
      cleaned the page and then did another write to the old addr X again.
      The 2nd write from thread 1 could succeed but the value will get lost.
      
              thread 1                           thread 2
           (bound to CPU1)                    (bound to CPU2)
      
        1: write 1 to addr X to get a
           writeable TLB on this CPU
      
                                              2: mremap starts
      
                                              3: move_ptes emptied PTE for addr X
                                                 and setup new PTE for addr Y and
                                                 then dropped PTL for X and Y
      
        4: page laundering for N by doing
           fadvise FADV_DONTNEED. When done,
           pageframe N is deemed clean.
      
        5: *write 2 to addr X
      
                                              6: tlb flush for addr X
      
        7: munmap (Y, pagesize) to make the
           page unmapped
      
        8: fadvise with FADV_DONTNEED again
           to kick the page off the pagecache
      
        9: pread the page from file to verify
           the value. If 1 is there, it means
           we have lost the written 2.
      
        *the write may or may not cause segmentation fault, it depends on
        if the TLB is still on the CPU.
      
      Please note that this is only one specific way of how the race could
      occur, it didn't mean that the race could only occur in exact the above
      config, e.g. more than 2 threads could be involved and fadvise() could
      be done in another thread, etc.
      
      For anonymous pages, they could race between mremap() and page reclaim:
      THP: a huge PMD is moved by mremap to a new huge PMD, then the new huge
      PMD gets unmapped/splitted/pagedout before the flush tlb happened for
      the old huge PMD in move_page_tables() and we could still write data to
      it.  The normal anonymous page has similar situation.
      
      To fix this, check for any dirty PTE in move_ptes()/move_huge_pmd() and
      if any, did the flush before dropping the PTL.  If we did the flush for
      every move_ptes()/move_huge_pmd() call then we do not need to do the
      flush in move_pages_tables() for the whole range.  But if we didn't, we
      still need to do the whole range flush.
      
      Alternatively, we can track which part of the range is flushed in
      move_ptes()/move_huge_pmd() and which didn't to avoid flushing the whole
      range in move_page_tables().  But that would require multiple tlb
      flushes for the different sub-ranges and should be less efficient than
      the single whole range flush.
      
      KBuild test on my Sandybridge desktop doesn't show any noticeable change.
      v4.9-rc4:
        real    5m14.048s
        user    32m19.800s
        sys     4m50.320s
      
      With this commit:
        real    5m13.888s
        user    32m19.330s
        sys     4m51.200s
      Reported-by: NDave Hansen <dave.hansen@intel.com>
      Signed-off-by: NAaron Lu <aaron.lu@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d190420
  2. 17 11月, 2016 30 次提交
  3. 16 11月, 2016 5 次提交