1. 11 10月, 2013 5 次提交
    • P
      powerpc: Provide for giveup_fpu/altivec to save state in alternate location · 18461960
      Paul Mackerras 提交于
      This provides a facility which is intended for use by KVM, where the
      contents of the FP/VSX and VMX (Altivec) registers can be saved away
      to somewhere other than the thread_struct when kernel code wants to
      use floating point or VMX instructions.  This is done by providing a
      pointer in the thread_struct to indicate where the state should be
      saved to.  The giveup_fpu() and giveup_altivec() functions test these
      pointers and save state to the indicated location if they are non-NULL.
      Note that the MSR_FP/VEC bits in task->thread.regs->msr are still used
      to indicate whether the CPU register state is live, even when an
      alternate save location is being used.
      
      This also provides load_fp_state() and load_vr_state() functions, which
      load up FP/VSX and VMX state from memory into the CPU registers, and
      corresponding store_fp_state() and store_vr_state() functions, which
      store FP/VSX and VMX state into memory from the CPU registers.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      18461960
    • P
      powerpc: Put FP/VSX and VR state into structures · de79f7b9
      Paul Mackerras 提交于
      This creates new 'thread_fp_state' and 'thread_vr_state' structures
      to store FP/VSX state (including FPSCR) and Altivec/VSX state
      (including VSCR), and uses them in the thread_struct.  In the
      thread_fp_state, the FPRs and VSRs are represented as u64 rather
      than double, since we rarely perform floating-point computations
      on the values, and this will enable the structures to be used
      in KVM code as well.  Similarly FPSCR is now a u64 rather than
      a structure of two 32-bit values.
      
      This takes the offsets out of the macros such as SAVE_32FPRS,
      REST_32FPRS, etc.  This enables the same macros to be used for normal
      and transactional state, enabling us to delete the transactional
      versions of the macros.   This also removes the unused do_load_up_fpu
      and do_load_up_altivec, which were in fact buggy since they didn't
      create large enough stack frames to account for the fact that
      load_up_fpu and load_up_altivec are not designed to be called from C
      and assume that their caller's stack frame is an interrupt frame.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      de79f7b9
    • A
      powerpc: add real mode support for dma operations on powernv · 8e0a1611
      Alexey Kardashevskiy 提交于
      The existing TCE machine calls (tce_build and tce_free) only support
      virtual mode as they call __raw_writeq for TCE invalidation what
      fails in real mode.
      
      This introduces tce_build_rm and tce_free_rm real mode versions
      which do mostly the same but use "Store Doubleword Caching Inhibited
      Indexed" instruction for TCE invalidation.
      
      This new feature is going to be utilized by real mode support of VFIO.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8e0a1611
    • A
      powerpc: Prepare to support kernel handling of IOMMU map/unmap · 8e0861fa
      Alexey Kardashevskiy 提交于
      The current VFIO-on-POWER implementation supports only user mode
      driven mapping, i.e. QEMU is sending requests to map/unmap pages.
      However this approach is really slow, so we want to move that to KVM.
      Since H_PUT_TCE can be extremely performance sensitive (especially with
      network adapters where each packet needs to be mapped/unmapped) we chose
      to implement that as a "fast" hypercall directly in "real
      mode" (processor still in the guest context but MMU off).
      
      To be able to do that, we need to provide some facilities to
      access the struct page count within that real mode environment as things
      like the sparsemem vmemmap mappings aren't accessible.
      
      This adds an API function realmode_pfn_to_page() to get page struct when
      MMU is off.
      
      This adds to MM a new function put_page_unless_one() which drops a page
      if counter is bigger than 1. It is going to be used when MMU is off
      (for example, real mode on PPC64) and we want to make sure that page
      release will not happen in real mode as it may crash the kernel in
      a horrible way.
      
      CONFIG_SPARSEMEM_VMEMMAP and CONFIG_FLATMEM are supported.
      
      Cc: linux-mm@kvack.org
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8e0861fa
    • A
      hashtable: add hash_for_each_possible_rcu_notrace() · 81fcfb81
      Alexey Kardashevskiy 提交于
      This adds hash_for_each_possible_rcu_notrace() which is basically
      a notrace clone of hash_for_each_possible_rcu() which cannot be
      used in real mode due to its tracing/debugging capability.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      81fcfb81
  2. 08 10月, 2013 3 次提交
  3. 07 10月, 2013 6 次提交
    • D
      HID: wiimote: fix FF deadlock · f50f9aab
      David Herrmann 提交于
      The input core has an internal spinlock that is acquired during event
      injection via input_event() and friends but also held during FF callbacks.
      That means, there is no way to share a lock between event-injection and FF
      handling. Unfortunately, this is what is required for wiimote state
      tracking and what we do with state.lock and input->lock.
      
      This deadlock can be triggered when using continuous data reporting and FF
      on a wiimote device at the same time. I takes me at least 30m of
      stress-testing to trigger it but users reported considerably shorter
      times (http://bpaste.net/show/132504/) when using some gaming-console
      emulators.
      
      The real problem is that we have two copies of internal state, one in the
      wiimote objects and the other in the input device. As the input-lock is
      not supposed to be accessed from outside of input-core, we have no other
      chance than offloading FF handling into a worker. This actually works
      pretty nice and also allows to implictly merge fast rumble changes into a
      single request.
      
      Due to the 3-layered workers (rumble+queue+l2cap) this might reduce FF
      responsiveness. Initial tests were fine so lets fix the race first and if
      it turns out to be too slow we can always handle FF out-of-band and skip
      the queue-worker.
      
      Cc: <stable@vger.kernel.org> # 3.11+
      Reported-by: Thomas Schneider
      Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      f50f9aab
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 162bdafa
      Linus Torvalds 提交于
      Pull s390 fixes from Martin Schwidefsky:
       "A couple of bux fixes, notable are the regression with ptrace vs
        restarting system calls and the patch for kdump to be able to copy
        from virtual memory"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390: fix system call restart after inferior call
        s390: Allow vmalloc target buffers for copy_from_oldmem()
        s390/sclp: properly detect line mode console
        s390/kprobes: add exrl to list of prohibited opcodes
        s390/3270: fix return value check in tty3270_resize_work()
      162bdafa
    • L
      Linux 3.12-rc4 · d0e639c9
      Linus Torvalds 提交于
      d0e639c9
    • E
      net: Update the sysctl permissions handler to test effective uid/gid · 2433c8f0
      Eric W. Biederman 提交于
      Modify the code to use current_euid(), and in_egroup_p, as in done
      in fs/proc/proc_sysctl.c:test_perm()
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Reported-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2433c8f0
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · 13caa8ed
      Linus Torvalds 提交于
      Pull SCSI target fixes from Nicholas Bellinger:
       "Here are the outstanding target fixes queued up for v3.12-rc4 code.
      
        The highlights include:
      
         - Make vhost/scsi tag percpu_ida_alloc() use GFP_ATOMIC
         - Allow sess_cmd_map allocation failure fallback to use vzalloc
         - Fix COMPARE_AND_WRITE se_cmd->data_length bug with FILEIO backends
         - Fixes for COMPARE_AND_WRITE callback recursive failure OOPs + non
           zero scsi_status bug
         - Make iscsi-target do acknowledgement tag release from RX context
         - Setup iscsi-target with extra (cmdsn_depth / 2) percpu_ida tags
      
        Also included is a iscsi-target patch CC'ed for v3.10+ that avoids
        legacy wait_for_task=true release during fast-past StatSN
        acknowledgement, and two other SRP target related patches that address
        long-standing issues that are CC'ed for v3.3+.
      
        Extra thanks to Thomas Glanzmann for his testing feedback with
        COMPARE_AND_WRITE + EXTENDED_COPY VAAI logic"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
        iscsi-target; Allow an extra tag_num / 2 number of percpu_ida tags
        iscsi-target: Perform release of acknowledged tags from RX context
        iscsi-target: Only perform wait_for_tasks when performing shutdown
        target: Fail on non zero scsi_status in compare_and_write_callback
        target: Fix recursive COMPARE_AND_WRITE callback failure
        target: Reset data_length for COMPARE_AND_WRITE to NoLB * block_size
        ib_srpt: always set response for task management
        target: Fall back to vzalloc upon ->sess_cmd_map kzalloc failure
        vhost/scsi: Use GFP_ATOMIC with percpu_ida_alloc for obtaining tag
        ib_srpt: Destroy cm_id before destroying QP.
        target: Fix xop->dbl assignment in target_xcopy_parse_segdesc_02
      13caa8ed
    • L
      Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma · 831ae3c1
      Linus Torvalds 提交于
      Pull slave-dmaengine fixes from Vinod Koul:
       "Here is the slave dmanegine fixes.  We have the fix for deadlock issue
        on imx-dma by Michael and Josh's edma config fix along with author
        change"
      
      * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
        dmaengine: imx-dma: fix callback path in tasklet
        dmaengine: imx-dma: fix lockdep issue between irqhandler and tasklet
        dmaengine: imx-dma: fix slow path issue in prep_dma_cyclic
        dma/Kconfig: Make TI_EDMA select TI_PRIV_EDMA
        edma: Update author email address
      831ae3c1
  4. 06 10月, 2013 5 次提交
  5. 05 10月, 2013 21 次提交
    • D
      btrfs: Fix crash due to not allocating integrity data for a bioset · b208c2f7
      Darrick J. Wong 提交于
      When btrfs creates a bioset, we must also allocate the integrity data pool.
      Otherwise btrfs will crash when it tries to submit a bio to a checksumming
      disk:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
       IP: [<ffffffff8111e28a>] mempool_alloc+0x4a/0x150
       PGD 2305e4067 PUD 23063d067 PMD 0
       Oops: 0000 [#1] PREEMPT SMP
       Modules linked in: btrfs scsi_debug xfs ext4 jbd2 ext3 jbd mbcache
      sch_fq_codel eeprom lpc_ich mfd_core nfsd exportfs auth_rpcgss af_packet
      raid6_pq xor zlib_deflate libcrc32c [last unloaded: scsi_debug]
       CPU: 1 PID: 4486 Comm: mount Not tainted 3.12.0-rc1-mcsum #2
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
       task: ffff8802451c9720 ti: ffff880230698000 task.ti: ffff880230698000
       RIP: 0010:[<ffffffff8111e28a>]  [<ffffffff8111e28a>] mempool_alloc+0x4a/0x150
       RSP: 0018:ffff880230699688  EFLAGS: 00010286
       RAX: 0000000000000001 RBX: 0000000000000000 RCX: 00000000005f8445
       RDX: 0000000000000001 RSI: 0000000000000010 RDI: 0000000000000000
       RBP: ffff8802306996f8 R08: 0000000000011200 R09: 0000000000000008
       R10: 0000000000000020 R11: ffff88009d6e8000 R12: 0000000000011210
       R13: 0000000000000030 R14: ffff8802306996b8 R15: ffff8802451c9720
       FS:  00007f25b8a16800(0000) GS:ffff88024fc80000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: 0000000000000018 CR3: 0000000230576000 CR4: 00000000000007e0
       Stack:
        ffff8802451c9720 0000000000000002 ffffffff81a97100 0000000000281250
        ffffffff81a96480 ffff88024fc99150 ffff880228d18200 0000000000000000
        0000000000000000 0000000000000040 ffff880230e8c2e8 ffff8802459dc900
       Call Trace:
        [<ffffffff811b2208>] bio_integrity_alloc+0x48/0x1b0
        [<ffffffff811b26fc>] bio_integrity_prep+0xac/0x360
        [<ffffffff8111e298>] ? mempool_alloc+0x58/0x150
        [<ffffffffa03e8041>] ? alloc_extent_state+0x31/0x110 [btrfs]
        [<ffffffff81241579>] blk_queue_bio+0x1c9/0x460
        [<ffffffff8123e58a>] generic_make_request+0xca/0x100
        [<ffffffff8123e639>] submit_bio+0x79/0x160
        [<ffffffffa03f865e>] btrfs_map_bio+0x48e/0x5b0 [btrfs]
        [<ffffffffa03c821a>] btree_submit_bio_hook+0xda/0x110 [btrfs]
        [<ffffffffa03e7eba>] submit_one_bio+0x6a/0xa0 [btrfs]
        [<ffffffffa03ef450>] read_extent_buffer_pages+0x250/0x310 [btrfs]
        [<ffffffff8125eef6>] ? __radix_tree_preload+0x66/0xf0
        [<ffffffff8125f1c5>] ? radix_tree_insert+0x95/0x260
        [<ffffffffa03c66f6>] btree_read_extent_buffer_pages.constprop.128+0xb6/0x120
      [btrfs]
        [<ffffffffa03c8c1a>] read_tree_block+0x3a/0x60 [btrfs]
        [<ffffffffa03caefd>] open_ctree+0x139d/0x2030 [btrfs]
        [<ffffffffa03a282a>] btrfs_mount+0x53a/0x7d0 [btrfs]
        [<ffffffff8113ab0b>] ? pcpu_alloc+0x8eb/0x9f0
        [<ffffffff81167305>] ? __kmalloc_track_caller+0x35/0x1e0
        [<ffffffff81176ba0>] mount_fs+0x20/0xd0
        [<ffffffff81191096>] vfs_kern_mount+0x76/0x120
        [<ffffffff81193320>] do_mount+0x200/0xa40
        [<ffffffff81135cdb>] ? strndup_user+0x5b/0x80
        [<ffffffff81193bf0>] SyS_mount+0x90/0xe0
        [<ffffffff8156d31d>] system_call_fastpath+0x1a/0x1f
       Code: 4c 8d 75 a8 4c 89 6d e8 45 89 e0 4c 8d 6f 30 48 89 5d d8 41 83 e0 af 48
      89 fb 49 83 c6 18 4c 89 7d f8 65 4c 8b 3c 25 c0 b8 00 00 <48> 8b 73 18 44 89 c7
      44 89 45 98 ff 53 20 48 85 c0 48 89 c2 74
       RIP  [<ffffffff8111e28a>] mempool_alloc+0x4a/0x150
        RSP <ffff880230699688>
       CR2: 0000000000000018
       ---[ end trace 7a96042017ed21e2 ]---
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      b208c2f7
    • C
      Merge branch 'for-linus' into for-linus-3.12 · 1329dfc8
      Chris Mason 提交于
      1329dfc8
    • L
      Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6 · a5c984cc
      Linus Torvalds 提交于
      Pull CIFS fixes from Steve French:
       "Small set of cifs fixes.  Most important is Jeff's fix that works
        around disconnection problems which can be caused by simultaneous use
        of user space tools (starting a long running smbclient backup then
        doing a cifs kernel mount) or multiple cifs mounts through a NAT, and
        Jim's fix to deal with reexport of cifs share.
      
        I expect to send two more cifs fixes next week (being tested now) -
        fixes to address an SMB2 unmount hang when server dies and a fix for
        cifs symlink handling of Windows "NFS" symlinks"
      
      * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
        [CIFS] update cifs.ko version
        [CIFS] Remove ext2 flags that have been moved to fs.h
        [CIFS] Provide sane values for nlink
        cifs: stop trying to use virtual circuits
        CIFS: FS-Cache: Uncache unread pages in cifs_readpages() before freeing them
      a5c984cc
    • L
      Merge tag 'pci-v3.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 95167aad
      Linus Torvalds 提交于
      Pull PCI fix from Bjorn Helgaas:
       "We merged what was intended to be an MMCONFIG cleanup, but in fact,
        for systems without _CBA (which is almost everything), it broke
        extended config space for domain 0 and it broke all config space for
        other domains.
      
        This reverts the change"
      
      * tag 'pci-v3.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero"
      95167aad
    • B
      Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero" · 67d470e0
      Bjorn Helgaas 提交于
      This reverts commit 07f9b61c.
      
      07f9b61c was intended to be a cleanup that didn't change anything, but in
      fact, for systems without _CBA (which is almost everything), it broke
      extended config space for domain 0 and all config space for other domains.
      
      Reference: http://lkml.kernel.org/r/20131004011806.GE20450@dangermouse.emea.sgi.comReported-by: NHedi Berriche <hedi@sgi.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      67d470e0
    • L
      Merge tag 'pm+acpi-3.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 7dee8dff
      Linus Torvalds 提交于
      Pull ACPI and power management fixes from Rafael Wysocki:
      
       - The resume part of user space driven hibernation (s2disk) is now
         broken after the change that moved the creation of memory bitmaps to
         after the freezing of tasks, because I forgot that the resume utility
         loaded the image before freezing tasks and needed the bitmaps for
         that.  The fix adds special handling for that case.
      
       - One of recent commits changed the export of acpi_bus_get_device() to
         EXPORT_SYMBOL_GPL(), which was technically correct but broke existing
         binary modules using that function including one in particularly
         widespread use.  Change it back to EXPORT_SYMBOL().
      
       - The intel_pstate driver sometimes fails to disable turbo if its
         no_turbo sysfs attribute is set.  Fix from Srinivas Pandruvada.
      
       - One of recent cpufreq fixes forgot to update a check in cpufreq-cpu0
         which still (incorrectly) treats non-NULL as non-error.  Fix from
         Philipp Zabel.
      
       - The SPEAr cpufreq driver uses a wrong variable type in one place
         preventing it from catching errors returned by one of the functions
         called by it.  Fix from Sachin Kamat.
      
      * tag 'pm+acpi-3.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: Use EXPORT_SYMBOL() for acpi_bus_get_device()
        intel_pstate: fix no_turbo
        cpufreq: cpufreq-cpu0: NULL is a valid regulator, part 2
        cpufreq: SPEAr: Fix incorrect variable type
        PM / hibernate: Fix user space driven resume regression
      7dee8dff
    • L
      Merge tag 'xfs-for-linus-v3.12-rc4' of git://oss.sgi.com/xfs/xfs · 3dbecf0a
      Linus Torvalds 提交于
      Pull xfs bugfixes from Ben Myers:
       "There are lockdep annotations for project quotas, a fix for dirent
        dtype support on v4 filesystems, a fix for a memory leak in recovery,
        and a fix for the build error that resulted from it.  D'oh"
      
      * tag 'xfs-for-linus-v3.12-rc4' of git://oss.sgi.com/xfs/xfs:
        xfs: Use kmem_free() instead of free()
        xfs: fix memory leak in xlog_recover_add_to_trans
        xfs: dirent dtype presence is dependent on directory magic numbers
        xfs: lockdep needs to know about 3 dquot-deep nesting
      3dbecf0a
    • L
      selinux: remove 'flags' parameter from avc_audit() · ab354062
      Linus Torvalds 提交于
      Now avc_audit() has no more users with that parameter. Remove it.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ab354062
    • L
      selinux: avc_has_perm_flags has no more users · cb4fbe57
      Linus Torvalds 提交于
      .. so get rid of it.  The only indirect users were all the
      avc_has_perm() callers which just expanded to have a zero flags
      argument.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cb4fbe57
    • I
      Btrfs: fix a use-after-free bug in btrfs_dev_replace_finishing · 1357272f
      Ilya Dryomov 提交于
      free_device rcu callback, scheduled from btrfs_rm_dev_replace_srcdev,
      can be processed before btrfs_scratch_superblock is called, which would
      result in a use-after-free on btrfs_device contents.  Fix this by
      zeroing the superblock before the rcu callback is registered.
      
      Cc: Stefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      1357272f
    • I
      Btrfs: eliminate races in worker stopping code · 964fb15a
      Ilya Dryomov 提交于
      The current implementation of worker threads in Btrfs has races in
      worker stopping code, which cause all kinds of panics and lockups when
      running btrfs/011 xfstest in a loop.  The problem is that
      btrfs_stop_workers is unsynchronized with respect to check_idle_worker,
      check_busy_worker and __btrfs_start_workers.
      
      E.g., check_idle_worker race flow:
      
             btrfs_stop_workers():            check_idle_worker(aworker):
      - grabs the lock
      - splices the idle list into the
        working list
      - removes the first worker from the
        working list
      - releases the lock to wait for
        its kthread's completion
                                        - grabs the lock
                                        - if aworker is on the working list,
                                          moves aworker from the working list
                                          to the idle list
                                        - releases the lock
      - grabs the lock
      - puts the worker
      - removes the second worker from the
        working list
                                    ......
              btrfs_stop_workers returns, aworker is on the idle list
                       FS is umounted, memory is freed
                                    ......
                    aworker is waken up, fireworks ensue
      
      With this applied, I wasn't able to trigger the problem in 48 hours,
      whereas previously I could reliably reproduce at least one of these
      races within an hour.
      Reported-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      964fb15a
    • L
      Btrfs: fix crash of compressed writes · 385fe0be
      Liu Bo 提交于
      The crash[1] is found by xfstests/generic/208 with "-o compress",
      it's not reproduced everytime, but it does panic.
      
      The bug is quite interesting, it's actually introduced by a recent commit
      (573aecaf,
      Btrfs: actually limit the size of delalloc range).
      
      Btrfs implements delay allocation, so during writeback, we
      (1) get a page A and lock it
      (2) search the state tree for delalloc bytes and lock all pages within the range
      (3) process the delalloc range, including find disk space and create
          ordered extent and so on.
      (4) submit the page A.
      
      It runs well in normal cases, but if we're in a racy case, eg.
      buffered compressed writes and aio-dio writes,
      sometimes we may fail to lock all pages in the 'delalloc' range,
      in which case, we need to fall back to search the state tree again with
      a smaller range limit(max_bytes = PAGE_CACHE_SIZE - offset).
      
      The mentioned commit has a side effect, that is, in the fallback case,
      we can find delalloc bytes before the index of the page we already have locked,
      so we're in the case of (delalloc_end <= *start) and return with (found > 0).
      
      This ends with not locking delalloc pages but making ->writepage still
      process them, and the crash happens.
      
      This fixes it by just thinking that we find nothing and returning to caller
      as the caller knows how to deal with it properly.
      
      [1]:
      ------------[ cut here ]------------
      kernel BUG at mm/page-writeback.c:2170!
      [...]
      CPU: 2 PID: 11755 Comm: btrfs-delalloc- Tainted: G           O 3.11.0+ #8
      [...]
      RIP: 0010:[<ffffffff810f5093>]  [<ffffffff810f5093>] clear_page_dirty_for_io+0x1e/0x83
      [...]
      [ 4934.248731] Stack:
      [ 4934.248731]  ffff8801477e5dc8 ffffea00049b9f00 ffff8801869f9ce8 ffffffffa02b841a
      [ 4934.248731]  0000000000000000 0000000000000000 0000000000000fff 0000000000000620
      [ 4934.248731]  ffff88018db59c78 ffffea0005da8d40 ffffffffa02ff860 00000001810016c0
      [ 4934.248731] Call Trace:
      [ 4934.248731]  [<ffffffffa02b841a>] extent_range_clear_dirty_for_io+0xcf/0xf5 [btrfs]
      [ 4934.248731]  [<ffffffffa02a8889>] compress_file_range+0x1dc/0x4cb [btrfs]
      [ 4934.248731]  [<ffffffff8104f7af>] ? detach_if_pending+0x22/0x4b
      [ 4934.248731]  [<ffffffffa02a8bad>] async_cow_start+0x35/0x53 [btrfs]
      [ 4934.248731]  [<ffffffffa02c694b>] worker_loop+0x14b/0x48c [btrfs]
      [ 4934.248731]  [<ffffffffa02c6800>] ? btrfs_queue_worker+0x25c/0x25c [btrfs]
      [ 4934.248731]  [<ffffffff810608f5>] kthread+0x8d/0x95
      [ 4934.248731]  [<ffffffff81060868>] ? kthread_freezable_should_stop+0x43/0x43
      [ 4934.248731]  [<ffffffff814fe09c>] ret_from_fork+0x7c/0xb0
      [ 4934.248731]  [<ffffffff81060868>] ? kthread_freezable_should_stop+0x43/0x43
      [ 4934.248731] Code: ff 85 c0 0f 94 c0 0f b6 c0 59 5b 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb e8 2c de 00 00 49 89 c4 48 8b 03 a8 01 75 02 <0f> 0b 4d 85 e4 74 52 49 8b 84 24 80 00 00 00 f6 40 20 01 75 44
      [ 4934.248731] RIP  [<ffffffff810f5093>] clear_page_dirty_for_io+0x1e/0x83
      [ 4934.248731]  RSP <ffff8801869f9c48>
      [ 4934.280307] ---[ end trace 36f06d3f8750236a ]---
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      385fe0be
    • J
      Btrfs: fix transid verify errors when recovering log tree · 60e7cd3a
      Josef Bacik 提交于
      If we crash with a log, remount and recover that log, and then crash before we
      can commit another transaction we will get transid verify errors on the next
      mount.  This is because we were not zero'ing out the log when we committed the
      transaction after recovery.  This is ok as long as we commit another transaction
      at some point in the future, but if you abort or something else goes wrong you
      can end up in this weird state because the recovery stuff says that the tree log
      should have a generation+1 of the super generation, which won't be the case of
      the transaction that was started for recovery.  Fix this by removing the check
      and _always_ zero out the log portion of the super when we commit a transaction.
      This fixes the transid verify issues I was seeing with my force errors tests.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      60e7cd3a
    • L
      selinux: remove 'flags' parameter from inode_has_perm · 19e49834
      Linus Torvalds 提交于
      Every single user passes in '0'.  I think we had non-zero users back in
      some stone age when selinux_inode_permission() was implemented in terms
      of inode_has_perm(), but that complicated case got split up into a
      totally separate code-path so that we could optimize the much simpler
      special cases.
      
      See commit 2e334057 ("SELinux: delay initialization of audit data in
      selinux_inode_permission") for example.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      19e49834
    • T
      xfs: Use kmem_free() instead of free() · b2a42f78
      Thierry Reding 提交于
      This fixes a build failure caused by calling the free() function which
      does not exist in the Linux kernel.
      Signed-off-by: NThierry Reding <treding@nvidia.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      
      (cherry picked from commit aaaae980)
      b2a42f78
    • T
      xfs: fix memory leak in xlog_recover_add_to_trans · 9b3b77fe
      tinguely@sgi.com 提交于
      Free the memory in error path of xlog_recover_add_to_trans().
      Normally this memory is freed in recovery pass2, but is leaked
      in the error path.
      Signed-off-by: NMark Tinguely <tinguely@sgi.com>
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      
      (cherry picked from commit 519ccb81)
      9b3b77fe
    • D
      xfs: dirent dtype presence is dependent on directory magic numbers · 6d313498
      Dave Chinner 提交于
      The determination of whether a directory entry contains a dtype
      field originally was dependent on the filesystem having CRCs
      enabled. This meant that the format for dtype beign enabled could be
      determined by checking the directory block magic number rather than
      doing a feature bit check. This was useful in that it meant that we
      didn't need to pass a struct xfs_mount around to functions that
      were already supplied with a directory block header.
      
      Unfortunately, the introduction of dtype fields into the v4
      structure via a feature bit meant this "use the directory block
      magic number" method of discriminating the dirent entry sizes is
      broken. Hence we need to convert the places that use magic number
      checks to use feature bit checks so that they work correctly and not
      by chance.
      
      The current code works on v4 filesystems only because the dirent
      size roundup covers the extra byte needed by the dtype field in the
      places where this problem occurs.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      
      (cherry picked from commit 367993e7)
      6d313498
    • D
      xfs: lockdep needs to know about 3 dquot-deep nesting · 89c6c89a
      Dave Chinner 提交于
      Michael Semon reported that xfs/299 generated this lockdep warning:
      
      =============================================
      [ INFO: possible recursive locking detected ]
      3.12.0-rc2+ #2 Not tainted
      ---------------------------------------------
      touch/21072 is trying to acquire lock:
       (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
      
      but task is already holding lock:
       (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&xfs_dquot_other_class);
        lock(&xfs_dquot_other_class);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      7 locks held by touch/21072:
       #0:  (sb_writers#10){++++.+}, at: [<c11185b6>] mnt_want_write+0x1e/0x3e
       #1:  (&type->i_mutex_dir_key#4){+.+.+.}, at: [<c11078ee>] do_last+0x245/0xe40
       #2:  (sb_internal#2){++++.+}, at: [<c122c9e0>] xfs_trans_alloc+0x1f/0x35
       #3:  (&(&ip->i_lock)->mr_lock/1){+.+...}, at: [<c126cd1b>] xfs_ilock+0x100/0x1f1
       #4:  (&(&ip->i_lock)->mr_lock){++++-.}, at: [<c126cf52>] xfs_ilock_nowait+0x105/0x22f
       #5:  (&dqp->q_qlock){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
       #6:  (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
      
      The lockdep annotation for dquot lock nesting only understands
      locking for user and "other" dquots, not user, group and quota
      dquots. Fix the annotations to match the locking heirarchy we now
      have.
      Reported-by: NMichael L. Semon <mlsemon35@gmail.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      
      (cherry picked from commit f112a049)
      89c6c89a
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 15c83d26
      Linus Torvalds 提交于
      Pull fuse bugfixes from Miklos Szeredi:
       "This contains two more fixes by Maxim for writeback/truncate races and
        fixes for RCU walk in fuse_dentry_revalidate()"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: no RCU mode in fuse_access()
        fuse: readdirplus: fix RCU walk
        fuse: don't check_submounts_and_drop() in RCU walk
        fuse: fix fallocate vs. ftruncate race
        fuse: wait for writeback in fuse_file_fallocate()
      15c83d26
    • L
      Merge tag 'iommu-fixes-v3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 8e1a2540
      Linus Torvalds 提交于
      Pull iommu fixes from Joerg Roedel:
       "A couple of fixes from the IOMMU side:
      
         - some small fixes for the new ARM-SMMU driver
         - a register offset correction for VT-d
         - add MAINTAINERS entry for drivers/iommu
      
        Overall no really big or intrusive changes"
      
      * tag 'iommu-fixes-v3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        x86/iommu: correct ICS register offset
        MAINTAINERS: add overall IOMMU section
        iommu/arm-smmu: don't enable SMMU device until probing has completed
        iommu/arm-smmu: fix iommu_present() test in init
        iommu/arm-smmu: fix a signedness bug
      8e1a2540
    • L
      Merge tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 · 0d45dab6
      Linus Torvalds 提交于
      Pull ARM64 fixes/updates from Catalin Marinas:
       - Bug-fixes (get_user/put_user, incorrect register width for ASID,
         FPSIMD initialisation)
       - Kconfig clean-up
       - defconfig update
      
      * tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
        arm64: Remove duplicate DEBUG_STACK_USAGE config
        arm64: include VIRTIO_{MMIO,BLK} in defconfig
        arm64: include EXT4 in defconfig
        arm64: fix possible invalid FPSIMD initialization state
        arm64: use correct register width when retrieving ASID
        arm64: avoid multiple evaluation of ptr in get_user/put_user()
      0d45dab6