1. 19 2月, 2016 2 次提交
    • M
      drm/atomic: Allow for holes in connector state, v2. · 5fff80bb
      Maarten Lankhorst 提交于
      Because we record connector_mask using 1 << drm_connector_index now
      the connector_mask should stay the same even when other connectors
      are removed. This was not the case with MST, in that case when removing
      a connector all other connectors may change their index.
      
      This is fixed by waiting until the first get_connector_state to allocate
      connector_state, and force reallocation when state is too small.
      
      As a side effect connector arrays no longer have to be preallocated,
      and can be allocated on first use which means a less allocations in
      the page flip only path.
      
      Changes since v1:
      - Whitespace. (Ville)
      - Call ida_remove when destroying the connector. (Ville)
      - u32 alloc -> int. (Ville)
      
      Fixes: 14de6c44 ("drm/atomic: Remove drm_atomic_connectors_for_crtc.")
      Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NLyude <cpaul@redhat.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      5fff80bb
    • J
      Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread" · 13d34ac6
      Jeff Layton 提交于
      This reverts commit c510eff6 ("fsnotify: destroy marks with
      call_srcu instead of dedicated thread").
      
      Eryu reported that he was seeing some OOM kills kick in when running a
      testcase that adds and removes inotify marks on a file in a tight loop.
      
      The above commit changed the code to use call_srcu to clean up the
      marks.  While that does (in principle) work, the srcu callback job is
      limited to cleaning up entries in small batches and only once per jiffy.
      It's easily possible to overwhelm that machinery with too many call_srcu
      callbacks, and Eryu's reproduer did just that.
      
      There's also another potential problem with using call_srcu here.  While
      you can obviously sleep while holding the srcu_read_lock, the callbacks
      run under local_bh_disable, so you can't sleep there.
      
      It's possible when putting the last reference to the fsnotify_mark that
      we'll end up putting a chain of references including the fsnotify_group,
      uid, and associated keys.  While I don't see any obvious ways that that
      could occurs, it's probably still best to avoid using call_srcu here
      after all.
      
      This patch reverts the above patch.  A later patch will take a different
      approach to eliminated the dedicated thread here.
      Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
      Reported-by: NEryu Guan <guaneryu@gmail.com>
      Tested-by: NEryu Guan <guaneryu@gmail.com>
      Cc: Jan Kara <jack@suse.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      13d34ac6
  2. 18 2月, 2016 1 次提交
  3. 16 2月, 2016 2 次提交
    • A
      tracing: Fix freak link error caused by branch tracer · b33c8ff4
      Arnd Bergmann 提交于
      In my randconfig tests, I came across a bug that involves several
      components:
      
      * gcc-4.9 through at least 5.3
      * CONFIG_GCOV_PROFILE_ALL enabling -fprofile-arcs for all files
      * CONFIG_PROFILE_ALL_BRANCHES overriding every if()
      * The optimized implementation of do_div() that tries to
        replace a library call with an division by multiplication
      * code in drivers/media/dvb-frontends/zl10353.c doing
      
              u32 adc_clock = 450560; /* 45.056 MHz */
              if (state->config.adc_clock)
                      adc_clock = state->config.adc_clock;
              do_div(value, adc_clock);
      
      In this case, gcc fails to determine whether the divisor
      in do_div() is __builtin_constant_p(). In particular, it
      concludes that __builtin_constant_p(adc_clock) is false, while
      __builtin_constant_p(!!adc_clock) is true.
      
      That in turn throws off the logic in do_div() that also uses
      __builtin_constant_p(), and instead of picking either the
      constant- optimized division, and the code in ilog2() that uses
      __builtin_constant_p() to figure out whether it knows the answer at
      compile time. The result is a link error from failing to find
      multiple symbols that should never have been called based on
      the __builtin_constant_p():
      
      dvb-frontends/zl10353.c:138: undefined reference to `____ilog2_NaN'
      dvb-frontends/zl10353.c:138: undefined reference to `__aeabi_uldivmod'
      ERROR: "____ilog2_NaN" [drivers/media/dvb-frontends/zl10353.ko] undefined!
      ERROR: "__aeabi_uldivmod" [drivers/media/dvb-frontends/zl10353.ko] undefined!
      
      This patch avoids the problem by changing __trace_if() to check
      whether the condition is known at compile-time to be nonzero, rather
      than checking whether it is actually a constant.
      
      I see this one link error in roughly one out of 1600 randconfig builds
      on ARM, and the patch fixes all known instances.
      
      Link: http://lkml.kernel.org/r/1455312410-1058841-1-git-send-email-arnd@arndb.deAcked-by: NNicolas Pitre <nico@linaro.org>
      Fixes: ab3c9c68 ("branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y")
      Cc: stable@vger.kernel.org # v2.6.30+
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b33c8ff4
    • S
      tracepoints: Do not trace when cpu is offline · f3775549
      Steven Rostedt (Red Hat) 提交于
      The tracepoint infrastructure uses RCU sched protection to enable and
      disable tracepoints safely. There are some instances where tracepoints are
      used in infrastructure code (like kfree()) that get called after a CPU is
      going offline, and perhaps when it is coming back online but hasn't been
      registered yet.
      
      This can probuce the following warning:
      
       [ INFO: suspicious RCU usage. ]
       4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S
       -------------------------------
       include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage!
      
       other info that might help us debug this:
      
       RCU used illegally from offline CPU!  rcu_scheduler_active = 1, debug_locks = 1
       no locks held by swapper/8/0.
      
       stack backtrace:
        CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S              4.4.0-00006-g0fe53e8-dirty #34
        Call Trace:
        [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable)
        [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170
        [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440
        [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100
        [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150
        [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140
        [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310
        [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60
        [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40
        [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560
        [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360
        [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14
      
      This warning is not a false positive either. RCU is not protecting code that
      is being executed while the CPU is offline.
      
      Instead of playing "whack-a-mole(TM)" and adding conditional statements to
      the tracepoints we find that are used in this instance, simply add a
      cpu_online() test to the tracepoint code where the tracepoint will be
      ignored if the CPU is offline.
      
      Use of raw_smp_processor_id() is fine, as there should never be a case where
      the tracepoint code goes from running on a CPU that is online and suddenly
      gets migrated to a CPU that is offline.
      
      Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.orgReported-by: NDenis Kirjanov <kda@linux-powerpc.org>
      Fixes: 97e1c18e ("tracing: Kernel Tracepoints")
      Cc: stable@vger.kernel.org # v2.6.28+
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f3775549
  4. 15 2月, 2016 2 次提交
    • D
      iommu/vt-d: Clear PPR bit to ensure we get more page request interrupts · 46924008
      David Woodhouse 提交于
      According to the VT-d specification we need to clear the PPR bit in
      the Page Request Status register when handling page requests, or the
      hardware won't generate any more interrupts.
      
      This wasn't actually necessary on SKL/KBL (which may well be the
      subject of a hardware erratum, although it's harmless enough). But
      other implementations do appear to get it right, and we only ever get
      one interrupt unless we clear the PPR bit.
      Reported-by: NCQ Tang <cq.tang@intel.com>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      Cc: stable@vger.kernel.org
      46924008
    • A
      powerpc/mm: Fix Multi hit ERAT cause by recent THP update · c777e2a8
      Aneesh Kumar K.V 提交于
      With ppc64 we use the deposited pgtable_t to store the hash pte slot
      information. We should not withdraw the deposited pgtable_t without
      marking the pmd none. This ensure that low level hash fault handling
      will skip this huge pte and we will handle them at upper levels.
      
      Recent change to pmd splitting changed the above in order to handle the
      race between pmd split and exit_mmap. The race is explained below.
      
      Consider following race:
      
      		CPU0				CPU1
      shrink_page_list()
        add_to_swap()
          split_huge_page_to_list()
            __split_huge_pmd_locked()
              pmdp_huge_clear_flush_notify()
      	// pmd_none() == true
      					exit_mmap()
      					  unmap_vmas()
      					    zap_pmd_range()
      					      // no action on pmd since pmd_none() == true
      	pmd_populate()
      
      As result the THP will not be freed. The leak is detected by check_mm():
      
      	BUG: Bad rss-counter state mm:ffff880058d2e580 idx:1 val:512
      
      The above required us to not mark pmd none during a pmd split.
      
      The fix for ppc is to clear the huge pte of _PAGE_USER, so that low
      level fault handling code skip this pte. At higher level we do take ptl
      lock. That should serialze us against the pmd split. Once the lock is
      acquired we do check the pmd again using pmd_same. That should always
      return false for us and hence we should retry the access. We do the
      pmd_same check in all case after taking plt with
      THP (do_huge_pmd_wp_page, do_huge_pmd_numa_page and
      huge_pmd_set_accessed)
      
      Also make sure we wait for irq disable section in other cpus to finish
      before flipping a huge pte entry with a regular pmd entry. Code paths
      like find_linux_pte_or_hugepte depend on irq disable to get
      a stable pte_t pointer. A parallel thp split need to make sure we
      don't convert a pmd pte to a regular pmd entry without waiting for the
      irq disable section to finish.
      
      Fixes: eef1b3ba ("thp: implement split_huge_pmd()")
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c777e2a8
  5. 12 2月, 2016 2 次提交
  6. 11 2月, 2016 3 次提交
  7. 10 2月, 2016 2 次提交
  8. 09 2月, 2016 2 次提交
  9. 08 2月, 2016 1 次提交
  10. 07 2月, 2016 2 次提交
    • H
      pty: make sure super_block is still valid in final /dev/tty close · 1f55c718
      Herton R. Krzesinski 提交于
      Considering current pty code and multiple devpts instances, it's possible
      to umount a devpts file system while a program still has /dev/tty opened
      pointing to a previosuly closed pty pair in that instance. In the case all
      ptmx and pts/N files are closed, umount can be done. If the program closes
      /dev/tty after umount is done, devpts_kill_index will use now an invalid
      super_block, which was already destroyed in the umount operation after
      running ->kill_sb. This is another "use after free" type of issue, but now
      related to the allocated super_block instance.
      
      To avoid the problem (warning at ida_remove and potential crashes) for
      this specific case, I added two functions in devpts which grabs additional
      references to the super_block, which pty code now uses so it makes sure
      the super block structure is still valid until pty shutdown is done.
      I also moved the additional inode references to the same functions, which
      also covered similar case with inode being freed before /dev/tty final
      close/shutdown.
      Signed-off-by: NHerton R. Krzesinski <herton@redhat.com>
      Cc: stable@vger.kernel.org # 2.6.29+
      Reviewed-by: NPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f55c718
    • N
      target: Drop legacy se_cmd->task_stop_comp + REQUEST_STOP usage · 57dae190
      Nicholas Bellinger 提交于
      With CMD_T_FABRIC_STOP + se_cmd->cmd_wait_set usage in place,
      go ahead and drop left-over CMD_T_REQUEST_STOP checks in
      target_complete_cmd() and unused target_stop_cmd().
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Cc: Quinn Tran <quinn.tran@qlogic.com>
      Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Andy Grover <agrover@redhat.com>
      Cc: Mike Christie <mchristi@redhat.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      57dae190
  11. 06 2月, 2016 4 次提交
    • K
      radix-tree: fix oops after radix_tree_iter_retry · 73204282
      Konstantin Khlebnikov 提交于
      Helper radix_tree_iter_retry() resets next_index to the current index.
      In following radix_tree_next_slot current chunk size becomes zero.  This
      isn't checked and it tries to dereference null pointer in slot.
      
      Tagged iterator is fine because retry happens only at slot 0 where tag
      bitmask in iter->tags is filled with single bit.
      
      Fixes: 46437f9a ("radix-tree: fix race in gang lookup")
      Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ohad Ben-Cohen <ohad@wizery.com>
      Cc: Jeremiah Mahler <jmmahler@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      73204282
    • K
      mm: replace vma_lock_anon_vma with anon_vma_lock_read/write · 12352d3c
      Konstantin Khlebnikov 提交于
      Sequence vma_lock_anon_vma() - vma_unlock_anon_vma() isn't safe if
      anon_vma appeared between lock and unlock.  We have to check anon_vma
      first or call anon_vma_prepare() to be sure that it's here.  There are
      only few users of these legacy helpers.  Let's get rid of them.
      
      This patch fixes anon_vma lock imbalance in validate_mm().  Write lock
      isn't required here, read lock is enough.
      
      And reorders expand_downwards/expand_upwards: security_mmap_addr() and
      wrapping-around check don't have to be under anon vma lock.
      
      Link: https://lkml.kernel.org/r/CACT4Y+Y908EjM2z=706dv4rV6dWtxTLK9nFg9_7DhRMLppBo2g@mail.gmail.comSigned-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      12352d3c
    • V
      mm, hugetlb: don't require CMA for runtime gigantic pages · 080fe206
      Vlastimil Babka 提交于
      Commit 944d9fec ("hugetlb: add support for gigantic page allocation
      at runtime") has added the runtime gigantic page allocation via
      alloc_contig_range(), making this support available only when CONFIG_CMA
      is enabled.  Because it doesn't depend on MIGRATE_CMA pageblocks and the
      associated infrastructure, it is possible with few simple adjustments to
      require only CONFIG_MEMORY_ISOLATION instead of full CONFIG_CMA.
      
      After this patch, alloc_contig_range() and related functions are
      available and used for gigantic pages with just CONFIG_MEMORY_ISOLATION
      enabled.  Note CONFIG_CMA selects CONFIG_MEMORY_ISOLATION.  This allows
      supporting runtime gigantic pages without the CMA-specific checks in
      page allocator fastpaths.
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      080fe206
    • N
      target: Fix remote-port TMR ABORT + se_cmd fabric stop · 0f4a9431
      Nicholas Bellinger 提交于
      To address the bug where fabric driver level shutdown
      of se_cmd occurs at the same time when TMR CMD_T_ABORTED
      is happening resulting in a -1 ->cmd_kref, this patch
      adds a CMD_T_FABRIC_STOP bit that is used to determine
      when TMR + driver I_T nexus shutdown is happening
      concurrently.
      
      It changes target_sess_cmd_list_set_waiting() to obtain
      se_cmd->cmd_kref + set CMD_T_FABRIC_STOP, and drop local
      reference in target_wait_for_sess_cmds() and invoke extra
      target_put_sess_cmd() during Task Aborted Status (TAS)
      when necessary.
      
      Also, it adds a new target_wait_free_cmd() wrapper around
      transport_wait_for_tasks() for the special case within
      transport_generic_free_cmd() to set CMD_T_FABRIC_STOP,
      and is now aware of CMD_T_ABORTED + CMD_T_TAS status
      bits to know when an extra transport_put_cmd() during
      TAS is required.
      
      Note transport_generic_free_cmd() is expected to block on
      cmd->cmd_wait_comp in order to follow what iscsi-target
      expects during iscsi_conn context se_cmd shutdown.
      
      Cc: Quinn Tran <quinn.tran@qlogic.com>
      Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Andy Grover <agrover@redhat.com>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: stable@vger.kernel.org # 3.10+
      Signed-off-by: NNicholas Bellinger <nab@daterainc.com>
      0f4a9431
  12. 05 2月, 2016 7 次提交
  13. 04 2月, 2016 7 次提交
    • H
      [media] vb2: fix nasty vb2_thread regression · fac710e4
      Hans Verkuil 提交于
      The vb2_thread implementation was made generic and was moved from
      videobuf2-v4l2.c to videobuf2-core.c in commit af3bac1a. Unfortunately
      that clearly was never tested since it broke read() causing NULL address
      references.
      
      The root cause was confused handling of vb2_buffer vs v4l2_buffer (the pb
      pointer in various core functions).
      
      The v4l2_buffer no longer exists after moving the code into the core and
      it is no longer needed. However, the vb2_thread code passed a pointer to
      a vb2_buffer to the core functions were a v4l2_buffer pointer was expected
      and vb2_thread expected that the vb2_buffer fields would be filled in
      correctly.
      
      This is obviously wrong since v4l2_buffer != vb2_buffer. Note that the
      pb pointer is a void pointer, so no type-checking took place.
      
      This patch fixes this problem:
      
      1) allow pb to be NULL for vb2_core_(d)qbuf. The vb2_thread code will use
         a NULL pointer here since they don't care about v4l2_buffer anyway.
      2) let vb2_core_dqbuf pass back the index of the received buffer. This is
         all vb2_thread needs: this index is the index into the q->bufs array
         and vb2_thread just gets the vb2_buffer from there.
      3) the fileio->b pointer (that originally contained a v4l2_buffer) is
         removed altogether since it is no longer needed.
      
      Tested with vivid and the cobalt driver.
      
      Cc: stable@vger.kernel.org # Kernel >= 4.3
      Signed-off-by: NHans Verkuil <hans.verkuil@cisco.com>
      Reported-by: NMatthias Schwarzott <zzam@gentoo.org>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@osg.samsung.com>
      fac710e4
    • N
      target: Fix LUN_RESET active I/O handling for ACK_KREF · febe562c
      Nicholas Bellinger 提交于
      This patch fixes a NULL pointer se_cmd->cmd_kref < 0
      refcount bug during TMR LUN_RESET with active se_cmd
      I/O, that can be triggered during se_cmd descriptor
      shutdown + release via core_tmr_drain_state_list() code.
      
      To address this bug, add common __target_check_io_state()
      helper for ABORT_TASK + LUN_RESET w/ CMD_T_COMPLETE
      checking, and set CMD_T_ABORTED + obtain ->cmd_kref for
      both cases ahead of last target_put_sess_cmd() after
      TFO->aborted_task() -> transport_cmd_finish_abort()
      callback has completed.
      
      It also introduces SCF_ACK_KREF to determine when
      transport_cmd_finish_abort() needs to drop the second
      extra reference, ahead of calling target_put_sess_cmd()
      for the final kref_put(&se_cmd->cmd_kref).
      
      It also updates transport_cmd_check_stop() to avoid
      holding se_cmd->t_state_lock while dropping se_cmd
      device state via target_remove_from_state_list(), now
      that core_tmr_drain_state_list() is holding the
      se_device lock while checking se_cmd state from
      within TMR logic.
      
      Finally, move transport_put_cmd() release of SGL +
      TMR + extended CDB memory into target_free_cmd_mem()
      in order to avoid potential resource leaks in TMR
      ABORT_TASK + LUN_RESET code-paths.  Also update
      target_release_cmd_kref() accordingly.
      Reviewed-by: NQuinn Tran <quinn.tran@qlogic.com>
      Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Andy Grover <agrover@redhat.com>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: stable@vger.kernel.org # 3.10+
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      febe562c
    • M
      radix-tree: fix race in gang lookup · 46437f9a
      Matthew Wilcox 提交于
      If the indirect_ptr bit is set on a slot, that indicates we need to redo
      the lookup.  Introduce a new function radix_tree_iter_retry() which
      forces the loop to retry the lookup by setting 'slot' to NULL and
      turning the iterator back to point at the problematic entry.
      
      This is a pretty rare problem to hit at the moment; the lookup has to
      race with a grow of the radix tree from a height of 0.  The consequences
      of hitting this race are that gang lookup could return a pointer to a
      radix_tree_node instead of a pointer to whatever the user had inserted
      in the tree.
      
      Fixes: cebbd29e ("radix-tree: rewrite gang lookup using iterator")
      Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ohad Ben-Cohen <ohad@wizery.com>
      Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      46437f9a
    • K
      mm: polish virtual memory accounting · 30bdbb78
      Konstantin Khlebnikov 提交于
      * add VM_STACK as alias for VM_GROWSUP/DOWN depending on architecture
      * always account VMAs with flag VM_STACK as stack (as it was before)
      * cleanup classifying helpers
      * update comments and documentation
      Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com>
      Tested-by: NSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      30bdbb78
    • J
      mm: memcontrol: drop superfluous entry in the per-memcg stats array · c792e824
      Johannes Weiner 提交于
      MEM_CGROUP_STAT_NSTATS is just a delimiter for cgroup1 statistics, not
      an actual array entry.  Reuse it for the first cgroup2 stat entry, like
      in the event array.
      
      Fixes: b2807f07 ("mm: memcontrol: add "sock" to cgroup2 memory.stat")
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c792e824
    • J
      proc: revert /proc/<pid>/maps [stack:TID] annotation · 65376df5
      Johannes Weiner 提交于
      Commit b7643757 ("procfs: mark thread stack correctly in
      proc/<pid>/maps") added [stack:TID] annotation to /proc/<pid>/maps.
      
      Finding the task of a stack VMA requires walking the entire thread list,
      turning this into quadratic behavior: a thousand threads means a
      thousand stacks, so the rendering of /proc/<pid>/maps needs to look at a
      million combinations.
      
      The cost is not in proportion to the usefulness as described in the
      patch.
      
      Drop the [stack:TID] annotation to make /proc/<pid>/maps (and
      /proc/<pid>/numa_maps) usable again for higher thread counts.
      
      The [stack] annotation inside /proc/<pid>/task/<tid>/maps is retained, as
      identifying the stack VMA there is an O(1) operation.
      
      Siddesh said:
       "The end users needed a way to identify thread stacks programmatically and
        there wasn't a way to do that.  I'm afraid I no longer remember (or have
        access to the resources that would aid my memory since I changed
        employers) the details of their requirement.  However, I did do this on my
        own time because I thought it was an interesting project for me and nobody
        really gave any feedback then as to its utility, so as far as I am
        concerned you could roll back the main thread maps information since the
        information is available in the thread-specific files"
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>
      Cc: Shaohua Li <shli@fb.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      65376df5
    • K
      thp: make split_queue per-node · a3d0a918
      Kirill A. Shutemov 提交于
      Andrea Arcangeli suggested to make split queue per-node to improve
      scalability.  Let's do it.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Suggested-by: NAndrea Arcangeli <aarcange@redhat.com>
      Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a3d0a918
  14. 03 2月, 2016 3 次提交
    • T
      ALSA: rawmidi: Make snd_rawmidi_transmit() race-free · 06ab3003
      Takashi Iwai 提交于
      A kernel WARNING in snd_rawmidi_transmit_ack() is triggered by
      syzkaller fuzzer:
        WARNING: CPU: 1 PID: 20739 at sound/core/rawmidi.c:1136
      Call Trace:
       [<     inline     >] __dump_stack lib/dump_stack.c:15
       [<ffffffff82999e2d>] dump_stack+0x6f/0xa2 lib/dump_stack.c:50
       [<ffffffff81352089>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:482
       [<ffffffff813522b9>] warn_slowpath_null+0x29/0x30 kernel/panic.c:515
       [<ffffffff84f80bd5>] snd_rawmidi_transmit_ack+0x275/0x400 sound/core/rawmidi.c:1136
       [<ffffffff84fdb3c1>] snd_virmidi_output_trigger+0x4b1/0x5a0 sound/core/seq/seq_virmidi.c:163
       [<     inline     >] snd_rawmidi_output_trigger sound/core/rawmidi.c:150
       [<ffffffff84f87ed9>] snd_rawmidi_kernel_write1+0x549/0x780 sound/core/rawmidi.c:1223
       [<ffffffff84f89fd3>] snd_rawmidi_write+0x543/0xb30 sound/core/rawmidi.c:1273
       [<ffffffff817b0323>] __vfs_write+0x113/0x480 fs/read_write.c:528
       [<ffffffff817b1db7>] vfs_write+0x167/0x4a0 fs/read_write.c:577
       [<     inline     >] SYSC_write fs/read_write.c:624
       [<ffffffff817b50a1>] SyS_write+0x111/0x220 fs/read_write.c:616
       [<ffffffff86336c36>] entry_SYSCALL_64_fastpath+0x16/0x7a arch/x86/entry/entry_64.S:185
      
      Also a similar warning is found but in another path:
      Call Trace:
       [<     inline     >] __dump_stack lib/dump_stack.c:15
       [<ffffffff82be2c0d>] dump_stack+0x6f/0xa2 lib/dump_stack.c:50
       [<ffffffff81355139>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:482
       [<ffffffff81355369>] warn_slowpath_null+0x29/0x30 kernel/panic.c:515
       [<ffffffff8527e69a>] rawmidi_transmit_ack+0x24a/0x3b0 sound/core/rawmidi.c:1133
       [<ffffffff8527e851>] snd_rawmidi_transmit_ack+0x51/0x80 sound/core/rawmidi.c:1163
       [<ffffffff852d9046>] snd_virmidi_output_trigger+0x2b6/0x570 sound/core/seq/seq_virmidi.c:185
       [<     inline     >] snd_rawmidi_output_trigger sound/core/rawmidi.c:150
       [<ffffffff85285a0b>] snd_rawmidi_kernel_write1+0x4bb/0x760 sound/core/rawmidi.c:1252
       [<ffffffff85287b73>] snd_rawmidi_write+0x543/0xb30 sound/core/rawmidi.c:1302
       [<ffffffff817ba5f3>] __vfs_write+0x113/0x480 fs/read_write.c:528
       [<ffffffff817bc087>] vfs_write+0x167/0x4a0 fs/read_write.c:577
       [<     inline     >] SYSC_write fs/read_write.c:624
       [<ffffffff817bf371>] SyS_write+0x111/0x220 fs/read_write.c:616
       [<ffffffff86660276>] entry_SYSCALL_64_fastpath+0x16/0x7a arch/x86/entry/entry_64.S:185
      
      In the former case, the reason is that virmidi has an open code
      calling snd_rawmidi_transmit_ack() with the value calculated outside
      the spinlock.   We may use snd_rawmidi_transmit() in a loop just for
      consuming the input data, but even there, there is a race between
      snd_rawmidi_transmit_peek() and snd_rawmidi_tranmit_ack().
      
      Similarly in the latter case, it calls snd_rawmidi_transmit_peek() and
      snd_rawmidi_tranmit_ack() separately without protection, so they are
      racy as well.
      
      The patch tries to address these issues by the following ways:
      - Introduce the unlocked versions of snd_rawmidi_transmit_peek() and
        snd_rawmidi_transmit_ack() to be called inside the explicit lock.
      - Rewrite snd_rawmidi_transmit() to be race-free (the former case).
      - Make the split calls (the latter case) protected in the rawmidi spin
        lock.
      
      BugLink: http://lkml.kernel.org/r/CACT4Y+YPq1+cYLkadwjWa5XjzF1_Vki1eHnVn-Lm0hzhSpu5PA@mail.gmail.com
      BugLink: http://lkml.kernel.org/r/CACT4Y+acG4iyphdOZx47Nyq_VHGbpJQK-6xNpiqUjaZYqsXOGw@mail.gmail.comReported-by: NDmitry Vyukov <dvyukov@google.com>
      Tested-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      06ab3003
    • R
      efi: Add NV memory attribute · c016ca08
      Robert Elliott 提交于
      Add the NV memory attribute introduced in UEFI 2.5 and add a
      column for it in the types and attributes string used when
      printing the UEFI memory map.
      
      old:
        efi: mem61: [type=14            |   |  |  |  |  |  | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff) (16384MB)
      
      new:
        efi: mem61: [type=14            |   |  |NV|  |  |  |  | |WB|WT|WC|UC] range=[0x0000000880000000-0x0000000c7fffffff) (16384MB)
      Signed-off-by: NRobert Elliott <elliott@hpe.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Reviewed-by: NLaszlo Ersek <lersek@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1454364428-494-13-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c016ca08
    • A
      efi: Add nonblocking option to efi_query_variable_store() · ca0e30dc
      Ard Biesheuvel 提交于
      The function efi_query_variable_store() may be invoked by
      efivar_entry_set_nonblocking(), which itself takes care to only
      call a non-blocking version of the SetVariable() runtime
      wrapper. However, efi_query_variable_store() may call the
      SetVariable() wrapper directly, as well as the wrapper for
      QueryVariableInfo(), both of which could deadlock in the same
      way we are trying to prevent by calling
      efivar_entry_set_nonblocking() in the first place.
      
      So instead, modify efi_query_variable_store() to use the
      non-blocking variants of QueryVariableInfo() (and give up rather
      than free up space if the available space is below
      EFI_MIN_RESERVE) if invoked with the 'nonblocking' argument set
      to true.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1454364428-494-5-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ca0e30dc