1. 26 10月, 2017 2 次提交
  2. 24 10月, 2017 2 次提交
  3. 19 10月, 2017 1 次提交
  4. 17 10月, 2017 7 次提交
  5. 14 10月, 2017 1 次提交
  6. 11 10月, 2017 2 次提交
    • D
      drm/i915: Use rcu instead of stop_machine in set_wedged · af7a8ffa
      Daniel Vetter 提交于
      stop_machine is not really a locking primitive we should use, except
      when the hw folks tell us the hw is broken and that's the only way to
      work around it.
      
      This patch tries to address the locking abuse of stop_machine() from
      
      commit 20e4933c
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Tue Nov 22 14:41:21 2016 +0000
      
          drm/i915: Stop the machine as we install the wedged submit_request handler
      
      Chris said parts of the reasons for going with stop_machine() was that
      it's no overhead for the fast-path. But these callbacks use irqsave
      spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast.
      
      To stay as close as possible to the stop_machine semantics we first
      update all the submit function pointers to the nop handler, then call
      synchronize_rcu() to make sure no new requests can be submitted. This
      should give us exactly the huge barrier we want.
      
      I pondered whether we should annotate engine->submit_request as __rcu
      and use rcu_assign_pointer and rcu_dereference on it. But the reason
      behind those is to make sure the compiler/cpu barriers are there for
      when you have an actual data structure you point at, to make sure all
      the writes are seen correctly on the read side. But we just have a
      function pointer, and .text isn't changed, so no need for these
      barriers and hence no need for annotations.
      
      Unfortunately there's a complication with the call to
      intel_engine_init_global_seqno:
      
      - Without stop_machine we must hold the corresponding spinlock.
      
      - Without stop_machine we must ensure that all requests are marked as
        having failed with dma_fence_set_error() before we call it. That
        means we need to split the nop request submission into two phases,
        both synchronized with rcu:
      
        1. Only stop submitting the requests to hw and mark them as failed.
      
        2. After all pending requests in the scheduler/ring are suitably
        marked up as failed and we can force complete them all, also force
        complete by calling intel_engine_init_global_seqno().
      
      This should fix the followwing lockdep splat:
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.14.0-rc3-CI-CI_DRM_3179+ #1 Tainted: G     U
      ------------------------------------------------------
      kworker/3:4/562 is trying to acquire lock:
       (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8113d4bc>] stop_machine+0x1c/0x40
      
      but task is already holding lock:
       (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915]
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #6 (&dev->struct_mutex){+.+.}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             __mutex_lock+0x86/0x9b0
             mutex_lock_interruptible_nested+0x1b/0x20
             i915_mutex_lock_interruptible+0x51/0x130 [i915]
             i915_gem_fault+0x209/0x650 [i915]
             __do_fault+0x1e/0x80
             __handle_mm_fault+0xa08/0xed0
             handle_mm_fault+0x156/0x300
             __do_page_fault+0x2c5/0x570
             do_page_fault+0x28/0x250
             page_fault+0x22/0x30
      
      -> #5 (&mm->mmap_sem){++++}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             __might_fault+0x68/0x90
             _copy_to_user+0x23/0x70
             filldir+0xa5/0x120
             dcache_readdir+0xf9/0x170
             iterate_dir+0x69/0x1a0
             SyS_getdents+0xa5/0x140
             entry_SYSCALL_64_fastpath+0x1c/0xb1
      
      -> #4 (&sb->s_type->i_mutex_key#5){++++}:
             down_write+0x3b/0x70
             handle_create+0xcb/0x1e0
             devtmpfsd+0x139/0x180
             kthread+0x152/0x190
             ret_from_fork+0x27/0x40
      
      -> #3 ((complete)&req.done){+.+.}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             wait_for_common+0x58/0x210
             wait_for_completion+0x1d/0x20
             devtmpfs_create_node+0x13d/0x160
             device_add+0x5eb/0x620
             device_create_groups_vargs+0xe0/0xf0
             device_create+0x3a/0x40
             msr_device_create+0x2b/0x40
             cpuhp_invoke_callback+0xc9/0xbf0
             cpuhp_thread_fun+0x17b/0x240
             smpboot_thread_fn+0x18a/0x280
             kthread+0x152/0x190
             ret_from_fork+0x27/0x40
      
      -> #2 (cpuhp_state-up){+.+.}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             cpuhp_issue_call+0x133/0x1c0
             __cpuhp_setup_state_cpuslocked+0x139/0x2a0
             __cpuhp_setup_state+0x46/0x60
             page_writeback_init+0x43/0x67
             pagecache_init+0x3d/0x42
             start_kernel+0x3a8/0x3fc
             x86_64_start_reservations+0x2a/0x2c
             x86_64_start_kernel+0x6d/0x70
             verify_cpu+0x0/0xfb
      
      -> #1 (cpuhp_state_mutex){+.+.}:
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             __mutex_lock+0x86/0x9b0
             mutex_lock_nested+0x1b/0x20
             __cpuhp_setup_state_cpuslocked+0x53/0x2a0
             __cpuhp_setup_state+0x46/0x60
             page_alloc_init+0x28/0x30
             start_kernel+0x145/0x3fc
             x86_64_start_reservations+0x2a/0x2c
             x86_64_start_kernel+0x6d/0x70
             verify_cpu+0x0/0xfb
      
      -> #0 (cpu_hotplug_lock.rw_sem){++++}:
             check_prev_add+0x430/0x840
             __lock_acquire+0x1420/0x15e0
             lock_acquire+0xb0/0x200
             cpus_read_lock+0x3d/0xb0
             stop_machine+0x1c/0x40
             i915_gem_set_wedged+0x1a/0x20 [i915]
             i915_reset+0xb9/0x230 [i915]
             i915_reset_device+0x1f6/0x260 [i915]
             i915_handle_error+0x2d8/0x430 [i915]
             hangcheck_declare_hang+0xd3/0xf0 [i915]
             i915_hangcheck_elapsed+0x262/0x2d0 [i915]
             process_one_work+0x233/0x660
             worker_thread+0x4e/0x3b0
             kthread+0x152/0x190
             ret_from_fork+0x27/0x40
      
      other info that might help us debug this:
      
      Chain exists of:
        cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev->struct_mutex
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&dev->struct_mutex);
                                     lock(&mm->mmap_sem);
                                     lock(&dev->struct_mutex);
        lock(cpu_hotplug_lock.rw_sem);
      
       *** DEADLOCK ***
      
      3 locks held by kworker/3:4/562:
       #0:  ("events_long"){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660
       #1:  ((&(&i915->gpu_error.hangcheck_work)->work)){+.+.}, at: [<ffffffff8109c64a>] process_one_work+0x1aa/0x660
       #2:  (&dev->struct_mutex){+.+.}, at: [<ffffffffa0136588>] i915_reset_device+0x1e8/0x260 [i915]
      
      stack backtrace:
      CPU: 3 PID: 562 Comm: kworker/3:4 Tainted: G     U          4.14.0-rc3-CI-CI_DRM_3179+ #1
      Hardware name:                  /NUC7i5BNB, BIOS BNKBL357.86A.0048.2017.0704.1415 07/04/2017
      Workqueue: events_long i915_hangcheck_elapsed [i915]
      Call Trace:
       dump_stack+0x68/0x9f
       print_circular_bug+0x235/0x3c0
       ? lockdep_init_map_crosslock+0x20/0x20
       check_prev_add+0x430/0x840
       ? irq_work_queue+0x86/0xe0
       ? wake_up_klogd+0x53/0x70
       __lock_acquire+0x1420/0x15e0
       ? __lock_acquire+0x1420/0x15e0
       ? lockdep_init_map_crosslock+0x20/0x20
       lock_acquire+0xb0/0x200
       ? stop_machine+0x1c/0x40
       ? i915_gem_object_truncate+0x50/0x50 [i915]
       cpus_read_lock+0x3d/0xb0
       ? stop_machine+0x1c/0x40
       stop_machine+0x1c/0x40
       i915_gem_set_wedged+0x1a/0x20 [i915]
       i915_reset+0xb9/0x230 [i915]
       i915_reset_device+0x1f6/0x260 [i915]
       ? gen8_gt_irq_ack+0x170/0x170 [i915]
       ? work_on_cpu_safe+0x60/0x60
       i915_handle_error+0x2d8/0x430 [i915]
       ? vsnprintf+0xd1/0x4b0
       ? scnprintf+0x3a/0x70
       hangcheck_declare_hang+0xd3/0xf0 [i915]
       ? intel_runtime_pm_put+0x56/0xa0 [i915]
       i915_hangcheck_elapsed+0x262/0x2d0 [i915]
       process_one_work+0x233/0x660
       worker_thread+0x4e/0x3b0
       kthread+0x152/0x190
       ? process_one_work+0x660/0x660
       ? kthread_create_on_node+0x40/0x40
       ret_from_fork+0x27/0x40
      Setting dangerous option reset - tainting kernel
      i915 0000:00:02.0: Resetting chip after gpu hang
      Setting dangerous option reset - tainting kernel
      i915 0000:00:02.0: Resetting chip after gpu hang
      
      v2: Have 1 global synchronize_rcu() barrier across all engines, and
      improve commit message.
      
      v3: We need to protect the seqno update with the timeline spinlock (in
      set_wedged) to avoid racing with other updates of the seqno, like we
      already do in nop_submit_request (Chris).
      
      v4: Use two-phase sequence to plug the race Chris spotted where we can
      complete requests before they're marked up with -EIO.
      
      v5: Review from Chris:
      - simplify nop_submit_request.
      - Add comment to rcu_read_lock section.
      - Align comments with the new style.
      
      v6: Remove unused variable to appease CI.
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102886
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103096
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Marta Lofstedt <marta.lofstedt@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171011091019.1425-1-daniel.vetter@ffwll.ch
      af7a8ffa
    • S
      drm/i915: Name structure in dev_priv that contains RPS/RC6 state as "gt_pm" · 562d9bae
      Sagar Arun Kamble 提交于
      Prepared substructure rps for RPS related state. autoenable_work is
      used for RC6 too hence it is defined outside rps structure. As we do
      this lot many functions are refactored to use intel_rps *rps to access
      rps related members. Hence renamed intel_rps_client pointer variables
      to rps_client in various functions.
      
      v2: Rebase.
      
      v3: s/pm/gt_pm (Chris)
      Refactored access to rps structure by declaring struct intel_rps * in
      many functions.
      Signed-off-by: NSagar Arun Kamble <sagar.a.kamble@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com> #1
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/1507360055-19948-9-git-send-email-sagar.a.kamble@intel.comAcked-by: NImre Deak <imre.deak@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171010213010.7415-8-chris@chris-wilson.co.uk
      562d9bae
  7. 10 10月, 2017 6 次提交
  8. 07 10月, 2017 7 次提交
  9. 29 9月, 2017 1 次提交
  10. 25 9月, 2017 1 次提交
  11. 22 9月, 2017 1 次提交
  12. 18 9月, 2017 1 次提交
  13. 14 9月, 2017 1 次提交
    • M
      mm: treewide: remove GFP_TEMPORARY allocation flag · 0ee931c4
      Michal Hocko 提交于
      GFP_TEMPORARY was introduced by commit e12ba74d ("Group short-lived
      and reclaimable kernel allocations") along with __GFP_RECLAIMABLE.  It's
      primary motivation was to allow users to tell that an allocation is
      short lived and so the allocator can try to place such allocations close
      together and prevent long term fragmentation.  As much as this sounds
      like a reasonable semantic it becomes much less clear when to use the
      highlevel GFP_TEMPORARY allocation flag.  How long is temporary? Can the
      context holding that memory sleep? Can it take locks? It seems there is
      no good answer for those questions.
      
      The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
      __GFP_RECLAIMABLE which in itself is tricky because basically none of
      the existing caller provide a way to reclaim the allocated memory.  So
      this is rather misleading and hard to evaluate for any benefits.
      
      I have checked some random users and none of them has added the flag
      with a specific justification.  I suspect most of them just copied from
      other existing users and others just thought it might be a good idea to
      use without any measuring.  This suggests that GFP_TEMPORARY just
      motivates for cargo cult usage without any reasoning.
      
      I believe that our gfp flags are quite complex already and especially
      those with highlevel semantic should be clearly defined to prevent from
      confusion and abuse.  Therefore I propose dropping GFP_TEMPORARY and
      replace all existing users to simply use GFP_KERNEL.  Please note that
      SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
      so they will be placed properly for memory fragmentation prevention.
      
      I can see reasons we might want some gfp flag to reflect shorterm
      allocations but I propose starting from a clear semantic definition and
      only then add users with proper justification.
      
      This was been brought up before LSF this year by Matthew [1] and it
      turned out that GFP_TEMPORARY really doesn't have a clear semantic.  It
      seems to be a heuristic without any measured advantage for most (if not
      all) its current users.  The follow up discussion has revealed that
      opinions on what might be temporary allocation differ a lot between
      developers.  So rather than trying to tweak existing users into a
      semantic which they haven't expected I propose to simply remove the flag
      and start from scratch if we really need a semantic for short term
      allocations.
      
      [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org
      
      [akpm@linux-foundation.org: fix typo]
      [akpm@linux-foundation.org: coding-style fixes]
      [sfr@canb.auug.org.au: drm/i915: fix up]
        Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ee931c4
  14. 08 9月, 2017 2 次提交
  15. 07 9月, 2017 3 次提交
  16. 06 9月, 2017 1 次提交
  17. 05 9月, 2017 1 次提交
    • V
      drm/i915: io unmap functions want __iomem · afe722be
      Ville Syrjälä 提交于
      Don't cast away the __iomem from the io_mapping functions so that
      sparse won't be so unhappy when we pass the pointer to the unmap
      functions. Instead let's move the cast to where we actually use the
      pointer.
      
      Fixes the following sparse warnings:
      i915_gem.c:1022:33: warning: incorrect type in argument 1 (different address spaces)
      i915_gem.c:1022:33:    expected void [noderef] <asn:2>*vaddr
      i915_gem.c:1022:33:    got void *[assigned] vaddr
      i915_gem.c:1027:34: warning: incorrect type in argument 1 (different address spaces)
      i915_gem.c:1027:34:    expected void [noderef] <asn:2>*vaddr
      i915_gem.c:1027:34:    got void *[assigned] vaddr
      i915_gem.c:1199:33: warning: incorrect type in argument 1 (different address spaces)
      i915_gem.c:1199:33:    expected void [noderef] <asn:2>*vaddr
      i915_gem.c:1199:33:    got void *[assigned] vaddr
      i915_gem.c:1204:34: warning: incorrect type in argument 1 (different address spaces)
      i915_gem.c:1204:34:    expected void [noderef] <asn:2>*vaddr
      i915_gem.c:1204:34:    got void *[assigned] vaddr
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20170901171252.31025-2-ville.syrjala@linux.intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      afe722be