1. 13 3月, 2018 1 次提交
  2. 09 3月, 2018 4 次提交
  3. 03 3月, 2018 2 次提交
    • M
      drm/i915/uc: Introduce intel_uc_suspend|resume · 7cfca4af
      Michal Wajdeczko 提交于
      We want to use higher level 'uc' functions as the main entry points to
      the GuC/HuC code to hide some details and keep code layered.
      
      While here, move call to disable_guc_interrupts after sending suspend
      action to the GuC to allow it work also with CTB as comm mechanism.
      
      v2: update commit msg (Sagar)
      Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NSagar Arun Kamble <sagar.a.kamble@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180302111550.21328-1-michal.wajdeczko@intel.com
      7cfca4af
    • C
      drm/i915: Suspend submission tasklets around wedging · 963ddd63
      Chris Wilson 提交于
      After staring hard at sequences like
      
      [   28.199013]  systemd-1       2..s. 26062228us : execlists_submission_tasklet: rcs0 cs-irq head=0 [0?], tail=1 [1?]
      [   28.199095]  systemd-1       2..s. 26062229us : execlists_submission_tasklet: rcs0 csb[1]: status=0x00000018:0x00000000, active=0x1
      [   28.199177]  systemd-1       2..s. 26062230us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.1, seqno=3, prio=-1024
      [   28.199258]  systemd-1       2..s. 26062231us : execlists_submission_tasklet: rcs0 completed ctx=0
      [   28.199340]  gem_eio-829     1..s1 26066853us : execlists_submission_tasklet: rcs0 in[0]:  ctx=1.1, seqno=1, prio=0
      [   28.199421]   <idle>-0       2..s. 26066863us : execlists_submission_tasklet: rcs0 cs-irq head=1 [1?], tail=2 [2?]
      [   28.199503]   <idle>-0       2..s. 26066865us : execlists_submission_tasklet: rcs0 csb[2]: status=0x00000001:0x00000000, active=0x1
      [   28.199585]  gem_eio-829     1..s1 26067077us : execlists_submission_tasklet: rcs0 in[1]:  ctx=3.1, seqno=2, prio=0
      [   28.199667]  gem_eio-829     1..s1 26067078us : execlists_submission_tasklet: rcs0 in[0]:  ctx=1.2, seqno=1, prio=0
      [   28.199749]   <idle>-0       2..s. 26067084us : execlists_submission_tasklet: rcs0 cs-irq head=2 [2?], tail=3 [3?]
      [   28.199830]   <idle>-0       2..s. 26067085us : execlists_submission_tasklet: rcs0 csb[3]: status=0x00008002:0x00000001, active=0x1
      [   28.199912]   <idle>-0       2..s. 26067086us : execlists_submission_tasklet: rcs0 out[0]: ctx=1.2, seqno=1, prio=0
      [   28.199994]  gem_eio-829     2..s. 28246084us : execlists_submission_tasklet: rcs0 cs-irq head=3 [3?], tail=4 [4?]
      [   28.200096]  gem_eio-829     2..s. 28246088us : execlists_submission_tasklet: rcs0 csb[4]: status=0x00000014:0x00000001, active=0x5
      [   28.200178]  gem_eio-829     2..s. 28246089us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.0, seqno=0, prio=0
      [   28.200260]  gem_eio-829     2..s. 28246127us : execlists_submission_tasklet: execlists_submission_tasklet:886 GEM_BUG_ON(buf[2 * head + 1] != port->context_id)
      
      the conclusion is that the only place where the ports are reset to zero,
      is from engine->cancel_requests called during i915_gem_set_wedged().
      
      The race is horrible as it results from calling set-wedged on active HW
      (the GPU reset failed) and as such we need to be careful as the HW state
      changes beneath us. Fortunately, it's the same scary conditions as
      affect normal reset, so we can reuse the same machinery to disable state
      tracking as we clobber it.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104945Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Michel Thierry <michel.thierry@intel.com>
      Fixes: af7a8ffa ("drm/i915: Use rcu instead of stop_machine in set_wedged")
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180302113324.23189-2-chris@chris-wilson.co.uk
      963ddd63
  4. 02 3月, 2018 1 次提交
  5. 22 2月, 2018 1 次提交
  6. 21 2月, 2018 2 次提交
    • C
      drm/i915: Move the policy for placement of the GGTT vma into the caller · 5935485f
      Chris Wilson 提交于
      Currently we make the unilateral decision inside
      i915_gem_object_pin_to_display() where the VMA should resided (inside
      the fence and mappable region or above?). This is not our decision to
      make as it impacts on how the display engine can use the resulting
      scanout object, and it would rather instruct us where to place the VMA so
      that it can enable the features it wants. As such, make the pin flags an
      argument to i915_gem_object_pin_to_display() and control them from
      intel_pin_and_fence_fb_obj()
      
      Whilst taking control of the mapping for ourselves, start tracking how
      we use it to avoid trying to free a fence we never claimed:
      
      <3>[  227.151869] GEM_BUG_ON(vma->fence->pin_count <= 0)
      <4>[  227.152064] ------------[ cut here ]------------
      <2>[  227.152068] kernel BUG at drivers/gpu/drm/i915/i915_vma.h:391!
      <4>[  227.152084] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
      <0>[  227.152092] Dumping ftrace buffer:
      <0>[  227.152099]    (ftrace buffer empty)
      <4>[  227.152102] Modules linked in: i915 snd_hda_codec_analog snd_hda_codec_generic coretemp snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm lpc_ich e1000e mei_me mei prime_numbers
      <4>[  227.152131] CPU: 1 PID: 1587 Comm: kworker/u16:49 Tainted: G     U           4.16.0-rc1-gbab67b2f6177-kasan_7+ #1
      <4>[  227.152134] Hardware name: Dell Inc. OptiPlex 755                 /0PU052, BIOS A08 02/19/2008
      <4>[  227.152236] Workqueue: events_unbound intel_atomic_commit_work [i915]
      <4>[  227.152292] RIP: 0010:intel_unpin_fb_vma+0x23a/0x2a0 [i915]
      <4>[  227.152295] RSP: 0018:ffff88005aad7b68 EFLAGS: 00010286
      <4>[  227.152300] RAX: 0000000000000026 RBX: ffff88005c359580 RCX: 0000000000000000
      <4>[  227.152304] RDX: 0000000000000026 RSI: ffffffff8707d840 RDI: ffffed000b55af63
      <4>[  227.152307] RBP: ffff880056817e58 R08: 0000000000000001 R09: 0000000000000000
      <4>[  227.152311] R10: ffff88005aad7b88 R11: 0000000000000000 R12: ffff8800568184d0
      <4>[  227.152314] R13: ffff880065b5ab08 R14: 0000000000000000 R15: dffffc0000000000
      <4>[  227.152318] FS:  0000000000000000(0000) GS:ffff88006ac40000(0000) knlGS:0000000000000000
      <4>[  227.152322] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4>[  227.152325] CR2: 00007f5fb25550a8 CR3: 0000000068c78000 CR4: 00000000000006e0
      <4>[  227.152328] Call Trace:
      <4>[  227.152385]  intel_cleanup_plane_fb+0x6b/0xd0 [i915]
      <4>[  227.152395]  drm_atomic_helper_cleanup_planes+0x166/0x280
      <4>[  227.152452]  intel_atomic_commit_tail+0x159d/0x3380 [i915]
      <4>[  227.152463]  ? process_one_work+0x66e/0x1460
      <4>[  227.152516]  ? skl_update_crtcs+0x9c0/0x9c0 [i915]
      <4>[  227.152523]  ? lock_acquire+0x13d/0x390
      <4>[  227.152527]  ? lock_acquire+0x13d/0x390
      <4>[  227.152534]  process_one_work+0x71a/0x1460
      <4>[  227.152540]  ? __schedule+0x815/0x1e20
      <4>[  227.152547]  ? pwq_dec_nr_in_flight+0x2b0/0x2b0
      <4>[  227.152553]  ? _raw_spin_lock_irq+0xa/0x40
      <4>[  227.152559]  worker_thread+0xdf/0xf60
      <4>[  227.152569]  ? process_one_work+0x1460/0x1460
      <4>[  227.152573]  kthread+0x2cf/0x3c0
      <4>[  227.152578]  ? _kthread_create_on_node+0xa0/0xa0
      <4>[  227.152583]  ret_from_fork+0x3a/0x50
      <4>[  227.152591] Code: c6 00 11 86 c0 48 c7 c7 e0 bd 85 c0 e8 60 e7 a9 c4 0f ff e9 1f fe ff ff 48 c7 c6 40 10 86 c0 48 c7 c7 e0 ca 85 c0 e8 2b 95 bd c4 <0f> 0b 48 89 ef e8 4c 44 e8 c4 e9 ef fd ff ff e8 42 44 e8 c4 e9
      <1>[  227.152720] RIP: intel_unpin_fb_vma+0x23a/0x2a0 [i915] RSP: ffff88005aad7b68
      
      v2: i915_vma_pin_fence() is a no-op if a fence isn't required, so check
      vma->fence as well.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180220134208.24988-2-chris@chris-wilson.co.uk
      5935485f
    • C
      drm/i915: Also check view->type for a normal GGTT view · ac87a6fd
      Chris Wilson 提交于
      We cannot simply use !view as shorthand for all normal GGTT views as a
      few callers will always populate a i915_ggtt_view struct and set the
      type to NORMAL instead. So check for (!view || view->type == NORMAL)
      inside i915_gem_object_ggtt_pin().
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180220134208.24988-1-chris@chris-wilson.co.uk
      ac87a6fd
  7. 20 2月, 2018 1 次提交
    • C
      drm/i915: Track number of pending freed objects · c9c70471
      Chris Wilson 提交于
      During igt, we frequently call into the driver to reset both HW and
      driver state (idling the device, waiting for it to become idle and
      freeing off old objects) to ensure that we start each test/subtest/pass
      from known state. This process incurs an RCU barrier or two to ensure
      that any such pending frees are indeed flushed before we return.
      However, unconditionally waiting on the RCU barrier adds needless delay
      to many callers, which adds up to several seconds when repeated thousands
      of times. We can skip the rcu_barrier() if by tracking how many outstanding
      frees we have, we know there are none.
      
      The same path is used along suspend, where we may be able to save the
      unconditional RCU barrier.
      
      To put it into perspective with a completely meaningless
      microbenchmark, igt/gem_sync/idle is improved from 50ms to 30us on bdw.
      
      v2: Remove the extra synchronize_rcu() inside i915_drop_caches_set()
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180219220631.25001-1-chris@chris-wilson.co.uk
      c9c70471
  8. 16 2月, 2018 1 次提交
  9. 10 2月, 2018 1 次提交
  10. 08 2月, 2018 3 次提交
  11. 07 2月, 2018 1 次提交
  12. 05 2月, 2018 4 次提交
  13. 01 2月, 2018 2 次提交
  14. 31 1月, 2018 1 次提交
    • C
      drm/i915: Always run hangcheck while the GPU is busy · 88923048
      Chris Wilson 提交于
      Previously, we relied on only running the hangcheck while somebody was
      waiting on the GPU, in order to minimise the amount of time hangcheck
      had to run. (If nobody was watching the GPU, nobody would notice if the
      GPU wasn't responding -- eventually somebody would care and so kick
      hangcheck into action.) However, this falls apart from around commit
      4680816b ("drm/i915: Wait first for submission, before waiting for
      request completion"), as not all waiters declare themselves to hangcheck
      and so we could switch off hangcheck and miss GPU hangs even when
      waiting under the struct_mutex.
      
      If we enable hangcheck from the first request submission, and let it run
      until the GPU is idle again, we forgo all the complexity involved with
      only enabling around waiters. We just have to remember to be careful that
      we do not declare a GPU hang when idly waiting for the next request to
      be come ready, as we will run hangcheck continuously even when the
      engines are stalled waiting for external events. This should be true
      already as we should only be tracking requests submitted to hardware for
      execution as an indicator that the engine is busy.
      
      Fixes: 4680816b ("drm/i915: Wait first for submission, before waiting for request completion"
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104840Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180129144104.3921-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      88923048
  15. 25 1月, 2018 1 次提交
    • S
      drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex · 70deeadd
      Sagar Arun Kamble 提交于
      This patch fixes lockdep issue due to circular locking dependency of
      struct_mutex, i_mutex_key, mmap_sem, relay_channels_mutex.
      For GuC log relay channel we create debugfs file that requires i_mutex_key
      lock and we are doing that under struct_mutex. So we introduced newer
      dependency as:
          &dev->struct_mutex --> &sb->s_type->i_mutex_key#3 --> &mm->mmap_sem
      However, there is dependency from mmap_sem to struct_mutex. Hence we
      separate the relay create/destroy operation from under struct_mutex.
      Also added runtime check of relay buffer status.
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.15.0-rc6-CI-Patchwork_7614+ #1 Not tainted
      ------------------------------------------------------
      debugfs_test/1388 is trying to acquire lock:
       (&dev->struct_mutex){+.+.}, at: [<00000000d5e1d915>] i915_mutex_lock_interruptible+0x47/0x130 [i915]
      
      but task is already holding lock:
       (&mm->mmap_sem){++++}, at: [<0000000029a9c131>] __do_page_fault+0x106/0x560
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #3 (&mm->mmap_sem){++++}:
             _copy_to_user+0x1e/0x70
             filldir+0x8c/0xf0
             dcache_readdir+0xeb/0x160
             iterate_dir+0xdc/0x140
             SyS_getdents+0xa0/0x130
             entry_SYSCALL_64_fastpath+0x1c/0x89
      
      -> #2 (&sb->s_type->i_mutex_key#3){++++}:
             start_creating+0x59/0x110
             __debugfs_create_file+0x2e/0xe0
             relay_create_buf_file+0x62/0x80
             relay_late_setup_files+0x84/0x250
             guc_log_late_setup+0x4f/0x110 [i915]
             i915_guc_log_register+0x32/0x40 [i915]
             i915_driver_load+0x7b6/0x1720 [i915]
             i915_pci_probe+0x2e/0x90 [i915]
             pci_device_probe+0x9c/0x120
             driver_probe_device+0x2a3/0x480
             __driver_attach+0xd9/0xe0
             bus_for_each_dev+0x57/0x90
             bus_add_driver+0x168/0x260
             driver_register+0x52/0xc0
             do_one_initcall+0x39/0x150
             do_init_module+0x56/0x1ef
             load_module+0x231c/0x2d70
             SyS_finit_module+0xa5/0xe0
             entry_SYSCALL_64_fastpath+0x1c/0x89
      
      -> #1 (relay_channels_mutex){+.+.}:
             relay_open+0x12c/0x2b0
             intel_guc_log_runtime_create+0xab/0x230 [i915]
             intel_guc_init+0x81/0x120 [i915]
             intel_uc_init+0x29/0xa0 [i915]
             i915_gem_init+0x182/0x530 [i915]
             i915_driver_load+0xaa9/0x1720 [i915]
             i915_pci_probe+0x2e/0x90 [i915]
             pci_device_probe+0x9c/0x120
             driver_probe_device+0x2a3/0x480
             __driver_attach+0xd9/0xe0
             bus_for_each_dev+0x57/0x90
             bus_add_driver+0x168/0x260
             driver_register+0x52/0xc0
             do_one_initcall+0x39/0x150
             do_init_module+0x56/0x1ef
             load_module+0x231c/0x2d70
             SyS_finit_module+0xa5/0xe0
             entry_SYSCALL_64_fastpath+0x1c/0x89
      
      -> #0 (&dev->struct_mutex){+.+.}:
             __mutex_lock+0x81/0x9b0
             i915_mutex_lock_interruptible+0x47/0x130 [i915]
             i915_gem_fault+0x201/0x790 [i915]
             __do_fault+0x15/0x70
             __handle_mm_fault+0x677/0xdc0
             handle_mm_fault+0x14f/0x2f0
             __do_page_fault+0x2d1/0x560
             page_fault+0x4c/0x60
      
      other info that might help us debug this:
      
      Chain exists of:
        &dev->struct_mutex --> &sb->s_type->i_mutex_key#3 --> &mm->mmap_sem
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&mm->mmap_sem);
                                     lock(&sb->s_type->i_mutex_key#3);
                                     lock(&mm->mmap_sem);
        lock(&dev->struct_mutex);
      
       *** DEADLOCK ***
      
      1 lock held by debugfs_test/1388:
       #0:  (&mm->mmap_sem){++++}, at: [<0000000029a9c131>] __do_page_fault+0x106/0x560
      
      stack backtrace:
      CPU: 2 PID: 1388 Comm: debugfs_test Not tainted 4.15.0-rc6-CI-Patchwork_7614+ #1
      Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.10 09/29/2016
      Call Trace:
       dump_stack+0x5f/0x86
       print_circular_bug.isra.18+0x1d0/0x2c0
       __lock_acquire+0x14ae/0x1b60
       ? lock_acquire+0xaf/0x200
       lock_acquire+0xaf/0x200
       ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
       __mutex_lock+0x81/0x9b0
       ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
       ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
       ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
       i915_mutex_lock_interruptible+0x47/0x130 [i915]
       ? __pm_runtime_resume+0x4f/0x80
       i915_gem_fault+0x201/0x790 [i915]
       __do_fault+0x15/0x70
       ? _raw_spin_unlock+0x29/0x40
       __handle_mm_fault+0x677/0xdc0
       handle_mm_fault+0x14f/0x2f0
       __do_page_fault+0x2d1/0x560
       ? page_fault+0x36/0x60
       page_fault+0x4c/0x60
      
      v2: Added lock protection to guc->log.runtime.relay_chan (Chris)
          Fixed locking inside guc_flush_logs uncovered by new lockdep.
      
      v3: Locking guc_read_update_log_buffer entirely with relay_lock. (Chris)
          Prepared intel_guc_init_early. Moved relay_lock inside relay_create
          relay_destroy, relay_file_create, guc_read_update_log_buffer. (Michal)
          Removed struct_mutex lock around guc_log_flush and removed usage
          of guc_log_has_relay() from runtime_create path as it needs
          struct_mutex lock.
      
      v4: Handle NULL relay sub buffer pointer earlier in read_update_log_buffer
          (Chris). Fixed comment suffix **/. (Michal)
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104693
      Testcase: igt/debugfs_test/read_all_entries # with enable_guc=1 and guc_log_level=1
      Signed-off-by: NSagar Arun Kamble <sagar.a.kamble@intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Marta Lofstedt <marta.lofstedt@intel.com>
      Cc: Michal Winiarski <michal.winiarski@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/1516808821-3638-3-git-send-email-sagar.a.kamble@intel.com
      70deeadd
  16. 24 1月, 2018 1 次提交
  17. 19 1月, 2018 1 次提交
    • C
      drm/i915: Avoid waitboosting on the active request · e9af4ea2
      Chris Wilson 提交于
      Watching a light workload on Baytrail (running glxgears and a 1080p
      decode), instead of the system remaining at low frequency, the glxgears
      would regularly trigger waitboosting after which it would have to spend
      a few seconds throttling back down. In this case, the waitboosting is
      counter productive as the minimal wait for glxgears doesn't prevent it
      from functioning correctly and delivering frames on time. In this case,
      glxgears happens to almost always be waiting on the current request,
      which we already expect to complete quickly (see i915_spin_request) and
      so avoiding the waitboost on the active request and spinning instead
      provides the best latency without overcommitting to upclocking.
      However, if the system falls behind we still force the waitboost.
      Similarly, we will also trigger upclocking if we detect the system is
      not delivering frames on time - again using a mechanism that tries to
      detect a miss and not preemptively upclock.
      
      v2: Also skip boosting for after missed vblank if the desired request is
      already active.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180118131609.16574-1-chris@chris-wilson.co.uk
      e9af4ea2
  18. 16 1月, 2018 2 次提交
  19. 11 1月, 2018 1 次提交
    • C
      drm/i915: Don't adjust priority on an already signaled fence · 5005c851
      Chris Wilson 提交于
      When we retire a signaled fence, we free the dependency tree. However,
      we skip clearing the list so that if we then try to adjust the priority
      of the signaled fence, we may walk the list of freed dependencies.
      
      [ 3083.156757] ==================================================================
      [ 3083.156806] BUG: KASAN: use-after-free in execlists_schedule+0x199/0x660 [i915]
      [ 3083.156810] Read of size 8 at addr ffff8806bf20f400 by task Xorg/831
      
      [ 3083.156815] CPU: 0 PID: 831 Comm: Xorg Not tainted 4.15.0-rc6-no-psn+ #1
      [ 3083.156817] Hardware name: Notebook                         N24_25BU/N24_25BU, BIOS 5.12 02/17/2017
      [ 3083.156818] Call Trace:
      [ 3083.156823]  dump_stack+0x5c/0x7a
      [ 3083.156827]  print_address_description+0x6b/0x290
      [ 3083.156830]  kasan_report+0x28f/0x380
      [ 3083.156872]  ? execlists_schedule+0x199/0x660 [i915]
      [ 3083.156914]  execlists_schedule+0x199/0x660 [i915]
      [ 3083.156956]  ? intel_crtc_atomic_check+0x146/0x4e0 [i915]
      [ 3083.156997]  ? execlists_submit_request+0xe0/0xe0 [i915]
      [ 3083.157038]  ? i915_vma_misplaced.part.4+0x25/0xb0 [i915]
      [ 3083.157079]  ? __i915_vma_do_pin+0x7c8/0xc80 [i915]
      [ 3083.157121]  ? intel_atomic_state_alloc+0x44/0x60 [i915]
      [ 3083.157130]  ? drm_atomic_helper_page_flip+0x3e/0xb0 [drm_kms_helper]
      [ 3083.157145]  ? drm_mode_page_flip_ioctl+0x7d2/0x850 [drm]
      [ 3083.157159]  ? drm_ioctl_kernel+0xa7/0xf0 [drm]
      [ 3083.157172]  ? drm_ioctl+0x45b/0x560 [drm]
      [ 3083.157211]  i915_gem_object_wait_priority+0x14c/0x2c0 [i915]
      [ 3083.157251]  ? i915_gem_get_aperture_ioctl+0x150/0x150 [i915]
      [ 3083.157290]  ? i915_vma_pin_fence+0x1d8/0x320 [i915]
      [ 3083.157331]  ? intel_pin_and_fence_fb_obj+0x175/0x250 [i915]
      [ 3083.157372]  ? intel_rotation_info_size+0x60/0x60 [i915]
      [ 3083.157413]  ? intel_link_compute_m_n+0x80/0x80 [i915]
      [ 3083.157428]  ? drm_dev_printk+0x1b0/0x1b0 [drm]
      [ 3083.157443]  ? drm_dev_printk+0x1b0/0x1b0 [drm]
      [ 3083.157485]  intel_prepare_plane_fb+0x2f8/0x5a0 [i915]
      [ 3083.157527]  ? intel_crtc_get_vblank_counter+0x80/0x80 [i915]
      [ 3083.157536]  drm_atomic_helper_prepare_planes+0xa0/0x1c0 [drm_kms_helper]
      [ 3083.157587]  intel_atomic_commit+0x12e/0x4e0 [i915]
      [ 3083.157605]  drm_atomic_helper_page_flip+0xa2/0xb0 [drm_kms_helper]
      [ 3083.157621]  drm_mode_page_flip_ioctl+0x7d2/0x850 [drm]
      [ 3083.157638]  ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
      [ 3083.157652]  ? drm_lease_owner+0x1a/0x30 [drm]
      [ 3083.157668]  ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
      [ 3083.157681]  drm_ioctl_kernel+0xa7/0xf0 [drm]
      [ 3083.157696]  drm_ioctl+0x45b/0x560 [drm]
      [ 3083.157711]  ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
      [ 3083.157725]  ? drm_getstats+0x20/0x20 [drm]
      [ 3083.157729]  ? timerqueue_del+0x49/0x80
      [ 3083.157732]  ? __remove_hrtimer+0x62/0xb0
      [ 3083.157735]  ? hrtimer_try_to_cancel+0x173/0x210
      [ 3083.157738]  do_vfs_ioctl+0x13b/0x880
      [ 3083.157741]  ? ioctl_preallocate+0x140/0x140
      [ 3083.157744]  ? _raw_spin_unlock_irq+0xe/0x30
      [ 3083.157746]  ? do_setitimer+0x234/0x370
      [ 3083.157750]  ? SyS_setitimer+0x19e/0x1b0
      [ 3083.157752]  ? SyS_alarm+0x140/0x140
      [ 3083.157755]  ? __rcu_read_unlock+0x66/0x80
      [ 3083.157757]  ? __fget+0xc4/0x100
      [ 3083.157760]  SyS_ioctl+0x74/0x80
      [ 3083.157763]  entry_SYSCALL_64_fastpath+0x1a/0x7d
      [ 3083.157765] RIP: 0033:0x7f6135d0c6a7
      [ 3083.157767] RSP: 002b:00007fff01451888 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
      [ 3083.157769] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f6135d0c6a7
      [ 3083.157771] RDX: 00007fff01451950 RSI: 00000000c01864b0 RDI: 000000000000000c
      [ 3083.157772] RBP: 00007f613076f600 R08: 0000000000000001 R09: 0000000000000000
      [ 3083.157773] R10: 0000000000000060 R11: 0000000000003246 R12: 0000000000000000
      [ 3083.157774] R13: 0000000000000060 R14: 000000000000001b R15: 0000000000000060
      
      [ 3083.157779] Allocated by task 831:
      [ 3083.157783]  kmem_cache_alloc+0xc0/0x200
      [ 3083.157822]  i915_gem_request_await_dma_fence+0x2c4/0x5d0 [i915]
      [ 3083.157861]  i915_gem_request_await_object+0x321/0x370 [i915]
      [ 3083.157900]  i915_gem_do_execbuffer+0x1165/0x19c0 [i915]
      [ 3083.157937]  i915_gem_execbuffer2+0x1ad/0x550 [i915]
      [ 3083.157950]  drm_ioctl_kernel+0xa7/0xf0 [drm]
      [ 3083.157962]  drm_ioctl+0x45b/0x560 [drm]
      [ 3083.157964]  do_vfs_ioctl+0x13b/0x880
      [ 3083.157966]  SyS_ioctl+0x74/0x80
      [ 3083.157968]  entry_SYSCALL_64_fastpath+0x1a/0x7d
      
      [ 3083.157971] Freed by task 831:
      [ 3083.157973]  kmem_cache_free+0x77/0x220
      [ 3083.158012]  i915_gem_request_retire+0x72c/0xa70 [i915]
      [ 3083.158051]  i915_gem_request_alloc+0x1e9/0x8b0 [i915]
      [ 3083.158089]  i915_gem_do_execbuffer+0xa96/0x19c0 [i915]
      [ 3083.158127]  i915_gem_execbuffer2+0x1ad/0x550 [i915]
      [ 3083.158140]  drm_ioctl_kernel+0xa7/0xf0 [drm]
      [ 3083.158153]  drm_ioctl+0x45b/0x560 [drm]
      [ 3083.158155]  do_vfs_ioctl+0x13b/0x880
      [ 3083.158156]  SyS_ioctl+0x74/0x80
      [ 3083.158158]  entry_SYSCALL_64_fastpath+0x1a/0x7d
      
      [ 3083.158162] The buggy address belongs to the object at ffff8806bf20f400
                      which belongs to the cache i915_dependency of size 64
      [ 3083.158166] The buggy address is located 0 bytes inside of
                      64-byte region [ffff8806bf20f400, ffff8806bf20f440)
      [ 3083.158168] The buggy address belongs to the page:
      [ 3083.158171] page:00000000d43decc4 count:1 mapcount:0 mapping:          (null) index:0x0
      [ 3083.158174] flags: 0x17ffe0000000100(slab)
      [ 3083.158179] raw: 017ffe0000000100 0000000000000000 0000000000000000 0000000180200020
      [ 3083.158182] raw: ffffea001afc16c0 0000000500000005 ffff880731b881c0 0000000000000000
      [ 3083.158184] page dumped because: kasan: bad access detected
      
      [ 3083.158187] Memory state around the buggy address:
      [ 3083.158190]  ffff8806bf20f300: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158192]  ffff8806bf20f380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158195] >ffff8806bf20f400: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158196]                    ^
      [ 3083.158199]  ffff8806bf20f480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158201]  ffff8806bf20f500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158203] ==================================================================
      Reported-by: NAlexandru Chirvasitu <achirvasub@gmail.com>
      Reported-by: NMike Keehan <mike@keehan.net>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104436
      Fixes: 1f181225 ("drm/i915/execlists: Keep request->priority for its lifetime")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Tested-by: NAlexandru Chirvasitu <achirvasub@gmail.com>
      Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180106105618.13532-1-chris@chris-wilson.co.uk
      (cherry picked from commit c218ee03)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      5005c851
  20. 10 1月, 2018 1 次提交
  21. 08 1月, 2018 1 次提交
    • C
      drm/i915: Don't adjust priority on an already signaled fence · c218ee03
      Chris Wilson 提交于
      When we retire a signaled fence, we free the dependency tree. However,
      we skip clearing the list so that if we then try to adjust the priority
      of the signaled fence, we may walk the list of freed dependencies.
      
      [ 3083.156757] ==================================================================
      [ 3083.156806] BUG: KASAN: use-after-free in execlists_schedule+0x199/0x660 [i915]
      [ 3083.156810] Read of size 8 at addr ffff8806bf20f400 by task Xorg/831
      
      [ 3083.156815] CPU: 0 PID: 831 Comm: Xorg Not tainted 4.15.0-rc6-no-psn+ #1
      [ 3083.156817] Hardware name: Notebook                         N24_25BU/N24_25BU, BIOS 5.12 02/17/2017
      [ 3083.156818] Call Trace:
      [ 3083.156823]  dump_stack+0x5c/0x7a
      [ 3083.156827]  print_address_description+0x6b/0x290
      [ 3083.156830]  kasan_report+0x28f/0x380
      [ 3083.156872]  ? execlists_schedule+0x199/0x660 [i915]
      [ 3083.156914]  execlists_schedule+0x199/0x660 [i915]
      [ 3083.156956]  ? intel_crtc_atomic_check+0x146/0x4e0 [i915]
      [ 3083.156997]  ? execlists_submit_request+0xe0/0xe0 [i915]
      [ 3083.157038]  ? i915_vma_misplaced.part.4+0x25/0xb0 [i915]
      [ 3083.157079]  ? __i915_vma_do_pin+0x7c8/0xc80 [i915]
      [ 3083.157121]  ? intel_atomic_state_alloc+0x44/0x60 [i915]
      [ 3083.157130]  ? drm_atomic_helper_page_flip+0x3e/0xb0 [drm_kms_helper]
      [ 3083.157145]  ? drm_mode_page_flip_ioctl+0x7d2/0x850 [drm]
      [ 3083.157159]  ? drm_ioctl_kernel+0xa7/0xf0 [drm]
      [ 3083.157172]  ? drm_ioctl+0x45b/0x560 [drm]
      [ 3083.157211]  i915_gem_object_wait_priority+0x14c/0x2c0 [i915]
      [ 3083.157251]  ? i915_gem_get_aperture_ioctl+0x150/0x150 [i915]
      [ 3083.157290]  ? i915_vma_pin_fence+0x1d8/0x320 [i915]
      [ 3083.157331]  ? intel_pin_and_fence_fb_obj+0x175/0x250 [i915]
      [ 3083.157372]  ? intel_rotation_info_size+0x60/0x60 [i915]
      [ 3083.157413]  ? intel_link_compute_m_n+0x80/0x80 [i915]
      [ 3083.157428]  ? drm_dev_printk+0x1b0/0x1b0 [drm]
      [ 3083.157443]  ? drm_dev_printk+0x1b0/0x1b0 [drm]
      [ 3083.157485]  intel_prepare_plane_fb+0x2f8/0x5a0 [i915]
      [ 3083.157527]  ? intel_crtc_get_vblank_counter+0x80/0x80 [i915]
      [ 3083.157536]  drm_atomic_helper_prepare_planes+0xa0/0x1c0 [drm_kms_helper]
      [ 3083.157587]  intel_atomic_commit+0x12e/0x4e0 [i915]
      [ 3083.157605]  drm_atomic_helper_page_flip+0xa2/0xb0 [drm_kms_helper]
      [ 3083.157621]  drm_mode_page_flip_ioctl+0x7d2/0x850 [drm]
      [ 3083.157638]  ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
      [ 3083.157652]  ? drm_lease_owner+0x1a/0x30 [drm]
      [ 3083.157668]  ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
      [ 3083.157681]  drm_ioctl_kernel+0xa7/0xf0 [drm]
      [ 3083.157696]  drm_ioctl+0x45b/0x560 [drm]
      [ 3083.157711]  ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
      [ 3083.157725]  ? drm_getstats+0x20/0x20 [drm]
      [ 3083.157729]  ? timerqueue_del+0x49/0x80
      [ 3083.157732]  ? __remove_hrtimer+0x62/0xb0
      [ 3083.157735]  ? hrtimer_try_to_cancel+0x173/0x210
      [ 3083.157738]  do_vfs_ioctl+0x13b/0x880
      [ 3083.157741]  ? ioctl_preallocate+0x140/0x140
      [ 3083.157744]  ? _raw_spin_unlock_irq+0xe/0x30
      [ 3083.157746]  ? do_setitimer+0x234/0x370
      [ 3083.157750]  ? SyS_setitimer+0x19e/0x1b0
      [ 3083.157752]  ? SyS_alarm+0x140/0x140
      [ 3083.157755]  ? __rcu_read_unlock+0x66/0x80
      [ 3083.157757]  ? __fget+0xc4/0x100
      [ 3083.157760]  SyS_ioctl+0x74/0x80
      [ 3083.157763]  entry_SYSCALL_64_fastpath+0x1a/0x7d
      [ 3083.157765] RIP: 0033:0x7f6135d0c6a7
      [ 3083.157767] RSP: 002b:00007fff01451888 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
      [ 3083.157769] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f6135d0c6a7
      [ 3083.157771] RDX: 00007fff01451950 RSI: 00000000c01864b0 RDI: 000000000000000c
      [ 3083.157772] RBP: 00007f613076f600 R08: 0000000000000001 R09: 0000000000000000
      [ 3083.157773] R10: 0000000000000060 R11: 0000000000003246 R12: 0000000000000000
      [ 3083.157774] R13: 0000000000000060 R14: 000000000000001b R15: 0000000000000060
      
      [ 3083.157779] Allocated by task 831:
      [ 3083.157783]  kmem_cache_alloc+0xc0/0x200
      [ 3083.157822]  i915_gem_request_await_dma_fence+0x2c4/0x5d0 [i915]
      [ 3083.157861]  i915_gem_request_await_object+0x321/0x370 [i915]
      [ 3083.157900]  i915_gem_do_execbuffer+0x1165/0x19c0 [i915]
      [ 3083.157937]  i915_gem_execbuffer2+0x1ad/0x550 [i915]
      [ 3083.157950]  drm_ioctl_kernel+0xa7/0xf0 [drm]
      [ 3083.157962]  drm_ioctl+0x45b/0x560 [drm]
      [ 3083.157964]  do_vfs_ioctl+0x13b/0x880
      [ 3083.157966]  SyS_ioctl+0x74/0x80
      [ 3083.157968]  entry_SYSCALL_64_fastpath+0x1a/0x7d
      
      [ 3083.157971] Freed by task 831:
      [ 3083.157973]  kmem_cache_free+0x77/0x220
      [ 3083.158012]  i915_gem_request_retire+0x72c/0xa70 [i915]
      [ 3083.158051]  i915_gem_request_alloc+0x1e9/0x8b0 [i915]
      [ 3083.158089]  i915_gem_do_execbuffer+0xa96/0x19c0 [i915]
      [ 3083.158127]  i915_gem_execbuffer2+0x1ad/0x550 [i915]
      [ 3083.158140]  drm_ioctl_kernel+0xa7/0xf0 [drm]
      [ 3083.158153]  drm_ioctl+0x45b/0x560 [drm]
      [ 3083.158155]  do_vfs_ioctl+0x13b/0x880
      [ 3083.158156]  SyS_ioctl+0x74/0x80
      [ 3083.158158]  entry_SYSCALL_64_fastpath+0x1a/0x7d
      
      [ 3083.158162] The buggy address belongs to the object at ffff8806bf20f400
                      which belongs to the cache i915_dependency of size 64
      [ 3083.158166] The buggy address is located 0 bytes inside of
                      64-byte region [ffff8806bf20f400, ffff8806bf20f440)
      [ 3083.158168] The buggy address belongs to the page:
      [ 3083.158171] page:00000000d43decc4 count:1 mapcount:0 mapping:          (null) index:0x0
      [ 3083.158174] flags: 0x17ffe0000000100(slab)
      [ 3083.158179] raw: 017ffe0000000100 0000000000000000 0000000000000000 0000000180200020
      [ 3083.158182] raw: ffffea001afc16c0 0000000500000005 ffff880731b881c0 0000000000000000
      [ 3083.158184] page dumped because: kasan: bad access detected
      
      [ 3083.158187] Memory state around the buggy address:
      [ 3083.158190]  ffff8806bf20f300: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158192]  ffff8806bf20f380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158195] >ffff8806bf20f400: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158196]                    ^
      [ 3083.158199]  ffff8806bf20f480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158201]  ffff8806bf20f500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [ 3083.158203] ==================================================================
      Reported-by: NAlexandru Chirvasitu <achirvasub@gmail.com>
      Reported-by: NMike Keehan <mike@keehan.net>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104436
      Fixes: 1f181225 ("drm/i915/execlists: Keep request->priority for its lifetime")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Tested-by: NAlexandru Chirvasitu <achirvasub@gmail.com>
      Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180106105618.13532-1-chris@chris-wilson.co.uk
      c218ee03
  22. 19 12月, 2017 1 次提交
  23. 18 12月, 2017 1 次提交
  24. 16 12月, 2017 1 次提交
  25. 14 12月, 2017 4 次提交
    • C
      drm/i915: Flush pending GTT writes before unbinding · 2797c4a1
      Chris Wilson 提交于
      From the shrinker paths, we want to relinquish the GPU and GGTT access to
      the object, releasing the backing storage back to the system for
      swapout. As a part of that process we would unpin the pages, marking
      them for access by the CPU (for the swapout/swapin). However, if that
      process was interrupted after unbind the vma, we missed a flush of the
      inflight GGTT writes before we made that GTT space available again for
      reuse, with the prospect that we would redirect them to another page.
      
      The bug dates back to the introduction of multiple GGTT vma, but the
      code itself dates to commit 02bef8f9 ("drm/i915: Unbind closed vma
      for i915_gem_object_unbind()").
      
      Fixes: 02bef8f9 ("drm/i915: Unbind closed vma for i915_gem_object_unbind()")
      Fixes: c5ad54cf ("drm/i915: Use partial view in mmap fault handler")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171204132513.7303-1-chris@chris-wilson.co.uk
      (cherry picked from commit 5888fc9e)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      2797c4a1
    • M
      drm/i915/guc: Extract guc_init from guc_init_hw · 61b5c158
      Michał Winiarski 提交于
      After GPU reset, GuC HW needs to be reinitialized (with FW reload).
      Unfortunately, we're doing some extra work there (mostly allocating stuff),
      work that can be moved to guc_init and called once at driver load time.
      
      As a side effect we're no longer hitting an assert in
      i915_ggtt_enable_guc on suspend/resume.
      
      v2: Do not duplicate disable_communication / reset_guc_interrupts
      v3: Add proper teardown after rebase
      
      References: 04f7b24e ("drm/i915/guc: Assert that we switch between known ggtt->invalidate functions")
      Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171213221352.7173-3-michal.winiarski@intel.com
      61b5c158
    • M
      drm/i915/guc: Move GuC workqueue allocations outside of the mutex · 3176ff49
      Michał Winiarski 提交于
      This gets rid of the following lockdep splat:
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.15.0-rc2-CI-Patchwork_7428+ #1 Not tainted
      ------------------------------------------------------
      debugfs_test/1351 is trying to acquire lock:
       (&dev->struct_mutex){+.+.}, at: [<000000009d90d1a3>] i915_mutex_lock_interruptible+0x47/0x130 [i915]
      
      but task is already holding lock:
       (&mm->mmap_sem){++++}, at: [<000000005df01c1e>] __do_page_fault+0x106/0x560
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #6 (&mm->mmap_sem){++++}:
             __might_fault+0x63/0x90
             _copy_to_user+0x1e/0x70
             filldir+0x8c/0xf0
             dcache_readdir+0xeb/0x160
             iterate_dir+0xe6/0x150
             SyS_getdents+0xa0/0x130
             entry_SYSCALL_64_fastpath+0x1c/0x89
      
      -> #5 (&sb->s_type->i_mutex_key#5){++++}:
             lockref_get+0x9/0x20
      
      -> #4 ((completion)&req.done){+.+.}:
             wait_for_common+0x54/0x210
             devtmpfs_create_node+0x130/0x150
             device_add+0x5ad/0x5e0
             device_create_groups_vargs+0xd4/0xe0
             device_create+0x35/0x40
             msr_device_create+0x22/0x40
             cpuhp_invoke_callback+0xc5/0xbf0
             cpuhp_thread_fun+0x167/0x210
             smpboot_thread_fn+0x17f/0x270
             kthread+0x173/0x1b0
             ret_from_fork+0x24/0x30
      
      -> #3 (cpuhp_state-up){+.+.}:
             cpuhp_issue_call+0x132/0x1c0
             __cpuhp_setup_state_cpuslocked+0x12f/0x2a0
             __cpuhp_setup_state+0x3a/0x50
             page_writeback_init+0x3a/0x5c
             start_kernel+0x393/0x3e2
             secondary_startup_64+0xa5/0xb0
      
      -> #2 (cpuhp_state_mutex){+.+.}:
             __mutex_lock+0x81/0x9b0
             __cpuhp_setup_state_cpuslocked+0x4b/0x2a0
             __cpuhp_setup_state+0x3a/0x50
             page_alloc_init+0x1f/0x26
             start_kernel+0x139/0x3e2
             secondary_startup_64+0xa5/0xb0
      
      -> #1 (cpu_hotplug_lock.rw_sem){++++}:
             cpus_read_lock+0x34/0xa0
             apply_workqueue_attrs+0xd/0x40
             __alloc_workqueue_key+0x2c7/0x4e1
             intel_guc_submission_init+0x10c/0x650 [i915]
             intel_uc_init_hw+0x29e/0x460 [i915]
             i915_gem_init_hw+0xca/0x290 [i915]
             i915_gem_init+0x115/0x3a0 [i915]
             i915_driver_load+0x9a8/0x16c0 [i915]
             i915_pci_probe+0x2e/0x90 [i915]
             pci_device_probe+0x9c/0x120
             driver_probe_device+0x2a3/0x480
             __driver_attach+0xd9/0xe0
             bus_for_each_dev+0x57/0x90
             bus_add_driver+0x168/0x260
             driver_register+0x52/0xc0
             do_one_initcall+0x39/0x150
             do_init_module+0x56/0x1ef
             load_module+0x231c/0x2d70
             SyS_finit_module+0xa5/0xe0
             entry_SYSCALL_64_fastpath+0x1c/0x89
      
      -> #0 (&dev->struct_mutex){+.+.}:
             lock_acquire+0xaf/0x200
             __mutex_lock+0x81/0x9b0
             i915_mutex_lock_interruptible+0x47/0x130 [i915]
             i915_gem_fault+0x201/0x760 [i915]
             __do_fault+0x15/0x70
             __handle_mm_fault+0x85b/0xe40
             handle_mm_fault+0x14f/0x2f0
             __do_page_fault+0x2d1/0x560
             page_fault+0x22/0x30
      
      other info that might help us debug this:
      
      Chain exists of:
        &dev->struct_mutex --> &sb->s_type->i_mutex_key#5 --> &mm->mmap_sem
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&mm->mmap_sem);
                                     lock(&sb->s_type->i_mutex_key#5);
                                     lock(&mm->mmap_sem);
        lock(&dev->struct_mutex);
      
       *** DEADLOCK ***
      
      1 lock held by debugfs_test/1351:
       #0:  (&mm->mmap_sem){++++}, at: [<000000005df01c1e>] __do_page_fault+0x106/0x560
      
      stack backtrace:
      CPU: 2 PID: 1351 Comm: debugfs_test Not tainted 4.15.0-rc2-CI-Patchwork_7428+ #1
      Hardware name:                  /NUC6i5SYB, BIOS SYSKLi35.86A.0057.2017.0119.1758 01/19/2017
      Call Trace:
       dump_stack+0x5f/0x86
       print_circular_bug+0x230/0x3b0
       check_prev_add+0x439/0x7b0
       ? lockdep_init_map_crosslock+0x20/0x20
       ? unwind_get_return_address+0x16/0x30
       ? __lock_acquire+0x1385/0x15a0
       __lock_acquire+0x1385/0x15a0
       lock_acquire+0xaf/0x200
       ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
       __mutex_lock+0x81/0x9b0
       ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
       ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
       ? i915_mutex_lock_interruptible+0x47/0x130 [i915]
       i915_mutex_lock_interruptible+0x47/0x130 [i915]
       ? __pm_runtime_resume+0x4f/0x80
       i915_gem_fault+0x201/0x760 [i915]
       __do_fault+0x15/0x70
       __handle_mm_fault+0x85b/0xe40
       handle_mm_fault+0x14f/0x2f0
       __do_page_fault+0x2d1/0x560
       page_fault+0x22/0x30
      RIP: 0033:0x7f98d6f49116
      RSP: 002b:00007ffd6ffc3278 EFLAGS: 00010283
      RAX: 00007f98d39a2bc0 RBX: 0000000000000000 RCX: 0000000000001680
      RDX: 0000000000001680 RSI: 00007ffd6ffc3400 RDI: 00007f98d39a2bc0
      RBP: 00007ffd6ffc33a0 R08: 0000000000000000 R09: 00000000000005a0
      R10: 000055e847c2a830 R11: 0000000000000002 R12: 0000000000000001
      R13: 000055e847c1d040 R14: 00007ffd6ffc3400 R15: 00007f98d6752ba0
      
      v2: Init preempt_work unconditionally (Chris)
      v3: Mention that we need the enable_guc=1 for lockdep splat (Chris)
      
      Testcase: igt/debugfs_test/read_all_entries # with i915.enable_guc=1
      Signed-off-by: NMichał Winiarski <michal.winiarski@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171213221352.7173-2-michal.winiarski@intel.com
      3176ff49
    • C
      drm/i915: Unwind i915_gem_init() failure · 6ca9a2be
      Chris Wilson 提交于
      Since Michal introduced new user controllable errors other than -EIO
      during i915_gem_init(), we need to actually unwind on the error path as
      we have to abort the module load (and we expect to do so cleanly!).
      
      As we now teardown key state and then mark the driver as wedged (on
      EIO), we have to be careful to not allow ourselves to resume and
      unwedge, thus attempting to use the uninitialised driver.
      
      v2: Try not to free driver state for the suppressed EIO
      v3: Use load-fault-injection to test both error/recovery paths.
      
      References: 8620eb1d ("drm/i915/uc: Don't use -EIO to report missing firmware")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
      Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171213134347.4608-1-chris@chris-wilson.co.uk
      6ca9a2be