1. 07 12月, 2016 1 次提交
  2. 06 12月, 2016 1 次提交
  3. 02 12月, 2016 4 次提交
  4. 30 11月, 2016 1 次提交
  5. 29 11月, 2016 1 次提交
  6. 23 11月, 2016 3 次提交
  7. 21 11月, 2016 5 次提交
    • M
      drm/i915: Wipe hang stats as an embedded struct · bc1d53c6
      Mika Kuoppala 提交于
      Bannable property, banned status, guilty and active counts are
      properties of i915_gem_context. Make them so.
      
      v2: rebase
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1479309634-28574-1-git-send-email-mika.kuoppala@intel.com
      bc1d53c6
    • M
      drm/i915: Add per client max context ban limit · b083a087
      Mika Kuoppala 提交于
      If we have a bad client submitting unfavourably across different
      contexts, creating new ones, the per context scoring of badness
      doesn't remove the root cause, the offending client.
      To counter, keep track of per client context bans. Deny access if
      client is responsible for more than 3 context bans in
      it's lifetime.
      
      v2: move ban check to context create ioctl (Chris)
      v3: add commentary about hangs needed to reach client ban (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      b083a087
    • M
      drm/i915: Add bannable context parameter · 84102171
      Mika Kuoppala 提交于
      Now when driver has per context scoring of 'hanging badness'
      and also subsequent hangs during short windows are allowed,
      if there is progress made in between, it does not make sense
      to expose a ban timing window as a context parameter anymore.
      
      Let the scoring be the sole indicator for ban policy and substitute
      ban period context parameter as a boolean to get/set context
      bannable property.
      
      v2: allow non root to opt into being banned (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Suggested-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      84102171
    • M
      drm/i915: Use request retirement as context progress · e5e1fc47
      Mika Kuoppala 提交于
      As hangcheck score was removed, the active decay of score
      was removed also. This removed feature for hangcheck to detect
      if the gpu client was accidentally or maliciously causing intermittent
      hangs. Reinstate the scoring as a per context property, so that if
      one context starts to act unfavourably, ban it.
      
      v2: ban_period_secs as a gate to score check (Chris)
      v3: decay in proper spot. scores as tunables (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      e5e1fc47
    • M
      drm/i915: Decouple hang detection from hangcheck period · 3fe3b030
      Mika Kuoppala 提交于
      Hangcheck state accumulation has gained more steps
      along the years, like head movement and more recently the
      subunit inactivity check. As the subunit sampling is only
      done if the previous state check showed inactivity, we
      have added more stages (and time) to reach a hang verdict.
      
      Asymmetric engine states led to different actual weight of
      'one hangcheck unit' and it was demonstrated in some
      hangs that due to difference in stages, simpler engines
      were accused falsely of a hang as their scoring was much
      more quicker to accumulate above the hang treshold.
      
      To completely decouple the hangcheck guilty score
      from the hangcheck period, convert hangcheck score to a
      rough period of inactivity measurement. As these are
      tracked as jiffies, they are meaningful also across
      reset boundaries. This makes finding a guilty engine
      more accurate across multi engine activity scenarios,
      especially across asymmetric engines.
      
      We lose the ability to detect cross batch malicious attempts
      to hinder the progress. Plan is to move this functionality
      to be part of context banning which is more natural fit,
      later in the series.
      
      v2: use time_before macros (Chris)
          reinstate the pardoning of moving engine after hc (Chris)
      v3: avoid global state for per engine stall detection (Chris)
      v4: take timeline last retirement into account (Chris)
      v5: do debug print on pardoning, split out retirement timestamp (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      3fe3b030
  8. 19 11月, 2016 3 次提交
  9. 18 11月, 2016 2 次提交
  10. 17 11月, 2016 4 次提交
  11. 15 11月, 2016 5 次提交
  12. 12 11月, 2016 2 次提交
    • C
      drm/i915: Stop skipping the final clflush back to system pages · 2b3c8317
      Chris Wilson 提交于
      When we release the shmem backing storage, we make sure that the pages
      are coherent with the cpu cache. However, our clflush routine was
      skipping the flush as the object had no pages at release time. Fix this by
      explicitly flushing the sg_table we are decoupling.
      
      Fixes: 03ac84f1 ("drm/i915: Pass around sg_table to get_pages/put_pages backend")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161111145809.9701-2-chris@chris-wilson.co.uk
      2b3c8317
    • C
      drm/i915: Only wait upon the execution timeline when unlocked · 9caa34aa
      Chris Wilson 提交于
      In order to walk the list of all timelines, we currently require the
      struct_mutex. We are sometimes called prior to the struct_mutex being
      taken by the caller (i.e !I915_WAIT_LOCKED) in which case we can only
      trust the global execution timelines (as these are owned by the device).
      This means in the unlocked phase we can only wait upon the currently
      executing requests and not all queued.
      
      [  175.743243] general protection fault: 0000 [#1] SMP
      [  175.743263] Modules linked in: nls_iso8859_1 intel_rapl x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iwlwifi aesni_intel aes_x86_64 lrw snd_soc_rt5640 gf128mul snd_soc_rl6231 snd_soc_core glue_helper snd_compress snd_pcm_dmaengine snd_hda_codec_hdmi ablk_helper snd_hda_codec_realtek cryptd snd_hda_codec_generic serio_raw cfg80211 snd_hda_intel snd_hda_codec ir_lirc_codec snd_hda_core lirc_dev snd_hwdep snd_pcm lpc_ich mei_me mei snd_seq_midi shpchp snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer rc_rc6_mce acpi_als nuvoton_cir kfifo_buf rc_core snd industrialio snd_soc_sst_acpi soundcore snd_soc_sst_match i2c_designware_platform 8250_dw i2c_designware_core dw_dmac spi_pxa2xx_platform mac_hid acpi_pad parport_pc ppdev lp parport
      [  175.743509]  autofs4 i915 e1000e psmouse ptp pps_core xhci_pci ehci_pci ahci xhci_hcd ehci_hcd libahci video sdhci_acpi sdhci i2c_hid hid
      [  175.743560] CPU: 2 PID: 2386 Comm: wtdg_monitor.sh Tainted: G     U          4.9.0-rc4-nightly+ #2
      [  175.743581] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0358.2016.0606.1423 06/06/2016
      [  175.743603] task: ffff88024509ba80 task.stack: ffffc9007bd18000
      [  175.743618] RIP: 0010:[<ffffffffa01af29b>]  [<ffffffffa01af29b>] i915_gem_wait_for_idle+0x3b/0x140 [i915]
      [  175.743660] RSP: 0000:ffffc9007bd1b9b8  EFLAGS: 00010297
      [  175.743674] RAX: ffff88024489d248 RBX: 0000000000000000 RCX: 0000000000000000
      [  175.743691] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880244898000
      [  175.743708] RBP: ffffc9007bd1b9f0 R08: 0000000000000000 R09: 0000000000000001
      [  175.743724] R10: 00000028eaf42792 R11: 0000000000000001 R12: dead000000000100
      [  175.743741] R13: dead000000000148 R14: ffffc9007bd1ba5f R15: 0000000000000005
      [  175.743758] FS:  00007f2638330700(0000) GS:ffff880256d00000(0000) knlGS:0000000000000000
      [  175.743777] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  175.743791] CR2: 00007f885c8cea40 CR3: 00000002416b5000 CR4: 00000000003406e0
      [  175.743808] Stack:
      [  175.743816]  ffff88024489d248 000000004509ba80 ffff880244898000 ffff88024509ba80
      [  175.743840]  00000000ffff8b69 ffffc9007bd1ba5f ffffc9007bd1ba5e ffffc9007bd1ba28
      [  175.743863]  ffffffffa01b661d 00000000ffffffff 0000000000000000 ffff880244898000
      [  175.743886] Call Trace:
      [  175.743906]  [<ffffffffa01b661d>] i915_gem_shrinker_lock_uninterruptible.constprop.5+0x5d/0xc0 [i915]
      [  175.743937]  [<ffffffffa01b6cd0>] i915_gem_shrinker_oom+0x30/0x1b0 [i915]
      [  175.743955]  [<ffffffff8109ca79>] notifier_call_chain+0x49/0x70
      [  175.743971]  [<ffffffff8109cd9d>] __blocking_notifier_call_chain+0x4d/0x70
      [  175.743988]  [<ffffffff8109cdd6>] blocking_notifier_call_chain+0x16/0x20
      [  175.744005]  [<ffffffff811885dc>] out_of_memory+0x22c/0x480
      [  175.744020]  [<ffffffff81205542>] __alloc_pages_slowpath+0x851/0x8ec
      [  175.744037]  [<ffffffff8118ca51>] __alloc_pages_nodemask+0x2c1/0x310
      [  175.744054]  [<ffffffff811d8ea8>] alloc_pages_current+0x88/0x120
      [  175.744070]  [<ffffffff811833a4>] __page_cache_alloc+0xb4/0xc0
      [  175.744086]  [<ffffffff811865ca>] filemap_fault+0x29a/0x500
      [  175.744101]  [<ffffffff81299aa6>] ext4_filemap_fault+0x36/0x50
      [  175.744117]  [<ffffffff811b3d4a>] __do_fault+0x6a/0xe0
      [  175.744131]  [<ffffffff811b97ee>] handle_mm_fault+0xd0e/0x1330
      [  175.744147]  [<ffffffff8106738c>] __do_page_fault+0x23c/0x4d0
      [  175.744162]  [<ffffffff81067650>] do_page_fault+0x30/0x80
      [  175.744177]  [<ffffffff817ffbe8>] page_fault+0x28/0x30
      [  175.744191] Code: 41 57 41 56 41 55 41 54 53 48 83 ec 10 4c 8b a7 48 52 00 00 89 75 d4 48 89 45 c8 49 39 c4 74 78 4d 8d 6c 24 48 41 bf 05 00 00 00 <49> 8b 5d 00 48 85 db 74 50 8b 83 20 01 00 00 85 c0 74 15 48 8b
      [  175.744320] RIP  [<ffffffffa01af29b>] i915_gem_wait_for_idle+0x3b/0x140 [i915]
      [  175.744351]  RSP <ffffc9007bd1b9b8>
      
      Fixes: 80b204bc ("drm/i915: Enable multiple timelines")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20161111145809.9701-1-chris@chris-wilson.co.uk
      9caa34aa
  13. 11 11月, 2016 2 次提交
  14. 10 11月, 2016 1 次提交
    • T
      drm/i915: Trim the object sg table · 0c40ce13
      Tvrtko Ursulin 提交于
      At the moment we allocate enough sg table entries assuming we
      will not be able to do any coalescing. But since in practice
      we most often can, and more so very effectively, this ends up
      wasting a lot of memory.
      
      A simple and effective way of trimming the over-allocated
      entries is to copy the table over to a new one allocated to the
      exact size.
      
      Experiments on my freshly logged and idle desktop (KDE) showed
      that by doing this we can save approximately 1 MiB of RAM, or
      when running a typical benchmark like gl_manhattan I have
      even seen a 6 MiB saving.
      
      More complicated techniques such as only copying the last used
      page and freeing the rest are left to the reader.
      
      v2:
       * Update commit message.
       * Use temporary sg_table on stack. (Chris Wilson)
      
      v3:
       * Commit message update.
       * Comment added.
       * Replace memcpy with copy assignment.
         (Chris Wilson)
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1478704423-7447-1-git-send-email-tvrtko.ursulin@linux.intel.com
      0c40ce13
  15. 08 11月, 2016 3 次提交
  16. 07 11月, 2016 2 次提交
    • I
      drm/i915: Add assert for no pending GPU requests during suspend/resume in LR mode · 31ab49ab
      Imre Deak 提交于
      During resume we will reset the SW/HW tracking for each ring head/tail
      pointers and so are not prepared to replay any pending requests (as
      opposed to GPU reset time). Add an assert for this both to the suspend
      and the resume code.
      
      v2:
      - Check for ELSP port idle already during suspend and check !gt.awake
        during resume. (Chris)
      v3:
      - Move the !gt.awake check to i915_gem_resume().
      v4:
      - s/intel_lr_engines_idle/intel_execlists_idle/ (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NImre Deak <imre.deak@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-4-git-send-email-imre.deak@intel.com
      31ab49ab
    • I
      drm/i915: Make sure engines are idle during GPU idling in LR mode · 0cb5670b
      Imre Deak 提交于
      We assume that the GPU is idle once receiving the seqno via the last
      request's user interrupt. In execlist mode the corresponding context
      completed interrupt can be delayed though and until this latter
      interrupt arrives we consider the request to be pending on the ELSP
      submit port. This can cause a problem during system suspend where this
      last request will be seen by the resume code as still pending. Such
      pending requests are normally replayed after a GPU reset, but during
      resume we reset both SW and HW tracking of the ring head/tail pointers,
      so replaying the pending request with its stale tail pointer will leave
      the ring in an inconsistent state. A subsequent request submission can
      lead then to the GPU executing from uninitialized area in the ring
      behind the above stale tail pointer.
      
      Fix this by making sure any pending request on the ELSP port is
      completed before suspending. I used a polling wait since the completion
      time I measured was <1ms and since normally we only need to wait during
      system suspend. GPU idling during runtime suspend is scheduled with a
      delay (currently 50-100ms) after the retirement of the last request at
      which point the context completed interrupt must have arrived already.
      
      The chance of this bug was increased by
      
      commit 1c777c5d
      Author: Imre Deak <imre.deak@intel.com>
      Date:   Wed Oct 12 17:46:37 2016 +0300
      
          drm/i915/hsw: Fix GPU hang during resume from S3-devices state
      
      but it could happen even without the explicit GPU reset, since we
      disable interrupts afterwards during the suspend sequence.
      
      v2:
      - Do an unlocked poll-wait first. (Chris)
      v3-4:
      - s/intel_lr_engines_idle/intel_execlists_idle/ and move
        i915.enable_execlists check to the new helper. (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98470Signed-off-by: NImre Deak <imre.deak@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-3-git-send-email-imre.deak@intel.com
      0cb5670b