1. 14 6月, 2016 2 次提交
    • D
      drm/i915/guc: add doorbell map to debugfs/i915_guc_info · 9636f6db
      Dave Gordon 提交于
      To properly verify the driver->doorbell->GuC functionality, validation
      needs to know how the driver has assigned the doorbell cache lines and
      registers, so make them visible through debugfs.
      
      v2: use kernel bitmap-printing format (%pb) rather than %x.
      Signed-off-by: NDave Gordon <david.s.gordon@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      9636f6db
    • A
      drm/i915:bxt: Enable Pooled EU support · 33e141ed
      arun.siluvery@linux.intel.com 提交于
      This mode allows to assign EUs to pools which can process work collectively.
      The command to enable this mode should be issued as part of context initialization.
      
      The pooled mode is global, once enabled it has to stay the same across all
      contexts until HW reset hence this is sent in auxiliary golden context batch.
      Thanks to Mika for the preliminary review and comments.
      
      v2: explain why this is enabled in golden context, use feature flag while
      enabling the support (Chris)
      
      v3: Include only kernel support as userspace support is not available yet.
      
      User space clients need to know when the pooled EU feature is present
      and enabled on the hardware so that they can adapt work submissions.
      Create a new device info flag for this purpose.
      
      Set has_pooled_eu to true in the Broxton static device info - Broxton
      supports the feature in hardware and the driver will enable it by
      default.
      
      We need to add getparam ioctls to enable userspace to query availability of
      this feature and to retrieve min. no of eus in a pool but we will expose
      them once userspace support is available. Opensource users for this feature
      are mesa, libva and beignet.
      
      Beignet team is currently working on adding userspace support.
      
      Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
      Cc: Winiarski, Michal <michal.winiarski@intel.com>
      Cc: Zou, Nanhai <nanhai.zou@intel.com>
      Cc: Yang, Rong R <rong.r.yang@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Armin Reese <armin.c.reese@intel.com>
      Cc: Tim Gore <tim.gore@intel.com>
      Signed-off-by: NJeff McGee <jeff.mcgee@intel.com>
      Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
      Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      33e141ed
  2. 01 6月, 2016 2 次提交
    • D
      drm/i915: Revert async unpin and nonblocking atomic commit · e42aeef1
      Daniel Vetter 提交于
      This reverts the following patches:
      
      d55dbd06 drm/i915: Allow nonblocking update of pageflips.
      15c86bdb drm/i915: Check for unpin correctness.
      95c2ccdc Reapply "drm/i915: Avoid stalling on pending flips for legacy cursor updates"
      a6747b73 drm/i915: Make unpin async.
      03f476e1 drm/i915: Prepare connectors for nonblocking checks.
      2099deff drm/i915: Pass atomic states to fbc update functions.
      ee7171af drm/i915: Remove reset_counter from intel_crtc.
      2ee004f7 drm/i915: Remove queue_flip pointer.
      b8d2afae drm/i915: Remove use_mmio_flip kernel parameter.
      8dd634d9 drm/i915: Remove cs based page flip support.
      143f73b3 drm/i915: Rework intel_crtc_page_flip to be almost atomic, v3.
      84fc494b drm/i915: Add the exclusive fence to plane_state.
      6885843a drm/i915: Convert flip_work to a list.
      aa420ddd drm/i915: Allow mmio updates on all platforms, v2.
      afee4d87 Revert "drm/i915: Avoid stalling on pending flips for legacy cursor updates"
      
      "drm/i915: Allow nonblocking update of pageflips" should have been
      split up, misses a proper commit message and seems to cause issues in
      the legacy page_flip path as demonstrated by kms_flip.
      
      "drm/i915: Make unpin async" doesn't handle the unthrottled cursor
      updates correctly, leading to an apparent pin count leak. This is
      caught by the WARN_ON in i915_gem_object_do_pin which screams if we
      have more than DRM_I915_GEM_OBJECT_MAX_PIN_COUNT pins.
      
      Unfortuantely we can't just revert these two because this patch series
      came with a built-in bisect breakage in the form of temporarily
      removing the unthrottled cursor update hack for legacy cursor ioctl.
      Therefore there's no other option than to revert the entire pile :(
      
      There's one tiny conflict in intel_drv.h due to other patches, nothing
      serious.
      
      Normally I'd wait a bit longer with doing a maintainer revert, but
      since the minimal set of patches we need to revert (due to the bisect
      breakage) is so big, time is running out fast. And very soon
      (especially after a few attempts at fixing issues) it'll be really
      hard to revert things cleanly.
      
      Lessons learned:
      - Not a good idea to rush the review (done by someone fairly new to
        the area) and not make sure domain experts had a chance to read it.
      
      - Patches should be properly split up. I only looked at the two
        patches that should be reverted in detail, but both look like the
        mix up different things in one patch.
      
      - Patches really should have proper commit messages. Especially when
        doing more than one thing, and especially when touching critical and
        tricky core code.
      
      - Building a patch series and r-b stamping it when it has a built-in
        bisect breakage is not a good idea.
      
      - I also think we need to stop building up technical debt by
        postponing atomic igt testcases even longer. I think it's clear that
        there's enough corner cases in this beast that we really need to
        have the testcases _before_ the next step lands.
      
      (cherry picked from commit 5a21b665
      from drm-intel-next-queeud)
      
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
      Cc: John Harrison <John.C.Harrison@Intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Acked-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Acked-by: NDave Airlie <airlied@redhat.com>
      Acked-by: NJani Nikula <jani.nikula@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      e42aeef1
    • S
      drm/i915: Update GEN6_PMINTRMSK setup with GuC enabled · 1800ad25
      Sagar Arun Kamble 提交于
      On Loading, GuC sets PM interrupts routing (bit 31) and clears ARAT
      expired interrupt (bit 9). Host turbo also updates this register
      in RPS flows. This patch ensures bit 31 and bit 9 setup by GuC persists.
      ARAT timer interrupt is needed in GuC for various features. It also
      facilitates halting GuC and hence achieving RC6. PM interrupt routing
      will not impact RPS interrupt reception by host as GuC will redirect
      them.
      This patch fixes igt test pm_rc6_residency that was failing with guc
      load/submission enabled. Tested with SKL GuC v6.1 and BXT GuC v5.1 and v8.7.
      
      v2: i915_irq/i915_pm decoupling from intel_guc. (ChrisW)
      
      v3: restructuring the mask update and rebase w.r.t Ville's patch. (ChrisW)
      
      v4: Updating the pm_intr_keep during direct_interrupts_to_guc. (Sagar)
      
      Cc: Chris Harris <chris.harris@intel.com>
      Cc: Zhe Wang <zhe1.wang@intel.com>
      Cc: Deepak S <deepak.s@intel.com>
      Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com>
      Cc: Akash Goel <akash.goel@intel.com>
      Testcase: igt/pm_rc6_residency
      Signed-off-by: NSagar Arun Kamble <sagar.a.kamble@intel.com>
      Tested-by: NMatt Roper <matthew.d.roper@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1464683307-19475-1-git-send-email-sagar.a.kamble@intel.com
      1800ad25
  3. 25 5月, 2016 1 次提交
    • D
      drm/i915: Revert async unpin and nonblocking atomic commit · 5a21b665
      Daniel Vetter 提交于
      This reverts the following patches:
      
      d55dbd06 drm/i915: Allow nonblocking update of pageflips.
      15c86bdb drm/i915: Check for unpin correctness.
      95c2ccdc Reapply "drm/i915: Avoid stalling on pending flips for legacy cursor updates"
      a6747b73 drm/i915: Make unpin async.
      03f476e1 drm/i915: Prepare connectors for nonblocking checks.
      2099deff drm/i915: Pass atomic states to fbc update functions.
      ee7171af drm/i915: Remove reset_counter from intel_crtc.
      2ee004f7 drm/i915: Remove queue_flip pointer.
      b8d2afae drm/i915: Remove use_mmio_flip kernel parameter.
      8dd634d9 drm/i915: Remove cs based page flip support.
      143f73b3 drm/i915: Rework intel_crtc_page_flip to be almost atomic, v3.
      84fc494b drm/i915: Add the exclusive fence to plane_state.
      6885843a drm/i915: Convert flip_work to a list.
      aa420ddd drm/i915: Allow mmio updates on all platforms, v2.
      afee4d87 Revert "drm/i915: Avoid stalling on pending flips for legacy cursor updates"
      
      "drm/i915: Allow nonblocking update of pageflips" should have been
      split up, misses a proper commit message and seems to cause issues in
      the legacy page_flip path as demonstrated by kms_flip.
      
      "drm/i915: Make unpin async" doesn't handle the unthrottled cursor
      updates correctly, leading to an apparent pin count leak. This is
      caught by the WARN_ON in i915_gem_object_do_pin which screams if we
      have more than DRM_I915_GEM_OBJECT_MAX_PIN_COUNT pins.
      
      Unfortuantely we can't just revert these two because this patch series
      came with a built-in bisect breakage in the form of temporarily
      removing the unthrottled cursor update hack for legacy cursor ioctl.
      Therefore there's no other option than to revert the entire pile :(
      
      There's one tiny conflict in intel_drv.h due to other patches, nothing
      serious.
      
      Normally I'd wait a bit longer with doing a maintainer revert, but
      since the minimal set of patches we need to revert (due to the bisect
      breakage) is so big, time is running out fast. And very soon
      (especially after a few attempts at fixing issues) it'll be really
      hard to revert things cleanly.
      
      Lessons learned:
      - Not a good idea to rush the review (done by someone fairly new to
        the area) and not make sure domain experts had a chance to read it.
      
      - Patches should be properly split up. I only looked at the two
        patches that should be reverted in detail, but both look like the
        mix up different things in one patch.
      
      - Patches really should have proper commit messages. Especially when
        doing more than one thing, and especially when touching critical and
        tricky core code.
      
      - Building a patch series and r-b stamping it when it has a built-in
        bisect breakage is not a good idea.
      
      - I also think we need to stop building up technical debt by
        postponing atomic igt testcases even longer. I think it's clear that
        there's enough corner cases in this beast that we really need to
        have the testcases _before_ the next step lands.
      
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
      Cc: John Harrison <John.C.Harrison@Intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Acked-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Acked-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Acked-by: NDave Airlie <airlied@redhat.com>
      Acked-by: NJani Nikula <jani.nikula@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      5a21b665
  4. 24 5月, 2016 4 次提交
  5. 23 5月, 2016 1 次提交
  6. 19 5月, 2016 6 次提交
  7. 13 5月, 2016 1 次提交
  8. 11 5月, 2016 2 次提交
  9. 09 5月, 2016 1 次提交
  10. 04 5月, 2016 2 次提交
  11. 28 4月, 2016 3 次提交
  12. 27 4月, 2016 1 次提交
  13. 24 4月, 2016 1 次提交
  14. 22 4月, 2016 1 次提交
  15. 15 4月, 2016 3 次提交
  16. 14 4月, 2016 3 次提交
  17. 12 4月, 2016 1 次提交
    • T
      drm/i915: Simplify for_each_fw_domain iterators · 33c582c1
      Tvrtko Ursulin 提交于
      As the vast majority of users do not use the domain id variable,
      we can eliminate it from the iterator and also change the latter
      using the same principle as was recently done for for_each_engine.
      
      For a couple of callers which do need the domain mask, store it
      in the domain array (which already has the domain id), then both
      can be retrieved thence.
      
      Result is clearer code and smaller generated binary, especially
      in the tight fw get/put loops. Also, relationship between domain
      id and mask is no longer assumed in the macro.
      
      v2: Improve grammar in the commit message and rename the
          iterator to for_each_fw_domain_masked for consistency.
          (Dave Gordon)
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDave Gordon <david.s.gordon@intel.com>
      33c582c1
  18. 09 4月, 2016 2 次提交
  19. 08 4月, 2016 1 次提交
  20. 07 4月, 2016 1 次提交
  21. 04 4月, 2016 1 次提交
    • T
      drm/i915: Move execlists irq handler to a bottom half · 27af5eea
      Tvrtko Ursulin 提交于
      Doing a lot of work in the interrupt handler introduces huge
      latencies to the system as a whole.
      
      Most dramatic effect can be seen by running an all engine
      stress test like igt/gem_exec_nop/all where, when the kernel
      config is lean enough, the whole system can be brought into
      multi-second periods of complete non-interactivty. That can
      look for example like this:
      
       NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u8:3:143]
       Modules linked in: [redacted for brevity]
       CPU: 0 PID: 143 Comm: kworker/u8:3 Tainted: G     U       L  4.5.0-160321+ #183
       Hardware name: Intel Corporation Broadwell Client platform/WhiteTip Mountain 1
       Workqueue: i915 gen6_pm_rps_work [i915]
       task: ffff8800aae88000 ti: ffff8800aae90000 task.ti: ffff8800aae90000
       RIP: 0010:[<ffffffff8104a3c2>]  [<ffffffff8104a3c2>] __do_softirq+0x72/0x1d0
       RSP: 0000:ffff88014f403f38  EFLAGS: 00000206
       RAX: ffff8800aae94000 RBX: 0000000000000000 RCX: 00000000000006e0
       RDX: 0000000000000020 RSI: 0000000004208060 RDI: 0000000000215d80
       RBP: ffff88014f403f80 R08: 0000000b1b42c180 R09: 0000000000000022
       R10: 0000000000000004 R11: 00000000ffffffff R12: 000000000000a030
       R13: 0000000000000082 R14: ffff8800aa4d0080 R15: 0000000000000082
       FS:  0000000000000000(0000) GS:ffff88014f400000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007fa53b90c000 CR3: 0000000001a0a000 CR4: 00000000001406f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Stack:
        042080601b33869f ffff8800aae94000 00000000fffc2678 ffff88010000000a
        0000000000000000 000000000000a030 0000000000005302 ffff8800aa4d0080
        0000000000000206 ffff88014f403f90 ffffffff8104a716 ffff88014f403fa8
       Call Trace:
        <IRQ>
        [<ffffffff8104a716>] irq_exit+0x86/0x90
        [<ffffffff81031e7d>] smp_apic_timer_interrupt+0x3d/0x50
        [<ffffffff814f3eac>] apic_timer_interrupt+0x7c/0x90
        <EOI>
        [<ffffffffa01c5b40>] ? gen8_write64+0x1a0/0x1a0 [i915]
        [<ffffffff814f2b39>] ? _raw_spin_unlock_irqrestore+0x9/0x20
        [<ffffffffa01c5c44>] gen8_write32+0x104/0x1a0 [i915]
        [<ffffffff8132c6a2>] ? n_tty_receive_buf_common+0x372/0xae0
        [<ffffffffa017cc9e>] gen6_set_rps_thresholds+0x1be/0x330 [i915]
        [<ffffffffa017eaf0>] gen6_set_rps+0x70/0x200 [i915]
        [<ffffffffa0185375>] intel_set_rps+0x25/0x30 [i915]
        [<ffffffffa01768fd>] gen6_pm_rps_work+0x10d/0x2e0 [i915]
        [<ffffffff81063852>] ? finish_task_switch+0x72/0x1c0
        [<ffffffff8105ab29>] process_one_work+0x139/0x350
        [<ffffffff8105b186>] worker_thread+0x126/0x490
        [<ffffffff8105b060>] ? rescuer_thread+0x320/0x320
        [<ffffffff8105fa64>] kthread+0xc4/0xe0
        [<ffffffff8105f9a0>] ? kthread_create_on_node+0x170/0x170
        [<ffffffff814f351f>] ret_from_fork+0x3f/0x70
        [<ffffffff8105f9a0>] ? kthread_create_on_node+0x170/0x170
      
      I could not explain, or find a code path, which would explain
      a +20 second lockup, but from some instrumentation it was
      apparent the interrupts off proportion of time was between
      10-25% under heavy load which is quite bad.
      
      When a interrupt "cliff" is reached, which was >~320k irq/s on
      my machine, the whole system goes into a terrible state of the
      above described multi-second lockups.
      
      By moving the GT interrupt handling to a tasklet in a most
      simple way, the problem above disappears completely.
      
      Testing the effect on sytem-wide latencies using
      igt/gem_syslatency shows the following before this patch:
      
      gem_syslatency: cycles=1532739, latency mean=416531.829us max=2499237us
      gem_syslatency: cycles=1839434, latency mean=1458099.157us max=4998944us
      gem_syslatency: cycles=1432570, latency mean=2688.451us max=1201185us
      gem_syslatency: cycles=1533543, latency mean=416520.499us max=2498886us
      
      This shows that the unrelated process is experiencing huge
      delays in its wake-up latency. After the patch the results
      look like this:
      
      gem_syslatency: cycles=808907, latency mean=53.133us max=1640us
      gem_syslatency: cycles=862154, latency mean=62.778us max=2117us
      gem_syslatency: cycles=856039, latency mean=58.079us max=2123us
      gem_syslatency: cycles=841683, latency mean=56.914us max=1667us
      
      Showing a huge improvement in the unrelated process wake-up
      latency. It also shows an approximate halving in the number
      of total empty batches submitted during the test. This may
      not be worrying since the test puts the driver under
      a very unrealistic load with ncpu threads doing empty batch
      submission to all GPU engines each.
      
      Another benefit compared to the hard-irq handling is that now
      work on all engines can be dispatched in parallel since we can
      have up to number of CPUs active tasklets. (While previously
      a single hard-irq would serially dispatch on one engine after
      another.)
      
      More interesting scenario with regards to throughput is
      "gem_latency -n 100" which  shows 25% better throughput and
      CPU usage, and 14% better dispatch latencies.
      
      I did not find any gains or regressions with Synmark2 or
      GLbench under light testing. More benchmarking is certainly
      required.
      
      v2:
         * execlists_lock should be taken as spin_lock_bh when
           queuing work from userspace now. (Chris Wilson)
         * uncore.lock must be taken with spin_lock_irq when
           submitting requests since that now runs from either
           softirq or process context.
      
      v3:
         * Expanded commit message with more testing data;
         * converted missed locking sites to _bh;
         * added execlist_lock comment. (Chris Wilson)
      
      v4:
         * Mention dispatch parallelism in commit. (Chris Wilson)
         * Do not hold uncore.lock over MMIO reads since the block
           is already serialised per-engine via the tasklet itself.
           (Chris Wilson)
         * intel_lrc_irq_handler should be static. (Chris Wilson)
         * Cancel/sync the tasklet on GPU reset. (Chris Wilson)
         * Document and WARN that tasklet cannot be active/pending
           on engine cleanup. (Chris Wilson/Imre Deak)
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Imre Deak <imre.deak@intel.com>
      Testcase: igt/gem_exec_nop/all
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94350Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: http://patchwork.freedesktop.org/patch/msgid/1459768316-6670-1-git-send-email-tvrtko.ursulin@linux.intel.com
      27af5eea