1. 24 3月, 2018 1 次提交
  2. 20 3月, 2018 1 次提交
  3. 14 3月, 2018 2 次提交
    • J
      drm/i915: Implement dynamic GuC WOPCM offset and size calculation · 6b0478fb
      Jackie Li 提交于
      Hardware may have specific restrictions on GuC WOPCM offset and size. On
      Gen9, the value of the GuC WOPCM size register needs to be larger than the
      value of GuC WOPCM offset register + a Gen9 specific offset (144KB) for
      reserved GuC WOPCM. Fail to enforce such a restriction on GuC WOPCM size
      will lead to GuC firmware execution failures. On the other hand, with
      current static GuC WOPCM offset and size values (512KB for both offset and
      size), the GuC WOPCM size verification will fail on Gen9 even if it can be
      fixed by lowering the GuC WOPCM offset by calculating its value based on
      HuC firmware size (which is likely less than 200KB on Gen9), so that we can
      have a GuC WOPCM size value which is large enough to pass the GuC WOPCM
      size check.
      
      This patch updates the reserved GuC WOPCM size for RC6 context on Gen9 to
      24KB to strictly align with the Gen9 GuC WOPCM layout. It also adds support
      to verify the GuC WOPCM size aganist the Gen9 hardware restrictions. To
      meet all above requirements, let's provide dynamic partitioning of the
      WOPCM that will be based on platform specific HuC/GuC firmware sizes.
      
      v2:
       - Removed intel_wopcm_init (Ville/Sagar/Joonas)
       - Renamed and Moved the intel_wopcm_partition into intel_guc (Sagar)
       - Removed unnecessary function calls (Joonas)
       - Init GuC WOPCM partition as soon as firmware fetching is completed
      
      v3:
       - Fixed indentation issues (Chris)
       - Removed layering violation code (Chris/Michal)
       - Created separat files for GuC wopcm code  (Michal)
       - Used inline function to avoid code duplication (Michal)
      
      v4:
       - Preset the GuC WOPCM top during early GuC init (Chris)
       - Fail intel_uc_init_hw() as soon as GuC WOPCM partitioning failed
      
      v5:
       - Moved GuC DMA WOPCM register updating code into intel_wopcm.c
       - Took care of the locking status before writing to GuC DMA
         Write-Once registers. (Joonas)
      
      v6:
       - Made sure the GuC WOPCM size to be multiple of 4K (4K aligned)
      
      v8:
       - Updated comments and fixed naming issues (Sagar/Joonas)
       - Updated commit message to include more description about the hardware
         restriction on GuC WOPCM size (Sagar)
      
      v9:
       - Minor changes variable names and code comments (Sagar)
       - Added detailed GuC WOPCM layout drawing (Sagar/Michal)
       - Refined macro definitions to be reader friendly (Michal)
       - Removed redundent check to valid flag (Michal)
       - Unified first parameter for exported GuC WOPCM functions (Michal)
       - Refined the name and parameter list of hardware restriction checking
         functions (Michal)
      
      v10:
       - Used shorter function name for internal functions (Joonas)
       - Moved init-ealry function into c file (Joonas)
       - Consolidated and removed redundant size checks (Joonas/Michal)
       - Removed unnecessary unlikely() from code which is only called once
         during boot (Joonas)
       - More fixes to kernel-doc format and content (Michal)
       - Avoided the use of PAGE_MASK for 4K pages (Michal)
       - Added error log messages to error paths (Michal)
      
      v11:
       - Replaced intel_guc_wopcm with more generic intel_wopcm and attached
         intel_wopcm to drm_i915_private instead intel_guc (Michal)
       - dynamic calculation of GuC non-wopcm memory start (a.k.a WOPCM Top
         offset from GuC WOPCM base) (Michal)
       - Moved WOPCM marco definitions into .c source file (Michal)
       - Exported WOPCM layout diagram as kernel-doc (Michal)
      
      v12:
       - Updated naming, function kernel-doc to align with new changes (Michal)
      
      v13:
       - Updated the ordering of s-o-b/cc/r-b tags (Sagar)
       - Corrected one tense error in comment (Sagar)
       - Corrected typos and removed spurious comments (Joonas)
      
      Bspec: 12690
      Signed-off-by: NJackie Li <yaodong.li@intel.com>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
      Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Cc: John Spotswood <john.a.spotswood@intel.com>
      Cc: Oscar Mateo <oscar.mateo@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> (v8)
      Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v9)
      Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> (v11)
      Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v12)
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/1520987574-19351-2-git-send-email-yaodong.li@intel.com
      6b0478fb
    • R
      drm/i915/psr: Use more PSR HW tracking. · 5baf63cc
      Rodrigo Vivi 提交于
      So far we are using frontbuffer tracking for everything
      and ignoring that PSR has a HW capable HW tracking for many
      modern usages of GPU on Core platforms and newer Atom ones.
      
      One reason for that is that we were trying to keep same
      infrastructure in place for VLV/CHV than the rest of platforms.
      But also because when this infrastructure was created
      the front-buffer-tracking origin wasn't that good and stable
      how it is today after Paulo reworked it to attend FBC cases.
      
      However this PSR implementation without HW tracking died
      on gen8LP. And newer platforms are starting to demand more HW
      tracking specially with PSR2 cases in mind.
      
      By disabling and re-enabling PSR totally every time we believe
      someone is going to change the front buffer content we don't
      allow PSR HW tracking to do this job and specially compromising
      the whole idea of PSR2 case where the HW tracking detect only
      the damaged area and do a partial screen update.
      
      So, from now on, on the platforms that has hw_tracking let's
      rely more on HW tracking.
      
      This also is the case in used by other drivers and more validated
      by SV teams. So I hope that this will lead us to less misterious
      bugs.
      
      v2: Only do this for platform that actually has hw tracking.
      
      v3 from DK
      Do this only for flips, small gradual changes are better.
      
      Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
      Cc: Jim Bride <jim.bride@linux.intel.com>
      Cc: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: NDhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
      Reviewed-by: NJose Roberto de Souza <jose.souza@intel.com>
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180307033420.3086-3-dhinakaran.pandiyan@intel.com
      5baf63cc
  4. 10 3月, 2018 1 次提交
  5. 08 3月, 2018 1 次提交
  6. 07 3月, 2018 2 次提交
    • T
      drm/i915/icl: Enhanced execution list support · 05f0addd
      Thomas Daniel 提交于
      Enhanced Execlists is an upgraded version of execlists which supports
      up to 8 ports. The lrcs to be submitted are written to a submit queue
      (the ExecLists Submission Queue - ELSQ), which is then loaded on the
      HW. When writing to the ELSP register, the lrcs are written cyclically
      in the queue from position 0 to position 7. Alternatively, it is
      possible to write directly in the individual positions of the queue
      using the ELSQC registers. To be able to re-use all the existing code
      we're using the latter method and we're currently limiting ourself to
      only using 2 elements.
      
      v2: Rebase.
      v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
      v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
      v5: Reword commit, rename regs to be closer to specs, turn off
          preemption (Daniele), reuse engine->execlists.elsp (Chris)
      v6: use has_logical_ring_elsq to differentiate the new paths
      v7: add preemption support, rename els to submit_reg (Chris)
      v8: save the ctrl register inside the execlists struct, drop CSB
          handling updates (superseded by preempt_complete_status) (Chris)
      v9: s/drm_i915_gem_request/i915_request (Mika)
      v10: resolved conflict in inject_preempt_context (Mika)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Signed-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-4-mika.kuoppala@linux.intel.com
      05f0addd
    • D
      drm/i915/icl: new context descriptor support · ac52da6a
      Daniele Ceraolo Spurio 提交于
      Starting from Gen11 the context descriptor format has been updated in
      the HW. The hw_id field has been considerably reduced in size and engine
      class and instance fields have been added.
      
      There is a slight name clashing issue because the field that we call
      hw_id is actually called SW Context ID in the specs for Gen11+.
      
      With the current size of the hw_id field we can have a maximum of 2k
      contexts at any time, but we could use the sw_counter field (which is sw
      defined) to increase that because the HW requirement is that
      engine_id + sw id + sw_counter is a unique number.
      GuC uses a similar method to support more contexts but does its tracking
      at lrc level. To avoid doing an implementation that will need to be
      reworked once GuC support lands, defer it for now and mark it as TODO.
      
      v2: rebased, add documentation, fix GEN11_ENGINE_INSTANCE_SHIFT
      v3: rebased, bring back lost code from i915_gem_context.c
      v4: make TODO comment more generic
      v5: be consistent with bit ordering, add extra checks (Chris)
      
      Cc: Oscar Mateo <oscar.mateo@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Reviewed-by: NOscar Mateo <oscar.mateo@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-3-mika.kuoppala@linux.intel.comSigned-off-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      ac52da6a
  7. 05 3月, 2018 1 次提交
  8. 01 3月, 2018 1 次提交
  9. 28 2月, 2018 1 次提交
  10. 22 2月, 2018 1 次提交
  11. 21 2月, 2018 4 次提交
    • J
      fed81658
    • C
      drm/i915/fbc: Use PLANE_HAS_FENCE to determine if the plane is fenced · 1c9b6b13
      Chris Wilson 提交于
      Rather than trusting the cached value of plane_state->vma->fence to
      imply whether the plane_state itself holds a reference on the
      framebuffer's fence, use the information provided in the
      plane_state->flags (PLANE_HAS_FENCE). Note that we still assume that FBC
      is entirely bounded by the plane_state active life span; it's not clear
      if that is a safe assumption.
      Suggested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180220134208.24988-4-chris@chris-wilson.co.uk
      1c9b6b13
    • C
      drm/i915: Move the policy for placement of the GGTT vma into the caller · 5935485f
      Chris Wilson 提交于
      Currently we make the unilateral decision inside
      i915_gem_object_pin_to_display() where the VMA should resided (inside
      the fence and mappable region or above?). This is not our decision to
      make as it impacts on how the display engine can use the resulting
      scanout object, and it would rather instruct us where to place the VMA so
      that it can enable the features it wants. As such, make the pin flags an
      argument to i915_gem_object_pin_to_display() and control them from
      intel_pin_and_fence_fb_obj()
      
      Whilst taking control of the mapping for ourselves, start tracking how
      we use it to avoid trying to free a fence we never claimed:
      
      <3>[  227.151869] GEM_BUG_ON(vma->fence->pin_count <= 0)
      <4>[  227.152064] ------------[ cut here ]------------
      <2>[  227.152068] kernel BUG at drivers/gpu/drm/i915/i915_vma.h:391!
      <4>[  227.152084] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
      <0>[  227.152092] Dumping ftrace buffer:
      <0>[  227.152099]    (ftrace buffer empty)
      <4>[  227.152102] Modules linked in: i915 snd_hda_codec_analog snd_hda_codec_generic coretemp snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm lpc_ich e1000e mei_me mei prime_numbers
      <4>[  227.152131] CPU: 1 PID: 1587 Comm: kworker/u16:49 Tainted: G     U           4.16.0-rc1-gbab67b2f6177-kasan_7+ #1
      <4>[  227.152134] Hardware name: Dell Inc. OptiPlex 755                 /0PU052, BIOS A08 02/19/2008
      <4>[  227.152236] Workqueue: events_unbound intel_atomic_commit_work [i915]
      <4>[  227.152292] RIP: 0010:intel_unpin_fb_vma+0x23a/0x2a0 [i915]
      <4>[  227.152295] RSP: 0018:ffff88005aad7b68 EFLAGS: 00010286
      <4>[  227.152300] RAX: 0000000000000026 RBX: ffff88005c359580 RCX: 0000000000000000
      <4>[  227.152304] RDX: 0000000000000026 RSI: ffffffff8707d840 RDI: ffffed000b55af63
      <4>[  227.152307] RBP: ffff880056817e58 R08: 0000000000000001 R09: 0000000000000000
      <4>[  227.152311] R10: ffff88005aad7b88 R11: 0000000000000000 R12: ffff8800568184d0
      <4>[  227.152314] R13: ffff880065b5ab08 R14: 0000000000000000 R15: dffffc0000000000
      <4>[  227.152318] FS:  0000000000000000(0000) GS:ffff88006ac40000(0000) knlGS:0000000000000000
      <4>[  227.152322] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4>[  227.152325] CR2: 00007f5fb25550a8 CR3: 0000000068c78000 CR4: 00000000000006e0
      <4>[  227.152328] Call Trace:
      <4>[  227.152385]  intel_cleanup_plane_fb+0x6b/0xd0 [i915]
      <4>[  227.152395]  drm_atomic_helper_cleanup_planes+0x166/0x280
      <4>[  227.152452]  intel_atomic_commit_tail+0x159d/0x3380 [i915]
      <4>[  227.152463]  ? process_one_work+0x66e/0x1460
      <4>[  227.152516]  ? skl_update_crtcs+0x9c0/0x9c0 [i915]
      <4>[  227.152523]  ? lock_acquire+0x13d/0x390
      <4>[  227.152527]  ? lock_acquire+0x13d/0x390
      <4>[  227.152534]  process_one_work+0x71a/0x1460
      <4>[  227.152540]  ? __schedule+0x815/0x1e20
      <4>[  227.152547]  ? pwq_dec_nr_in_flight+0x2b0/0x2b0
      <4>[  227.152553]  ? _raw_spin_lock_irq+0xa/0x40
      <4>[  227.152559]  worker_thread+0xdf/0xf60
      <4>[  227.152569]  ? process_one_work+0x1460/0x1460
      <4>[  227.152573]  kthread+0x2cf/0x3c0
      <4>[  227.152578]  ? _kthread_create_on_node+0xa0/0xa0
      <4>[  227.152583]  ret_from_fork+0x3a/0x50
      <4>[  227.152591] Code: c6 00 11 86 c0 48 c7 c7 e0 bd 85 c0 e8 60 e7 a9 c4 0f ff e9 1f fe ff ff 48 c7 c6 40 10 86 c0 48 c7 c7 e0 ca 85 c0 e8 2b 95 bd c4 <0f> 0b 48 89 ef e8 4c 44 e8 c4 e9 ef fd ff ff e8 42 44 e8 c4 e9
      <1>[  227.152720] RIP: intel_unpin_fb_vma+0x23a/0x2a0 [i915] RSP: ffff88005aad7b68
      
      v2: i915_vma_pin_fence() is a no-op if a fence isn't required, so check
      vma->fence as well.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180220134208.24988-2-chris@chris-wilson.co.uk
      5935485f
    • V
      drm/i915: Assert that we don't overflow frontbuffer tracking bits · aa81e2c3
      Ville Syrjälä 提交于
      Add some compile time assrts to the frontbuffer tracking to make sure
      that we have enough bits per pipe to cover all the planes, and that we
      have enough total bits to cover all the planes across all pipes.
      
      We'll ignore any potential clash between the overlay bit and the
      plane bits because that will allow us to keep using a total of 32
      bits for the foreseeable future.
      
      While at it change the macros to use BIT() and GENMASK(). The latter
      gets rid of the hardcoded 0xff and thus means we can change the
      number of bits per pipe by just changing
      INTEL_FRONTBUFFER_BITS_PER_PIPE.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180124183642.32549-1-ville.syrjala@linux.intel.comReviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      aa81e2c3
  12. 20 2月, 2018 1 次提交
    • C
      drm/i915: Track number of pending freed objects · c9c70471
      Chris Wilson 提交于
      During igt, we frequently call into the driver to reset both HW and
      driver state (idling the device, waiting for it to become idle and
      freeing off old objects) to ensure that we start each test/subtest/pass
      from known state. This process incurs an RCU barrier or two to ensure
      that any such pending frees are indeed flushed before we return.
      However, unconditionally waiting on the RCU barrier adds needless delay
      to many callers, which adds up to several seconds when repeated thousands
      of times. We can skip the rcu_barrier() if by tracking how many outstanding
      frees we have, we know there are none.
      
      The same path is used along suspend, where we may be able to save the
      unconditional RCU barrier.
      
      To put it into perspective with a completely meaningless
      microbenchmark, igt/gem_sync/idle is improved from 50ms to 30us on bdw.
      
      v2: Remove the extra synchronize_rcu() inside i915_drop_caches_set()
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180219220631.25001-1-chris@chris-wilson.co.uk
      c9c70471
  13. 16 2月, 2018 1 次提交
  14. 14 2月, 2018 4 次提交
  15. 13 2月, 2018 1 次提交
  16. 10 2月, 2018 2 次提交
  17. 08 2月, 2018 1 次提交
  18. 07 2月, 2018 2 次提交
  19. 05 2月, 2018 1 次提交
  20. 02 2月, 2018 4 次提交
  21. 01 2月, 2018 1 次提交
  22. 31 1月, 2018 3 次提交
  23. 25 1月, 2018 1 次提交
  24. 24 1月, 2018 1 次提交
  25. 20 1月, 2018 1 次提交