1. 26 7月, 2012 2 次提交
  2. 20 6月, 2012 1 次提交
    • D
      drm/i915: disable flushing_list/gpu_write_list · cc889e0f
      Daniel Vetter 提交于
      This is just the minimal patch to disable all this code so that we can
      do decent amounts of QA before we rip it all out.
      
      The complicating thing is that we need to flush the gpu caches after
      the batchbuffer is emitted. Which is past the point of no return where
      execbuffer can't fail any more (otherwise we risk submitting the same
      batch multiple times).
      
      Hence we need to add a flag to track whether any caches associated
      with that ring are dirty. And emit the flush in add_request if that's
      the case.
      
      Note that this has a quite a few behaviour changes:
      - Caches get flushed/invalidated unconditionally.
      - Invalidation now happens after potential inter-ring sync.
      
      I've bantered around a bit with Chris on irc whether this fixes
      anything, and it might or might not. The only thing clear is that with
      these changes it's much easier to reason about correctness.
      
      Also rip out a lone get_next_request_seqno in the execbuffer
      retire_commands function. I've dug around and I couldn't figure out
      why that is still there, with the outstanding lazy request stuff it
      shouldn't be necessary.
      
      v2: Chris Wilson complained that I also invalidate the read caches
      when flushing after a batchbuffer. Now optimized.
      
      v3: Added some comments to explain the new flushing behaviour.
      
      Cc: Eric Anholt <eric@anholt.net>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      cc889e0f
  3. 14 6月, 2012 3 次提交
    • B
      drm/i915: possibly invalidate TLB before context switch · 12b0286f
      Ben Widawsky 提交于
      From http://intellinuxgraphics.org/documentation/SNB/IHD_OS_Vol1_Part3.pdf
      
      [DevSNB] If Flush TLB invalidation Mode is enabled it's the driver's
      responsibility to invalidate the TLBs at least once after the previous
      context switch after any GTT mappings changed (including new GTT
      entries).  This can be done by a pipelined PIPE_CONTROL with TLB inv bit
      set immediately before MI_SET_CONTEXT.
      
      On GEN7 the invalidation mode is explicitly set, but this appears to be
      lacking for GEN6. Since I don't know the history on this, I've decided
      to dynamically read the value at ring init time, and use that value
      throughout.
      
      v2: better comment (daniel)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      12b0286f
    • B
      drm/i915: context switch implementation · e0556841
      Ben Widawsky 提交于
      Implement the context switch code as well as the interfaces to do the
      context switch. This patch also doesn't match 1:1 with the RFC patches.
      The main difference is that from Daniel's responses the last context
      object is now stored instead of the last context. This aids in allows us
      to free the context data structure, and context object independently.
      
      There is room for optimization: this code will pin the context object
      until the next context is active. The optimal way to do it is to
      actually pin the object, move it to the active list, do the context
      switch, and then unpin it. This allows the eviction code to actually
      evict the context object if needed.
      
      The context switch code is missing workarounds, they will be implemented
      in future patches.
      
      v2: actually do obj->dirty=1 in switch (daniel)
      Modified comment around above
      Remove flags to context switch (daniel)
      Move mi_set_context code to i915_gem_context.c (daniel)
      Remove seqno , use lazy request instead (daniel)
      
      v3: use i915_gem_request_next_seqno instead of
            outstanding_lazy_request (Daniel)
      remove id's from trace events (Daniel)
      Put the context BO in the instruction domain (Daniel)
      Don't unref the BO is context switch fails (Chris)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      e0556841
    • B
      drm/i915: context basic create & destroy · 40521054
      Ben Widawsky 提交于
      Invent an abstraction for a hw context which is passed around through
      the core functions. The main bit a hw context holds is the buffer object
      which backs the context. The rest of the members are just helper
      functions. Specifically the ring member, which could likely go away if
      we decide to never implement whatever other hw context support exists.
      
      Of note here is the introduction of the 64k alignment constraint for the
      BO. If contexts become heavily used, we should consider tweaking this
      down to 4k. Until the contexts are merged and tested a bit though, I
      think 64k is a nice start (based on docs).
      
      Since we don't yet switch contexts, there is really not much complexity
      here. Creation/destruction works pretty much as one would expect. An idr
      is used to generate the context id numbers which are unique per file
      descriptor.
      
      v2: add DRM_DEBUG_DRIVERS to distinguish ENOMEM failures (ben)
      convert a BUG_ON to WARN_ON, default destruction is still fatal (ben)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      40521054
  4. 20 5月, 2012 1 次提交
    • C
      drm/i915: Introduce for_each_ring() macro · b4519513
      Chris Wilson 提交于
      In many places we wish to iterate over the rings associated with the
      GPU, so refactor them to use a common macro.
      
      Along the way, there are a few code removals that should be side-effect
      free and some rearrangement which should only have a cosmetic impact,
      such as error-state.
      
      Note that this slightly changes the semantics in the hangcheck code:
      We now always cycle through all enabled rings instead of
      short-circuiting the logic.
      
      v2: Pull in a couple of suggestions from Ben and Daniel for
      intel_ring_initialized() and not removing the warning (just moving them
      to a new home, closer to the error).
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
      [danvet: Added note to commit message about the small behaviour
      change, suggested by Ben Widawsky.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b4519513
  5. 03 5月, 2012 5 次提交
  6. 13 4月, 2012 1 次提交
  7. 10 4月, 2012 1 次提交
  8. 15 2月, 2012 1 次提交
    • C
      drm/i915: Record the tail at each request and use it to estimate the head · a71d8d94
      Chris Wilson 提交于
      By recording the location of every request in the ringbuffer, we know
      that in order to retire the request the GPU must have finished reading
      it and so the GPU head is now beyond the tail of the request. We can
      therefore provide a conservative estimate of where the GPU is reading
      from in order to avoid having to read back the ring buffer registers
      when polling for space upon starting a new write into the ringbuffer.
      
      A secondary effect is that this allows us to convert
      intel_ring_buffer_wait() to use i915_wait_request() and so consolidate
      upon the single function to handle the complicated task of waiting upon
      the GPU. A necessary precaution is that we need to make that wait
      uninterruptible to match the existing conditions as all the callers of
      intel_ring_begin() have not been audited to handle ERESTARTSYS
      correctly.
      
      By using a conservative estimate for the head, and always processing all
      outstanding requests first, we prevent a race condition between using
      the estimate and direct reads of I915_RING_HEAD which could result in
      the value of the head going backwards, and the tail overflowing once
      again. We are also careful to mark any request that we skip over in
      order to free space in ring as consumed which provides a
      self-consistency check.
      
      Given sufficient abuse, such as a set of unthrottled GPU bound
      cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on
      Sandy Bridge (i5-2520m):
        firefox-paintball  18927ms -> 15646ms: 1.21x speedup
        firefox-fishtank   12563ms -> 11278ms: 1.11x speedup
      which is a mild consolation for the performance those traces achieved from
      exploiting the buggy autoreported head.
      
      v2: Add a few more comments and make request->tail a conservative
      estimate as suggested by Daniel Vetter.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: resolve conflicts with retirement defering and the lack of
      the autoreport head removal (that will go in through -fixes).]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      a71d8d94
  9. 30 1月, 2012 1 次提交
  10. 22 9月, 2011 1 次提交
    • B
      drm/i915: Dumb down the semaphore logic · c8c99b0f
      Ben Widawsky 提交于
      While I think the previous code is correct, it was hard to follow and
      hard to debug. Since we already have a ring abstraction, might as well
      use it to handle the semaphore updates and compares.
      
      I don't expect this code to make semaphores better or worse, but you
      never know...
      
      v2:
      Remove magic per Keith's suggestions.
      Ran Daniel's gem_ring_sync_loop test on this.
      
      v3:
      Ignored one of Keith's suggestions.
      
      v4:
      Removed some bloat per Daniel's recommendation.
      
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Keith Packard <keithp@keithp.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NKeith Packard <keithp@keithp.com>
      c8c99b0f
  11. 20 9月, 2011 1 次提交
  12. 13 7月, 2011 1 次提交
  13. 11 5月, 2011 2 次提交
  14. 06 3月, 2011 1 次提交
  15. 07 2月, 2011 1 次提交
  16. 21 1月, 2011 2 次提交
    • L
      i915: Fix i915 suspend delay · d7b9935a
      Linus Torvalds 提交于
      During system suspend, the "wait for ring buffer to empty" loop would
      always time out after three seconds, because the faster cached ring
      buffer head read would always return zero.  Force the slow-and-careful
      PIO read on all but the first iterations of the loop to fix it.
      
      This also removes the unused (and useless) 'actual_head' variable that
      tried to approximate doing this, but did it incorrectly.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Cc: Dave Airlie <airlied@linux.ie>
      Cc: DRI mailing list <dri-devel@lists.freedesktop.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7b9935a
    • C
      drm/i915/ringbuffer: Fix use of stale HEAD position whilst polling for space · c7dca47b
      Chris Wilson 提交于
      During suspend, Linus found that his machine would hang for 3 seconds,
      and identified that intel_ring_buffer_wait() was the culprit:
      
      "Because from looking at the code, I get the notion that
      "intel_read_status_page()" may not be exact. But what happens if that
      inexact value matches our cached ring->actual_head, so we never even
      try to read the exact case? Does it _stay_ inexact for arbitrarily
      long times? If so, we might wait for the ring to empty forever (well,
      until the timeout - the behavior I see), even though the ring really
      _is_ empty."
      
      As the reported HEAD position is only updated every time it crosses a
      64k boundary, whilst draining the ring it is indeed likely to remain one
      value. If that value matches the last known HEAD position, we never read
      the true value from the register and so trigger a timeout.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      c7dca47b
  17. 20 1月, 2011 1 次提交
    • C
      drm/i915: Initialise ring vfuncs for old DRI paths · e8616b6c
      Chris Wilson 提交于
      We weren't setting up the vfunc table when initialising the old DRI
      ringbuffer, leading to such OOPSes as:
      
      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<(null)>] (null)
      PGD 10c441067 PUD 1185e5067 PMD 0
      Oops: 0010 [#1] PREEMPT SMP
      last sysfs file: /sys/class/dmi/id/chassis_asset_tag
      CPU 3
      Modules linked in: i915 drm_kms_helper drm fb fbdev i2c_algo_bit
      cfbcopyarea video backlight output cfbimgblt cfbfillrect autofs4 ipv6
      nfs lockd fscache nfs_acl auth_rpcgss sunrpc coretemp hwmon_vid mousedev
      usbhid hid option usb_wwan snd_hda_codec_via asus_atk0110 atl1e
      usbserial snd_hda_intel snd_hda_codec firmware_class snd_hwdep snd_pcm
      snd_seq snd_timer snd_seq_device processor parport_pc thermal snd
      thermal_sys parport 8250_pnp button rng_core rtc_cmos shpchp hwmon
      rtc_core ehci_hcd pci_hotplug uhci_hcd soundcore tpm_tis i2c_i801
      rtc_lib tpm serio_raw snd_page_alloc tpm_bios i2c_core usbcore psmouse
      intel_agp sg pcspkr sr_mod evdev cdrom ext3 jbd mbcache dm_mod sd_mod
      ata_piix libata scsi_mod unix
      Jan 18 15:49:29 lithui kernel:
      Pid: 3605, comm: Xorg Not tainted 2.6.36.2 #5 P5KPL-CM/System Product
      Name
      RIP: 0010:[<0000000000000000>]  [<(null)>] (null)
      RSP: 0018:ffff8801150d1d40  EFLAGS: 00010202
      RAX: 000000000001ffff RBX: ffff88011a011b00 RCX: 000000000001a704
      RDX: ffff880118566028 RSI: ffff880118566028 RDI: ffff880117876800
      RBP: ffff8801150d1d48 R08: ffff8801195fe300 R09: 00000000c0086444
      R10: 0000000000000001 R11: 0000000000003206 R12: ffff880117876800
      R13: ffff880118566000 R14: ffff880117876820 R15: ffff8801150d1df8
      FS:  00007f1038d456e0(0000) GS:ffff880001780000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000000 CR3: 00000001187e7000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process Xorg (pid: 3605, threadinfo ffff8801150d0000, task
      ffff88011b016e40)
      Stack:
      ffffffffa043b8e6 ffff8801150d1d98 ffffffffa041768b dead000000000000
      <0> 0000000000000048 00007f1023f2a000 0000000000000044 0000000000000008
      <0> ffff88010d26bd80 ffff880117876800 ffff8801150d1df8 ffff8801150d1ea8
      Call Trace:
      [<ffffffffa043b8e6>] ? intel_ring_advance+0x16/0x20 [i915]
      [<ffffffffa041768b>] i915_irq_emit+0x15b/0x240 [i915]
      [<ffffffffa03ea7b1>] drm_ioctl+0x1f1/0x460 [drm]
      [<ffffffffa0417530>] ? i915_irq_emit+0x0/0x240 [i915]
      [<ffffffff810dd8f1>] ? do_sync_read+0xd1/0x120
      [<ffffffff81025b1f>] ? do_page_fault+0x1df/0x3d0
      [<ffffffff810ed5c7>] do_vfs_ioctl+0x97/0x550
      [<ffffffff8115c2ea>] ? security_file_permission+0x7a/0x90
      [<ffffffff810edb19>] sys_ioctl+0x99/0xa0
      [<ffffffff810024ab>] system_call_fastpath+0x16/0x1b
      Code:  Bad RIP value.
      RIP  [<(null)>] (null)
      RSP <ffff8801150d1d40>
      CR2: 0000000000000000
      Reported-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Tested-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29153
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=23172Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: stable@kernel.org
      e8616b6c
  18. 19 1月, 2011 1 次提交
  19. 12 1月, 2011 6 次提交
  20. 14 12月, 2010 1 次提交
  21. 09 12月, 2010 1 次提交
  22. 05 12月, 2010 1 次提交
  23. 30 11月, 2010 1 次提交
  24. 24 11月, 2010 1 次提交
  25. 12 11月, 2010 1 次提交
  26. 11 11月, 2010 1 次提交