1. 22 1月, 2013 3 次提交
    • J
    • D
      drm/i915: clarify concurrent hang detect/gpu reset consistency · 7db0ba24
      Daniel Vetter 提交于
      Damien Lespiau wondered how race the gpu reset/hang detection code is
      against concurrent gpu resets/hang detections or combinations thereof.
      Luckily the single work item is guranteed to never run concurrently,
      so reset handling is already single-threaded.
      
      Hence we only have to worry about concurrent hang detections, or a
      hang detection firing off while we're still processing an older gpu
      reset request. Due to the new mechanism of setting the reset in
      progress flag and the ordering guaranteed by the schedule_work
      function there's nothing to do but add a comment explaining why we're
      safe.
      
      The only thing I've noticed is that we still try to reset the gpu now,
      even when it is declared terminally wedged. Add a check for that to
      avoid continous warnings about failed resets, in case the hangcheck
      timer ever gets stuck.
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      7db0ba24
    • D
      drm/i915: create a race-free reset detection · f69061be
      Daniel Vetter 提交于
      With the previous patch the state transition handling of the reset
      code itself is now (hopefully) race free and solid. But that still
      leaves out everyone else - with the various lock-free wait paths
      we have there's the possibility that the reset happens between the
      point where we read the seqno we should wait on and the actual wait.
      
      And if __wait_seqno then never sees the RESET_IN_PROGRESS state, we'll
      happily wait for a seqno which will in all likelyhood never signal.
      
      In practice this is not a big problem since the X server gets
      constantly interrupted, and can then submit more work (hopefully) to
      unblock everyone else: As soon as a new seqno write lands, all waiters
      will unblock. But running the i-g-t reset testcase ZZ_hangman can
      expose this race, especially on slower hw with fewer cpu cores.
      
      Now looking forward to ARB_robustness and friends that's not the best
      possible behaviour, hence this patch adds a reset_counter to be able
      to detect any reset, even if a given thread never observed the
      in-progress state.
      
      The important part is to correctly order things:
      - The write side needs to increment the counter after any seqno gets
        reset.  Hence we need to do that at the end of the reset work, and
        again wake everyone up. We also need to place a barrier in between
        any possible seqno changes and the counter increment, since any
        unlock operations only guarantee that nothing leaks out, but not
        that at later load operation gets moved ahead.
      - On the read side we need to ensure that no reset can sneak in and
        invalidate the seqno. In all cases we can use the one-sided barrier
        that unlock operations guarantee (of the lock protecting the
        respective seqno/ring pair) to ensure correct ordering. Hence it is
        sufficient to place the atomic read before the mutex/spin_unlock and
        no additional barriers are required.
      
      The end-result of all this is that we need to wake up everyone twice
      in a reset operation:
      - First, before the reset starts, to get any lockholders of the locks,
        so that the reset can proceed.
      - Second, after the reset is completed, to allow waiters to properly
        and reliably detect the reset condition and bail out.
      
      I admit that this entire reset_counter thing smells a bit like
      overkill, but I think it's justified since it makes it really explicit
      what the bail-out condition is. And we need a reset counter anyway to
      implement ARB_robustness, and imo with finer-grained locking on the
      horizont this is the most resilient scheme I could think of.
      
      v2: Drop spurious change in the wait_for_error EXIT_COND - we only
      need to wait until we leave the reset-in-progress wedged state.
      
      v3: Don't play tricks with barriers in the throttle ioctl, the
      spin_unlock is barrier enough.
      
      I've also considered using a little helper to grab the current
      reset_counter, but then decided that hiding the atomic_read isn't a
      great idea, since having it explicitly show up in the code is a nice
      remainder to reviews to check the memory barriers.
      
      v4: Add a comment to explain why we need to fall through in
      __wait_seqno in the end variable assignments.
      
      v5: Review from Damien:
      - s/smb/smp/ in a comment
      - don't increment the reset counter after we've set it to WEDGED. Now
        we (again) properly wedge the gpu when the reset fails.
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      f69061be
  2. 20 1月, 2013 14 次提交
    • C
      drm/i915: Only apply the mb() when flushing the GTT domain during a finish · 97c809fd
      Chris Wilson 提交于
      Now that we seem to have brought order to the GTT barriers, the last one
      to review is the terminal barrier before we unbind the buffer from the
      GTT. This needs to only be performed if the buffer still resides in the
      GTT domain, and so we can skip some needless barriers otherwise.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      97c809fd
    • C
      drm/i915: Only insert the mb() before updating the fence parameter · d0a57789
      Chris Wilson 提交于
      With a fence, we only need to insert a memory barrier around the actual
      fence alteration for CPU accesses through the GTT. Performing the
      barrier in flush-fence was inserting unnecessary and expensive barriers
      for never fenced objects.
      
      Note removing the barriers from flush-fence, which was effectively a
      barrier before every direct access through the GTT, revealed that we
      where missing a barrier before the first access through the GTT. Lack of
      that barrier was sufficient to cause GPU hangs.
      
      v2: Add a couple more comments to explain the new barriers
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d0a57789
    • D
      drm/i915: clear up wedged transitions · 1f83fee0
      Daniel Vetter 提交于
      We have two important transitions of the wedged state in the current
      code:
      
      - 0 -> 1: This means a hang has been detected, and signals to everyone
        that they please get of any locks, so that the reset work item can
        do its job.
      
      - 1 -> 0: The reset handler has completed.
      
      Now the last transition mixes up two states: "Reset completed and
      successful" and "Reset failed". To distinguish these two we do some
      tricks with the reset completion, but I simply could not convince
      myself that this doesn't race under odd circumstances.
      
      Hence split this up, and add a new terminal state indicating that the
      hw is gone for good.
      
      Also add explicit #defines for both states, update comments.
      
      v2: Split out the reset handling bugfix for the throttle ioctl.
      
      v3: s/tmp/wedged/ sugested by Chris Wilson. Also fixup up a rebase
      error which prevented this patch from actually compiling.
      
      v4: To unify the wedged state with the reset counter, keep the
      reset-in-progress state just as a flag. The terminally-wedged state is
      now denoted with a big number.
      
      v5: Add a comment to the reset_counter special values explaining that
      WEDGED & RESET_IN_PROGRESS needs to be true for the code to be
      correct.
      
      v6: Fixup logic errors introduced with the wedged+reset_counter
      unification. Since WEDGED implies reset-in-progress (in a way we're
      terminally stuck in the dead-but-reset-not-completed state), we need
      ensure that we check for this everywhere. The specific bug was in
      wait_for_error, which would simply have timed out.
      
      v7: Extract an inline i915_reset_in_progress helper to make the code
      more readable. Also annote the reset-in-progress case with an
      unlikely, to help the compiler optimize the fastpath. Do the same for
      the terminally wedged case with i915_terminally_wedged.
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1f83fee0
    • D
      drm/i915: fix reset handling in the throttle ioctl · 308887aa
      Daniel Vetter 提交于
      While auditing the code I've noticed one place (the throttle ioctl)
      which does not yet wait for the reset handler to complete and doesn't
      properly decode the wedge state into -EAGAIN/-EIO. Fix this up by
      calling the right helpers. This might explain the oddball "my
      compositor just died in a successfull gpu reset" reports. Or maybe not, since
      current mesa doesn't use this ioctl to throttle command submission.
      
      The throttle ioctl doesn't take the struct_mutex, so to avoid busy-looping
      with -EAGAIN while a reset is in process, check for errors first and wait
      for the handler to complete if a reset is pending by calling
      i915_gem_wait_for_error.
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      308887aa
    • D
      drm/i915: move wedged to the other gpu error handling stuff · 33196ded
      Daniel Vetter 提交于
      And to make Ben Widawsky happier, use the gpu_error instead of
      the entire device as the argument in some functions.
      
      Drop the outdated comment on ->wedged for now, a follow-up patch will
      change the semantics and add a proper comment again.
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      33196ded
    • D
      drm/i915: extract hangcheck/reset/error_state state into substruct · 99584db3
      Daniel Vetter 提交于
      This has been sprinkled all over the place in dev_priv. I think
      it'd be good to also move all the code into a separate file like
      i915_gem_error.c, but that's for another patch.
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      99584db3
    • D
      drm/i915: move dev_priv->mm out of line · 4b5aed62
      Daniel Vetter 提交于
      Tha one is really big, since it contains tons of comments explaining
      how things work. Which is nice ;-)
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      4b5aed62
    • B
      drm/i915: Needs_dmar, not · 8d2e6308
      Ben Widawsky 提交于
      The reasoning behind our code taking two paths depending upon whether or
      not we may have been configured for IOMMU isn't clear to me. It should
      always be safe to use the pci mapping functions as they are designed to
      abstract the decision we were handling in i915.
      
      Aside from simpler code, removing another member for the intel_gtt
      struct is a nice motivation.
      
      I ran this by Chris, and he wasn't concerned about the extra kzalloc,
      and memory references vs. page_to_phys calculation in the case without
      IOMMU.
      
      v2: Update commit message
      
      v3: Remove needs_dmar addition from Zhenyu upstream
      
      This reverts (and then other stuff)
      commit 20652097
      Author: Zhenyu Wang <zhenyuw@linux.intel.com>
      Date:   Thu Dec 13 23:47:47 2012 +0800
      
          drm/i915: Fix missed needs_dmar setting
      
      Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
      Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      [danvet: Squash in follow-up fix to remove the bogus hunk which
      deleted the dma_mask configuration for gen6+.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8d2e6308
    • B
      drm/i915: Remove scratch page from shared · 9c61a32d
      Ben Widawsky 提交于
      We already had a mapping in both (minus the phys_addr in AGP).
      Reviewed-by: NRodrigo Vivi <rodrigo.vivi@gmail.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9c61a32d
    • B
      drm/i915: Cut out the infamous ILK w/a from AGP layer · a81cc00c
      Ben Widawsky 提交于
      And, move it to where the rest of the logic is.
      
      There is some slight functionality changes. There was extra paranoid
      checks in AGP code making sure we never do idle maps on gen2 parts. That
      was not duplicated as the simple PCI id check should do the right thing.
      
      v2: use IS_GEN5 && IS_MOBILE check instead. For now, this is the same as
      IS_IRONLAKE_M but is more future proof. The workaround docs hint that
      more than one platform may be effected, but we've never seen such a
      platform in the wild. (Rodrigo, Daniel)
      
      Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v1)
      Cc: Dave Airlie <airlied@redhat.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      a81cc00c
    • V
      drm/i915: Provide the quantization range in the AVI infoframe · abedc077
      Ville Syrjälä 提交于
      The AVI infoframe is able to inform the display whether the source is
      sending full or limited range RGB data.
      
      As per CEA-861 [1] we must first check whether the display reports the
      quantization range as selectable, and if so we can set the approriate
      bits in the AVI inforframe.
      
      [1] CEA-861-E - 6.4 Format of Version 2 AVI InfoFrame
      
      v2: Give the Q bits better names, add spec chapter information
      Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      abedc077
    • V
      drm/i915: Add "Automatic" mode for the "Broadcast RGB" property · 55bc60db
      Ville Syrjälä 提交于
      Add a new "Automatic" mode to the "Broadcast RGB" range property.
      When selected the driver automagically selects between full range and
      limited range output.
      
      Based on CEA-861 [1] guidelines, limited range output is selected if the
      mode is a CEA mode, except 640x480. Otherwise full range output is used.
      Additionally DVI monitors should most likely default to full range
      always.
      
      As per DP1.2a [2] DisplayPort should always use full range for 18bpp, and
      otherwise will follow CEA-861 rules.
      
      NOTE: The default value for the property will now be "Automatic"
      so some people may be affected in case they're relying on the
      current full range default.
      
      [1] CEA-861-E - 5.1 Default Encoding Parameters
      [2] VESA DisplayPort Ver.1.2a - 5.1.1.1 Video Colorimetry
      
      v2: Use has_hdmi_sink to check if a HDMI monitor is present
      v3: Add information about relevant spec chapters
      Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      55bc60db
    • V
      drm/i915: Fix RGB color range property for PCH platforms · 3685a8f3
      Ville Syrjälä 提交于
      The RGB color range select bit on the DP/SDVO/HDMI registers
      disappeared when PCH was introduced, and instead a new PIPECONF bit
      was added that performs the same function.
      
      Add a new INTEL_MODE_LIMITED_COLOR_RANGE private mode flag, and set
      it in the encoder mode_fixup if limited color range is requested.
      Set the the PIPECONF bit 13 based on the flag.
      
      Experimentation showed that simply toggling the bit while the pipe is
      active doesn't work. We need to restart the pipe, which luckily already
      happens.
      
      The DP/SDVO/HDMI bit 8 is marked MBZ in the docs, so avoid setting it,
      although it doesn't seem to do any harm in practice.
      
      TODO:
      - the PIPECONF bit too seems to have disappeared from HSW. Need a
        volunteer to test if it's just a documentation issue or if it's really
        gone. If the bit is gone and no easy replacement is found, then I suppose
        we may need to use the pipe CSC unit to perform the range compression.
      
      v2: Use mode private_flags instead of intel_encoder virtual functions
      v3: Moved the intel_dp color_range handling after bpc check to help
          later patches
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46800Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      3685a8f3
    • B
      drm/i915: Remove use of gtt_mappable_entries · 93d18799
      Ben Widawsky 提交于
      Mappable_end, ie. size is almost always what you want as opposed to the
      number of entries. Since we already have that information, we can scrap
      the number of entries and only calculate it when needed.
      
      If gtt_start is !0, this will have slightly different behavior. This
      difference can only occur in DRI1, and exists when we try to kick out
      the firmware fb. The new code seems like a bugfix to me.
      
      The other case where we've changed the behavior is during init we check
      the mappable region against our current known upper and lower limits
      (64MB, and 512MB). This now matches the comment, and makes things more
      convenient after removing gtt_mappable_entries.
      
      Also worth noting is the setting of mappable_end is taken out of setup
      because we do it earlier now in the DRI2 case and therefore need to add
      that tiny hunk to support the DRI1 IOCTL.
      
      v2: Move up mappable end to before legacy AGP init
      
      v3: Add the dev_priv inclusion here from previous rebase error in patch
      5
      
      Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      [danvet: squash in fix for a printk format flag mismatch warning.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      93d18799
  3. 18 1月, 2013 18 次提交
  4. 08 1月, 2013 1 次提交
    • S
      drm/prime: drop reference on imported dma-buf come from gem · be8a42ae
      Seung-Woo Kim 提交于
      Increasing ref counts of both dma-buf and gem for imported dma-buf come from gem
      makes memory leak. release function of dma-buf cannot be called because f_count
      of dma-buf increased by importing gem and gem ref count cannot be decrease
      because of exported dma-buf.
      
      So I add dma_buf_put() for imported gem come from its own gem into each drivers
      having prime_import and prime_export capabilities. With this, only gem ref
      count is increased if importing gem exported from gem of same driver.
      Signed-off-by: NSeung-Woo Kim <sw0312.kim@samsung.com>
      Signed-off-by: NKyungmin.park <kyungmin.park@samsung.com>
      Cc: Inki Dae <inki.dae@samsung.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Rob Clark <rob.clark@linaro.org>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      be8a42ae
  5. 04 1月, 2013 1 次提交
    • G
      Drivers: gpu: remove __dev* attributes. · 56550d94
      Greg Kroah-Hartman 提交于
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      
      This change removes the use of __devinit, __devexit_p, and __devexit
      from these drivers.
      
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: David Airlie <airlied@linux.ie>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56550d94
  6. 21 12月, 2012 1 次提交
  7. 20 12月, 2012 2 次提交
    • B
      drm/i915: Make GSM void · 1c45140d
      Ben Widawsky 提交于
      The iomapping of the register region has historically been a uint32_t
      for the obvious reason that our PTE size was always 4b. In the future
      however, we cannot make this assumption.
      
      By making the type void, it makes the upcoming pointer math we will do
      much easier, and hopefully gives the compiler opportunities to warn us
      when we do stupid things.
      
      v2: Cast to __iomem, caught by Ville
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      [danvet: Fixup __iomem issue for real.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1c45140d
    • B
      drm/i915: Move GSM mapping into dev_priv · 06e5598f
      Ben Widawsky 提交于
      This removes an unused field from the AGP structure and moves it into
      the dev_priv structure (with a slightly better name). This builds upon
      the kill-agp series already merged.
      
      GSM is a well defined term in the bspec:
      GSM: Graphics Stolen Memory
      
      GTT stolen space is defined for storage of the GFX GTT entries in
      physical memory. IA can not access GSM directly , it can only access via
      GTTMMADR. GT can access GSM directly or through GTTMMADR.
      
      This is not the entire stolen space.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      06e5598f