1. 13 1月, 2015 2 次提交
    • M
      drm/i915: Refactor work that can sleep out of commit (v7) · 32b7eeec
      Matt Roper 提交于
      Once we integrate our work into the atomic pipeline, plane commit
      operations will need to happen with interrupts disabled, due to vblank
      evasion.  Our commit functions today include sleepable work, so those
      operations need to be split out and run either before or after the
      atomic register programming.
      
      The solution here calculates which of those operations will need to be
      performed during the 'check' phase and sets flags in an intel_crtc
      sub-struct.  New intel_begin_crtc_commit() and
      intel_finish_crtc_commit() functions are added before and after the
      actual register programming; these will eventually be called from the
      atomic plane helper's .atomic_begin() and .atomic_end() entrypoints.
      
      v2: Fix broken sprite code split
      
      v3: Make the pre/post commit work crtc-based to match how we eventually
          want this to be called from the atomic plane helpers.
      
      v4: Some platforms that haven't had their watermark code reworked were
          waiting for vblank, then calling update_sprite_watermarks in their
          platform-specific disable code.  These also need to be flagged out
          of the critical section.
      
      v5: Sprite plane test for primary show/hide should just set the flag to
          wait for pending flips, not actually perform the wait.  (Ander)
      
      v6:
       - Rebase onto latest di-nightly; picks up an important runtime PM fix.
       - Handle 'wait_for_flips' flag in intel_begin_crtc_commit(). (Ander)
       - Use wait_for_flips flag for primary plane update rather than
         performing the wait in the check routine.
       - Added kerneldoc to pre_disable/post_enable functions that are no
         longer static.  (Ander)
       - Replace assert_pipe_enabled() in intel_disable_primary_hw_plane()
         with an intel_crtc->active test; it turns out assert_pipe_enabled()
         grabs some mutexes and can sleep, which we can't do with interrupts
         disabled.
      
      v7:
       - Check for fb != NULL when deciding whether the sprite plane hides the
         primary plane during a sprite update.  (PRTS)
      Signed-off-by: NMatt Roper <matthew.d.roper@intel.com>
      Reviewed-by: NAnder Conselvan de Oliveira <conselvan2@gmail.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      32b7eeec
    • J
      drm/i915: fix build for CONFIG_BUG=n · 2f3408c7
      Jani Nikula 提交于
      If CONFIG_BUG=n __WARN_printf won't be defined leading to the below
      build failure. The double underscores should have told us to steer clear
      of it anyway.
      
      drivers/gpu/drm/i915/intel_display.c: In function ‘assert_pll’:
      drivers/gpu/drm/i915/intel_display.c:1027:2: error: implicit declaration
      of function ‘__WARN_printf’ [-Werror=implicit-function-declaration]
        I915_STATE_WARN(cur_state != state,
      
      Use WARN(1, ...) instead. It handles CONFIG_BUG=n gracefully and, with
      the constant condition, a sane compiler should reduce it to
      __WARN_printf.
      
      This is a regression introduced by
      
      commit e2c719b7
      Author: Rob Clark <robdclark@gmail.com>
      Date:   Mon Dec 15 13:56:32 2014 -0500
      
          drm/i915: tame the chattermouth (v2)
      Reported-by: NJim Davis <jim.epost@gmail.com>
      Reference: http://mid.gmane.org/CA+r1ZhgHTi7bS2irhtuSUs9aO=Br1dumN8=oAOeaMJDZ_ZhwBw@mail.gmail.com
      Cc: Rob Clark <robdclark@gmail.com>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      2f3408c7
  2. 12 1月, 2015 3 次提交
  3. 08 1月, 2015 3 次提交
    • T
      drm/i915: Reserve shadow batch VMA analogue to others · 7226572d
      Tvrtko Ursulin 提交于
      If not pinned VMA can become an eviction target just before it needs to be
      executed which breaks the internal object lifetime rules.
      Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87399Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      7226572d
    • C
      drm/i915: Add ioctl to set per-context parameters · c9dc0f35
      Chris Wilson 提交于
      Sometimes we wish to tweak how an individual context behaves. Since we
      always create a context for every filp, this means that individual
      processes can fine tune their behaviour even if they do not explicitly
      create a context.
      
      The first example parameter here is to enable multi-process GPU testing,
      but the interface should be able to cope with passing arbitrarily complex
      parameters.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Testcase: igt/gem_reset_stats/ban-period-*
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c9dc0f35
    • D
      drm/i915: Push vblank enable/disable past encoder->enable/disable · f9b61ff6
      Daniel Vetter 提交于
      It is platform/output depenedent when exactly the pipe will start
      running. Sometimes we just need the (cpu) pipe enabled, in other cases
      the pch transcoder is enough and in yet other cases the (DP) port is
      sending the frame start signal.
      
      In a perfect world we'd put the drm_crtc_vblank_on call exactly where
      the pipe starts running, but due to cloning and similar things this
      will get messy. And the current approach of picking the most
      conservative place for all combinations also doesn't work since that
      results in legit vblank waits (in encoder->enable hooks, e.g. the 2
      vblank waits for sdvo) failing.
      
      Completely going back to the old world before
      
      commit 51e31d49
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Mon Sep 15 12:36:02 2014 +0200
      
          drm/i915: Use generic vblank wait
      
      isn't great either since screaming when the vblank wait work because
      the pipe is off is kinda nice.
      
      Pick a compromise and move the drm_crtc_vblank_on right before the
      encoder->enable call. This is a lie on some outputs/platforms, but
      after the ->enable callback the pipe is guaranteed to run everywhere.
      So not that bad really. Suggested by Ville.
      
      v2: Same treatment for drm_crtc_vblank_off and encoder->disable: I've
      missed the ibx pipe B select w/a, which also has a vblank wait in the
      disable function (while the pipe is obviously still running).
      
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Acked-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      f9b61ff6
  4. 07 1月, 2015 5 次提交
  5. 06 1月, 2015 10 次提交
    • A
      drm/i915: Support creation of unbound wc user mappings for objects · 1816f923
      Akash Goel 提交于
      This patch provides support to create write-combining virtual mappings of
      GEM object. It intends to provide the same funtionality of 'mmap_gtt'
      interface without the constraints and contention of a limited aperture
      space, but requires clients handles the linear to tile conversion on their
      own. This is for improving the CPU write operation performance, as with such
      mapping, writes and reads are almost 50% faster than with mmap_gtt. Similar
      to the GTT mmapping, unlike the regular CPU mmapping, it avoids the cache
      flush after update from CPU side, when object is passed onto GPU.  This
      type of mapping is specially useful in case of sub-region update,
      i.e. when only a portion of the object is to be updated. Using a CPU mmap
      in such cases would normally incur a clflush of the whole object, and
      using a GTT mmapping would likely require eviction of an active object or
      fence and thus stall. The write-combining CPU mmap avoids both.
      
      To ensure the cache coherency, before using this mapping, the GTT domain
      has been reused here. This provides the required cache flush if the object
      is in CPU domain or synchronization against the concurrent rendering.
      Although the access through an uncached mmap should automatically
      invalidate the cache lines, this may not be true for non-temporal write
      instructions and also not all pages of the object may be updated at any
      given point of time through this mapping.  Having a call to get_pages in
      set_to_gtt_domain function, as added in the earlier patch 'drm/i915:
      Broaden application of set-domain(GTT)', would guarantee the clflush and
      so there will be no cachelines holding the data for the object before it
      is accessed through this map.
      
      The drm_i915_gem_mmap structure (for the DRM_I915_GEM_MMAP_IOCTL) has been
      extended with a new flags field (defaulting to 0 for existent users). In
      order for userspace to detect the extended ioctl, a new parameter
      I915_PARAM_MMAP_VERSION has been added for versioning the ioctl interface.
      
      v2: Fix error handling, invalid flag detection, renaming (ickle)
      
      v3: Rebase to latest drm-intel-nightly codebase
      
      The new mmapping is exercised by igt/gem_mmap_wc,
      igt/gem_concurrent_blit and igt/gem_gtt_speed.
      
      Change-Id: Ie883942f9e689525f72fe9a8d3780c3a9faa769a
      Signed-off-by: NAkash Goel <akash.goel@intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1816f923
    • C
      drm/i915: Broaden application of set-domain(GTT) · 43566ded
      Chris Wilson 提交于
      Previously, this was restricted to only operate on bound objects - to
      make pointer access through the GTT to the object coherent with writes
      to and from the GPU. A second usecase is drm_intel_bo_wait_rendering()
      which at present does not function unless the object also happens to
      be bound into the GGTT (on current systems that is becoming increasingly
      rare, especially for the typical requests from mesa). A third usecase is
      a future patch wishing to extend the coverage of the GTT domain to
      include objects not bound into the GGTT but still in its coherent cache
      domain. For the latter pair of requests, we need to operate on the
      object regardless of its bind state.
      
      v2: After discussion with Akash, we came to the conclusion that the
      get-pages was required in order for accurate domain tracking in the
      corner cases (like the shrinker) and also useful for ensuring memory
      coherency with earlier cached CPU mmaps in case userspace uses exotic
      cache bypass (non-temporal) instructions.
      
      v3: Fix the inactive object check.
      
      v4: Rebase to latest drm-intel-nightly codebase
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NAkash Goel <akash.goel@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      43566ded
    • B
      drm/i915: Add some extra guards in evict_vm · b9b5dce5
      Ben Widawsky 提交于
      v2: Use WARN_ONs (Daniel)
      
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b9b5dce5
    • D
      drm/i915: Include i915_gem_evict.c kerneldoc into the drm docbook · 7838a63a
      Daniel Vetter 提交于
      I've written these long before we've had a reasonable docbook
      structure, and naturally they've gone stale. Fix this up asap.
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      7838a63a
    • K
      drm/i915: Make sample_c messages go faster on Haswell. · 94411593
      Kenneth Graunke 提交于
      Haswell significantly improved the performance of sampler_c messages,
      but the optimization appears to be off by default.  Later platforms
      remove this bit, and apparently always enable the optimization.
      
      Improves performance in "Counter Strike: Global Offensive" by 18%
      at default settings on Iris Pro.
      
      This may break sampling of paletted formats (P8/A8P8/P8A8).  It's
      unclear whether it affects sampling of paletted formats in general,
      or just the sample_c message (which is never used).
      
      While libva does have support for using paletted formats (primarily
      for OSDs), that support appears to have been broken for at least a
      year, so I couldn't observe a regression from this:
      
      I tried to get libva-intel to use paletted formats, and observe a
      regression...but the only thing I found that used it was mplayer's OSD
      (on screen display).  Even without my patch, the colors were totally
      wrong with that, and it's according to a few distro wikis, that's been
      the case for over a year.
      
      If libva's code for paletted formats /is/ broken, they could always
      add code to disable this bit using the command validator when fixing
      it.
      
      Further investigation from Haihao shows that libva mplayer OSD seems
      to work at least on his setup (still unclear what's wron with Ken's),
      and that it's not affected by this patch. Quoting the discussion
      between Haihao and Ken:
      
      > > > If you use "-vo gl" or "-vo xv", the OSD is solid white text with a black
      > > > border around it.  I presume that it's supposed to be white with vaapi as
      > > > well, but I guess I'm not entirely sure.
      > > >
      > > > It's possible that the optimization doesn't affect the palette as long as
      > > > you never use sample_c with the paletted textures.
      > >
      > > I verified the palette takes effect in the following way:
      > >
      > > 1. Only support P8A8 format in the driver
      > >
      > > 2. ran the above command and I saw white OSD text
      > >
      > > 3. Only support P4A4 format in the driver and don't use
      > > 3DSTATE_SAMPLER_PALETTE_LOAD0 to load the value to the texture palette,
      > > so the palette keeps unchanged.
      > >
      > > 4. ran the above command and I saw black OSD text.
      > >
      > > 5. Load the right value to the texture palette and ran the above command
      > > again, I saw white OSD text.
      > >
      > > Hence I think sample_c with the paletted textures is used in the driver.
      >
      > That sounds like the palette is actually working, then.  Great :)
      >
      > I doubt that libva would use sample_c - sampling with a shadow comparison?
      > It looks like it just uses sample and sample+killpix.
      
      You are right, libva driver doesn't use sample_c message.
      
      > I'm pretty sure the sample_c optimization just uses the palette memory as
      > storage for some stuff, so it's quite possible it just works if you're
      > only using sample and sample+killpix.
      
      Thanks for the explanation, it makes sense to me.
      Signed-off-by: NKenneth Graunke <kenneth@whitecape.org>
      [danvet: Add wa name from Ville's review to the comment and copypaste
      the explanation why we don't care about libva (already broken) from
      Ken. Also add conclusion from libva devs that&why this is all fine.]
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Cc: "Xiang, Haihao" <haihao.xiang@intel.com>
      Cc: libva@lists.freedesktop.org
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      94411593
    • D
      drm/radeon: integer underflow in radeon_cp_dispatch_texture() · dd5a74f2
      Dan Carpenter 提交于
      The test:
      
      	if (size > RADEON_MAX_TEXTURE_SIZE) {
      
      "size" is an integer and it's controled by the user so it can be
      negative and the test can underflow.  Later we use "size" in:
      
      	dwords = size / 4;
      	...
      	RADEON_COPY_MT(buffer, data, (int)(dwords * sizeof(u32)));
      
      It causes memory corruption to copy a negative size buffer.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      dd5a74f2
    • A
      drm/radeon: adjust default bapm settings for KV · 02ae7af5
      Alex Deucher 提交于
      Enabling bapm seems to cause clocking problems on some
      KV configurations.  Disable it by default for now.
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      02ae7af5
    • A
      drm/radeon: properly filter DP1.2 4k modes on non-DP1.2 hw · 410cce2a
      Alex Deucher 提交于
      The check was already in place in the dp mode_valid check, but
      radeon_dp_get_dp_link_clock() never returned the high clock
      mode_valid was checking for because that function clipped the
      clock based on the hw capabilities.  Add an explicit check
      in the mode_valid function.
      
      bug:
      https://bugs.freedesktop.org/show_bug.cgi?id=87172Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc:stable@vge.kernel.org
      410cce2a
    • A
      drm/radeon: fix sad_count check for dce3 · 5665c3eb
      Alex Deucher 提交于
      Make it consistent with the sad code for other asics to deal
      with monitors that don't report sads.
      
      bug:
      https://bugzilla.kernel.org/show_bug.cgi?id=89461Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      5665c3eb
    • A
      drm/radeon: KV has three PPLLs (v2) · fbedf1c3
      Alex Deucher 提交于
      Enable all three in the driver.  Early documentation
      indicated the 3rd one was used for something else, but
      that is not the case.
      
      v2: handle disable as well
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      fbedf1c3
  6. 05 1月, 2015 1 次提交
  7. 29 12月, 2014 2 次提交
    • O
      drm/radeon: Init amdkfd only if it was compiled · 38c2adfb
      Oded Gabbay 提交于
      This patch changes the radeon_kfd_init(), which is used to initialize the
      interface between radeon and amdkfd, so the interface will be initialized only
      if amdkfd was build, either as module or inside the kernel image.
      
      In the modules case, the symbol_request() will be used (same as old code). In
      the in-image compilation case, a direct call to kgd2kfd_init() will be done.
      For other cases, radeon_kfd_init() will just return false.
      
      This patch is necessary because in case of the following specific
      configuration: kernel 32-bit, no modules support, random kernel base and no
      hibernation, the symbol_request() doesn't work as expected - it doesn't return
      NULL if the symbol doesn't exists - which makes the kernel panic.
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
      38c2adfb
    • S
      amdkfd: actually allocate longs for the pasid bitmask · 68d0cb49
      Sasha Levin 提交于
      Commit "amdkfd: use sizeof(long) granularity for the pasid bitmask" calculated
      the number of longs it will need, but ended up allocating that number of
      bytes rather than longs.
      
      Fix that silly error and allocate the amount of data really required.
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
      68d0cb49
  8. 24 12月, 2014 1 次提交
  9. 23 12月, 2014 1 次提交
  10. 22 12月, 2014 9 次提交
  11. 21 12月, 2014 1 次提交
    • O
      drm: Put amdkfd before radeon in drm Makefile · 611a03d7
      Oded Gabbay 提交于
      When amdkfd and radeon are compiled inside the kernel image (not as modules),
      radeon will load before amdkfd, which will cause a bug when radeon will probe
      the GPUs.
      
      When the two drivers are compiled as modules, amdkfd is loaded after radeon is
      loaded but before radeon starts probing the GPUs. This is done because radeon
      loads the amdkfd module through symbol_request function.
      
      This patch makes amdkfd load before radeon when they are both compiled inside
      the kernel image, which makes the behavior similar to the case when they are
      modules, and prevents the kernel bug.
      Signed-off-by: NOded Gabbay <oded.gabbay@amd.com>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      611a03d7
  12. 19 12月, 2014 2 次提交