1. 17 10月, 2013 1 次提交
    • C
      drm/i915: Disable all GEM timers and work on unload · 45c5f202
      Chris Wilson 提交于
      We have two once very similar functions, i915_gpu_idle() and
      i915_gem_idle(). The former is used as the lower level operation to
      flush work on the GPU, whereas the latter is the high level interface to
      flush the GEM bookkeeping in addition to flushing the GPU. As such
      i915_gem_idle() also clears out the request and activity lists and
      cancels the delayed work. This is what we need for unloading the driver,
      unfortunately we called i915_gpu_idle() instead.
      
      In the process, make sure that when cancelling the delayed work and
      timer, which is synchronous, that we do not hold any locks to prevent a
      deadlock if the work item is already waiting upon the mutex. This
      requires us to push the mutex down from the caller to i915_gem_idle().
      
      v2: s/i915_gem_idle/i915_gem_suspend/
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70334Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Tested-by: xunx.fang@intel.com
      [danvet: Only set ums.suspended for !kms as discussed earlier. Chris
      noticed that this slipped through.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      45c5f202
  2. 16 10月, 2013 1 次提交
    • B
      drm/i915: Do a fuller init after reset · 3d57e5bd
      Ben Widawsky 提交于
      I had this lying around from he original PPGTT series, and thought we
      might try to get it in by itself.
      
      It's convenient to just call i915_gem_init_hw at reset because we'll be
      adding new things to that function, and having just one function to call
      instead of reimplementing it in two places is nice.
      
      In order to accommodate we cleanup ringbuffers in order to bring them
      back up cleanly. Optionally, we could also teardown/re initialize the
      default context but this was causing some problems on reset which I
      wasn't able to fully debug, and is unnecessary with the previous context
      init/enable split.
      
      This essentially reverts:
      commit 8e88a2bd
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Tue Jun 19 18:40:00 2012 +0200
      
          drm/i915: don't call modeset_init_hw in i915_reset
      
      It seems to work for me on ILK now. Perhaps it's due to:
      commit 8a5c2ae7
      Author: Jesse Barnes <jbarnes@virtuousgeek.org>
      Date:   Thu Mar 28 13:57:19 2013 -0700
      
          drm/i915: fix ILK GPU reset for render
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      3d57e5bd
  3. 10 10月, 2013 3 次提交
  4. 09 10月, 2013 1 次提交
    • D
      drm: kill ->gem_init_object() and friends · 16eb5f43
      David Herrmann 提交于
      All drivers embed gem-objects into their own buffer objects. There is no
      reason to keep drm_gem_object_alloc(), gem->driver_private and
      ->gem_init_object() anymore.
      
      New drivers are highly encouraged to do the same. There is no benefit in
      allocating gem-objects separately.
      
      Cc: Dave Airlie <airlied@gmail.com>
      Cc: Alex Deucher <alexdeucher@gmail.com>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Rob Clark <robdclark@gmail.com>
      Cc: Inki Dae <inki.dae@samsung.com>
      Cc: Ben Skeggs <skeggsb@gmail.com>
      Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
      Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Acked-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      16eb5f43
  5. 04 10月, 2013 2 次提交
    • C
      drm/i915: Boost RPS frequency for CPU stalls · b29c19b6
      Chris Wilson 提交于
      If we encounter a situation where the CPU blocks waiting for results
      from the GPU, give the GPU a kick to boost its the frequency.
      
      This should work to reduce user interface stalls and to quickly promote
      mesa to high frequencies - but the cost is that our requested frequency
      stalls high (as we do not idle for long enough before rc6 to start
      reducing frequencies, nor are we aggressive at down clocking an
      underused GPU). However, this should be mitigated by rc6 itself powering
      off the GPU when idle, and that energy use is dependent upon the workload
      of the GPU in addition to its frequency (e.g. the math or sampler
      functions only consume power when used). Still, this is likely to
      adversely affect light workloads.
      
      In particular, this nearly eliminates the highly noticeable wake-up lag
      in animations from idle. For example, expose or workspace transitions.
      (However, given the situation where we fail to downclock, our requested
      frequency is almost always the maximum, except for Baytrail where we
      manually downclock upon idling. This often masks the latency of
      upclocking after being idle, so animations are typically smooth - at the
      cost of increased power consumption.)
      
      Stéphane raised the concern that this will punish good applications and
      reward bad applications - but due to the nature of how mesa performs its
      client throttling, I believe all mesa applications will be roughly
      equally affected. To address this concern, and to prevent applications
      like compositors from permanently boosting the RPS state, we ratelimit the
      frequency of the wait-boosts each client recieves.
      
      Unfortunately, this techinique is ineffective with Ironlake - which also
      has dynamic render power states and suffers just as dramatically. For
      Ironlake, the thermal/power headroom is shared with the CPU through
      Intelligent Power Sharing and the intel-ips module. This leaves us with
      no GPU boost frequencies available when coming out of idle, and due to
      hardware limitations we cannot change the arbitration between the CPU and
      GPU quickly enough to be effective.
      
      v2: Limit each client to receiving a single boost for each active period.
          Tested by QA to only marginally increase power, and to demonstrably
          increase throughput in games. No latency measurements yet.
      
      v3: Cater for front-buffer rendering with manual throttling.
      
      v4: Tidy up.
      
      v5: Sadly the compositor needs frequent boosts as it may never idle, but
      due to its picking mechanism (using ReadPixels) may require frequent
      waits. Those waits, along with the waits for the vrefresh swap, conspire
      to keep the GPU at low frequencies despite the interactive latency. To
      overcome this we ditch the one-boost-per-active-period and just ratelimit
      the number of wait-boosts each client can receive.
      Reported-and-tested-by: NPaul Neumann <paul104x@yahoo.de>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68716Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Kenneth Graunke <kenneth@whitecape.org>
      Cc: Stéphane Marchesin <stephane.marchesin@gmail.com>
      Cc: Owen Taylor <otaylor@redhat.com>
      Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com>
      Cc: "Zhuang, Lena" <lena.zhuang@intel.com>
      Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      [danvet: No extern for function prototypes in headers.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b29c19b6
    • C
      drm/i915: Fix __wait_seqno to use true infinite timeouts · 094f9a54
      Chris Wilson 提交于
      When we switched to always using a timeout in conjunction with
      wait_seqno, we lost the ability to detect missed interrupts. Since, we
      have had issues with interrupts on a number of generations, and they are
      required to be delivered in a timely fashion for a smooth UX, it is
      important that we do log errors found in the wild and prevent the
      display stalling for upwards of 1s every time the seqno interrupt is
      missed.
      
      Rather than continue to fix up the timeouts to work around the interface
      impedence in wait_event_*(), open code the combination of
      wait_event[_interruptible][_timeout], and use the exposed timer to
      poll for seqno should we detect a lost interrupt.
      
      v2: In order to satisfy the debug requirement of logging missed
      interrupts with the real world requirments of making machines work even
      if interrupts are hosed, we revert to polling after detecting a missed
      interrupt.
      
      v3: Throw in a debugfs interface to simulate broken hw not reporting
      interrupts.
      
      v4: s/EGAIN/EAGAIN/ (Imre)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NImre Deak <imre.deak@intel.com>
      [danvet: Don't use the struct typedef in new code.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      094f9a54
  6. 01 10月, 2013 5 次提交
  7. 26 9月, 2013 1 次提交
    • D
      drm/i915: Fix up usage of SHRINK_STOP · d3227046
      Daniel Vetter 提交于
      In
      
      commit 81e49f81
      Author: Glauber Costa <glommer@openvz.org>
      Date:   Wed Aug 28 10:18:13 2013 +1000
      
          i915: bail out earlier when shrinker cannot acquire mutex
      
      SHRINK_STOP was added to tell the core shrinker code to bail out and
      go to the next shrinker since the i915 shrinker couldn't acquire
      required locks. But the SHRINK_STOP return code was added to the
      ->count_objects callback and not the ->scan_objects callback as it
      should have been, resulting in tons of dmesg noise like
      
      shrink_slab: i915_gem_inactive_scan+0x0/0x9c negative objects to delete nr=-xxxxxxxxx
      
      Fix discusssed with Dave Chinner.
      
      References: http://www.spinics.net/lists/intel-gfx/msg33597.htmlReported-by: NKnut Petersen <Knut_Petersen@t-online.de>
      Cc: Knut Petersen <Knut_Petersen@t-online.de>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Glauber Costa <glommer@openvz.org>
      Cc: Glauber Costa <glommer@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Acked-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d3227046
  8. 20 9月, 2013 4 次提交
    • B
      drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF · 040d2baa
      Ben Widawsky 提交于
      We'd only ever used this define to denote whether or not we have the
      dynamic parity feature (DPF) and never to determine whether or not L3
      exists. Baytrail is a good example of where L3 exists, and not DPF.
      
      This patch provides clarify in the code for future use cases which might
      want to actually query whether or not L3 exists.
      
      v2: Add /* DPF == dynamic parity feature */
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      040d2baa
    • B
      drm/i915: Keep a list of all contexts · a33afea5
      Ben Widawsky 提交于
      I have implemented this patch before without creating a separate list
      (I'm having trouble finding the links, but the messages ids are:
      <1364942743-6041-2-git-send-email-ben@bwidawsk.net>
      <1365118914-15753-9-git-send-email-ben@bwidawsk.net>)
      
      However, the code is much simpler to just use a list and it makes the
      code from the next patch a lot more pretty.
      
      As you'll see in the next patch, the reason for this is to be able to
      specify when a context needs to get L3 remapping. More details there.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      a33afea5
    • B
      drm/i915: Make l3 remapping use the ring · c3787e2e
      Ben Widawsky 提交于
      Using LRI for setting the remapping registers allows us to stream l3
      remapping information. This is necessary to handle per context remaps as
      we'll see implemented in an upcoming patch.
      
      Using the ring also means we don't need to frob the DOP clock gating
      bits.
      
      v2: Add comment about lack of worry for concurrent register access
      (Daniel)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      [danvet: Bikeshed the comment a bit by doing a s/XXX/Note - there's
      nothing to fix.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c3787e2e
    • B
      drm/i915: Add second slice l3 remapping · 35a85ac6
      Ben Widawsky 提交于
      Certain HSW SKUs have a second bank of L3. This L3 remapping has a
      separate register set, and interrupt from the first "slice". A slice is
      simply a term to define some subset of the GPU's l3 cache. This patch
      implements both the interrupt handler, and ability to communicate with
      userspace about this second slice.
      
      v2:  Remove redundant check about non-existent slice.
      Change warning about interrupts of unknown slices to WARN_ON_ONCE
      Handle the case where we get 2 slice interrupts concurrently, and switch
      the tracking of interrupts to be non-destructive (all Ville)
      Don't enable/mask the second slice parity interrupt for ivb/vlv (even
      though all docs I can find claim it's rsvd) (Ville + Bryan)
      Keep BYT excluded from L3 parity
      
      v3: Fix the slice = ffs to be decremented by one (found by Ville). When
      I initially did my testing on the series, I was using 1-based slice
      counting, so this code was correct. Not sure why my simpler tests that
      I've been running since then didn't pick it up sooner.
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      35a85ac6
  9. 13 9月, 2013 2 次提交
  10. 11 9月, 2013 2 次提交
    • G
      i915: bail out earlier when shrinker cannot acquire mutex · 81e49f81
      Glauber Costa 提交于
      The main shrinker driver will keep trying for a while to free objects if
      the returned value from the shrink scan procedure is 0.  That means "no
      objects now", but a retry could very well succeed.
      
      But what we should say here is a different thing: that it is impossible to
      shrink, and we would better bail out soon.  We find this behavior more
      appropriate for the case where the lock cannot be taken.  Specially given
      the hammer behavior of the i915: if another thread is already shrinking,
      we are likely not to be able to shrink anything anyway when we finally
      acquire the mutex.
      Signed-off-by: NGlauber Costa <glommer@openvz.org>
      Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Cc: Arve Hjønnevåg <arve@android.com>
      Cc: Carlos Maiolino <cmaiolino@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: J. Bruce Fields <bfields@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      81e49f81
    • D
      drivers: convert shrinkers to new count/scan API · 7dc19d5a
      Dave Chinner 提交于
      Convert the driver shrinkers to the new API.  Most changes are compile
      tested only because I either don't have the hardware or it's staging
      stuff.
      
      FWIW, the md and android code is pretty good, but the rest of it makes me
      want to claw my eyes out.  The amount of broken code I just encountered is
      mind boggling.  I've added comments explaining what is broken, but I fear
      that some of the code would be best dealt with by being dragged behind the
      bike shed, burying in mud up to it's neck and then run over repeatedly
      with a blunt lawn mower.
      
      Special mention goes to the zcache/zcache2 drivers.  They can't co-exist
      in the build at the same time, they are under different menu options in
      menuconfig, they only show up when you've got the right set of mm
      subsystem options configured and so even compile testing is an exercise in
      pulling teeth.  And that doesn't even take into account the horrible,
      broken code...
      
      [glommer@openvz.org: fixes for i915, android lowmem, zcache, bcache]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NGlauber Costa <glommer@openvz.org>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Cc: Arve Hjønnevåg <arve@android.com>
      Cc: Carlos Maiolino <cmaiolino@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: J. Bruce Fields <bfields@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7dc19d5a
  11. 10 9月, 2013 1 次提交
  12. 06 9月, 2013 1 次提交
    • M
      drm/i915: ban badly behaving contexts · be62acb4
      Mika Kuoppala 提交于
      Now when we have mechanism in place to track which context
      was guilty of hanging the gpu, it is possible to punish
      for bad behaviour.
      
      If context has recently submitted a faulty batchbuffers guilty of
      gpu hang and submits another batch which hangs gpu in quick
      succession, ban it permanently. If ctx is banned, no more
      batchbuffers will be queued for execution.
      
      There is no need for global wedge machinery anymore and
      it would be unwise to wedge the whole gpu if we have multiple
      hanging batches queued for execution. Instead just ban
      the guilty ones and carry on.
      
      v2: Store guilty ban status bool in gpu_error instead of pointers
          that might become danling before hang is declared.
      
      v3: Use return value for banned status instead of stashing state
          into gpu_error (Chris Wilson)
      
      v4: - rebase on top of fixed hang stats api
          - add define for ban period
          - rename commit and improve commit msg
      
      v5: - rely context banning instead of wedging the gpu
          - beautification and fix for ban calculation (Chris)
      Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      be62acb4
  13. 05 9月, 2013 3 次提交
    • C
      drm/i915: Hold an object reference whilst we shrink it · 57094f82
      Chris Wilson 提交于
      Whilst running the shrinker, we need to hold a reference as we unbind
      the objects, or else we may end up waiting for and retiring requests,
      which in turn may result in this object being freed.
      
      This is very similar to the eviction code which also has to be very
      careful to keep a reference to its objects as it retires and unbinds
      them.
      
      Another similarity, that Ben pointed out, is that as we may call
      retire-requests, the unbound_list is outside of our control. We must
      only process a single element of that list at a time, that is we can not
      rely on the "safe" next pointer being valid after a call to
      i915_vma_unbind().
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
        IP: [<ffffffffa0082892>] i915_gem_gtt_finish_object+0x68/0xbd [i915]
        PGD 758d3067 PUD ac0d6067 PMD 0
        Oops: 0000 [#1] SMP
        Modules linked in: dm_mod snd_hda_codec_realtek iTCO_wdt iTCO_vendor_support pcspkr snd_hda_intel i2c_i801 snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd lpc_ich mfd_core soundcore battery ac option usb_wwan usbserial uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev i915 video button drm_kms_helper drm acpi_cpufreq mperf freq_table
        CPU: 1 PID: 16835 Comm: fbo-maxsize Not tainted 3.11.0-rc7_nightlytop_8fdad4_20130902_+ #7977
        task: ffff8800712106d0 ti: ffff880028e4a000 task.ti: ffff880028e4a000
        RIP: 0010:[<ffffffffa0082892>]  [<ffffffffa0082892>] i915_gem_gtt_finish_object+0x68/0xbd [i915]
        RSP: 0018:ffff880028e4b9e8  EFLAGS: 00010246
        RAX: 0000000000000000 RBX: ffff880145734000 RCX: ffff880145735328
        RDX: ffff8801457353fc RSI: 0000000000000000 RDI: ffff88007597cc00
        RBP: ffff88007597cc00 R08: 0000000000000001 R09: ffff88014f257f00
        R10: ffffea0001d65f00 R11: 0000000000bba60b R12: ffff880149e5b000
        R13: ffff880145734001 R14: ffff88007597ccc8 R15: ffff88007597cc00
        FS:  00007ff5bc919740(0000) GS:ffff88014f240000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000008 CR3: 0000000028f4c000 CR4: 00000000001407e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Stack:
         0000000000000000 ffff88007597cc00 ffff8801440d6840 0000000000000000
         ffff880145734000 ffffffffa007c854 0000000000000010 ffff88007597c900
         0000000000018000 00000000004a1201 ffff88007597cc60 ffffffffa007d183
        Call Trace:
         [<ffffffffa007c854>] ? i915_vma_unbind+0xe2/0x1d1 [i915]
         [<ffffffffa007d183>] ? __i915_gem_shrink+0xf1/0x162 [i915]
         [<ffffffffa007d2ee>] ? i915_gem_object_get_pages_gtt+0xfa/0x303 [i915]
         [<ffffffffa00795f4>] ? i915_gem_object_get_pages+0x54/0x89 [i915]
         [<ffffffffa007cbda>] ? i915_gem_object_pin+0x238/0x5ce [i915]
         [<ffffffff812cba5f>] ? __sg_page_iter_next+0x2b/0x58
         [<ffffffffa0082056>] ? gen6_ppgtt_insert_entries+0xf2/0x114 [i915]
         [<ffffffffa007fe4b>] ? i915_gem_execbuffer_reserve_vma.isra.13+0x79/0x18d [i915]
         [<ffffffffa008017c>] ? i915_gem_execbuffer_reserve+0x21d/0x347 [i915]
         [<ffffffffa0080bfb>] ? i915_gem_do_execbuffer.isra.17+0x4f3/0xe61 [i915]
         [<ffffffffa00795f4>] ? i915_gem_object_get_pages+0x54/0x89 [i915]
         [<ffffffffa007e405>] ? i915_gem_pwrite_ioctl+0x743/0x7a5 [i915]
         [<ffffffffa0081a46>] ? i915_gem_execbuffer2+0x15e/0x1e4 [i915]
         [<ffffffffa000e20d>] ? drm_ioctl+0x2a5/0x3c4 [drm]
         [<ffffffffa00818e8>] ? i915_gem_execbuffer+0x37f/0x37f [i915]
         [<ffffffff816f64c0>] ? __do_page_fault+0x3ab/0x449
         [<ffffffff810be3da>] ? do_mmap_pgoff+0x2b2/0x341
         [<ffffffff810e49be>] ? vfs_ioctl+0x1e/0x31
         [<ffffffff810e5194>] ? do_vfs_ioctl+0x3ad/0x3ef
         [<ffffffff810e5224>] ? SyS_ioctl+0x4e/0x7e
         [<ffffffff816f88d2>] ? system_call_fastpath+0x16/0x1b
        Code: 52 0c a0 48 c7 c6 22 30 0d a0 31 c0 e8 ef 00 f9 ff bf c6 a7 00 00 e8 90 5d 24 e1 f6 85 13 01 00 00 10 75 44 48 8b 85 18 01 00 00 <8b> 50 08 48 8b 30 49 8b 84 24 88 02 00 00 48 89 c7 48 81 c7 98
        RIP  [<ffffffffa0082892>] i915_gem_gtt_finish_object+0x68/0xbd [i915]
        RSP <ffff880028e4b9e8>
        CR2: 0000000000000008
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68171Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: stable@vger.kernel.org
      [danvet: Bikeshed the comments a bit as discussed with Chris.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      57094f82
    • C
      drm/i915; Preallocate the lazy request · 3c0e234c
      Chris Wilson 提交于
      It is possible for us to be forced to perform an allocation for the lazy
      request whilst running the shrinker. This allocation may fail, leaving
      us unable to reclaim any memory leading to premature OOM. A neat
      solution to the problem is to preallocate the request at the same time
      as acquiring the seqno for the ring transaction. This means that we can
      report ENOMEM prior to touching the rings.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      3c0e234c
    • C
      drm/i915: Rename ring->outstanding_lazy_request · 1823521d
      Chris Wilson 提交于
      Prior to preallocating an request for lazy emission, rename the existing
      field to make way (and differentiate the seqno from the request struct).
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1823521d
  14. 04 9月, 2013 10 次提交
    • C
      drm/i915: Rearrange the comments in i915_add_request() · 9a7e0c2a
      Chris Wilson 提交于
      The comments were a little out-of-sequence with the code, forcing the
      reader to jump around whilst reading. Whilst moving the comments around,
      add one to explain the context reference.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9a7e0c2a
    • C
      drm/i915: Do not add an interrupt for a context switch · c0321e2c
      Chris Wilson 提交于
      We use the request to ensure we hold a reference to the context for the
      duration that it remains in use by the ring. Each request only holds a
      reference to the current context, hence we emit a request after
      switching contexts with the final reference to the old context. However,
      the extra interrupt caused by that request is not useful (no timing
      critical function will wait for the context object), instead the overhead
      of servicing the IRQ shows up in some (lightweight) benchmarks. In order
      to keep the useful property of using the request to manage the context
      lifetime, we want to add a dummy request that is associated with the
      interrupt from the subsequent real request following the batch.
      
      The extra interrupt was added as a side-effect of using
      i915_add_request() in
      
      commit 112522f6
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Thu May 2 16:48:07 2013 +0300
      
          drm/i915: put context upon switching
      
      v2: Daniel convinced me that the request here was solely for context
      lifetime tracking and that we have the active ref to keep the object
      alive whilst the MI_SET_CONTEXT. So the only concern then is which
      context should get the blame for MI_SET_CONTEXT failing. The old scheme
      added a request for the old context so that any hang upto and including
      the switch away would mark the old context as guilty. Now any hang here
      implicates the new context. However since we have already gone through a
      complete flush with the last context in its last request, and all that
      lies in no-man's-land is an invalidate flush and the MI_SET_CONTEXT, we
      should be safe in not unduly placing blame on the new context.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c0321e2c
    • D
      drm/i915: Fix list corruption in vma_unbind · 0ff501cb
      Daniel Vetter 提交于
      The saga around the breadcrumb vmas used by execbuf continues ...
      
      This time around we've managed to unconditionally move the object to
      the unbound list on the last vma unbind even though it might never
      have been on either the bound or unbound list. Hilarity ensued.
      
      Chris Wilson tracked this one down but compared to his patches I've
      simply opted to completely separate the unbound case for not-yet bound
      vmas. Otherwise we imo end up with semantically hard to parse checks
      around the list_move_tail(global_list, ...).
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68462Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      0ff501cb
    • R
      drm/i915: Report enabled slices on Haswell GT3 · 9435373e
      Rodrigo Vivi 提交于
      Batchbuffers constructed by userspace can conditionalise their URB
      allocations through the use of the MI_SET_PREDICATE command. This
      command can read the MI_PREDICATE_RESULT_2 register to see how many
      slices are enabled on GT3, and by virtue of the result, scale their
      memory allocations to fit enabled memory.
      
      Of course, this only works if the kernel sets the appropriate bit in the
      register first.
      
      v2: Better commit subject and message by Chris Wilson.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Credits-to: Yejun Guo <yejun.guo@intel.com>
      Signed-off-by: NRodrigo Vivi <rodrigo.vivi@gmail.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9435373e
    • D
      drm/i915: More vma fixups around unbind/destroy · b93dab6e
      Daniel Vetter 提交于
      The important bugfix here is that we must not unlink the vma when
      we keep it around as a placeholder for the execbuf code. Since then we
      won't find it again when execbuf gets interrupt and restarted and
      create a 2nd vma. And since the code as-is isn't fit yet to deal with
      more than one vma, hilarity ensues.
      
      Specifically the dma map/unmap of the sg table isn't adjusted for
      multiple vmas yet and will blow up like this:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      IP: [<ffffffffa008fb37>] i915_gem_gtt_finish_object+0x73/0xc8 [i915]
      PGD 56bb5067 PUD ad3dd067 PMD 0
      Oops: 0000 [#1] SMP
      Modules linked in: tcp_lp ppdev parport_pc lp parport ipv6 dm_mod dcdbas snd_hda_codec_hdmi pcspkr snd_hda_codec_realtek serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec lpc_ich snd_hwdep mfd_core snd_pcm snd_page_alloc snd_timer snd soundcore acpi_cpufreq i915 video button drm_kms_helper drm mperf freq_table
      CPU: 1 PID: 16650 Comm: fbo-maxsize Not tainted 3.11.0-rc4_nightlytop_d93f59_debug_20130814_+ #6957
      Hardware name: Dell Inc. OptiPlex 9010/03JR84, BIOS A01 05/04/2012
      task: ffff8800563b3f00 ti: ffff88004bdf4000 task.ti: ffff88004bdf4000
      RIP: 0010:[<ffffffffa008fb37>]  [<ffffffffa008fb37>] i915_gem_gtt_finish_object+0x73/0xc8 [i915]
      RSP: 0018:ffff88004bdf5958  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff8801135e0000 RCX: ffff8800ad3bf8e0
      RDX: ffff8800ad3bf8e0 RSI: 0000000000000000 RDI: ffff8801007ee780
      RBP: ffff88004bdf5978 R08: ffff8800ad3bf8e0 R09: 0000000000000000
      R10: ffffffff86ca1810 R11: ffff880036a17101 R12: ffff8801007ee780
      R13: 0000000000018001 R14: ffff880118c4e000 R15: ffff8801007ee780
      FS:  00007f401a0ce740(0000) GS:ffff88011e280000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000008 CR3: 000000005635c000 CR4: 00000000001407e0
      Stack:
       ffff8801007ee780 ffff88005c253180 0000000000018000 ffff8801135e0000
       ffff88004bdf59a8 ffffffffa0088e55 0000000000000011 ffff8801007eec00
       0000000000018000 ffff880036a17101 ffff88004bdf5a08 ffffffffa0089026
      Call Trace:
       [<ffffffffa0088e55>] i915_vma_unbind+0xdf/0x1ab [i915]
       [<ffffffffa0089026>] __i915_gem_shrink+0x105/0x177 [i915]
       [<ffffffffa0089452>] i915_gem_object_get_pages_gtt+0x108/0x309 [i915]
       [<ffffffffa0085ba9>] i915_gem_object_get_pages+0x61/0x90 [i915]
       [<ffffffffa008f22b>] ? gen6_ppgtt_insert_entries+0x103/0x125 [i915]
       [<ffffffffa008a113>] i915_gem_object_pin+0x1fa/0x5df [i915]
       [<ffffffffa008cdfe>] i915_gem_execbuffer_reserve_object.isra.6+0x8d/0x1bc [i915]
       [<ffffffffa008d156>] i915_gem_execbuffer_reserve+0x229/0x367 [i915]
       [<ffffffffa008dbf6>] i915_gem_do_execbuffer.isra.12+0x4dc/0xf3a [i915]
       [<ffffffff810fc823>] ? might_fault+0x40/0x90
       [<ffffffffa008eb89>] i915_gem_execbuffer2+0x187/0x222 [i915]
       [<ffffffffa000971c>] drm_ioctl+0x308/0x442 [drm]
       [<ffffffffa008ea02>] ? i915_gem_execbuffer+0x3ae/0x3ae [i915]
       [<ffffffff817db156>] ? __do_page_fault+0x3dd/0x481
       [<ffffffff8112fdba>] vfs_ioctl+0x26/0x39
       [<ffffffff811306a2>] do_vfs_ioctl+0x40e/0x451
       [<ffffffff817deda7>] ? sysret_check+0x1b/0x56
       [<ffffffff8113073c>] SyS_ioctl+0x57/0x87
       [<ffffffff8135bbfe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
       [<ffffffff817ded82>] system_call_fastpath+0x16/0x1b
      Code: 48 c7 c6 84 30 0e a0 31 c0 e8 d0 e9 f7 ff bf c6 a7 00 00 e8 07 af 2c e1 41 f6 84 24 03 01 00 00 10 75 44 49 8b 84 24 08 01 00 00 <8b> 50 08 48 8b 30 49 8b 86 b0 04 00 00 48 89 c7 48 81 c7 98 00
      RIP  [<ffffffffa008fb37>] i915_gem_gtt_finish_object+0x73/0xc8 [i915]
       RSP <ffff88004bdf5958>
      CR2: 0000000000000008
      
      As a consequence we need to change the "only one vma for now" check in
      vma_unbind - since vma_destroy isn't always called the obj->vma_list
      might not be empty. Instead check that the vma list is singular at the
      beginning of vma_unbind. This is also more symmetric with bind_to_vm.
      
      This fixes the igt/gem_evict_everything|alignment testcases.
      
      v2:
      - Add a paranoid WARN to mark_free in the eviction code to make sure
        we never try to evict a vma used by the execbuf code right now.
      - Move the check for a temporary execbuf vma into vma_destroy -
        otherwise the failure path cleanup in bind_to_vm will blow up.
      
      Our first attempting at fixing this was
      
      commit 1be81a2f2cfd8789a627401d470423358fba2d76
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Tue Aug 20 12:56:40 2013 +0100
      
          drm/i915: Don't destroy the vma placeholder during execbuffer reservation
      
      Squash with this when merging!
      
      v3: Improvements suggested in Chris' review:
      - Move the WARN_ON in vma_destroy that checks for vmas with an drm_mm
        allocation before the early return.
      - Bail out if we hit the WARN in mark_free to hopefully make the
        kernel survive for long enough to capture it.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68298
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68171
      Tested-by: lu hua <huax.lu@intel.com> (v2)
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b93dab6e
    • C
      drm/i915: Don't destroy the vma placeholder during execbuffer reservation · aaa05667
      Chris Wilson 提交于
      The execbuffer handle and exec_link were moved from the object into the
      vma. As the vma may be unbound and destroyed whilst attempting to
      reserve the execbuffer objects (either through a forced unbind to fix up
      a misalignment or through an evict-everything call) we need to prevent
      the free of the i915_vma itself. Otherwise not only is the list of
      objects to reserve corrupt, but we continue to reference stale vma
      entries.
      
      Fixes kernel crash with i-g-t/gem_evict_everything
      
      This regression has been introduced in
      
      commit 04038a515d6eda6dd0857c0ade0b3950d372f4c0
      Author:     Ben Widawsky <ben@bwidawsk.net>
      AuthorDate: Wed Aug 14 11:38:36 2013 +0200
      
          drm/i915: Convert execbuf code to use vmas
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      References: http://www.spinics.net/lists/intel-gfx/msg32038.html
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68298Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      aaa05667
    • D
      drm/i915: inline vma_create into lookup_or_create_vma · e656a6cb
      Daniel Vetter 提交于
      In the execbuf code we don't clean up any vmas which ended up not
      getting bound for code simplicity. To make sure that we don't end up
      creating multiple vma for the same vm kill the somewhat dangerous
      vma_create function and inline it into lookup_or_create.
      
      This is just a safety measure to prevent surprises in the future.
      
      Also update the somewhat confused comment in the execbuf code and
      clarify what kind of magic is going on with a new one.
      
      v2: Keep the function separate as requested by Chris. But give it a __
      prefix for paranoia and move it tighter together with the other vma
      stuff.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e656a6cb
    • B
      drm/i915: Convert execbuf code to use vmas · 27173f1f
      Ben Widawsky 提交于
      In order to transition more of our code over to using a VMA instead of
      an <OBJ, VM> pair - we must have the vma accessible at execbuf time. Up
      until now, we've only had a VMA when actually binding an object.
      
      The previous patch helped handle the distinction on bound vs. unbound.
      This patch will help us catch leaks, and other issues before we actually
      shuffle a bunch of stuff around.
      
      This attempts to convert all the execbuf code to speak in vmas. Since
      the execbuf code is very self contained it was a nice isolated
      conversion.
      
      The meat of the code is about turning eb_objects into eb_vma, and then
      wiring up the rest of the code to use vmas instead of obj, vm pairs.
      
      Unfortunately, to do this, we must move the exec_list link from the obj
      structure. This list is reused in the eviction code, so we must also
      modify the eviction code to make this work.
      
      WARNING: This patch makes an already hotly profiled path slower. The cost is
      unavoidable. In reply to this mail, I will attach the extra data.
      
      v2: Release table lock early, and two a 2 phase vma lookup to avoid
      having to use a GFP_ATOMIC. (Chris)
      
      v3: s/obj_exec_list/obj_exec_link/
      Updates to address
      commit 6d2b8885
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Wed Aug 7 18:30:54 2013 +0100
      
          drm/i915: List objects allocated from stolen memory in debugfs
      
      v4: Use obj = vma->obj for neatness in some places (Chris)
      need_reloc_mappable() should return false if ppgtt (Chris)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      [danvet: Split out prep patches. Also remove a FIXME comment which is
      now taken care of.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      27173f1f
    • D
      drm/i915: Don't call sg_free_table() if sg_alloc_table() fails · d2933a5b
      Damien Lespiau 提交于
      One needs to call __sg_free_table() if __sg_alloc_table() fails, but
      sg_alloc_table() does that for us already.
      Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
      Reviewd-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d2933a5b
    • J
      i915_gem: Convert kmem_cache_alloc(...GFP_ZERO) to kmem_cache_zalloc · fac15c10
      Joe Perches 提交于
      The helper exists, might as well use it instead of __GFP_ZERO.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      fac15c10
  15. 23 8月, 2013 1 次提交
    • P
      drm/i915: allow package C8+ states on Haswell (disabled) · c67a470b
      Paulo Zanoni 提交于
      This patch allows PC8+ states on Haswell. These states can only be
      reached when all the display outputs are disabled, and they allow some
      more power savings.
      
      The fact that the graphics device is allowing PC8+ doesn't mean that
      the machine will actually enter PC8+: all the other devices also need
      to allow PC8+.
      
      For now this option is disabled by default. You need i915.allow_pc8=1
      if you want it.
      
      This patch adds a big comment inside i915_drv.h explaining how it
      works and how it tracks things. Read it.
      
      v2: (this is not really v2, many previous versions were already sent,
           but they had different names)
          - Use the new functions to enable/disable GTIMR and GEN6_PMIMR
          - Rename almost all variables and functions to names suggested by
            Chris
          - More WARNs on the IRQ handling code
          - Also disable PC8 when there's GPU work to do (thanks to Ben for
            the help on this), so apps can run caster
          - Enable PC8 on a delayed work function that is delayed for 5
            seconds. This makes sure we only enable PC8+ if we're really
            idle
          - Make sure we're not in PC8+ when suspending
      v3: - WARN if IRQs are disabled on __wait_seqno
          - Replace some DRM_ERRORs with WARNs
          - Fix calls to restore GT and PM interrupts
          - Use intel_mark_busy instead of intel_ring_advance to disable PC8
      v4: - Use the force_wake, Luke!
      v5: - Remove the "IIR is not zero" WARNs
          - Move the force_wake chunk to its own patch
          - Only restore what's missing from RC6, not everything
      Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c67a470b
  16. 22 8月, 2013 2 次提交