1. 05 1月, 2011 1 次提交
  2. 24 11月, 2010 1 次提交
  3. 22 11月, 2010 1 次提交
    • M
      drm/vblank: Add support for precise vblank timestamping. · 27641c3f
      Mario Kleiner 提交于
      The DRI2 swap & sync implementation needs precise
      vblank counts and precise timestamps corresponding
      to those vblank counts. For conformance to the OpenML
      OML_sync_control extension specification the DRM
      timestamp associated with a vblank count should
      correspond to the start of video scanout of the first
      scanline of the video frame following the vblank
      interval for that vblank count.
      
      Therefore we need to carry around precise timestamps
      for vblanks. Currently the DRM and KMS drivers generate
      timestamps ad-hoc via do_gettimeofday() in some
      places. The resulting timestamps are sometimes not
      very precise due to interrupt handling delays, they
      don't conform to OML_sync_control and some are wrong,
      as they aren't taken synchronized to the vblank.
      
      This patch implements support inside the drm core
      for precise and robust timestamping. It consists
      of the following interrelated pieces.
      
      1. Vblank timestamp caching:
      
      A per-crtc ringbuffer stores the most recent vblank
      timestamps corresponding to vblank counts.
      
      The ringbuffer can be read out lock-free via the
      accessor function:
      
      struct timeval timestamp;
      vblankcount = drm_vblank_count_and_time(dev, crtcid, &timestamp).
      
      The function returns the current vblank count and
      the corresponding timestamp for start of video
      scanout following the vblank interval. It can be
      used anywhere between enclosing drm_vblank_get(dev, crtcid)
      and drm_vblank_put(dev,crtcid) statements. It is used
      inside the drmWaitVblank ioctl and in the vblank event
      queueing and handling. It should be used by kms drivers for
      timestamping of bufferswap completion.
      
      The timestamp ringbuffer is reinitialized each time
      vblank irq's get reenabled in drm_vblank_get()/
      drm_update_vblank_count(). It is invalidated when
      vblank irq's get disabled.
      
      The ringbuffer is updated inside drm_handle_vblank()
      at each vblank irq.
      
      2. Calculation of precise vblank timestamps:
      
      drm_get_last_vbltimestamp() is used to compute the
      timestamp for the end of the most recent vblank (if
      inside active scanout), or the expected end of the
      current vblank interval (if called inside a vblank
      interval). The function calls into a new optional kms
      driver entry point dev->driver->get_vblank_timestamp()
      which is supposed to provide the precise timestamp.
      If a kms driver doesn't implement the entry point or
      if the call fails, a simple do_gettimeofday() timestamp
      is returned as crude approximation of the true vblank time.
      
      A new drm module parameter drm.timestamp_precision_usec
      allows to disable high precision timestamps (if set to
      zero) or to specify the maximum acceptable error in
      the timestamps in microseconds.
      
      Kms drivers could implement their get_vblank_timestamp()
      function in a gpu specific way, as long as returned
      timestamps conform to OML_sync_control, e.g., by use
      of gpu specific hardware timestamps.
      
      Optionally, kms drivers can simply wrap and use the new
      utility function drm_calc_vbltimestamp_from_scanoutpos().
      This function calls a new optional kms driver function
      dev->driver->get_scanout_position() which returns the
      current horizontal and vertical video scanout position
      of the crtc. The scanout position together with the
      drm_display_timing of the current video mode is used
      to calculate elapsed time relative to start of active scanout
      for the current video frame. This elapsed time is subtracted
      from the current do_gettimeofday() time to get the timestamp
      corresponding to start of video scanout. Currently
      non-interlaced, non-doublescan video modes, with or
      without panel scaling are handled correctly. Interlaced/
      doublescan modes are tbd in a future patch.
      
      3. Filtering of redundant vblank irq's and removal of
      some race-conditions in the vblank irq enable/disable path:
      
      Some gpu's (e.g., Radeon R500/R600) send spurious vblank
      irq's outside the vblank if vblank irq's get reenabled.
      These get detected by use of the vblank timestamps and
      filtered out to avoid miscounting of vblanks.
      
      Some race-conditions between the vblank irq enable/disable
      functions, the vblank irq handler and the gpu itself (updating
      its hardware vblank counter in the "wrong" moment) are
      fixed inside vblank_disable_and_save() and
      drm_update_vblank_count() by use of the vblank timestamps and
      a new spinlock dev->vblank_time_lock.
      
      The time until vblank irq disable is now configurable via
      a new drm module parameter drm.vblankoffdelay to allow
      experimentation with timeouts that are much shorter than
      the current 5 seconds and should allow longer vblank off
      periods for better power savings.
      
      Followup patches will use these new functions to
      implement precise timestamping for the intel and radeon
      kms drivers.
      Signed-off-by: NMario Kleiner <mario.kleiner@tuebingen.mpg.de>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      27641c3f
  4. 01 11月, 2010 1 次提交
  5. 01 10月, 2010 3 次提交
  6. 28 9月, 2010 1 次提交
  7. 30 8月, 2010 13 次提交
  8. 17 8月, 2010 1 次提交
    • D
      drm: block userspace under allocating buffer and having drivers overwrite it (v2) · 1b2f1489
      Dave Airlie 提交于
      With the current screwed but its ABI, ioctls for the drm, Linus pointed out that we could allow userspace to specify the allocation size, but we pass it to the driver which then uses it blindly to store a struct. Now if userspace specifies the allocation size as smaller than the driver needs, the driver can possibly overwrite memory.
      
      This patch restructures the driver ioctls so we store the structure size we are expecting, and make sure we allocate at least that size. The copy from/to userspace are still restricted to the size the user specifies, this allows ioctl structs to grow on both sides of the equation.
      
      Up until now we didn't really use the DRM_IOCTL defines in the kernel, so this cleans them up and adds them for nouveau.
      
      v2:
      fix nouveau pushbuf arg (thanks to Ben for pointing it out)
      Reported-by: NLinus Torvalds <torvalds@linuxfoundation.org>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      1b2f1489
  9. 10 8月, 2010 1 次提交
    • B
      drm: Fix support for PCI domains · c17c2f89
      Benjamin Herrenschmidt 提交于
      (For some reason I thought that went in ages ago ...)
      
      This fixes support for PCI domains in what should hopefully be a backward
      compatible way along with a change to libdrm.
      
      When the interface version is set to 1.4, we assume userspace understands
      domains and the world is at peace. We thus pass proper domain numbers
      instead of 0 to userspace.
      
      The newer libdrm will then try 1.4 first, and fallback to 1.1, along with
      ignoring domains in the later case (well, except on alpha of course)
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      c17c2f89
  10. 05 8月, 2010 1 次提交
    • A
      drm: kill BKL from common code · 58374713
      Arnd Bergmann 提交于
      This restricts the use of the big kernel lock to the i830 and i810
      device drivers. The three remaining users in common code (open, ioctl
      and release) get converted to a new mutex, the drm_global_mutex,
      making the locking stricter than the big kernel lock.
      
      This may have a performance impact, but only in those cases that
      currently don't use DRM_UNLOCKED flag in the ioctl list and would
      benefit from that anyway.
      
      The reason why i810 and i830 cannot use drm_global_mutex in their
      mmap functions is a lock-order inversion problem between the current
      use of the BKL and mmap_sem in these drivers. Since the BKL has
      release-on-sleep semantics, it's harmless but it would cause trouble
      if we replace the BKL with a mutex.
      
      Instead, these drivers get their own ioctl wrappers that take the
      BKL around every ioctl call and then set their own handlers as
      DRM_UNLOCKED.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: David Airlie <airlied@linux.ie>
      Cc: dri-devel@lists.freedesktop.org
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      58374713
  11. 04 8月, 2010 1 次提交
  12. 02 7月, 2010 1 次提交
  13. 01 6月, 2010 2 次提交
  14. 20 4月, 2010 2 次提交
  15. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  16. 15 3月, 2010 1 次提交
  17. 11 2月, 2010 1 次提交
  18. 07 1月, 2010 1 次提交
  19. 18 12月, 2009 1 次提交
    • A
      drm: convert drm_ioctl to unlocked_ioctl · ed8b6704
      Arnd Bergmann 提交于
      drm_ioctl is called with the Big Kernel Lock held,
      which shows up very high in statistics on vfs_ioctl.
      
      Moving the lock into the drm_ioctl function itself
      makes sure we blame the right subsystem and it gets
      us one step closer to eliminating the locked version
      of fops->ioctl.
      
      Since drm_ioctl does not require the lock itself,
      we only need to hold it while calling the specific
      handler. The 32 bit conversion handlers do not
      interact with any other code, so they don't need
      the BKL here either and can just call drm_ioctl.
      
      As a bonus, this cleans up all the other users
      of drm_ioctl which now no longer have to find
      the inode or call lock_kernel.
      
      [airlied: squashed the non-driver bits
      of the second patch in here, this provides
      the flag for drivers to use to select unlocked
      ioctls - but doesn't modify any drivers].
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: David Airlie <airlied@linux.ie>
      Cc: dri-devel@lists.sourceforge.net
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      ed8b6704
  20. 04 12月, 2009 2 次提交
  21. 02 12月, 2009 1 次提交
    • L
      drm/i915: Fix sync to vblank when VGA output is turned off · 778c9026
      Li Peng 提交于
      In current vblank-wait implementation, if we turn off VGA output,
      drm_wait_vblank will still wait on the disabled pipe until timeout,
      because vblank on the pipe is assumed be enabled. This would cause
      slow system response on some system such as moblin.
      
      This patch resolve the issue by adding a drm helper function
      drm_vblank_off which explicitly clear vblank_enabled[crtc], wake up
      any waiting queue and save last vblank counter before turning off
      crtc. It also slightly change drm_vblank_get to ensure that we will
      will return immediately if trying to wait on a disabled pipe.
      Signed-off-by: NLi Peng <peng.li@intel.com>
      Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      [anholt: hand-applied for conflicts with overlay changes]
      Signed-off-by: NEric Anholt <eric@anholt.net>
      778c9026
  22. 25 11月, 2009 1 次提交
    • E
      drm/i915: Replace a calloc followed by copying data over it with malloc. · c8e0f93a
      Eric Anholt 提交于
      Execbufs involve quite a bit of payload, to the extent that cache misses
      show up in the profiles here, and a suspicion that some of those cachelines
      may get evicted and then reloaded in the subsequent copy.
      
      This is still abstracted like drm_calloc_large since we want to check for
      size overflow, and because we want to choose between kmalloc and vmalloc
      on the fly.  cairo's interface for malloc-with-calloc's-args was used as
      the model.
      Signed-off-by: NEric Anholt <eric@anholt.net>
      c8e0f93a
  23. 18 11月, 2009 1 次提交