提交 · fef9df8b594531a4257b6a3bf7e190570c17be29 · openanolis / cloud-kernel

03 7月, 2017 1 次提交

drm/atomic: initial support for asynchronous plane update · fef9df8b

由 Gustavo Padovan 提交于 6月 30, 2017

In some cases, like cursor updates, it is interesting to update the
plane in an asynchronous fashion to avoid big delays. The current queued
update could be still waiting for a fence to signal and thus block any
subsequent update until its scan out. In cases like this if we update the
cursor synchronously through the atomic API it will cause significant
delays that would even be noticed by the final user.

This patch creates a fast path to jump ahead the current queued state and
do single planes updates without going through all atomic steps in
drm_atomic_helper_commit(). We take this path for legacy cursor updates.

For now only single plane updates are supported, but we plan to support
multiple planes updates and async PageFlips through this interface as well
in the near future.

v6:	- move check code to drm_atomic_helper.c (Daniel Vetter)

v5:
	- improve comments (Eric Anholt)

v4:
	- fix state->crtc NULL check (Archit Taneja)

v3:
	- fix iteration on the wrong crtc state
	- put back code to forbid updates if there is a queued update for
	the same plane (Ville Syrjälä)
	- move size checks back to drivers (Ville Syrjälä)
	- move ASYNC_UPDATE flag addition to its own patch (Ville Syrjälä)

v2:
	- allow updates even if there is a queued update for the same
	plane.
        - fixes on the documentation (Emil Velikov)
        - unconditionally call ->atomic_async_update (Emil Velikov)
        - check for ->atomic_async_update earlier (Daniel Vetter)
        - make ->atomic_async_check() the last step (Daniel Vetter)
        - add ASYNC_UPDATE flag (Eric Anholt)
        - update state in core after ->atomic_async_update (Eric Anholt)
	- update docs (Eric Anholt)

Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org> (v5)
Acked-by: Eric Anholt <eric@anholt.net> (v5)
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170630180322.29007-2-gustavo@padovan.org

fef9df8b

30 6月, 2017 2 次提交

drm: Add old state pointer to CRTC .enable() helper function · 0b20a0f8

由 Laurent Pinchart 提交于 6月 30, 2017

The old state is useful for drivers that need to perform operations at
enable time that depend on the transition between the old and new
states.

While at it, rename the operation to .atomic_enable() to be consistent
with .atomic_disable(), as the .enable() operation is used by atomic
helpers only.
Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com> # for sun4i
Acked-by: Philipp Zabel <p.zabel@pengutronix.de> # for imx-drm and mediatek
Acked-by: Alexey Brodkin <abrodkin@synopsys.com> # for arcpgu
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> # for atmel-hlcdc
Acked-by: Liviu Dudau <Liviu.Dudau@arm.com> # for hdlcd and mali-dp
Acked-by: Stefan Agner <stefan@agner.ch> # for fsl-dcu
Tested-by: Philippe Cornu <philippe.cornu@st.com> # for stm
Acked-by: Philippe Cornu <philippe.cornu@st.com> # for stm
Acked-by: Vincent Abriou <vincent.abriou@st.com> # for sti
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> # for vmwgfx
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170630093646.7928-2-laurent.pinchart+renesas@ideasonboard.com

0b20a0f8

dma-buf/dma-fence: Extract __dma_fence_is_later() · 81114776

由 Chris Wilson 提交于 6月 29, 2017

Often we have the task of comparing two seqno known to be on the same
context, so provide a common __dma_fence_is_later().
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Reviewed-by: NSean Paul <seanpaul@chromium.org>
Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170629125930.821-1-chris@chris-wilson.co.uk

81114776

28 6月, 2017 1 次提交

drm/vblank: Unexport drm_vblank_cleanup · b4164d66

由 Daniel Vetter 提交于 6月 26, 2017

There's no reason for drivers to call this, and all the ones I've
removed looked very fishy:
- Proper quiescenting of the vblank machinery should be done by
  calling drm_crtc_vblank_off(), which is best done by shutting down
  the entire display engine with drm_atomic_helper_shutdown.

- Releasing of allocated memory is done by the core already, it calls
  drm_vblank_cleanup as a fallback.

- drm_vblank_cleanup also has checks for drivers which forget to clean
  up vblank interrupts.

This essentially reverts

commit e77cef9c
Author: Jerome Glisse <jglisse@redhat.com>
Date:   Thu Jan 7 15:39:13 2010 +0100

    drm: Avoid calling vblank function is vblank wasn't initialized

which was done to fix a bug in radeon code with msi interrupts:

commit 003e69f9
Author: Jerome Glisse <jglisse@redhat.com>
Date:   Thu Jan 7 15:39:14 2010 +0100

    drm/radeon/kms: Don't try to enable IRQ if we have no handler installed

Afaict from digging around in old code, this was needed to avoid
blowing up in the ums fallback, and has stopped serving it's purpose
long ago - if irq init fails, the driver fails to load, and there's
really no way to blow up anymore.

Long story short, this was most likely a small ums compat/fallback
hack that became a thing of it's own and got cargo-cult duplicated all
over the drm codebase for essentially no gain at all.

v2: Mention that for drivers with a ->release callback cleanup is
handled by drm_dev_fini() (Thierry).

Cc: Thierry Reding <treding@nvidia.com>
Acked-by: NThierry Reding <treding@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: NSean Paul <seanpaul@chromium.org>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170626161949.25629-2-daniel.vetter@ffwll.ch

b4164d66

24 6月, 2017 1 次提交

slub: make sysfs file removal asynchronous · 3b7b3140

由 Tejun Heo 提交于 6月 23, 2017

Commit bf5eb3de ("slub: separate out sysfs_slab_release() from
sysfs_slab_remove()") made slub sysfs file removals synchronous to
kmem_cache shutdown.

Unfortunately, this created a possible ABBA deadlock between slab_mutex
and sysfs draining mechanism triggering the following lockdep warning.

  ======================================================
  [ INFO: possible circular locking dependency detected ]
  4.10.0-test+ #48 Not tainted
  -------------------------------------------------------
  rmmod/1211 is trying to acquire lock:
   (s_active#120){++++.+}, at: [<ffffffff81308073>] kernfs_remove+0x23/0x40

  but task is already holding lock:
   (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #1 (slab_mutex){+.+.+.}:
	 lock_acquire+0xf6/0x1f0
	 __mutex_lock+0x75/0x950
	 mutex_lock_nested+0x1b/0x20
	 slab_attr_store+0x75/0xd0
	 sysfs_kf_write+0x45/0x60
	 kernfs_fop_write+0x13c/0x1c0
	 __vfs_write+0x28/0x120
	 vfs_write+0xc8/0x1e0
	 SyS_write+0x49/0xa0
	 entry_SYSCALL_64_fastpath+0x1f/0xc2

  -> #0 (s_active#120){++++.+}:
	 __lock_acquire+0x10ed/0x1260
	 lock_acquire+0xf6/0x1f0
	 __kernfs_remove+0x254/0x320
	 kernfs_remove+0x23/0x40
	 sysfs_remove_dir+0x51/0x80
	 kobject_del+0x18/0x50
	 __kmem_cache_shutdown+0x3e6/0x460
	 kmem_cache_destroy+0x1fb/0x2d0
	 kvm_exit+0x2d/0x80 [kvm]
	 vmx_exit+0x19/0xa1b [kvm_intel]
	 SyS_delete_module+0x198/0x1f0
	 entry_SYSCALL_64_fastpath+0x1f/0xc2

  other info that might help us debug this:

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(slab_mutex);
				 lock(s_active#120);
				 lock(slab_mutex);
    lock(s_active#120);

   *** DEADLOCK ***

  2 locks held by rmmod/1211:
   #0:  (cpu_hotplug.dep_map){++++++}, at: [<ffffffff810a7877>] get_online_cpus+0x37/0x80
   #1:  (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0

  stack backtrace:
  CPU: 3 PID: 1211 Comm: rmmod Not tainted 4.10.0-test+ #48
  Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
  Call Trace:
   print_circular_bug+0x1be/0x210
   __lock_acquire+0x10ed/0x1260
   lock_acquire+0xf6/0x1f0
   __kernfs_remove+0x254/0x320
   kernfs_remove+0x23/0x40
   sysfs_remove_dir+0x51/0x80
   kobject_del+0x18/0x50
   __kmem_cache_shutdown+0x3e6/0x460
   kmem_cache_destroy+0x1fb/0x2d0
   kvm_exit+0x2d/0x80 [kvm]
   vmx_exit+0x19/0xa1b [kvm_intel]
   SyS_delete_module+0x198/0x1f0
   ? SyS_delete_module+0x5/0x1f0
   entry_SYSCALL_64_fastpath+0x1f/0xc2

It'd be the cleanest to deal with the issue by removing sysfs files
without holding slab_mutex before the rest of shutdown; however, given
the current code structure, it is pretty difficult to do so.

This patch punts sysfs file removal to a work item.  Before commit
bf5eb3de, the removal was punted to a RCU delayed work item which is
executed after release.  Now, we're punting to a different work item on
shutdown which still maintains the goal removing the sysfs files earlier
when destroying kmem_caches.

Link: http://lkml.kernel.org/r/20170620204512.GI21326@htj.duckdns.org
Fixes: bf5eb3de ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()")
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3b7b3140

23 6月, 2017 1 次提交

drm/qxl: fix __user annotations · 6545135a

由 Gerd Hoffmann 提交于 6月 20, 2017

Drop them from u64 fields, tag local variables correctly instead.
While being at it switch the code to use u64_to_user_ptr().
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
Acked-by: NDaniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620113916.6967-2-kraxel@redhat.com

6545135a

22 6月, 2017 3 次提交

drm: Add drm_atomic_helper_wait_for_flip_done() · 01086487

由 Boris Brezillon 提交于 6月 02, 2017

Add an helper to wait for all page flips of an atomic state to be done.

v2:
- Pimp kerneldoc as discussed with Boris on irc
- Add missing doc for @dev.
- Use old_state for consitency with wait_for_vblanks

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> (v1)
Acked-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1496392332-8722-2-git-send-email-boris.brezillon@free-electrons.com

01086487

ACPI / scan: Fix enumeration for special SPI and I2C devices · e4330d8b

由 Jarkko Nikula 提交于 6月 19, 2017

Commit f406270b ("ACPI / scan: Set the visited flag for all
enumerated devices") caused that two group of special SPI or I2C
devices do not enumerate. SPI and I2C devices are expected to be
enumerated by the SPI and I2C subsystems but change caused that
acpi_bus_attach() marks those devices with acpi_device_set_enumerated().

First group of devices are matched using Device Tree compatible property
with special _HID "PRP0001". Those devices have matched scan handler,
acpi_scan_attach_handler() retuns 1 and acpi_bus_attach() marks them
with acpi_device_set_enumerated().

Second group of devices without valid _HID such as "LNXVIDEO" have
device->pnp.type.platform_id set to zero and change again marks them
with acpi_device_set_enumerated().

Fix this by flagging the SPI and I2C devices during struct acpi_device
object initialization time and let the code in acpi_bus_attach() to go
through the device_attach() and acpi_default_enumeration() path for all
SPI and I2C devices.

Fixes: f406270b (ACPI / scan: Set the visited flag for all enumerated devices)
Signed-off-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Cc: 4.11+ <stable@vger.kernel.org> # 4.11+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e4330d8b

blk-mq: fix performance regression with shared tags · 8e8320c9

由 Jens Axboe 提交于 6月 20, 2017

If we have shared tags enabled, then every IO completion will trigger
a full loop of every queue belonging to a tag set, and every hardware
queue for each of those queues, even if nothing needs to be done.
This causes a massive performance regression if you have a lot of
shared devices.

Instead of doing this huge full scan on every IO, add an atomic
counter to the main queue that tracks how many hardware queues have
been marked as needing a restart. With that, we can avoid looking for
restartable queues, if we don't have to.

Max reports that this restores performance. Before this patch, 4K
IOPS was limited to 22-23K IOPS. With the patch, we are running at
950-970K IOPS.

Fixes: 6d8c6c0f ("blk-mq: Restart a single queue if tag sets are shared")
Reported-by: NMax Gurtovoy <maxg@mellanox.com>
Tested-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Tested-by: NBart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8e8320c9

20 6月, 2017 8 次提交

drm: Convert CMA fbdev console suspend helpers to use bool · d0a29878

由 Liviu Dudau 提交于 6月 20, 2017

drm_fbdev_cma_set_suspend{,_unlocked} use an integer parameter
to describe whether the intended state is a suspend or a resume.
It then passes the value to drm_fb_helper_set_suspend{,_unlocked}
which uses a boolean. Switch to using bool everywhere.
Signed-off-by: NLiviu Dudau <liviu.dudau@arm.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620102320.8849-1-Liviu.Dudau@arm.com

d0a29878

time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting · 3d88d56c

由 John Stultz 提交于 6月 08, 2017

Due to how the MONOTONIC_RAW accumulation logic was handled,
there is the potential for a 1ns discontinuity when we do
accumulations. This small discontinuity has for the most part
gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW
in their vDSO clock_gettime implementation, we've seen failures
with the inconsistency-check test in kselftest.

This patch addresses the issue by using the same sub-ns
accumulation handling that CLOCK_MONOTONIC uses, which avoids
the issue for in-kernel users.

Since the ARM64 vDSO implementation has its own clock_gettime
calculation logic, this patch reduces the frequency of errors,
but failures are still seen. The ARM64 vDSO will need to be
updated to include the sub-nanosecond xtime_nsec values in its
calculation for this issue to be completely fixed.
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Tested-by: NDaniel Mentz <danielmentz@google.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "stable #4 . 8+" <stable@vger.kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

3d88d56c

time: Fix clock->read(clock) race around clocksource changes · ceea5e37

由 John Stultz 提交于 6月 08, 2017

In tests, which excercise switching of clocksources, a NULL
pointer dereference can be observed on AMR64 platforms in the
clocksource read() function:

u64 clocksource_mmio_readl_down(struct clocksource *c)
{
	return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask;
}

This is called from the core timekeeping code via:

	cycle_now = tkr->read(tkr->clock);

tkr->read is the cached tkr->clock->read() function pointer.
When the clocksource is changed then tkr->clock and tkr->read
are updated sequentially. The code above results in a sequential
load operation of tkr->read and tkr->clock as well.

If the store to tkr->clock hits between the loads of tkr->read
and tkr->clock, then the old read() function is called with the
new clock pointer. As a consequence the read() function
dereferences a different data structure and the resulting 'reg'
pointer can point anywhere including NULL.

This problem was introduced when the timekeeping code was
switched over to use struct tk_read_base. Before that, it was
theoretically possible as well when the compiler decided to
reload clock in the code sequence:

     now = tk->clock->read(tk->clock);

Add a helper function which avoids the issue by reading
tk_read_base->clock once into a local variable clk and then issue
the read function via clk->read(clk). This guarantees that the
read() function always gets the proper clocksource pointer handed
in.

Since there is now no use for the tkr.read pointer, this patch
also removes it, and to address stopping the fast timekeeper
during suspend/resume, it introduces a dummy clocksource to use
rather then just a dummy read function.
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Acked-by: NIngo Molnar <mingo@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: stable <stable@vger.kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Daniel Mentz <danielmentz@google.com>
Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

ceea5e37

drm/doc: Improve ioctl/fops docs a bit more · bb2eaba6

由 Daniel Vetter 提交于 5月 31, 2017

I spotted a markup issue, plus adding the descriptions in drm_driver.
Plus a few more links while at it.

I'm still mildly unhappy with the split between fops and ioctls, but I
still think having the ioctls in the uapi chapter makes more sense. Oh
well ...

v2: Rebase.

v3: Move misplace hunk to the right patch.

Cc: Stefan Agner <stefan@agner.ch>
Acked-by: NThierry Reding <treding@nvidia.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170531092045.3950-1-daniel.vetter@ffwll.ch

bb2eaba6

drm/pci: Deprecate drm_pci_init/exit completely · 10631d72

由 Daniel Vetter 提交于 5月 24, 2017

The magic switching between proper pci driver and shadow-attach isn't
useful anymore since there's no ums+kms drivers left. Let's split this
up properly, calling pci_register_driver for kms drivers and renaming
the shadow-attach init to drm_legacy_pci_init/exit.
Acked-by: NThierry Reding <treding@nvidia.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170524145212.27837-6-daniel.vetter@ffwll.ch

10631d72

drm: Remove drm_driver->set_busid hook · 5c484cee

由 Daniel Vetter 提交于 5月 24, 2017

The only special-case is pci devices, and we can easily handle this in
the core. Do so and drop a pile of boilerplate from drivers.
Acked-by: NThierry Reding <treding@nvidia.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170524145212.27837-5-daniel.vetter@ffwll.ch

5c484cee

drm/vblank: Consistent drm_crtc_ prefix · ca814b25

由 Daniel Vetter 提交于 5月 24, 2017

We use drm_crtc_ for all the new-style vblank functions which directly
take a struct drm_crtc *. drm_accurate_vblank_count was the odd one
out, correct this to appease my OCD.
Acked-by: NThierry Reding <treding@nvidia.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170524145212.27837-13-daniel.vetter@ffwll.ch

ca814b25

drm/doc: vblank cleanup · 57d30230

由 Daniel Vetter 提交于 5月 24, 2017

Unify and review everything, plus make sure it's all correct markup.
Drop the kernel-doc for internal functions. Also rework the overview
section, it's become rather outdated.

Unfortuantely the kernel-doc in drm_driver isn't rendered yet, but
that will change as soon as drm_driver is kernel-docified properly.

Also document properly that drm_vblank_cleanup is optional, the core
calls this already.

v2: Make it clear that cleanup happens in drm_dev_fini for drivers
with their own ->release callback (Thierry).
Acked-by: NThierry Reding <treding@nvidia.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170524145212.27837-11-daniel.vetter@ffwll.ch

57d30230

19 6月, 2017 1 次提交

mm: larger stack guard gap, between vmas · 1be7107f

由 Hugh Dickins 提交于 6月 19, 2017

Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.

This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.

Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.

One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).

Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.

Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: NOleg Nesterov <oleg@redhat.com>
Original-patch-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NHugh Dickins <hughd@google.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1be7107f

17 6月, 2017 1 次提交

amdgpu: use drm sync objects for shared semaphores (v6) · 660e8558

由 Dave Airlie 提交于 3月 13, 2017

This creates a new command submission chunk for amdgpu
to add in and out sync objects around the submission.

Sync objects are managed via the drm syncobj ioctls.

The command submission interface is enhanced with two new
chunks, one for syncobj pre submission dependencies,
and one for post submission sync obj signalling,
and just takes a list of handles for each.

This is based on work originally done by David Zhou at AMD,
with input from Christian Konig on what things should look like.

In theory VkFences could be backed with sync objects and
just get passed into the cs as syncobj handles as well.

NOTE: this interface addition needs a version bump to expose
it to userspace.

TODO: update to dep_sync when rebasing onto amdgpu master.
(with this - r-b from Christian)

v1.1: keep file reference on import.
v2: move to using syncobjs
v2.1: change some APIs to just use p pointer.
v3: make more robust against CS failures, we now add the
wait sems but only remove them once the CS job has been
submitted.
v4: rewrite names of API and base on new syncobj code.
v5: move post deps earlier, rename some apis
v6: lookup post deps earlier, and just replace fences
in post deps stage (Christian)
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

660e8558

16 6月, 2017 5 次提交

drm/i915: Allow execbuffer to use the first object as the batch · 1a71cf2f

由 Chris Wilson 提交于 6月 16, 2017

Currently, the last object in the execlist is the always the batch.
However, when building the batch buffer we often know the batch object
first and if we can use the first slot in the execlist we can emit
relocation instructions relative to it immediately and avoid a separate
pass to adjust the relocations to point to the last execlist slot.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

1a71cf2f

drm/msm: Add hint to DRM_IOCTL_MSM_GEM_INFO to return an object IOVA · 49fd08ba

由 Jordan Crouse 提交于 5月 08, 2017

Modify the 'pad' member of struct drm_msm_gem_info to 'flags'. If the
user sets 'flags' to non-zero it means that they want a IOVA for the
GEM object instead of a mmap() offset. Return the iova in the 'offset'
member.
Signed-off-by: NJordan Crouse <jcrouse@codeaurora.org>
[robclark: s/hint/flags in commit msg]
Signed-off-by: NRob Clark <robdclark@gmail.com>

49fd08ba

drm/msm: Remove DRM_MSM_NUM_IOCTLS · 167b606a

由 Jordan Crouse 提交于 5月 08, 2017

The ioctl array is sparsely populated but the compiler will make sure
that it is sufficiently sized for all the values that we have so we
can safely use ARRAY_SIZE() instead of having a constantly changing
#define in the uapi header.
Signed-off-by: NJordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: NRob Clark <robdclark@gmail.com>

167b606a

drm/vc4: Add get/set tiling ioctls. · 83753117

由 Eric Anholt 提交于 6月 07, 2017

This allows mesa to set the tiling format for a BO and have that
tiling format be respected by mesa on the other side of an
import/export (and by vc4 scanout in the kernel), without defining a
protocol to pass the tiling through userspace.
Signed-off-by: NEric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-2-eric@anholt.netAcked-by: NDave Airlie <airlied@redhat.com>

83753117

drm/vc4: Add T-format scanout support. · 98830d91

由 Eric Anholt 提交于 6月 07, 2017

The T tiling format is what V3D uses for textures, with no raster
support at all until later revisions of the hardware (and always at a
large 3D performance penalty). If we can't scan out V3D's format,
then we often need to do a relayout at some stage of the pipeline,
either right before texturing from the scanout buffer (common in X11
without a compositor) or between a tiled screen buffer right before
scanout (an option I've considered in trying to resolve this
inconsistency, but which means needing to use the dirty fb ioctl and
having some update policy).

T-format scanout lets us avoid either of those shadow copies, for a
massive, obvious performance improvement to X11 window dragging
without a compositor. Unfortunately, enabling a compositor to work
around the discrepancy has turned out to be too costly in memory
consumption for the Raspbian distribution.

Because the HVS operates a scanline at a time, compositing from T does
increase the memory bandwidth cost of scanout. On my 1920x1080@32bpp
display on a RPi3, we go from about 15% of system memory bandwidth
with linear to about 20% with tiled. However, for X11 this still ends
up being a huge performance win in active usage.

This patch doesn't yet handle src_x/src_y offsetting within the tiled
buffer. However, we fail to do so for untiled buffers already.
Signed-off-by: NEric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-1-eric@anholt.netReviewed-by: NBoris Brezillon <boris.brezillon@free-electrons.com>

98830d91

15 6月, 2017 12 次提交

gpu: host1x: Refactor channel allocation code · 8474b025

由 Mikko Perttunen 提交于 6月 15, 2017

This is largely a rewrite of the Host1x channel allocation code, bringing
several changes:

- The previous code could deadlock due to an interaction
  between the 'reflock' mutex and CDMA timeout handling.
  This gets rid of the mutex.
- Support for more than 32 channels, required for Tegra186
- General refactoring, including better encapsulation
  of channel ownership handling into channel.c
Signed-off-by: NMikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: NDmitry Osipenko <digetx@gmail.com>
Tested-by: NDmitry Osipenko <digetx@gmail.com>
Signed-off-by: NThierry Reding <treding@nvidia.com>

8474b025

gpu: host1x: Correct swapped arguments in the is_addr_reg() definition · a2b78b0d

由 Dmitry Osipenko 提交于 6月 15, 2017

Arguments of the .is_addr_reg() are swapped in the definition of the
function, that is quite confusing.
Signed-off-by: NDmitry Osipenko <digetx@gmail.com>
Reviewed-by: NErik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: NMikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: NThierry Reding <treding@nvidia.com>

a2b78b0d

gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall · 0f563a4b

由 Dmitry Osipenko 提交于 6月 15, 2017

Several channels could be made to write the same unit concurrently via
the SETCLASS opcode, trusting userspace is a bad idea. It should be
possible to drop the per-client channel reservation and add a per-unit
locking by inserting MLOCK's to the command stream to re-allow the
SETCLASS opcode, but it will be much more work. Let's forbid the
unit-unrelated class changes for now.
Signed-off-by: NDmitry Osipenko <digetx@gmail.com>
Reviewed-by: NErik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: NMikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: NThierry Reding <treding@nvidia.com>

0f563a4b

drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTL · d0fbbdff

由 Dmitry Osipenko 提交于 6月 15, 2017

The waitchecks along with multiple syncpoints per submit are not ready
for use yet, let's forbid them for now.
Signed-off-by: NDmitry Osipenko <digetx@gmail.com>
Reviewed-by: NMikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: NThierry Reding <treding@nvidia.com>

d0fbbdff

gpu: host1x: Flesh out kerneldoc · 466749f1

由 Thierry Reding 提交于 4月 10, 2017

Improve kerneldoc for the public parts of the host1x infrastructure in
preparation for adding driver-specific part to the GPU documentation.
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NThierry Reding <treding@nvidia.com>

466749f1

firmware: dmi_scan: Make dmi_walk and dmi_walk_early return real error codes · c9268200

由 Andy Lutomirski 提交于 6月 15, 2017

Currently they return -1 on error, which will confuse callers if
they try to interpret it as a normal negative error code.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NDarren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: NJean Delvare <jdelvare@suse.de>

c9268200

drm: Remove duplicate forward declaration · d35fb617

由 Dawid Kurek 提交于 6月 14, 2017

Forward declarations in C are great but I'm pretty sure one is enough.
Signed-off-by: NDawid Kurek <dawikur@gmail.com>
Signed-off-by: NSean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170614213518.GA3554@gmail.com

d35fb617

drm/i915/perf: Add OA unit support for Gen 8+ · 19f81df2

由 Robert Bragg 提交于 6月 13, 2017

Enables access to OA unit metrics for BDW, CHV, SKL and BXT which all
share (more-or-less) the same OA unit design.

Of particular note in comparison to Haswell: some OA unit HW config
state has become per-context state and as a consequence it is somewhat
more complicated to manage synchronous state changes from the cpu while
there's no guarantee of what context (if any) is currently actively
running on the gpu.

The periodic sampling frequency which can be particularly useful for
system-wide analysis (as opposed to command stream synchronised
MI_REPORT_PERF_COUNT commands) is perhaps the most surprising state to
have become per-context save and restored (while the OABUFFER
destination is still a shared, system-wide resource).

This support for gen8+ takes care to consider a number of timing
challenges involved in synchronously updating per-context state
primarily by programming all config state from the cpu and updating all
current and saved contexts synchronously while the OA unit is still
disabled.

The driver intentionally avoids depending on command streamer
programming to update OA state considering the lack of synchronization
between the automatic loading of OACTXCONTROL state (that includes the
periodic sampling state and enable state) on context restore and the
parsing of any general purpose BB the driver can control. I.e. this
implementation is careful to avoid the possibility of a context restore
temporarily enabling any out-of-date periodic sampling state. In
addition to the risk of transiently-out-of-date state being loaded
automatically; there are also internal HW latencies involved in the
loading of MUX configurations which would be difficult to account for
from the command streamer (and we only want to enable the unit when once
the MUX configuration is complete).

Since the Gen8+ OA unit design no longer supports clock gating the unit
off for a single given context (which effectively stopped any progress
of counters while any other context was running) and instead supports
tagging OA reports with a context ID for filtering on the CPU, it means
we can no longer hide the system-wide progress of counters from a
non-privileged application only interested in metrics for its own
context. Although we could theoretically try and subtract the progress
of other contexts before forwarding reports via read() we aren't in a
position to filter reports captured via MI_REPORT_PERF_COUNT commands.
As a result, for Gen8+, we always require the
dev.i915.perf_stream_paranoid to be unset for any access to OA metrics
if not root.

v5: Drain submitted requests when enabling metric set to ensure no
    lite-restore erases the context image we just updated (Lionel)

v6: In addition to drain, switch to kernel context & update all
    context in place (Chris)

v7: Add missing mutex_unlock() if switching to kernel context fails
    (Matthew)

v8: Simplify OA period/flex-eu-counters programming by using the
    batchbuffer instead of modifying ctx-image (Lionel)

v9: Back to updating the context image (due to erroneous testing,
    batchbuffer programming the OA unit doesn't actually work)
    (Lionel)
    Pin context before updating context image (Chris)
    Drop MMIO programming now that we switch to a kernel context with
    right values in initial context image (Chris)

v10: Just pin_map the contexts we want to modify or let the
     configuration happen on first use (Chris)

v11: Update kernel context OA config through the batchbuffer rather
     than on the fly ctx-image update (Lionel)

v12: Rework OA context registers update again by swithing away from
     user contexts and reconfiguring the kernel context through the
     batchbuffer and updating all the other contexts' context image.
     Also take care to lock slice/subslice configuration when OA is
     on. (Lionel)

v13: Request rpcs updates on all engine when updating the OA config
     (Lionel)

v14: Drop any kind of rpcs management now that we monitor sseu
     configuration changes in a later patch (Lionel)
     Remove usleep after programming the NOA configs on Gen8+, this
     doesn't seem to be needed (Lionel)

v15: Respect coding style for block comments (Chris)

v16: Add missing i915_add_request() in case we fail to emit OA
     configuration (Matthew)
Signed-off-by: NRobert Bragg <robert@sixbynine.org>
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com> \o/
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>

19f81df2

drm/i915: expose _SUBSLICE_MASK GETPARM · f5320233

由 Robert Bragg 提交于 6月 13, 2017

Assuming a uniform mask across all slices, this enables userspace to
determine the specific sub slices can be enabled. This information is
required, for example, to be able to analyse some OA counter reports
where the counter configuration depends on the HW sub slice
configuration.
Signed-off-by: NRobert Bragg <robert@sixbynine.org>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>

f5320233

drm/i915: expose _SLICE_MASK GETPARM · 7fed555c

由 Robert Bragg 提交于 6月 13, 2017

Enables userspace to determine the maximum number of slices that can
be enabled on the device and also know what specific slices can be
enabled. This information is required, for example, to be able to
analyse some OA counter reports where the counter configuration
depends on the HW slice configuration.
Signed-off-by: NRobert Bragg <robert@sixbynine.org>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NBen Widawsky <ben@bwidawsk.net>

7fed555c

block: Fix a blk_exit_rl() regression · dc9edc44

由 Bart Van Assche 提交于 6月 14, 2017

Avoid that the following complaint is reported:

 BUG: sleeping function called from invalid context at kernel/workqueue.c:2790
 in_atomic(): 1, irqs_disabled(): 0, pid: 41, name: rcuop/3
 1 lock held by rcuop/3/41:
  #0:  (rcu_callback){......}, at: [<ffffffff8111f9a2>] rcu_nocb_kthread+0x282/0x500
 Call Trace:
  dump_stack+0x86/0xcf
  ___might_sleep+0x174/0x260
  __might_sleep+0x4a/0x80
  flush_work+0x7e/0x2e0
  __cancel_work_timer+0x143/0x1c0
  cancel_work_sync+0x10/0x20
  blk_throtl_exit+0x25/0x60
  blkcg_exit_queue+0x35/0x40
  blk_release_queue+0x42/0x130
  kobject_put+0xa9/0x190

This happens since we invoke callbacks that need to block from the
queue release handler. Fix this by pushing the final release to
a workqueue.
Reported-by: NRoss Zwisler <zwisler@gmail.com>
Fixes: commit b425e504 ("block: Avoid that blk_exit_rl() triggers a use-after-free")
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>

Updated changelog
Signed-off-by: NJens Axboe <axboe@fb.com>

dc9edc44

net: update undefined ->ndo_change_mtu() comment · db46a0e1

由 Magnus Damm 提交于 6月 14, 2017

Update ->ndo_change_mtu() callback comment to remove text
about returning error in case of undefined callback. This
change makes the comment match the existing code behavior.
Signed-off-by: NMagnus Damm <damm+renesas@opensource.se>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db46a0e1

14 6月, 2017 3 次提交

dev_ioctl: copy only the smaller struct iwreq for wext · 68dd02d1

由 Johannes Berg 提交于 6月 14, 2017

Unfortunately, struct iwreq isn't a proper subset of struct ifreq,
but is still handled by the same code path. Robert reported that
then applications may (randomly) fault if the struct iwreq they
pass happens to land within 8 bytes of the end of a mapping (the
struct is only 32 bytes, vs. struct ifreq's 40 bytes).

To fix this, pull out the code handling wireless extension ioctls
and copy only the smaller structure in this case.

This bug goes back a long time, I tracked that it was introduced
into mainline in 2.1.15, over 20 years ago!

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=195869Reported-by: NRobert O'Callahan <robert@ocallahan.org>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

68dd02d1

drm/syncobj: add sync_file interaction. (v1.2) · 3ee45a3b

由 Dave Airlie 提交于 4月 26, 2017

This interface allows importing the fence from a sync_file into
an existing drm sync object, or exporting the fence attached to
an existing drm sync object into a new sync file object.

This should only be used to interact with sync files where necessary.

v1.1: fence put fixes (Chris), drop fence from ioctl names (Chris)
fixup for new fence replace API.
Reviewed-by: NSean Paul <seanpaul@chromium.org>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDave Airlie <airlied@redhat.com>

3ee45a3b

drm: introduce sync objects (v4) · e9083420

由 Dave Airlie 提交于 4月 04, 2017

Sync objects are new toplevel drm object, that contain a
pointer to a fence. This fence can be updated via command
submission ioctls via drivers.

There is also a generic wait obj API modelled on the vulkan
wait API (with code modelled on some amdgpu code).

These objects can be converted to an opaque fd that can be
passes between processes.

v2: rename reference/unreference to put/get (Chris)
fix leaked reference (David Zhou)
drop mutex in favour of cmpxchg (Chris)
v3: cleanups from danvet, rebase on drm_fops rename
check fd_flags is 0 in ioctls.
v4: export find/free, change replace fence to take a
syncobj. In order to support lookup first, replace
later semantics which seem in the end to be cleaner.
Reviewed-by: NSean Paul <seanpaul@chromium.org>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDave Airlie <airlied@redhat.com>

e9083420

12 6月, 2017 1 次提交

ACPICA: Tables: Mechanism to handle late stage acpi_get_table() imbalance · 83848fbe

由 Lv Zheng 提交于 6月 07, 2017

Considering this case:

 1. A program opens a sysfs table file 65535 times, it can increase
    validation_count and first increment cause the table to be mapped:

     validation_count = 65535

 2. AML execution causes "Load" to be executed on the same
    table, this time it cannot increase validation_count, so
    validation_count remains:

      validation_count = 65535

 3. The program closes sysfs table file 65535 times, it can decrease
    validation_count and the last decrement cause the table to be
    unmapped:

     validation_count = 0

 4. AML code still accessing the loaded table, kernel crash can be
    observed.

To prevent that from happening, add a validation_count threashold.
When it is reached, the validation_count can no longer be
incremented/decremented to invalidate the table descriptor (means
preventing table unmappings)

Note that code added in acpi_tb_put_table() is actually a no-op but
changes the warning message into a "warn once" one. Lv Zheng.
Signed-off-by: NLv Zheng <lv.zheng@intel.com>
[ rjw: Changelog, comments ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

83848fbe

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功