提交 · 9ba41327d8d01df54be1e6f1c246b123b009fa55 · openanolis / cloud-kernel

14 9月, 2015 6 次提交

sysfs-tagging.txt: fix pre-kernfs references · 9ba41327

由 Ulf Magnusson 提交于 9月 02, 2015

 - sysfs_dirent is now kernfs_node - see commit 324a56e1 ("kernfs:
   s/sysfs_dirent/kernfs_node/ and rename its friends accordingly")

 - sysfs_super_info is now kernfs_super_info - see commit c525aadd
   ("kernfs: s/sysfs/kernfs/ in various data structures")

 - the 's_' prefix was dropped from various fields - see
   commit adc5e8b5 ("kernfs: drop s_ prefix from kernfs_node members")
Signed-off-by: NUlf Magnusson <ulfalizer@gmail.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

9ba41327

sysfs.txt: fix pre-kernfs sysfs_dirent reference · 390b421c

由 Ulf Magnusson 提交于 9月 02, 2015

sysfs_dirent went away when kernfs was extracted from sysfs. The reference
to the kobject now lives in a kernfs_node (in the 'priv' member).

See commit 324a56e1 ("kernfs: s/sysfs_dirent/kernfs_node/ and rename
its friends accordingly").
Signed-off-by: NUlf Magnusson <ulfalizer@gmail.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

390b421c

documentation: fix small typo in rbtree.txt · 121e0248

由 Alexey Klimov 提交于 9月 06, 2015

Signed-off-by: NAlexey Klimov <alexey.klimov@linaro.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

121e0248

Documentation: lockstat: Fix typo lokcing -> locking · df5f0b6e

由 Stephen Boyd 提交于 9月 09, 2015

Cc: Jiri Kosina <trivial@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

df5f0b6e

Docs/DocBook: Add .db files to .gitignore · 47f16529

由 Jonathan Corbet 提交于 9月 11, 2015

These files were added with the XML crossreference patch; they make git
complain.
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

47f16529

DocBook: ignore .proc files · 2d774bd0

由 Brian Norris 提交于 8月 31, 2015

These are generated as part of 'make htmldocs'. If we don't ignore them,
then most of our generated subdirectories get treated as "untracked" by
git.
Signed-off-by: NBrian Norris <computersforpeace@gmail.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

2d774bd0

11 9月, 2015 4 次提交

proc: export idle flag via kpageflags · f074a8f4

由 Vladimir Davydov 提交于 9月 09, 2015

As noted by Minchan, a benefit of reading idle flag from /proc/kpageflags
is that one can easily filter dirty and/or unevictable pages while
estimating the size of unused memory.

Note that idle flag read from /proc/kpageflags may be stale in case the
page was accessed via a PTE, because it would be too costly to iterate
over all page mappings on each /proc/kpageflags read to provide an
up-to-date value.  To make sure the flag is up-to-date one has to read
/sys/kernel/mm/page_idle/bitmap first.
Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
Reviewed-by: NAndres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f074a8f4

mm: introduce idle page tracking · 33c3fc71

由 Vladimir Davydov 提交于 9月 09, 2015

Knowing the portion of memory that is not used by a certain application or
memory cgroup (idle memory) can be useful for partitioning the system
efficiently, e.g.  by setting memory cgroup limits appropriately.
Currently, the only means to estimate the amount of idle memory provided
by the kernel is /proc/PID/{clear_refs,smaps}: the user can clear the
access bit for all pages mapped to a particular process by writing 1 to
clear_refs, wait for some time, and then count smaps:Referenced.  However,
this method has two serious shortcomings:

 - it does not count unmapped file pages
 - it affects the reclaimer logic

To overcome these drawbacks, this patch introduces two new page flags,
Idle and Young, and a new sysfs file, /sys/kernel/mm/page_idle/bitmap.
A page's Idle flag can only be set from userspace by setting bit in
/sys/kernel/mm/page_idle/bitmap at the offset corresponding to the page,
and it is cleared whenever the page is accessed either through page tables
(it is cleared in page_referenced() in this case) or using the read(2)
system call (mark_page_accessed()). Thus by setting the Idle flag for
pages of a particular workload, which can be found e.g.  by reading
/proc/PID/pagemap, waiting for some time to let the workload access its
working set, and then reading the bitmap file, one can estimate the amount
of pages that are not used by the workload.

The Young page flag is used to avoid interference with the memory
reclaimer.  A page's Young flag is set whenever the Access bit of a page
table entry pointing to the page is cleared by writing to the bitmap file.
If page_referenced() is called on a Young page, it will add 1 to its
return value, therefore concealing the fact that the Access bit was
cleared.

Note, since there is no room for extra page flags on 32 bit, this feature
uses extended page flags when compiled on 32 bit.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: kpageidle requires an MMU]
[akpm@linux-foundation.org: decouple from page-flags rework]
Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
Reviewed-by: NAndres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

33c3fc71

proc: add kpagecgroup file · 80ae2fdc

由 Vladimir Davydov 提交于 9月 09, 2015

/proc/kpagecgroup contains a 64-bit inode number of the memory cgroup each
page is charged to, indexed by PFN.  Having this information is useful for
estimating a cgroup working set size.

The file is present if CONFIG_PROC_PAGE_MONITOR && CONFIG_MEMCG.
Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
Reviewed-by: NAndres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

80ae2fdc

zswap: update docs for runtime-changeable attributes · 9c4c5ef3

由 Dan Streetman 提交于 9月 09, 2015

Change the Documentation/vm/zswap.txt doc to indicate that the "zpool" and
"compressor" params are now changeable at runtime.
Signed-off-by: NDan Streetman <ddstreet@ieee.org>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9c4c5ef3

10 9月, 2015 5 次提交

soc: qcom: smd: Use correct remote processor ID · 93dbed91

由 Andy Gross 提交于 8月 26, 2015

This patch fixes SMEM addressing issues when remote processors need to use
secure SMEM partitions.
Signed-off-by: NAndy Gross <agross@codeaurora.org>
Reviewed-by: NBjorn Andersson <bjorn.andersson@sonymobile.com>

93dbed91

Documentation: dt: binding: atmel-sama5d4-wdt: for SAMA5D4 watchdog driver · f4fff94e

由 Wenyou Yang 提交于 8月 06, 2015

The compatible "atmel,sama5d4-wdt" supports the SAMA5D4 watchdog driver
and the watchdog's WDT_MR register can be written more than once.
Signed-off-by: NWenyou Yang <wenyou.yang@atmel.com>
Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

f4fff94e

DT: watchdog: Add NXP LPC18xx Watchdog Timer binding documentation · cfde37e1

由 Ariel D'Alessandro 提交于 8月 01, 2015

Add the devicetree binding document for NXP LPC18xx Watchdog Timer.
Signed-off-by: NAriel D'Alessandro <ariel@vanguardiasur.com.ar>
Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

cfde37e1

Documentation: watchdog: at91sam9_wdt: add clocks property · ab54d7f0

由 Alexandre Belloni 提交于 7月 31, 2015

The watchdog has an input clock, the slow clock. It is required as it will
not function without it.
Signed-off-by: NAlexandre Belloni <alexandre.belloni@free-electrons.com>
Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

ab54d7f0

Documentation/watchdog: add timeout and ping rate control to watchdog-test.c · f15d7114

由 Timur Tabi 提交于 6月 29, 2015

The watchdog test program is much more useful if it can configure the
timeout value and ping rate.  This will allow you to test actual timeouts.

Adds the -t parameter to set the timeout value (in seconds), and -p to set
the ping rate (number of seconds between pings).
Signed-off-by: NTimur Tabi <timur@codeaurora.org>
Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

f15d7114

09 9月, 2015 10 次提交

pwm: Add NXP LPC18xx PWM/SCT DT binding documentation · b0dabcc6

由 Ariel D'Alessandro 提交于 8月 05, 2015

Add the devicetree binding document for NXP LPC18xx PWM/SCT.
Signed-off-by: NAriel D'Alessandro <ariel@vanguardiasur.com.ar>
Signed-off-by: NThierry Reding <thierry.reding@gmail.com>

b0dabcc6

zsmalloc: account the number of compacted pages · 860c707d

由 Sergey Senozhatsky 提交于 9月 08, 2015

Compaction returns back to zram the number of migrated objects, which is
quite uninformative -- we have objects of different sizes so user space
cannot obtain any valuable data from that number.  Change compaction to
operate in terms of pages and return back to compaction issuer the
number of pages that were freed during compaction.  So from now on we
will export more meaningful value in zram<id>/mm_stat -- the number of
freed (compacted) pages.

This requires:
 (a) a rename of `num_migrated' to 'pages_compacted'
 (b) a internal API change -- return first_page's fullness_group from
     putback_zspage(), so we know when putback_zspage() did
     free_zspage().  It helps us to account compaction stats correctly.
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

860c707d

mm/page_alloc.c: fix a misleading comment · 013110a7

由 Yaowei Bai 提交于 9月 08, 2015

The comment says that the per-cpu batchsize and zone watermarks are
determined by present_pages which is definitely wrong, they are both
calculated from managed_pages.  Fix it.
Signed-off-by: NYaowei Bai <bywxiaobai@163.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

013110a7

Documentation: update libhugetlbfs location and use for testing · e6590740

由 Mike Kravetz 提交于 9月 08, 2015

The URL for libhugetlbfs has changed.  Also, put a stronger emphasis on
using libgugetlbfs for hugetlb regression testing.
Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
Acked-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Joern Engel <joern@logfs.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e6590740

mm: add dma_pool_zalloc() call to DMA API · ad82362b

由 Sean O. Stalley 提交于 9月 08, 2015

Add a wrapper function for dma_pool_alloc() to get zeroed memory.
Signed-off-by: NSean O. Stalley <sean.stalley@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Gilles Muller <Gilles.Muller@lip6.fr>
Cc: Nicolas Palix <nicolas.palix@imag.fr>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ad82362b

mm, oom: do not panic for oom kills triggered from sysrq · 071a4bef

由 David Rientjes 提交于 9月 08, 2015

Sysrq+f is used to kill a process either for debug or when the VM is
otherwise unresponsive.

It is not intended to trigger a panic when no process may be killed.

Avoid panicking the system for sysrq+f when no processes are killed.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Suggested-by: NMichal Hocko <mhocko@suse.cz>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

071a4bef

mm: /proc/pid/smaps:: show proportional swap share of the mapping · 8334b962

由 Minchan Kim 提交于 9月 08, 2015

We want to know per-process workingset size for smart memory management
on userland and we use swap(ex, zram) heavily to maximize memory
efficiency so workingset includes swap as well as RSS.

On such system, if there are lots of shared anonymous pages, it's really
hard to figure out exactly how many each process consumes memory(ie, rss
+ wap) if the system has lots of shared anonymous memory(e.g, android).

This patch introduces SwapPss field on /proc/<pid>/smaps so we can get
more exact workingset size per process.

Bongkyu tested it. Result is below.

1. 50M used swap
SwapTotal: 461976 kB
SwapFree: 411192 kB

$ adb shell cat /proc/*/smaps | grep "SwapPss:" | awk '{sum += $2} END {print sum}';
48236
$ adb shell cat /proc/*/smaps | grep "Swap:" | awk '{sum += $2} END {print sum}';
141184

2. 240M used swap
SwapTotal: 461976 kB
SwapFree: 216808 kB

$ adb shell cat /proc/*/smaps | grep "SwapPss:" | awk '{sum += $2} END {print sum}';
230315
$ adb shell cat /proc/*/smaps | grep "Swap:" | awk '{sum += $2} END {print sum}';
1387744

[akpm@linux-foundation.org: simplify kunmap_atomic() call]
Signed-off-by: NMinchan Kim <minchan@kernel.org>
Reported-by: NBongkyu Kim <bongkyu.kim@lge.com>
Tested-by: NBongkyu Kim <bongkyu.kim@lge.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8334b962

pagemap: update documentation · 83b4b0bb

由 Konstantin Khlebnikov 提交于 9月 08, 2015

Notes about recent changes.

[akpm@linux-foundation.org: various tweaks]
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Mark Williamson <mwilliamson@undo-software.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

83b4b0bb

pagemap: add mmap-exclusive bit for marking pages mapped only here · 77bb499b

由 Konstantin Khlebnikov 提交于 9月 08, 2015

This patch sets bit 56 in pagemap if this page is mapped only once.  It
allows to detect exclusively used pages without exposing PFN:

present file exclusive state
0       0    0         non-present
1       1    0         file page mapped somewhere else
1       1    1         file page mapped only here
1       0    0         anon non-CoWed page (shared with parent/child)
1       0    1         anon CoWed page (or never forked)

CoWed pages in (MAP_FILE | MAP_PRIVATE) areas are anon in this context.

MMap-exclusive bit doesn't reflect potential page-sharing via swapcache:
page could be mapped once but has several swap-ptes which point to it.
Application could detect that by swap bit in pagemap entry and touch that
pte via /proc/pid/mem to get real information.

See http://lkml.kernel.org/r/CAEVpBa+_RyACkhODZrRvQLs80iy0sqpdrd0AaP_-tgnX3Y9yNQ@mail.gmail.com

Requested by Mark Williamson.

[akpm@linux-foundation.org: fix spello]
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: NMark Williamson <mwilliamson@undo-software.com>
Tested-by: NMark Williamson <mwilliamson@undo-software.com>
Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

77bb499b

dax: add huge page fault support · 844f35db

由 Matthew Wilcox 提交于 9月 08, 2015

This is the support code for DAX-enabled filesystems to allow them to
provide huge pages in response to faults.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

844f35db

07 9月, 2015 1 次提交

Documentation/features/vm: Meta2 is capable of THP · e7e98d76

由 James Hogan 提交于 7月 30, 2015

Change metag Transparent Huge Pages (THP) support from .. to TODO. Meta2
has variable sized pages, between 4KB and 4MB, specified at the 1st
level page table level, and already supports hugetlbfs, so supporting
THP is theoretically possible too.
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-metag@vger.kernel.org
Cc: linux-doc@vger.kernel.org

e7e98d76

06 9月, 2015 4 次提交

Input: touchscreen - add imx6ul_tsc driver support · 9a436d52

由 Haibo Chen 提交于 9月 05, 2015

Freescale i.MX6UL contains a internal touchscreen controller,
this patch add a driver to support this controller.
Signed-off-by: NHaibo Chen <haibo.chen@freescale.com>
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>

9a436d52

Input: Add touchscreen support for Colibri VF50 · 48ead50c

由 Sanchayan Maity 提交于 9月 05, 2015

The Colibri Vybrid VF50 module supports 4-wire touchscreens using
FETs and ADC inputs. This driver uses the IIO consumer interface
and relies on the vf610_adc driver based on the IIO framework.
Signed-off-by: NSanchayan Maity <maitysanchayan@gmail.com>
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>

48ead50c

ARM: dts: AM437x: Add the internal and external clock nodes for rtc · fff51e77

由 Keerthy 提交于 8月 18, 2015

rtc can either be supplied from internal 32k clock or external crystal
generated 32k clock. Internal clock is SOC specific and the external
clock is board dependent. Adding the corresponding nodes.
Signed-off-by: NKeerthy <j-keerthy@ti.com>
Acked-by: NTony Lindgren <tony@atomide.com>
Signed-off-by: NAlexandre Belloni <alexandre.belloni@free-electrons.com>

fff51e77

devicetree: bindings: rtc: add bindings for xilinx zynqmp rtc · 12ece40d

由 Suneel Garapati 提交于 8月 19, 2015

adds file for description on device node bindings for RTC
found on Xilinx Zynq Ultrascale+ MPSoC.
Signed-off-by: NSuneel Garapati <suneel.garapati@xilinx.com>
Signed-off-by: NAlexandre Belloni <alexandre.belloni@free-electrons.com>

12ece40d

05 9月, 2015 5 次提交

doc: dt: add documentation for nxp,lpc1788-rtc · dcb9372b

由 Joachim Eastwood 提交于 7月 11, 2015

Document NXP LPC178x/18xx/408x/43xx bindings
Signed-off-by: NJoachim Eastwood <manabian@gmail.com>
Signed-off-by: NAlexandre Belloni <alexandre.belloni@free-electrons.com>

dcb9372b

Documentation/features/vm: add feature description and arch support status for... · c7e1e3cc

由 Mel Gorman 提交于 9月 04, 2015

Documentation/features/vm: add feature description and arch support status for batched TLB flush after unmap
Signed-off-by: NMel Gorman <mgorman@suse.de>
Acked-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c7e1e3cc

userfaultfd: change the read API to return a uffd_msg · a9b85f94

由 Andrea Arcangeli 提交于 9月 04, 2015

I had requests to return the full address (not the page aligned one) to
userland.

It's not entirely clear how the page offset could be relevant because
userfaults aren't like SIGBUS that can sigjump to a different place and it
actually skip resolving the fault depending on a page offset.  There's
currently no real way to skip the fault especially because after a
UFFDIO_COPY|ZEROPAGE, the fault is optimized to be retried within the
kernel without having to return to userland first (not even self modifying
code replacing the .text that touched the faulting address would prevent
the fault to be repeated).  Userland cannot skip repeating the fault even
more so if the fault was triggered by a KVM secondary page fault or any
get_user_pages or any copy-user inside some syscall which will return to
kernel code.  The second time FAULT_FLAG_RETRY_NOWAIT won't be set leading
to a SIGBUS being raised because the userfault can't wait if it cannot
release the mmap_map first (and FAULT_FLAG_RETRY_NOWAIT is required for
that).

Still returning userland a proper structure during the read() on the uffd,
can allow to use the current UFFD_API for the future non-cooperative
extensions too and it looks cleaner as well.  Once we get additional
fields there's no point to return the fault address page aligned anymore
to reuse the bits below PAGE_SHIFT.

The only downside is that the read() syscall will read 32bytes instead of
8bytes but that's not going to be measurable overhead.

The total number of new events that can be extended or of new future bits
for already shipped events, is limited to 64 by the features field of the
uffdio_api structure.  If more will be needed a bump of UFFD_API will be
required.

[akpm@linux-foundation.org: use __packed]
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a9b85f94

userfaultfd: uAPI · 1038628d

由 Andrea Arcangeli 提交于 9月 04, 2015

Defines the uAPI of the userfaultfd, notably the ioctl numbers and protocol.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1038628d

userfaultfd: linux/Documentation/vm/userfaultfd.txt · 25edd8bf

由 Andrea Arcangeli 提交于 9月 04, 2015

This is the latest userfaultfd patchset.  The postcopy live migration
feature on the qemu side is mostly ready to be merged and it entirely
depends on the userfaultfd syscall to be merged as well.  So it'd be great
if this patchset could be reviewed for merging in -mm.

Userfaults allow to implement on demand paging from userland and more
generally they allow userland to more efficiently take control of the
behavior of page faults than what was available before (PROT_NONE +
SIGSEGV trap).

The use cases are:

1) KVM postcopy live migration (one form of cloud memory
   externalization).

   KVM postcopy live migration is the primary driver of this work:

    http://blog.zhaw.ch/icclab/setting-up-post-copy-live-migration-in-openstack/
    http://lists.gnu.org/archive/html/qemu-devel/2015-02/msg04873.html

2) postcopy live migration of binaries inside linux containers:

    http://thread.gmane.org/gmane.linux.kernel.mm/132662

3) KVM postcopy live snapshotting (allowing to limit/throttle the
   memory usage, unlike fork would, plus the avoidance of fork
   overhead in the first place).

   While the wrprotect tracking is not implemented yet, the syscall API is
   already contemplating the wrprotect fault tracking and it's generic enough
   to allow its later implementation in a backwards compatible fashion.

4) KVM userfaults on shared memory. The UFFDIO_COPY lowlevel method
   should be extended to work also on tmpfs and then the
   uffdio_register.ioctls will notify userland that UFFDIO_COPY is
   available even when the registered virtual memory range is tmpfs
   backed.

5) alternate mechanism to notify web browsers or apps on embedded
   devices that volatile pages have been reclaimed. This basically
   avoids the need to run a syscall before the app can access with the
   CPU the virtual regions marked volatile. This depends on point 4)
   to be fulfilled first, as volatile pages happily apply to tmpfs.

Even though there wasn't a real use case requesting it yet, it also
allows to implement distributed shared memory in a way that readonly
shared mappings can exist simultaneously in different hosts and they
can be become exclusive at the first wrprotect fault.

This patch (of 22):

Add documentation.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

25edd8bf

04 9月, 2015 2 次提交

C
ipmi: Add device tree bindings information · 92e84721
由 Corey Minyard 提交于 9月 03, 2015
```
Signed-off-by: NCorey Minyard <cminyard@mvista.com>
```
92e84721

[media] DocBook media: Fix typo "the the" in xml files · 06268390

由 Masanari Iida 提交于 8月 17, 2015

This patch fix spelling typo "the the" found in controls.xml
and vidioc-g-param.xml.
These xml files are'nt generated from any source files, so I have
to fix these xml files directly.
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@osg.samsung.com>

06268390

03 9月, 2015 3 次提交

Documentation: dt: Add Pistachio SoC general purpose timer binding document · d6ed2b9b

由 Ezequiel Garcia 提交于 7月 27, 2015

Add a device-tree binding document for the clocksource driver provided
by Pistachio SoC general purpose timers.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@imgtec.com>
Reviewed-by: NAndrew Bresticker <abrestic@chromium.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: devicetree@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: James Hartley <James.Hartley@imgtec.com>
Cc: Govindraj Raja <Govindraj.Raja@imgtec.com>
Cc: Damien Horsley <Damien.Horsley@imgtec.com>
Cc: James Hogan <James.Hogan@imgtec.com>
Cc: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Patchwork: https://patchwork.linux-mips.org/patch/10783/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

d6ed2b9b

R
Documentation: MIPS now supports uprobes. · b7565cc3
由 Ralf Baechle 提交于 7月 29, 2015
```
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
```
b7565cc3

Documentation/sysrq.txt: Mention MIPS TLB dump (x) · 0f6ce775

由 James Hogan 提交于 7月 15, 2015

Commit d1e9a4f5 ("MIPS: Add SysRq operation to dump TLBs on all
CPUs") added the 'x' sysrq key for dumping MIPS TLB entries, but didn't
document it in Documentation/sysrq.txt.

Add mention of the MIPS use of the 'x' SysRq key.
Reported-by: NMaciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Acked-by: NJonathan Corbet <corbet@lwn.net>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: linux-doc@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/10720/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

0f6ce775

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功