- 01 5月, 2019 2 次提交
-
-
由 KarimAllah Ahmed 提交于
In KVM, specially for nested guests, there is a dominant pattern of: => map guest memory -> do_something -> unmap guest memory In addition to all this unnecessarily noise in the code due to boiler plate code, most of the time the mapping function does not properly handle memory that is not backed by "struct page". This new guest mapping API encapsulate most of this boiler plate code and also handles guest memory that is not backed by "struct page". The current implementation of this API is using memremap for memory that is not backed by a "struct page" which would lead to a huge slow-down if it was used for high-frequency mapping operations. The API does not have any effect on current setups where guest memory is backed by a "struct page". Further patches are going to also introduce a pfn-cache which would significantly improve the performance of the memremap case. Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Luwei Kang 提交于
Inject a PMI for KVM guest when Intel PT working in Host-Guest mode and Guest ToPA entry memory buffer was completely filled. Signed-off-by: NLuwei Kang <luwei.kang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 4月, 2019 1 次提交
-
-
由 Christian Borntraeger 提交于
There are cases where halt polling is unwanted. For example when running KVM on an over committed LPAR we rather want to give back the CPU to neighbour LPARs instead of polling. Let us provide a callback that allows architectures to disable polling. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 16 4月, 2019 1 次提交
-
-
由 Paolo Bonzini 提交于
These were found with smatch, and then generalized when applicable. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 4月, 2019 2 次提交
-
-
由 Michael Zhivich 提交于
When building C++ userspace code that includes ethtool.h with "-Werror -Wall", g++ complains about signed-unsigned comparison in ethtool_validate_speed() due to definition of SPEED_UNKNOWN as -1. Explicitly cast SPEED_UNKNOWN to __u32 to match type of ethtool_validate_speed() argument. Signed-off-by: NMichael Zhivich <mzhivich@akamai.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 ndesaulniers@google.com 提交于
Fixes the warning reported by Clang: security/keys/trusted.c:146:17: warning: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Wvarargs] va_start(argp, h3); ^ security/keys/trusted.c:126:37: note: parameter of type 'unsigned char' is declared here unsigned char *h2, unsigned char h3, ...) ^ Specifically, it seems that both the C90 (4.8.1.1) and C11 (7.16.1.4) standards explicitly call this out as undefined behavior: The parameter parmN is the identifier of the rightmost parameter in the variable parameter list in the function definition (the one just before the ...). If the parameter parmN is declared with ... or with a type that is not compatible with the type that results after application of the default argument promotions, the behavior is undefined. Link: https://github.com/ClangBuiltLinux/linux/issues/41 Link: https://www.eskimo.com/~scs/cclass/int/sx11c.htmlSuggested-by: NDavid Laight <David.Laight@aculab.com> Suggested-by: NDenis Kenzior <denkenz@gmail.com> Suggested-by: NJames Bottomley <jejb@linux.vnet.ibm.com> Suggested-by: NNathan Chancellor <natechancellor@gmail.com> Signed-off-by: NNick Desaulniers <ndesaulniers@google.com> Reviewed-by: NNathan Chancellor <natechancellor@gmail.com> Tested-by: NNathan Chancellor <natechancellor@gmail.com> Reviewed-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: NJames Morris <james.morris@microsoft.com>
-
- 07 4月, 2019 2 次提交
-
-
由 Dan Carpenter 提交于
This is similar to commit e285d5bf ("NFC: Fix the number of pipes") where we changed NFC_HCI_MAX_PIPES from 127 to 128. As the comment next to the define explains, the pipe identifier is 7 bits long. The highest possible pipe is 127, but the number of possible pipes is 128. As the code is now, then there is potential for an out of bounds array access: net/nfc/nci/hci.c:297 nci_hci_cmd_received() warn: array off by one? 'ndev->hci_dev->pipes[pipe]' '0-127 == 127' Fixes: 11f54f22 ("NFC: nci: Add HCI over NCI protocol support") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Smelkov 提交于
fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock Commit 9c225f26 ("vfs: atomic f_pos accesses as per POSIX") added locking for file.f_pos access and in particular made concurrent read and write not possible - now both those functions take f_pos lock for the whole run, and so if e.g. a read is blocked waiting for data, write will deadlock waiting for that read to complete. This caused regression for stream-like files where previously read and write could run simultaneously, but after that patch could not do so anymore. See e.g. commit 581d21a2 ("xenbus: fix deadlock on writes to /proc/xen/xenbus") which fixes such regression for particular case of /proc/xen/xenbus. The patch that added f_pos lock in 2014 did so to guarantee POSIX thread safety for read/write/lseek and added the locking to file descriptors of all regular files. In 2014 that thread-safety problem was not new as it was already discussed earlier in 2006. However even though 2006'th version of Linus's patch was adding f_pos locking "only for files that are marked seekable with FMODE_LSEEK (thus avoiding the stream-like objects like pipes and sockets)", the 2014 version - the one that actually made it into the tree as 9c225f26 - is doing so irregardless of whether a file is seekable or not. See https://lore.kernel.org/lkml/53022DB1.4070805@gmail.com/ https://lwn.net/Articles/180387 https://lwn.net/Articles/180396 for historic context. The reason that it did so is, probably, that there are many files that are marked non-seekable, but e.g. their read implementation actually depends on knowing current position to correctly handle the read. Some examples: kernel/power/user.c snapshot_read fs/debugfs/file.c u32_array_read fs/fuse/control.c fuse_conn_waiting_read + ... drivers/hwmon/asus_atk0110.c atk_debugfs_ggrp_read arch/s390/hypfs/inode.c hypfs_read_iter ... Despite that, many nonseekable_open users implement read and write with pure stream semantics - they don't depend on passed ppos at all. And for those cases where read could wait for something inside, it creates a situation similar to xenbus - the write could be never made to go until read is done, and read is waiting for some, potentially external, event, for potentially unbounded time -> deadlock. Besides xenbus, there are 14 such places in the kernel that I've found with semantic patch (see below): drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock .write() drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can deadlock .write() drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock .write() drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() can deadlock .write() net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock .write() drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can deadlock .write() drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can deadlock .write() drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock .write() net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can deadlock .write() drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can deadlock .write() drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock .write() drivers/input/misc/uinput.c:400:1-17: ERROR: uinput_fops: .read() can deadlock .write() drivers/infiniband/core/user_mad.c:985:7-23: ERROR: umad_fops: .read() can deadlock .write() drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock .write() In addition to the cases above another regression caused by f_pos locking is that now FUSE filesystems that implement open with FOPEN_NONSEEKABLE flag, can no longer implement bidirectional stream-like files - for the same reason as above e.g. read can deadlock write locking on file.f_pos in the kernel. FUSE's FOPEN_NONSEEKABLE was added in 2008 in a7c1b990 ("fuse: implement nonseekable open") to support OSSPD. OSSPD implements /dev/dsp in userspace with FOPEN_NONSEEKABLE flag, with corresponding read and write routines not depending on current position at all, and with both read and write being potentially blocking operations: See https://github.com/libfuse/osspd https://lwn.net/Articles/308445 https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1406 https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1438-L1477 https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1479-L1510 Corresponding libfuse example/test also describes FOPEN_NONSEEKABLE as "somewhat pipe-like files ..." with read handler not using offset. However that test implements only read without write and cannot exercise the deadlock scenario: https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L124-L131 https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L146-L163 https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L209-L216 I've actually hit the read vs write deadlock for real while implementing my FUSE filesystem where there is /head/watch file, for which open creates separate bidirectional socket-like stream in between filesystem and its user with both read and write being later performed simultaneously. And there it is semantically not easy to split the stream into two separate read-only and write-only channels: https://lab.nexedi.com/kirr/wendelin.core/blob/f13aa600/wcfs/wcfs.go#L88-169 Let's fix this regression. The plan is: 1. We can't change nonseekable_open to include &~FMODE_ATOMIC_POS - doing so would break many in-kernel nonseekable_open users which actually use ppos in read/write handlers. 2. Add stream_open() to kernel to open stream-like non-seekable file descriptors. Read and write on such file descriptors would never use nor change ppos. And with that property on stream-like files read and write will be running without taking f_pos lock - i.e. read and write could be running simultaneously. 3. With semantic patch search and convert to stream_open all in-kernel nonseekable_open users for which read and write actually do not depend on ppos and where there is no other methods in file_operations which assume @offset access. 4. Add FOPEN_STREAM to fs/fuse/ and open in-kernel file-descriptors via steam_open if that bit is present in filesystem open reply. It was tempting to change fs/fuse/ open handler to use stream_open instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE, and in particular GVFS which actually uses offset in its read and write handlers https://codesearch.debian.net/search?q=-%3Enonseekable+%3D https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080 https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346 https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481 so if we would do such a change it will break a real user. 5. Add stream_open and FOPEN_STREAM handling to stable kernels starting from v3.14+ (the kernel where 9c225f26 first appeared). This will allow to patch OSSPD and other FUSE filesystems that provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE in their open handler and this way avoid the deadlock on all kernel versions. This should work because fs/fuse/ ignores unknown open flags returned from a filesystem and so passing FOPEN_STREAM to a kernel that is not aware of this flag cannot hurt. In turn the kernel that is not aware of FOPEN_STREAM will be < v3.14 where just FOPEN_NONSEEKABLE is sufficient to implement streams without read vs write deadlock. This patch adds stream_open, converts /proc/xen/xenbus to it and adds semantic patch to automatically locate in-kernel places that are either required to be converted due to read vs write deadlock, or that are just safe to be converted because read and write do not use ppos and there are no other funky methods in file_operations. Regarding semantic patch I've verified each generated change manually - that it is correct to convert - and each other nonseekable_open instance left - that it is either not correct to convert there, or that it is not converted due to current stream_open.cocci limitations. The script also does not convert files that should be valid to convert, but that currently have .llseek = noop_llseek or generic_file_llseek for unknown reason despite file being opened with nonseekable_open (e.g. drivers/input/mousedev.c) Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Yongzhi Pan <panyongzhi@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Tejun Heo <tj@kernel.org> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Nikolaus Rath <Nikolaus@rath.org> Cc: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: NKirill Smelkov <kirr@nexedi.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 4月, 2019 4 次提交
-
-
由 Greg Thelen 提交于
Since commit a983b5eb ("mm: memcontrol: fix excessive complexity in memory.stat reporting") memcg dirty and writeback counters are managed as: 1) per-memcg per-cpu values in range of [-32..32] 2) per-memcg atomic counter When a per-cpu counter cannot fit in [-32..32] it's flushed to the atomic. Stat readers only check the atomic. Thus readers such as balance_dirty_pages() may see a nontrivial error margin: 32 pages per cpu. Assuming 100 cpus: 4k x86 page_size: 13 MiB error per memcg 64k ppc page_size: 200 MiB error per memcg Considering that dirty+writeback are used together for some decisions the errors double. This inaccuracy can lead to undeserved oom kills. One nasty case is when all per-cpu counters hold positive values offsetting an atomic negative value (i.e. per_cpu[*]=32, atomic=n_cpu*-32). balance_dirty_pages() only consults the atomic and does not consider throttling the next n_cpu*32 dirty pages. If the file_lru is in the 13..200 MiB range then there's absolutely no dirty throttling, which burdens vmscan with only dirty+writeback pages thus resorting to oom kill. It could be argued that tiny containers are not supported, but it's more subtle. It's the amount the space available for file lru that matters. If a container has memory.max-200MiB of non reclaimable memory, then it will also suffer such oom kills on a 100 cpu machine. The following test reliably ooms without this patch. This patch avoids oom kills. $ cat test mount -t cgroup2 none /dev/cgroup cd /dev/cgroup echo +io +memory > cgroup.subtree_control mkdir test cd test echo 10M > memory.max (echo $BASHPID > cgroup.procs && exec /memcg-writeback-stress /foo) (echo $BASHPID > cgroup.procs && exec dd if=/dev/zero of=/foo bs=2M count=100) $ cat memcg-writeback-stress.c /* * Dirty pages from all but one cpu. * Clean pages from the non dirtying cpu. * This is to stress per cpu counter imbalance. * On a 100 cpu machine: * - per memcg per cpu dirty count is 32 pages for each of 99 cpus * - per memcg atomic is -99*32 pages * - thus the complete dirty limit: sum of all counters 0 * - balance_dirty_pages() only sees atomic count -99*32 pages, which * it max()s to 0. * - So a workload can dirty -99*32 pages before balance_dirty_pages() * cares. */ #define _GNU_SOURCE #include <err.h> #include <fcntl.h> #include <sched.h> #include <stdlib.h> #include <stdio.h> #include <sys/stat.h> #include <sys/sysinfo.h> #include <sys/types.h> #include <unistd.h> static char *buf; static int bufSize; static void set_affinity(int cpu) { cpu_set_t affinity; CPU_ZERO(&affinity); CPU_SET(cpu, &affinity); if (sched_setaffinity(0, sizeof(affinity), &affinity)) err(1, "sched_setaffinity"); } static void dirty_on(int output_fd, int cpu) { int i, wrote; set_affinity(cpu); for (i = 0; i < 32; i++) { for (wrote = 0; wrote < bufSize; ) { int ret = write(output_fd, buf+wrote, bufSize-wrote); if (ret == -1) err(1, "write"); wrote += ret; } } } int main(int argc, char **argv) { int cpu, flush_cpu = 1, output_fd; const char *output; if (argc != 2) errx(1, "usage: output_file"); output = argv[1]; bufSize = getpagesize(); buf = malloc(getpagesize()); if (buf == NULL) errx(1, "malloc failed"); output_fd = open(output, O_CREAT|O_RDWR); if (output_fd == -1) err(1, "open(%s)", output); for (cpu = 0; cpu < get_nprocs(); cpu++) { if (cpu != flush_cpu) dirty_on(output_fd, cpu); } set_affinity(flush_cpu); if (fsync(output_fd)) err(1, "fsync(%s)", output); if (close(output_fd)) err(1, "close(%s)", output); free(buf); } Make balance_dirty_pages() and wb_over_bg_thresh() work harder to collect exact per memcg counters. This avoids the aforementioned oom kills. This does not affect the overhead of memory.stat, which still reads the single atomic counter. Why not use percpu_counter? memcg already handles cpus going offline, so no need for that overhead from percpu_counter. And the percpu_counter spinlocks are more heavyweight than is required. It probably also makes sense to use exact dirty and writeback counters in memcg oom reports. But that is saved for later. Link: http://lkml.kernel.org/r/20190329174609.164344-1-gthelen@google.comSigned-off-by: NGreg Thelen <gthelen@google.com> Reviewed-by: NRoman Gushchin <guro@fb.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> [4.16+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jann Horn 提交于
Symmetrically to VM_FAULT_SET_HINDEX(), we need a force-cast in VM_FAULT_GET_HINDEX() to tell sparse that this is intentional. Sparse complains about the current code when building a kernel with CONFIG_MEMORY_FAILURE: arch/x86/mm/fault.c:1058:53: warning: restricted vm_fault_t degrades to integer Link: http://lkml.kernel.org/r/20190327204117.35215-1-jannh@google.com Fixes: 3d353901 ("mm: create the new vm_fault_t type") Signed-off-by: NJann Horn <jannh@google.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Arnd Bergmann 提交于
clang points out with hundreds of warnings that the bitrev macros have a problem with constant input: drivers/hwmon/sht15.c:187:11: error: variable '__x' is uninitialized when used within its own initialization [-Werror,-Wuninitialized] u8 crc = bitrev8(data->val_status & 0x0F); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/bitrev.h:102:21: note: expanded from macro 'bitrev8' __constant_bitrev8(__x) : \ ~~~~~~~~~~~~~~~~~~~^~~~ include/linux/bitrev.h:67:11: note: expanded from macro '__constant_bitrev8' u8 __x = x; \ ~~~ ^ Both the bitrev and the __constant_bitrev macros use an internal variable named __x, which goes horribly wrong when passing one to the other. The obvious fix is to rename one of the variables, so this adds an extra '_'. It seems we got away with this because - there are only a few drivers using bitrev macros - usually there are no constant arguments to those - when they are constant, they tend to be either 0 or (unsigned)-1 (drivers/isdn/i4l/isdnhdlc.o, drivers/iio/amplifiers/ad8366.c) and give the correct result by pure chance. In fact, the only driver that I could find that gets different results with this is drivers/net/wan/slic_ds26522.c, which in turn is a driver for fairly rare hardware (adding the maintainer to Cc for testing). Link: http://lkml.kernel.org/r/20190322140503.123580-1-arnd@arndb.de Fixes: 556d2f05 ("ARM: 8187/1: add CONFIG_HAVE_ARCH_BITREVERSE to support rbit instruction") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NNick Desaulniers <ndesaulniers@google.com> Cc: Zhao Qiang <qiang.zhao@nxp.com> Cc: Yalin Wang <yalin.wang@sonymobile.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nick Desaulniers 提交于
A recent optimization in Clang (r355672) lowers comparisons of the return value of memcmp against zero to comparisons of the return value of bcmp against zero. This helps some platforms that implement bcmp more efficiently than memcmp. glibc simply aliases bcmp to memcmp, but an optimized implementation is in the works. This results in linkage failures for all targets with Clang due to the undefined symbol. For now, just implement bcmp as a tailcail to memcmp to unbreak the build. This routine can be further optimized in the future. Other ideas discussed: * A weak alias was discussed, but breaks for architectures that define their own implementations of memcmp since aliases to declarations are not permitted (only definitions). Arch-specific memcmp implementations typically declare memcmp in C headers, but implement them in assembly. * -ffreestanding also is used sporadically throughout the kernel. * -fno-builtin-bcmp doesn't work when doing LTO. Link: https://bugs.llvm.org/show_bug.cgi?id=41035 Link: https://code.woboq.org/userspace/glibc/string/memcmp.c.html#bcmp Link: https://github.com/llvm/llvm-project/commit/8e16d73346f8091461319a7dfc4ddd18eedcff13 Link: https://github.com/ClangBuiltLinux/linux/issues/416 Link: http://lkml.kernel.org/r/20190313211335.165605-1-ndesaulniers@google.comSigned-off-by: NNick Desaulniers <ndesaulniers@google.com> Reported-by: NNathan Chancellor <natechancellor@gmail.com> Reported-by: NAdhemerval Zanella <adhemerval.zanella@linaro.org> Suggested-by: NArnd Bergmann <arnd@arndb.de> Suggested-by: NJames Y Knight <jyknight@google.com> Suggested-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Suggested-by: NNathan Chancellor <natechancellor@gmail.com> Suggested-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: NNathan Chancellor <natechancellor@gmail.com> Tested-by: NNathan Chancellor <natechancellor@gmail.com> Reviewed-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 4月, 2019 2 次提交
-
-
由 Steven Rostedt (VMware) 提交于
After removing the start and count arguments of syscall_get_arguments() it seems reasonable to remove them from syscall_set_arguments(). Note, as of today, there are no users of syscall_set_arguments(). But we are told that there will be soon. But for now, at least make it consistent with syscall_get_arguments(). Link: http://lkml.kernel.org/r/20190327222014.GA32540@altlinux.org Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Dave Martin <dave.martin@arm.com> Cc: "Dmitry V. Levin" <ldv@altlinux.org> Cc: x86@kernel.org Cc: linux-snps-arc@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: uclinux-h8-devel@lists.sourceforge.jp Cc: linux-hexagon@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: nios2-dev@lists.rocketboards.org Cc: openrisc@lists.librecores.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-riscv@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: linux-um@lists.infradead.org Cc: linux-xtensa@linux-xtensa.org Cc: linux-arch@vger.kernel.org Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86 Reviewed-by: NDmitry V. Levin <ldv@altlinux.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Steven Rostedt (Red Hat) 提交于
At Linux Plumbers, Andy Lutomirski approached me and pointed out that the function call syscall_get_arguments() implemented in x86 was horribly written and not optimized for the standard case of passing in 0 and 6 for the starting index and the number of system calls to get. When looking at all the users of this function, I discovered that all instances pass in only 0 and 6 for these arguments. Instead of having this function handle different cases that are never used, simply rewrite it to return the first 6 arguments of a system call. This should help out the performance of tracing system calls by ptrace, ftrace and perf. Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Dave Martin <dave.martin@arm.com> Cc: "Dmitry V. Levin" <ldv@altlinux.org> Cc: x86@kernel.org Cc: linux-snps-arc@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: uclinux-h8-devel@lists.sourceforge.jp Cc: linux-hexagon@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: nios2-dev@lists.rocketboards.org Cc: openrisc@lists.librecores.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-riscv@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: linux-um@lists.infradead.org Cc: linux-xtensa@linux-xtensa.org Cc: linux-arch@vger.kernel.org Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86 Reviewed-by: NDmitry V. Levin <ldv@altlinux.org> Reported-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 04 4月, 2019 1 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
task_current_syscall() has a single user that passes in 6 for maxargs, which is the maximum arguments that can be used to get system calls from syscall_get_arguments(). Instead of passing in a number of arguments to grab, just get 6 arguments. The args argument even specifies that it's an array of 6 items. This will also allow changing syscall_get_arguments() to not get a variable number of arguments, but always grab 6. Linus also suggested not passing in a bunch of arguments to task_current_syscall() but to instead pass in a pointer to a structure, and just fill the structure. struct seccomp_data has almost all the parameters that is needed except for the stack pointer (sp). As seccomp_data is part of uapi, and I'm afraid to change it, a new structure was created "syscall_info", which includes seccomp_data and adds the "sp" field. Link: http://lkml.kernel.org/r/20161107213233.466776454@goodmis.org Cc: Andy Lutomirski <luto@kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 02 4月, 2019 3 次提交
-
-
由 Paolo Abeni 提交于
The same code to flush qdisc tree and purge the qdisc queue is duplicated in many places and in most cases it does not respect NOLOCK qdisc: the global backlog len is used and the per CPU values are ignored. This change addresses the above, factoring-out the relevant code and using the helpers introduced by the previous patch to fetch the correct backlog len. Fixes: c5ad119f ("net: sched: pfifo_fast use skb_array") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Classful qdiscs can't access directly the child qdiscs backlog length: if such qdisc is NOLOCK, per CPU values should be accounted instead. Most qdiscs no not respect the above. As a result, qstats fetching for most classful qdisc is currently incorrect: if the child qdisc is NOLOCK, it always reports 0 len backlog. This change introduces a pair of helpers to safely fetch both backlog and qlen and use them in stats class dumping functions, fixing the above issue and cleaning a bit the code. DRR needs also to access the child qdisc queue length, so it needs custom handling. Fixes: c5ad119f ("net: sched: pfifo_fast use skb_array") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Suryaputra 提交于
Configuration check to accept source route IP options should be made on the incoming netdevice when the skb->dev is an l3mdev master. The route lookup for the source route next hop also needs the incoming netdev. v2->v3: - Simplify by passing the original netdevice down the stack (per David Ahern). Signed-off-by: NStephen Suryaputra <ssuryaextr@gmail.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 3月, 2019 6 次提交
-
-
由 Yuval Avnery 提交于
Refresh tirs is looping over a global list of tirs while netdevs are adding and removing tirs from that list. That is why a lock is required. Fixes: 724b2aa1 ("net/mlx5e: TIRs management refactoring") Signed-off-by: NYuval Avnery <yuvalav@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Andrei Vagin 提交于
There are a few system calls (pselect, ppoll, etc) which replace a task sigmask while they are running in a kernel-space When a task calls one of these syscalls, the kernel saves a current sigmask in task->saved_sigmask and sets a syscall sigmask. On syscall-exit-stop, ptrace traps a task before restoring the saved_sigmask, so PTRACE_GETSIGMASK returns the syscall sigmask and PTRACE_SETSIGMASK does nothing, because its sigmask is replaced by saved_sigmask, when the task returns to user-space. This patch fixes this problem. PTRACE_GETSIGMASK returns saved_sigmask if it's set. PTRACE_SETSIGMASK drops the TIF_RESTORE_SIGMASK flag. Link: http://lkml.kernel.org/r/20181120060616.6043-1-avagin@gmail.com Fixes: 29000cae ("ptrace: add ability to get/set signal-blocked mask") Signed-off-by: NAndrei Vagin <avagin@gmail.com> Acked-by: NOleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Randy Dunlap 提交于
Fix typo of kernel-doc parameter notation (there should be no space between '@' and the parameter name). Also fixes bogus kernel-doc notation output formatting. Link: http://lkml.kernel.org/r/ddce8b80-9a8a-d52d-3546-87b2211c089a@infradead.org Fixes: 70b44595 ("mm, compaction: use free lists to quickly locate a migration source") Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Acked-by: NMel Gorman <mgorman@techsingularity.net> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Souptick Joarder 提交于
kbuild produces the below warning: tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 5453a3df commit 3d353901 ("mm: create the new vm_fault_t type") reproduce: # apt-get install sparse git checkout 3d353901 make ARCH=x86_64 allmodconfig make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' >> mm/memory.c:3968:21: sparse: incorrect type in assignment (different >> base types) @@ expected restricted vm_fault_t [usertype] ret @@ >> got e] ret @@ mm/memory.c:3968:21: expected restricted vm_fault_t [usertype] ret mm/memory.c:3968:21: got int This patch converts to return vm_fault_t type for hugetlb_fault() when CONFIG_HUGETLB_PAGE=n. Regarding the sparse warning, Luc said: : This is the expected behaviour. The constant 0 is magic regarding bitwise : types but ({ ...; 0; }) is not, it is just an ordinary expression of type : 'int'. : : So, IMHO, Souptick's patch is the right thing to do. Link: http://lkml.kernel.org/r/20190318162604.GA31553@jordon-HP-15-Notebook-PCSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nicolas Boichat 提交于
Patch series "iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables", v6. This is a followup to the discussion in [1], [2]. IOMMUs using ARMv7 short-descriptor format require page tables (level 1 and 2) to be allocated within the first 4GB of RAM, even on 64-bit systems. For L1 tables that are bigger than a page, we can just use __get_free_pages with GFP_DMA32 (on arm64 systems only, arm would still use GFP_DMA). For L2 tables that only take 1KB, it would be a waste to allocate a full page, so we considered 3 approaches: 1. This series, adding support for GFP_DMA32 slab caches. 2. genalloc, which requires pre-allocating the maximum number of L2 page tables (4096, so 4MB of memory). 3. page_frag, which is not very memory-efficient as it is unable to reuse freed fragments until the whole page is freed. [3] This series is the most memory-efficient approach. stable@ note: We confirmed that this is a regression, and IOMMU errors happen on 4.19 and linux-next/master on MT8173 (elm, Acer Chromebook R13). The issue most likely starts from commit ad67f5a6 ("arm64: replace ZONE_DMA with ZONE_DMA32"), i.e. 4.15, and presumably breaks a number of Mediatek platforms (and maybe others?). [1] https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html [2] https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031696.html [3] https://patchwork.codeaurora.org/patch/671639/ This patch (of 3): IOMMUs using ARMv7 short-descriptor format require page tables to be allocated within the first 4GB of RAM, even on 64-bit systems. On arm64, this is done by passing GFP_DMA32 flag to memory allocation functions. For IOMMU L2 tables that only take 1KB, it would be a waste to allocate a full page using get_free_pages, so we considered 3 approaches: 1. This patch, adding support for GFP_DMA32 slab caches. 2. genalloc, which requires pre-allocating the maximum number of L2 page tables (4096, so 4MB of memory). 3. page_frag, which is not very memory-efficient as it is unable to reuse freed fragments until the whole page is freed. This change makes it possible to create a custom cache in DMA32 zone using kmem_cache_create, then allocate memory using kmem_cache_alloc. We do not create a DMA32 kmalloc cache array, as there are currently no users of kmalloc(..., GFP_DMA32). These calls will continue to trigger a warning, as we keep GFP_DMA32 in GFP_SLAB_BUG_MASK. This implies that calls to kmem_cache_*alloc on a SLAB_CACHE_DMA32 kmem_cache must _not_ use GFP_DMA32 (it is anyway redundant and unnecessary). Link: http://lkml.kernel.org/r/20181210011504.122604-2-drinkcat@chromium.orgSigned-off-by: NNicolas Boichat <drinkcat@chromium.org> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NWill Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Sasha Levin <Alexander.Levin@microsoft.com> Cc: Huaisheng Ye <yehs1@lenovo.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Yong Wu <yong.wu@mediatek.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Tomasz Figa <tfiga@google.com> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Qian Cai 提交于
Commit f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online") introduced move_pfn_range_to_zone() which calls memmap_init_zone() during onlining a memory block. memmap_init_zone() will reset pagetype flags and makes migrate type to be MOVABLE. However, in __offline_pages(), it also call undo_isolate_page_range() after offline_isolated_pages() to do the same thing. Due to commit 2ce13640 ("mm: __first_valid_page skip over offline pages") changed __first_valid_page() to skip offline pages, undo_isolate_page_range() here just waste CPU cycles looping around the offlining PFN range while doing nothing, because __first_valid_page() will return NULL as offline_isolated_pages() has already marked all memory sections within the pfn range as offline via offline_mem_sections(). Also, after calling the "useless" undo_isolate_page_range() here, it reaches the point of no returning by notifying MEM_OFFLINE. Those pages will be marked as MIGRATE_MOVABLE again once onlining. The only thing left to do is to decrease the number of isolated pageblocks zone counter which would make some paths of the page allocation slower that the above commit introduced. Even if alloc_contig_range() can be used to isolate 16GB-hugetlb pages on ppc64, an "int" should still be enough to represent the number of pageblocks there. Fix an incorrect comment along the way. [cai@lca.pw: v4] Link: http://lkml.kernel.org/r/20190314150641.59358-1-cai@lca.pw Link: http://lkml.kernel.org/r/20190313143133.46200-1-cai@lca.pw Fixes: 2ce13640 ("mm: __first_valid_page skip over offline pages") Signed-off-by: NQian Cai <cai@lca.pw> Acked-by: NMichal Hocko <mhocko@suse.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> [4.13+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 3月, 2019 2 次提交
-
-
由 Eric Dumazet 提交于
net_hash_mix() currently uses kernel address of a struct net, and is used in many places that could be used to reveal this address to a patient attacker, thus defeating KASLR, for the typical case (initial net namespace, &init_net is not dynamically allocated) I believe the original implementation tried to avoid spending too many cycles in this function, but security comes first. Also provide entropy regardless of CONFIG_NET_NS. Fixes: 0b441916 ("netns: introduce the net_hash_mix "salt" for hashes") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NAmit Klein <aksecurity@gmail.com> Reported-by: NBenny Pinkas <benny@pinkas.net> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Masahiro Yamada 提交于
I do not see any consistency about headers_install of <linux/kvm_para.h> and <asm/kvm_para.h>. According to my analysis of Linux 5.1-rc1, there are 3 groups: [1] Both <linux/kvm_para.h> and <asm/kvm_para.h> are exported alpha, arm, hexagon, mips, powerpc, s390, sparc, x86 [2] <asm/kvm_para.h> is exported, but <linux/kvm_para.h> is not arc, arm64, c6x, h8300, ia64, m68k, microblaze, nios2, openrisc, parisc, sh, unicore32, xtensa [3] Neither <linux/kvm_para.h> nor <asm/kvm_para.h> is exported csky, nds32, riscv This does not match to the actual KVM support. At least, [2] is half-baked. Nor do arch maintainers look like they care about this. For example, commit 0add5371 ("microblaze: Add missing kvm_para.h to Kbuild") exported <asm/kvm_para.h> to user-space in order to fix an in-kernel build error. We have two ways to make this consistent: [A] export both <linux/kvm_para.h> and <asm/kvm_para.h> for all architectures, irrespective of the KVM support [B] Match the header export of <linux/kvm_para.h> and <asm/kvm_para.h> to the KVM support My first attempt was [A] because the code looks cleaner, but Paolo suggested [B]. So, this commit goes with [B]. For most architectures, <asm/kvm_para.h> was moved to the kernel-space. I changed include/uapi/linux/Kbuild so that it checks generated asm/kvm_para.h as well as check-in ones. After this commit, there will be two groups: [1] Both <linux/kvm_para.h> and <asm/kvm_para.h> are exported arm, arm64, mips, powerpc, s390, x86 [2] Neither <linux/kvm_para.h> nor <asm/kvm_para.h> is exported alpha, arc, c6x, csky, h8300, hexagon, ia64, m68k, microblaze, nds32, nios2, openrisc, parisc, riscv, sh, sparc, unicore32, xtensa Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Acked-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 28 3月, 2019 2 次提交
-
-
由 Claudiu Manoil 提交于
With a recent link mode advertisement code update this helper providing local pause capability translation used for flow control link mode negotiation got broken. For eth drivers using this helper, the issue is apparent only if either PAUSE or ASYM_PAUSE is being advertised. Fixes: 3c1bcc86 ("net: ethernet: Convert phydev advertize and supported from u32 to link mode") Signed-off-by: NClaudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hans de Goede 提交于
VirtualBox 6.0.x has a new feature where the guest kernel driver passes info about the origin of the request (e.g. userspace or kernelspace) to the hypervisor. If we do not pass this information then when running the 6.0.x userspace guest-additions tools on a 6.0.x host, some requests will get denied with a VERR_VERSION_MISMATCH error, breaking vboxservice.service and the mounting of shared folders marked to be auto-mounted. This commit implements passing the requestor info to the host, fixing this. Signed-off-by: NHans de Goede <hdegoede@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 27 3月, 2019 1 次提交
-
-
由 Vladimir Oltean 提交于
Previously the green and amber LEDs on this quad PHY were solid, to indicate an encoding of the link speed (10/100/1000). This keeps the LEDs always on just as before, but now they flash on Rx/Tx activity. Signed-off-by: NVladimir Oltean <olteanv@gmail.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 3月, 2019 2 次提交
-
-
由 Bhupesh Sharma 提交于
Commit bf904d27 ("x86/pti/64: Remove the SYSCALL64 entry trampoline") removed the sole usage of kclist_add_remap() but it did not remove the left-over definition from the include file. Fix the same. Signed-off-by: NBhupesh Sharma <bhsharma@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Anderson <anderson@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Kairui Song <kasong@redhat.com> Cc: kexec@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Omar Sandoval <osandov@fb.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/1553583028-17804-1-git-send-email-bhsharma@redhat.com
-
由 Linus Torvalds 提交于
This reverts commit 1aec4211. Steven Rostedt reports that it causes a hang at bootup and bisected it to this commit. The troigger is apparently a module alias for "parport_lowlevel" that points to "parport_pc", which causes a hang with modprobe -q -- parport_lowlevel blocking forever with a backtrace like this: wait_for_completion_killable+0x1c/0x28 call_usermodehelper_exec+0xa7/0x108 __request_module+0x351/0x3d8 get_lowlevel_driver+0x28/0x41 [parport] __parport_register_driver+0x39/0x1f4 [parport] daisy_drv_init+0x31/0x4f [parport] parport_bus_init+0x5d/0x7b [parport] parport_default_proc_register+0x26/0x1000 [parport] do_one_initcall+0xc2/0x1e0 do_init_module+0x50/0x1d4 load_module+0x1c2e/0x21b3 sys_init_module+0xef/0x117 Supid says: "Due to the new device model daisy driver will now try to find the parallel ports while trying to register its driver so that it can bind with them. Now, since daisy driver is loaded while parport bus is initialising the list of parport is still empty and it tries to load the lowlevel driver, which has an alias set to parport_pc, now causes a deadlock" But I don't think the daisy driver should be loaded by the parport initialization in the first place, so let's revert the whole change. If the daisy driver can just initialize separately on its own (like a driver should), instead of hooking into the parport init sequence directly, this issue probably would go away. Reported-and-bisected-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Reported-by: NMichal Kubecek <mkubecek@suse.cz> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 3月, 2019 2 次提交
-
-
由 Neil Armstrong 提交于
The G12A Documentation lacked these 2 reset lines, but they are present and used for each USB 2 PHYs. Add them to the dt-bindings for the upcoming USB support. Fixes: dbfc5453 ("dt-bindings: reset: meson: add g12a bindings") Signed-off-by: NNeil Armstrong <narmstrong@baylibre.com> Reviewed-by: NMartin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: NPhilipp Zabel <p.zabel@pengutronix.de>
-
由 Erik Schmauss 提交于
Rather than setting debug output flags during early init, its makes more sense to simply re-define ACPI_DEBUG_DEFAULT specifically for Linux. ACPICA commit 60903715711f4b00ca1831779a8a23279a66497d Link: https://github.com/acpica/acpica/commit/60903715 Fixes: ce5cbf53 ("ACPI: Set debug output flags independent of ACPICA") Reported-by: NAlexandru Gagniuc <mr.nuke.me@gmail.com> Tested-by: NAlexandru Gagniuc <mr.nuke.me@gmail.com> Signed-off-by: NErik Schmauss <erik.schmauss@intel.com> Signed-off-by: NBob Moore <robert.moore@intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 23 3月, 2019 2 次提交
-
-
由 Kairui Song 提交于
On machines where the GART aperture is mapped over physical RAM, /proc/kcore contains the GART aperture range. Accessing the GART range via /proc/kcore results in a kernel crash. vmcore used to have the same issue, until it was fixed with commit 2a3e83c6 ("x86/gart: Exclude GART aperture from vmcore")', leveraging existing hook infrastructure in vmcore to let /proc/vmcore return zeroes when attempting to read the aperture region, and so it won't read from the actual memory. Apply the same workaround for kcore. First implement the same hook infrastructure for kcore, then reuse the hook functions introduced in the previous vmcore fix. Just with some minor adjustment, rename some functions for more general usage, and simplify the hook infrastructure a bit as there is no module usage yet. Suggested-by: NBaoquan He <bhe@redhat.com> Signed-off-by: NKairui Song <kasong@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NJiri Bohac <jbohac@suse.cz> Acked-by: NBaoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Omar Sandoval <osandov@fb.com> Cc: Dave Young <dyoung@redhat.com> Link: https://lkml.kernel.org/r/20190308030508.13548-1-kasong@redhat.com
-
由 Shenghui Wang 提交于
"sbitmap_batch_clear" should be "sbitmap_deferred_clear" Acked-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NShenghui Wang <shhuiw@foxmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 22 3月, 2019 4 次提交
-
-
由 Thomas Gleixner 提交于
spdxcheck.py complains: include/linux/platform_data/gpio/gpio-amd-fch.h: 1:28 Invalid License ID: GPL+ which is correct because GPL+ is not a valid identifier. Of course this could have been caught by checkpatch.pl _before_ submitting or merging the patch. WARNING: 'SPDX-License-Identifier: GPL+ */' is not supported in LICENSES/... #271: FILE: include/linux/platform_data/gpio/gpio-amd-fch.h:1: +/* SPDX-License-Identifier: GPL+ */ Fix it under the assumption that the author meant GPL-2.0+, which makes sense as the corresponding C file is using that identifier. Fixes: e09d168f ("gpio: AMD G-Series PCH gpio driver") Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com>
-
由 Davide Caratti 提交于
use RCU when accessing the action chain, to avoid use after free in the traffic path when 'goto chain' is replaced on existing TC actions (see script below). Since the control action is read in the traffic path without holding the action spinlock, we need to explicitly ensure that a->goto_chain is not NULL before dereferencing (i.e it's not sufficient to rely on the value of TC_ACT_GOTO_CHAIN bits). Not doing so caused NULL dereferences in tcf_action_goto_chain_exec() when the following script: # tc chain add dev dd0 chain 42 ingress protocol ip flower \ > ip_proto udp action pass index 4 # tc filter add dev dd0 ingress protocol ip flower \ > ip_proto udp action csum udp goto chain 42 index 66 # tc chain del dev dd0 chain 42 ingress (start UDP traffic towards dd0) # tc action replace action csum udp pass index 66 was run repeatedly for several hours. Suggested-by: NCong Wang <xiyou.wangcong@gmail.com> Suggested-by: NVlad Buslov <vladbu@mellanox.com> Signed-off-by: NDavide Caratti <dcaratti@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Davide Caratti 提交于
callers of tcf_gact_goto_chain_index() can potentially read an old value of the chain index, or even dereference a NULL 'goto_chain' pointer, because 'goto_chain' and 'tcfa_action' are read in the traffic path without caring of concurrent write in the control path. The most recent value of chain index can be read also from a->tcfa_action (it's encoded there together with TC_ACT_GOTO_CHAIN bits), so we don't really need to dereference 'goto_chain': just read the chain id from the control action. Fixes: e457d86a ("net: sched: add couple of goto_chain helpers") Signed-off-by: NDavide Caratti <dcaratti@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Davide Caratti 提交于
- pass a pointer to struct tcf_proto in each actions's init() handler, to allow validating the control action, checking whether the chain exists and (eventually) refcounting it. - remove code that validates the control action after a successful call to the action's init() handler, and replace it with a test that forbids addition of actions having 'goto_chain' and NULL goto_chain pointer at the same time. - add tcf_action_check_ctrlact(), that will validate the control action and eventually allocate the action 'goto_chain' within the init() handler. - add tcf_action_set_ctrlact(), that will assign the control action and swap the current 'goto_chain' pointer with the new given one. This disallows 'goto_chain' on actions that don't initialize it properly in their init() handler, i.e. calling tcf_action_check_ctrlact() after successful IDR reservation and then calling tcf_action_set_ctrlact() to assign 'goto_chain' and 'tcf_action' consistently. By doing this, the kernel does not leak anymore refcounts when a valid 'goto chain' handle is replaced in TC actions, causing kmemleak splats like the following one: # tc chain add dev dd0 chain 42 ingress protocol ip flower \ > ip_proto tcp action drop # tc chain add dev dd0 chain 43 ingress protocol ip flower \ > ip_proto udp action drop # tc filter add dev dd0 ingress matchall \ > action gact goto chain 42 index 66 # tc filter replace dev dd0 ingress matchall \ > action gact goto chain 43 index 66 # echo scan >/sys/kernel/debug/kmemleak <...> unreferenced object 0xffff93c0ee09f000 (size 1024): comm "tc", pid 2565, jiffies 4295339808 (age 65.426s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 08 00 06 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000009b63f92d>] tc_ctl_chain+0x3d2/0x4c0 [<00000000683a8d72>] rtnetlink_rcv_msg+0x263/0x2d0 [<00000000ddd88f8e>] netlink_rcv_skb+0x4a/0x110 [<000000006126a348>] netlink_unicast+0x1a0/0x250 [<00000000b3340877>] netlink_sendmsg+0x2c1/0x3c0 [<00000000a25a2171>] sock_sendmsg+0x36/0x40 [<00000000f19ee1ec>] ___sys_sendmsg+0x280/0x2f0 [<00000000d0422042>] __sys_sendmsg+0x5e/0xa0 [<000000007a6c61f9>] do_syscall_64+0x5b/0x180 [<00000000ccd07542>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<0000000013eaa334>] 0xffffffffffffffff Fixes: db50514f ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0 ("net_sched: reject unknown tcfa_action values") Signed-off-by: NDavide Caratti <dcaratti@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 3月, 2019 1 次提交
-
-
由 Peter Xu 提交于
Signed-off-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Dou Liyang <douliyangs@gmail.com> Cc: Julien Thierry <julien.thierry@arm.com> Link: https://lkml.kernel.org/r/20190318065123.11862-1-peterx@redhat.com
-