- 27 4月, 2016 1 次提交
-
-
由 Richard Guy Briggs 提交于
The tty field was missing from AUDIT_LOGIN events. Refactor code to create a new function audit_get_tty(), using it to replace the call in audit_log_task_info() and to add it to audit_log_set_loginuid(). Lock and bump the kref to protect it, adding audit_put_tty() alias to decrement it. Signed-off-by: NRichard Guy Briggs <rgb@redhat.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
- 13 3月, 2016 1 次提交
-
-
由 Ming Lei 提交于
For !BIO_CLONED bio, we can use .bi_vcnt safely, but it doesn't mean we can just simply return .bi_io_vec[.bi_vcnt - 1] because the start postion may have been moved in the middle of the bvec, such as splitting in the middle of bvec. Fixes: 7bcd79ac(block: bio: introduce helpers to get the 1st and last bvec) Cc: stable@vger.kernel.org Reported-by: NKent Overstreet <kent.overstreet@gmail.com> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 10 3月, 2016 4 次提交
-
-
由 Zhen Lei 提交于
To keep consistent with kfree, which tolerate ptr is NULL. We do this because sometimes we may use goto statement, so that success and failure case can share parts of the code. But unfortunately, dma_free_coherent called with parameter cpu_addr is null will cause oops, such as showed below: Unable to handle kernel paging request at virtual address ffffffc020d3b2b8 pgd = ffffffc083a61000 [ffffffc020d3b2b8] *pgd=0000000000000000, *pud=0000000000000000 CPU: 4 PID: 1489 Comm: malloc_dma_1 Tainted: G O 4.1.12 #1 Hardware name: ARM64 (DT) PC is at __dma_free_coherent.isra.10+0x74/0xc8 LR is at __dma_free+0x9c/0xb0 Process malloc_dma_1 (pid: 1489, stack limit = 0xffffffc0837fc020) [...] Call trace: __dma_free_coherent.isra.10+0x74/0xc8 __dma_free+0x9c/0xb0 malloc_dma+0x104/0x158 [dma_alloc_coherent_mtmalloc] kthread+0xec/0xfc Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mark Rutland 提交于
Functions which the compiler has instrumented for ASAN place poison on the stack shadow upon entry and remove this poison prior to returning. In some cases (e.g. hotplug and idle), CPUs may exit the kernel a number of levels deep in C code. If there are any instrumented functions on this critical path, these will leave portions of the idle thread stack shadow poisoned. If a CPU returns to the kernel via a different path (e.g. a cold entry), then depending on stack frame layout subsequent calls to instrumented functions may use regions of the stack with stale poison, resulting in (spurious) KASAN splats to the console. Contemporary GCCs always add stack shadow poisoning when ASAN is enabled, even when asked to not instrument a function [1], so we can't simply annotate functions on the critical path to avoid poisoning. Instead, this series explicitly removes any stale poison before it can be hit. In the common hotplug case we clear the entire stack shadow in common code, before a CPU is brought online. On architectures which perform a cold return as part of cpu idle may retain an architecture-specific amount of stack contents. To retain the poison for this retained context, the arch code must call the core KASAN code, passing a "watermark" stack pointer value beyond which shadow will be cleared. Architectures which don't perform a cold return as part of idle do not need any additional code. This patch (of 3): Functions which the compiler has instrumented for KASAN place poison on the stack shadow upon entry and remove this poision prior to returning. In some cases (e.g. hotplug and idle), CPUs may exit the kernel a number of levels deep in C code. If there are any instrumented functions on this critical path, these will leave portions of the stack shadow poisoned. If a CPU returns to the kernel via a different path (e.g. a cold entry), then depending on stack frame layout subsequent calls to instrumented functions may use regions of the stack with stale poison, resulting in (spurious) KASAN splats to the console. To avoid this, we must clear stale poison from the stack prior to instrumented functions being called. This patch adds functions to the KASAN core for removing poison from (portions of) a task's stack. These will be used by subsequent patches to avoid problems with hotplug and idle. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Reviewed-by: NAndrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Williams 提交于
Given we have uninitialized list_heads being passed to list_add() it will always be the case that those uninitialized values randomly trigger the poison value. Especially since a list_add() operation will seed the stack with the poison value for later stack allocations to trip over. For example, see these two false positive reports: list_add attempted on force-poisoned entry WARNING: at lib/list_debug.c:34 [..] NIP [c00000000043c390] __list_add+0xb0/0x150 LR [c00000000043c38c] __list_add+0xac/0x150 Call Trace: __list_add+0xac/0x150 (unreliable) __down+0x4c/0xf8 down+0x68/0x70 xfs_buf_lock+0x4c/0x150 [xfs] list_add attempted on force-poisoned entry(0000000000000500), new->next == d0000000059ecdb0, new->prev == 0000000000000500 WARNING: at lib/list_debug.c:33 [..] NIP [c00000000042db78] __list_add+0xa8/0x140 LR [c00000000042db74] __list_add+0xa4/0x140 Call Trace: __list_add+0xa4/0x140 (unreliable) rwsem_down_read_failed+0x6c/0x1a0 down_read+0x58/0x60 xfs_log_commit_cil+0x7c/0x600 [xfs] Fixes: commit 5c2c2587 ("mm, dax, pmem: introduce {get|put}_dev_pagemap() for dax-gup") Signed-off-by: NDan Williams <dan.j.williams@intel.com> Reported-by: NEryu Guan <eguan@redhat.com> Tested-by: NEryu Guan <eguan@redhat.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> -
由 Steven Rostedt (Red Hat) 提交于
Commit f3775549 ("tracepoints: Do not trace when cpu is offline") added a check to make sure that tracepoints only get called when the cpu is online, as it uses rcu_read_lock_sched() for protection. Commit 3a630178 ("tracing: generate RCU warnings even when tracepoints are disabled") added lockdep checks (including rcu checks) for events that are not enabled to catch possible RCU issues that would only be triggered if a trace event was enabled. Commit f3775549 only stopped the warnings when the trace event was enabled but did not prevent warnings if the trace event was called when disabled. To fix this, the cpu online check is moved to where the condition is added to the trace event. This will place the cpu online check in all places that it may be used now and in the future. Cc: stable@vger.kernel.org # v3.18+ Fixes: f3775549 ("tracepoints: Do not trace when cpu is offline") Fixes: 3a630178 ("tracing: generate RCU warnings even when tracepoints are disabled") Reported-by: NSudeep Holla <sudeep.holla@arm.com> Tested-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 05 3月, 2016 1 次提交
-
-
由 Yan, Zheng 提交于
Add support for the format change of MClientReply/MclientCaps. Also add code that denies access to inodes with pool_ns layouts. Signed-off-by: NYan, Zheng <zyan@redhat.com> Reviewed-by: NSage Weil <sage@redhat.com>
-
- 04 3月, 2016 8 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
Commit 9f616680 "tracing: Allow triggers to filter for CPU ids and process names" added a 'comm' filter that will filter events based on the current tasks struct 'comm'. But this now hides the ability to filter events that have a 'comm' field too. For example, sched_migrate_task trace event. That has a 'comm' field of the task to be migrated. echo 'comm == "bash"' > events/sched_migrate_task/filter will now filter all sched_migrate_task events for tasks named "bash" that migrates other tasks (in interrupt context), instead of seeing when "bash" itself gets migrated. This fix requires a couple of changes. 1) Change the look up order for filter predicates to look at the events fields before looking at the generic filters. 2) Instead of basing the filter function off of the "comm" name, have the generic "comm" filter have its own filter_type (FILTER_COMM). Test against the type instead of the name to assign the filter function. 3) Add a new "COMM" filter that works just like "comm" but will filter based on the current task, even if the trace event contains a "comm" field. Do the same for "cpu" field, adding a FILTER_CPU and a filter "CPU". Cc: stable@vger.kernel.org # v4.3+ Fixes: 9f616680 "tracing: Allow triggers to filter for CPU ids and process names" Reported-by: NMatt Fleming <matt@codeblueprint.co.uk> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Christoph Hellwig 提交于
Driver private request types should not get the artifical cap for the FS requests. This is important to use the full device capabilities for internal command or NVMe pass through commands. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reported-by: NJeff Lien <Jeff.Lien@hgst.com> Tested-by: NJeff Lien <Jeff.Lien@hgst.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Updated by me to use an explicit check for the one command type that does support extended checking, instead of relying on the ordering of the enum command values - as suggested by Keith. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tejun Heo 提交于
If cgroup writeback is in use, inodes can be scheduled for asynchronous wb switching. Before 5ff8eaac ("writeback: keep superblock pinned during cgroup writeback association switches"), this could race with umount leading to super_block being destroyed while inodes are pinned for wb switching. 5ff8eaac fixed it by bumping s_active while wb switches are in flight; however, this allowed in-flight wb switches to make umounts asynchronous when the userland expected synchronosity - e.g. fsck immediately following umount may fail because the device is still busy. This patch removes the problematic super_block pinning and instead makes generic_shutdown_super() flush in-flight wb switches. wb switches are now executed on a dedicated isw_wq so that they can be flushed and isw_nr_in_flight keeps track of the number of in-flight wb switches so that flushing can be avoided in most cases. v2: Move cgroup_writeback_umount() further below and add MS_ACTIVE check in inode_switch_wbs() as Jan an Al suggested. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NTahsin Erdogan <tahsin@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@ZenIV.linux.org.uk> Link: http://lkml.kernel.org/g/CAAeU0aNCq7LGODvVGRU-oU_o-6enii5ey0p1c26D1ZzYwkDc5A@mail.gmail.com Fixes: 5ff8eaac ("writeback: keep superblock pinned during cgroup writeback association switches") Cc: stable@vger.kernel.org #v4.5 Reviewed-by: NJan Kara <jack@suse.cz> Tested-by: NTahsin Erdogan <tahsin@google.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Ming Lei 提交于
This patch applies the two introduced helpers to figure out the 1st and last bvec, and fixes the original way after bio splitting. Cc: stable@vger.kernel.org Reported-by: NSagi Grimberg <sagig@dev.mellanox.co.il> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Ming Lei 提交于
In the following patch, the way for figuring out the last bvec will be changed with a bit cost introduced, so return immediately if the queue doesn't have virt boundary limit. Actually most of devices have not this limit. Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Ming Lei 提交于
The bio passed to bio_will_gap() may be fast cloned from upper layer(dm, md, bcache, fs, ...), or from bio splitting in block core. Unfortunately bio_will_gap() just figures out the last bvec via 'bi_io_vec[prev->bi_vcnt - 1]' directly, and this way is obviously wrong. This patch introduces two helpers for getting the first and last bvec of one bio for fixing the issue. Cc: stable@vger.kernel.org Reported-by: NSagi Grimberg <sagig@dev.mellanox.co.il> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Benjamin Poirier 提交于
The current reserved_tailroom calculation fails to take hlen and tlen into account. skb: [__hlen__|__data____________|__tlen___|__extra__] ^ ^ head skb_end_offset In this representation, hlen + data + tlen is the size passed to alloc_skb. "extra" is the extra space made available in __alloc_skb because of rounding up by kmalloc. We can reorder the representation like so: [__hlen__|__data____________|__extra__|__tlen___] ^ ^ head skb_end_offset The maximum space available for ip headers and payload without fragmentation is min(mtu, data + extra). Therefore, reserved_tailroom = data + extra + tlen - min(mtu, data + extra) = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen) = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen) Compare the second line to the current expression: reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset) and we can see that hlen and tlen are not taken into account. The min() in the third line can be expanded into: if mtu < skb_tailroom - tlen: reserved_tailroom = skb_tailroom - mtu else: reserved_tailroom = tlen Depending on hlen, tlen, mtu and the number of multicast address records, the current code may output skbs that have less tailroom than dev->needed_tailroom or it may output more skbs than needed because not all space available is used. Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs") Signed-off-by: NBenjamin Poirier <bpoirier@suse.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gabriel Fernandez 提交于
This patch manages the case when you have an Ethernet MAC with a "fixed link", and not connected to a normal MDIO-managed PHY device. The test of phy_bus_name was not helpful because it was never affected and replaced by the mdio test node. Signed-off-by: NGabriel Fernandez <gabriel.fernandez@linaro.org> Acked-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 3月, 2016 1 次提交
-
-
由 Tariq Toukan 提交于
We should modify TIRs explicitly to apply the new RSS configuration. The light ndo close/open calls do not "refresh" them. Fixes: 2d75b2bc ('net/mlx5e: Add ethtool RSS configuration options') Signed-off-by: NTariq Toukan <tariqt@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 3月, 2016 1 次提交
-
-
由 Al Viro 提交于
Games with ordering and barriers are way too brittle. Just bump ->d_seq before and after updating ->d_inode and ->d_flags type bits, so that verifying ->d_seq would guarantee they are coherent. Cc: stable@vger.kernel.org # v3.13+ Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 28 2月, 2016 3 次提交
-
-
由 Ross Zwisler 提交于
Previously calls to dax_writeback_mapping_range() for all DAX filesystems (ext2, ext4 & xfs) were centralized in filemap_write_and_wait_range(). dax_writeback_mapping_range() needs a struct block_device, and it used to get that from inode->i_sb->s_bdev. This is correct for normal inodes mounted on ext2, ext4 and XFS filesystems, but is incorrect for DAX raw block devices and for XFS real-time files. Instead, call dax_writeback_mapping_range() directly from the filesystem ->writepages function so that it can supply us with a valid block device. This also fixes DAX code to properly flush caches in response to sync(2). Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NJan Kara <jack@suse.cz> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@fb.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ross Zwisler 提交于
dax_clear_blocks() needs a valid struct block_device and previously it was using inode->i_sb->s_bdev in all cases. This is correct for normal inodes on mounted ext2, ext4 and XFS filesystems, but is incorrect for DAX raw block devices and for XFS real-time devices. Instead, rename dax_clear_blocks() to dax_clear_sectors(), and change its arguments to take a bdev and a sector instead of an inode and a block. This better reflects what the function does, and it allows the filesystem and raw block device code to pass in an appropriate struct block_device. Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Suggested-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NJan Kara <jack@suse.cz> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@fb.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Daniel Cashman 提交于
Commit d07e2259 ("mm: mmap: add new /proc tunable for mmap_base ASLR") added the ability to choose from a range of values to use for entropy count in generating the random offset to the mmap_base address. The maximum value on this range was set to 32 bits for 64-bit x86 systems, but this value could be increased further, requiring more than the 32 bits of randomness provided by get_random_int(), as is already possible for arm64. Add a new function: get_random_long() which more naturally fits with the mmap usage of get_random_int() but operates exactly the same as get_random_int(). Also, fix the shifting constant in mmap_rnd() to be an unsigned long so that values greater than 31 bits generate an appropriate mask without overflow. This is especially important on x86, as its shift instruction uses a 5-bit mask for the shift operand, which meant that any value for mmap_rnd_bits over 31 acts as a no-op and effectively disables mmap_base randomization. Finally, replace calls to get_random_int() with get_random_long() where appropriate. This patch (of 2): Add get_random_long(). Signed-off-by: NDaniel Cashman <dcashman@android.com> Acked-by: NKees Cook <keescook@chromium.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: David S. Miller <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Nick Kralevich <nnk@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Mark Salyzyn <salyzyn@android.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 2月, 2016 1 次提交
-
-
由 Harvey Hunt 提交于
The id buffer in ata_device is a DMA target, but it isn't explicitly cacheline aligned. Due to this, adjacent fields can be overwritten with stale data from memory on non coherent architectures. As a result, the kernel is sometimes unable to communicate with an ATA device. Fix this by ensuring that the id buffer is cacheline aligned. This issue is similar to that fixed by Commit 84bda12a ("libata: align ap->sector_buf"). Signed-off-by: NHarvey Hunt <harvey.hunt@imgtec.com> Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> # 2.6.18 Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 25 2月, 2016 2 次提交
-
-
由 Peter Zijlstra 提交于
perf_install_in_context() relies upon the context switch hooks to have scheduled in events when the IPI misses its target -- after all, if the task has moved from the CPU (or wasn't running at all), it will have to context switch to run elsewhere. This however doesn't appear to be happening. It is possible for the IPI to not happen (task wasn't running) only to later observe the task running with an inactive context. The only possible explanation is that the context switch hooks are not called. Therefore put in a sync_sched() after toggling the jump_label to guarantee all CPUs will have them enabled before we install an event. A simple if (0->1) sync_sched() will not in fact work, because any further increment can race and complete before the sync_sched(). Therefore we must jump through some hoops. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.980211985@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Alexander reported that when the 'original' context gets destroyed, no new clones happen. This can happen irrespective of the ctx switch optimization, any task can die, even the parent, and we want to continue monitoring the task hierarchy until we either close the event or no tasks are left in the hierarchy. perf_event_init_context() will attempt to pin the 'parent' context during clone(). At that point current is the parent, and since current cannot have exited while executing clone(), its context cannot have passed through perf_event_exit_task_context(). Therefore perf_pin_task_context() cannot observe ctx->task == TASK_TOMBSTONE. However, since inherit_event() does: if (parent_event->parent) parent_event = parent_event->parent; it looks at the 'original' event when it does: is_orphaned_event(). This can return true if the context that contains the this event has passed through perf_event_exit_task_context(). And thus we'll fail to clone the perf context. Fix this by adding a new state: STATE_DEAD, which is set by perf_release() to indicate that the filedesc (or kernel reference) is dead and there are no observers for our data left. Only for STATE_DEAD will is_orphaned_event() be true and inhibit cloning. STATE_EXIT is otherwise preserved such that is_event_hup() remains functional and will report when the observed task hierarchy becomes empty. Reported-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Tested-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Fixes: c6e5b732 ("perf: Synchronously clean up child events") Link: http://lkml.kernel.org/r/20160224174947.919845295@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 24 2月, 2016 1 次提交
-
-
由 Dan Williams 提交于
The original format of these commands from the "NVDIMM DSM Interface Example" [1] are superseded by the ACPI 6.1 definition of the "NVDIMM Root Device _DSMs" [2]. [1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf [2]: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf "9.20.7 NVDIMM Root Device _DSMs" Changes include: 1/ New 'restart' fields in ars_status, unfortunately these are implemented in the middle of the existing definition so this change is not backwards compatible. The expectation is that shipping platforms will only ever support the ACPI 6.1 definition. 2/ New status values for ars_start ('busy') and ars_status ('overflow'). Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Linda Knippers <linda.knippers@hpe.com> Cc: <stable@vger.kernel.org> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 22 2月, 2016 2 次提交
-
-
由 Karicheri, Muralidharan 提交于
Rename the pad to sw_data as per description of this field in the hardware spec(refer sprugr9 from www.ti.com). Latest version of the document is at http://www.ti.com/lit/ug/sprugr9h/sprugr9h.pdf and section 3.1 Host Packet Descriptor describes this field. Define and use a constant for the size of sw_data field similar to other fields in the struct for desc and document the sw_data field in the header. As the sw_data is not touched by hw, it's type can be changed to u32. Rename the helpers to match with the updated dma desc field sw_data. Cc: Wingman Kwok <w-kwok2@ti.com> Cc: Mugunthan V N <mugunthanvnm@ti.com> CC: Arnd Bergmann <arnd@arndb.de> CC: Grygorii Strashko <grygorii.strashko@ti.com> CC: David Laight <David.Laight@ACULAB.COM> Signed-off-by: NMurali Karicheri <m-karicheri2@ti.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ivaylo Dimitrov 提交于
Patch <703df6c0> ("power: bq27xxx_battery: Reorganize I2C into a module") has removed the device name numbering from bq27xxx_battery_i2c_probe. Fix that by restoring the code. Fixes: 703df6c0Signed-off-by: NIvaylo Dimitrov <ivo.g.dimitrov.75@gmail.com> Reviewed-by: NPali Rohár <pali.rohar@gmail.com> Tested-by: NPali Rohár <pali.rohar@gmail.com> Signed-off-by: NSebastian Reichel <sre@kernel.org>
-
- 20 2月, 2016 2 次提交
-
-
由 Dan Williams 提交于
Use the output length specified in the command to size the receive buffer rather than the arbitrary 4K limit. This bug was hiding the fact that the ndctl implementation of ndctl_bus_cmd_new_ars_status() was not specifying an output buffer size. Cc: <stable@vger.kernel.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> -
由 Nikolay Aleksandrov 提交于
When I used netdev_for_each_lower_dev in commit bad53162 ("vrf: remove slave queue and private slave struct") I thought that it acts like netdev_for_each_lower_private and can be used to remove the current device from the list while walking, but unfortunately it acts more like netdev_for_each_lower_private_rcu and doesn't allow it. The difference is where the "iter" points to, right now it points to the current element and that makes it impossible to remove it. Change the logic to be similar to netdev_for_each_lower_private and make it point to the "next" element so we can safely delete the current one. VRF is the only such user right now, there's no change for the read-only users. Here's what can happen now: [98423.249858] general protection fault: 0000 [#1] SMP [98423.250175] Modules linked in: vrf bridge(O) stp llc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac drbg ppdev aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd evdev serio_raw pcspkr virtio_balloon parport_pc parport i2c_piix4 i2c_core virtio_console acpi_cpufreq button 9pnet_virtio 9p 9pnet fscache ipv6 autofs4 ext4 crc16 mbcache jbd2 sg virtio_blk virtio_net sr_mod cdrom e1000 ata_generic ehci_pci uhci_hcd ehci_hcd usbcore usb_common virtio_pci ata_piix libata floppy virtio_ring virtio scsi_mod [last unloaded: bridge] [98423.255040] CPU: 1 PID: 14173 Comm: ip Tainted: G O 4.5.0-rc2+ #81 [98423.255386] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 [98423.255777] task: ffff8800547f5540 ti: ffff88003428c000 task.ti: ffff88003428c000 [98423.256123] RIP: 0010:[<ffffffff81514f3e>] [<ffffffff81514f3e>] netdev_lower_get_next+0x1e/0x30 [98423.256534] RSP: 0018:ffff88003428f940 EFLAGS: 00010207 [98423.256766] RAX: 0002000100000004 RBX: ffff880054ff9000 RCX: 0000000000000000 [98423.257039] RDX: ffff88003428f8b8 RSI: ffff88003428f950 RDI: ffff880054ff90c0 [98423.257287] RBP: ffff88003428f940 R08: 0000000000000000 R09: 0000000000000000 [98423.257537] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88003428f9e0 [98423.257802] R13: ffff880054a5fd00 R14: ffff88003428f970 R15: 0000000000000001 [98423.258055] FS: 00007f3d76881700(0000) GS:ffff88005d000000(0000) knlGS:0000000000000000 [98423.258418] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [98423.258650] CR2: 00007ffe5951ffa8 CR3: 0000000052077000 CR4: 00000000000406e0 [98423.258902] Stack: [98423.259075] ffff88003428f960 ffffffffa0442636 0002000100000004 ffff880054ff9000 [98423.259647] ffff88003428f9b0 ffffffff81518205 ffff880054ff9000 ffff88003428f978 [98423.260208] ffff88003428f978 ffff88003428f9e0 ffff88003428f9e0 ffff880035b35f00 [98423.260739] Call Trace: [98423.260920] [<ffffffffa0442636>] vrf_dev_uninit+0x76/0xa0 [vrf] [98423.261156] [<ffffffff81518205>] rollback_registered_many+0x205/0x390 [98423.261401] [<ffffffff815183ec>] unregister_netdevice_many+0x1c/0x70 [98423.261641] [<ffffffff8153223c>] rtnl_delete_link+0x3c/0x50 [98423.271557] [<ffffffff815335bb>] rtnl_dellink+0xcb/0x1d0 [98423.271800] [<ffffffff811cd7da>] ? __inc_zone_state+0x4a/0x90 [98423.272049] [<ffffffff815337b4>] rtnetlink_rcv_msg+0x84/0x200 [98423.272279] [<ffffffff810cfe7d>] ? trace_hardirqs_on+0xd/0x10 [98423.272513] [<ffffffff8153370b>] ? rtnetlink_rcv+0x1b/0x40 [98423.272755] [<ffffffff81533730>] ? rtnetlink_rcv+0x40/0x40 [98423.272983] [<ffffffff8155d6e7>] netlink_rcv_skb+0x97/0xb0 [98423.273209] [<ffffffff8153371a>] rtnetlink_rcv+0x2a/0x40 [98423.273476] [<ffffffff8155ce8b>] netlink_unicast+0x11b/0x1a0 [98423.273710] [<ffffffff8155d2f1>] netlink_sendmsg+0x3e1/0x610 [98423.273947] [<ffffffff814fbc98>] sock_sendmsg+0x38/0x70 [98423.274175] [<ffffffff814fc253>] ___sys_sendmsg+0x2e3/0x2f0 [98423.274416] [<ffffffff810d841e>] ? do_raw_spin_unlock+0xbe/0x140 [98423.274658] [<ffffffff811e1bec>] ? handle_mm_fault+0x26c/0x2210 [98423.274894] [<ffffffff811e19cd>] ? handle_mm_fault+0x4d/0x2210 [98423.275130] [<ffffffff81269611>] ? __fget_light+0x91/0xb0 [98423.275365] [<ffffffff814fcd42>] __sys_sendmsg+0x42/0x80 [98423.275595] [<ffffffff814fcd92>] SyS_sendmsg+0x12/0x20 [98423.275827] [<ffffffff81611bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a [98423.276073] Code: c3 31 c0 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 06 55 48 81 c7 c0 00 00 00 48 89 e5 48 8b 00 48 39 f8 74 09 48 89 06 <48> 8b 40 e8 5d c3 31 c0 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 [98423.279639] RIP [<ffffffff81514f3e>] netdev_lower_get_next+0x1e/0x30 [98423.279920] RSP <ffff88003428f940> CC: David Ahern <dsa@cumulusnetworks.com> CC: David S. Miller <davem@davemloft.net> CC: Roopa Prabhu <roopa@cumulusnetworks.com> CC: Vlad Yasevich <vyasevic@redhat.com> Fixes: bad53162 ("vrf: remove slave queue and private slave struct") Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: NDavid Ahern <dsa@cumulusnetworks.com> Tested-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 2月, 2016 1 次提交
-
-
由 Jeff Layton 提交于
This reverts commit c510eff6 ("fsnotify: destroy marks with call_srcu instead of dedicated thread"). Eryu reported that he was seeing some OOM kills kick in when running a testcase that adds and removes inotify marks on a file in a tight loop. The above commit changed the code to use call_srcu to clean up the marks. While that does (in principle) work, the srcu callback job is limited to cleaning up entries in small batches and only once per jiffy. It's easily possible to overwhelm that machinery with too many call_srcu callbacks, and Eryu's reproduer did just that. There's also another potential problem with using call_srcu here. While you can obviously sleep while holding the srcu_read_lock, the callbacks run under local_bh_disable, so you can't sleep there. It's possible when putting the last reference to the fsnotify_mark that we'll end up putting a chain of references including the fsnotify_group, uid, and associated keys. While I don't see any obvious ways that that could occurs, it's probably still best to avoid using call_srcu here after all. This patch reverts the above patch. A later patch will take a different approach to eliminated the dedicated thread here. Signed-off-by: NJeff Layton <jeff.layton@primarydata.com> Reported-by: NEryu Guan <guaneryu@gmail.com> Tested-by: NEryu Guan <guaneryu@gmail.com> Cc: Jan Kara <jack@suse.com> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 2月, 2016 3 次提交
-
-
由 Bjorn Helgaas 提交于
Revert 811a4e6f ("PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed"). This is part of reverting 991de2e5 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()") to fix regressions it introduced. Link: https://bugzilla.kernel.org/show_bug.cgi?id=111211 Fixes: 991de2e5 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()") Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NRafael J. Wysocki <rafael@kernel.org> CC: Jiang Liu <jiang.liu@linux.intel.com>
-
由 Jessica Yu 提交于
Remove the ftrace module notifier in favor of directly calling ftrace_module_enable() and ftrace_release_mod() in the module loader. Hard-coding the function calls directly in the module loader removes dependence on the module notifier call chain and provides better visibility and control over what gets called when, which is important to kernel utilities such as livepatch. This fixes a notifier ordering issue in which the ftrace module notifier (and hence ftrace_module_enable()) for coming modules was being called after klp_module_notify(), which caused livepatch modules to initialize incorrectly. This patch removes dependence on the module notifier call chain in favor of hard coding the corresponding function calls in the module loader. This ensures that ftrace and livepatch code get called in the correct order on patch module load and unload. Fixes: 5156dca3 ("ftrace: Fix the race between ftrace and insmod") Signed-off-by: NJessica Yu <jeyu@redhat.com> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Reviewed-by: NPetr Mladek <pmladek@suse.cz> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Kinglong Mee 提交于
unreferenced object 0xffffc90000abf000 (size 16900): comm "fsync02", pid 15765, jiffies 4297431627 (age 423.772s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 a0 c2 19 00 88 ff ff ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8174d54e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811b9b91>] __vmalloc_node_range+0x231/0x280 [<ffffffff811b9c2a>] __vmalloc+0x4a/0x50 [<ffffffffa02c9ec1>] ext_tree_prepare_commit+0x231/0x2e0 [blocklayoutdriver] [<ffffffffa02c700e>] bl_prepare_layoutcommit+0xe/0x10 [blocklayoutdriver] [<ffffffffa0596a6c>] pnfs_layoutcommit_inode+0x29c/0x330 [nfsv4] [<ffffffffa0596b13>] pnfs_generic_sync+0x13/0x20 [nfsv4] [<ffffffffa0585188>] nfs4_file_fsync+0x58/0x150 [nfsv4] [<ffffffff81228e5b>] vfs_fsync_range+0x4b/0xb0 [<ffffffff81228f1d>] do_fsync+0x3d/0x70 [<ffffffff812291d0>] SyS_fsync+0x10/0x20 [<ffffffff81757def>] entry_SYSCALL_64_fastpath+0x12/0x76 [<ffffffffffffffff>] 0xffffffffffffffff v2, add missing include header Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 17 2月, 2016 2 次提交
-
-
由 Huy Nguyen 提交于
problem description: The current code sets UAR page size equal to system page size. The ConnectX-3 and ConnectX-3 Pro HWs require minimum 128 UAR pages. The mlx4 kernel drivers are not loaded if there is less than 128 UAR pages. solution: Always set UAR page to 4KB. This allows more UAR pages if the OS has PAGE_SIZE larger than 4KB. For example, PowerPC kernel use 64KB system page size, with 4MB uar region, there are 4MB/2/64KB = 32 uars (half for uar, half for blueflame). This does not meet minimum 128 UAR pages requirement. With 4KB UAR page, there are 4MB/2/4KB = 512 uars which meet the minimum requirement. Note that only codes in mlx4_core that deal with firmware know that uar page size is 4KB. Codes that deal with usr page in cq and qp context (mlx4_ib, mlx4_en and part of mlx4_core) still have the same assumption that uar page size equals to system page size. Note that with this implementation, on 64KB system page size kernel, there are 16 uars per system page but only one uars is used. The other 15 uars are ignored because of the above assumption. Regarding SR-IOV, mlx4_core in hypervisor will set the uar page size to 4KB and mlx4_core code in virtual OS will obtain the uar page size from firmware. Regarding backward compatibility in SR-IOV, if hypervisor has this new code, the virtual OS must be updated. If hypervisor has old code, and the virtual OS has this new code, the new code will be backward compatible with the old code. If the uar size is big enough, this new code in VF continues to work with 64 KB uar page size (on PowerPc kernel). If the uar size does not meet 128 uars requirement, this new code not loaded in VF and print the same error message as the old code in Hypervisor. Signed-off-by: NHuy Nguyen <huyn@mellanox.com> Reviewed-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Matan Barak 提交于
mlx5_ifc.h is a header file representing the API and ABI between the driver to the firmware and hardware. This file is used from both the mlx5_ib and mlx5_core drivers. Previously, this file used incrementing counter to indicate reserved fields, for example: struct mlx5_ifc_odp_per_transport_service_cap_bits { u8 send[0x1]; u8 receive[0x1]; u8 write[0x1]; u8 read[0x1]; u8 reserved_0[0x1]; u8 srq_receive[0x1]; u8 reserved_1[0x1a]; }; If one developer implements through net-next feature A that uses reserved_0, they replace it with featureA and renames reserved_1 to reserved_0. In the same kernel cycle, a 2nd developer could implement feature B through the rdma tree, that uses reserved_1 and split it to featureB and a smaller reserved_1 field. This will cause a conflict when the two trees are merged. The source of this conflict is that the 1st developer changed *all* reserved fields. As Linus suggested, we change the layout of structs to: struct mlx5_ifc_odp_per_transport_service_cap_bits { u8 send[0x1]; u8 receive[0x1]; u8 write[0x1]; u8 read[0x1]; u8 reserved_at_4[0x1]; u8 srq_receive[0x1]; u8 reserved_at_6[0x1a]; }; This makes the conflicts much more rare and preserves the locality of changes. Signed-off-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NAlaa Hleihel <alaa@mellanox.com> Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 2月, 2016 2 次提交
-
-
由 Arnd Bergmann 提交于
In my randconfig tests, I came across a bug that involves several components: * gcc-4.9 through at least 5.3 * CONFIG_GCOV_PROFILE_ALL enabling -fprofile-arcs for all files * CONFIG_PROFILE_ALL_BRANCHES overriding every if() * The optimized implementation of do_div() that tries to replace a library call with an division by multiplication * code in drivers/media/dvb-frontends/zl10353.c doing u32 adc_clock = 450560; /* 45.056 MHz */ if (state->config.adc_clock) adc_clock = state->config.adc_clock; do_div(value, adc_clock); In this case, gcc fails to determine whether the divisor in do_div() is __builtin_constant_p(). In particular, it concludes that __builtin_constant_p(adc_clock) is false, while __builtin_constant_p(!!adc_clock) is true. That in turn throws off the logic in do_div() that also uses __builtin_constant_p(), and instead of picking either the constant- optimized division, and the code in ilog2() that uses __builtin_constant_p() to figure out whether it knows the answer at compile time. The result is a link error from failing to find multiple symbols that should never have been called based on the __builtin_constant_p(): dvb-frontends/zl10353.c:138: undefined reference to `____ilog2_NaN' dvb-frontends/zl10353.c:138: undefined reference to `__aeabi_uldivmod' ERROR: "____ilog2_NaN" [drivers/media/dvb-frontends/zl10353.ko] undefined! ERROR: "__aeabi_uldivmod" [drivers/media/dvb-frontends/zl10353.ko] undefined! This patch avoids the problem by changing __trace_if() to check whether the condition is known at compile-time to be nonzero, rather than checking whether it is actually a constant. I see this one link error in roughly one out of 1600 randconfig builds on ARM, and the patch fixes all known instances. Link: http://lkml.kernel.org/r/1455312410-1058841-1-git-send-email-arnd@arndb.deAcked-by: NNicolas Pitre <nico@linaro.org> Fixes: ab3c9c68 ("branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y") Cc: stable@vger.kernel.org # v2.6.30+ Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> -
由 Steven Rostedt (Red Hat) 提交于
The tracepoint infrastructure uses RCU sched protection to enable and disable tracepoints safely. There are some instances where tracepoints are used in infrastructure code (like kfree()) that get called after a CPU is going offline, and perhaps when it is coming back online but hasn't been registered yet. This can probuce the following warning: [ INFO: suspicious RCU usage. ] 4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S ------------------------------- include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/8/0. stack backtrace: CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S 4.4.0-00006-g0fe53e8-dirty #34 Call Trace: [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable) [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170 [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440 [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100 [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150 [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140 [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310 [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60 [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40 [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560 [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360 [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14 This warning is not a false positive either. RCU is not protecting code that is being executed while the CPU is offline. Instead of playing "whack-a-mole(TM)" and adding conditional statements to the tracepoints we find that are used in this instance, simply add a cpu_online() test to the tracepoint code where the tracepoint will be ignored if the CPU is offline. Use of raw_smp_processor_id() is fine, as there should never be a case where the tracepoint code goes from running on a CPU that is online and suddenly gets migrated to a CPU that is offline. Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.orgReported-by: NDenis Kirjanov <kda@linux-powerpc.org> Fixes: 97e1c18e ("tracing: Kernel Tracepoints") Cc: stable@vger.kernel.org # v2.6.28+ Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 15 2月, 2016 1 次提交
-
-
由 David Woodhouse 提交于
According to the VT-d specification we need to clear the PPR bit in the Page Request Status register when handling page requests, or the hardware won't generate any more interrupts. This wasn't actually necessary on SKL/KBL (which may well be the subject of a hardware erratum, although it's harmless enough). But other implementations do appear to get it right, and we only ever get one interrupt unless we clear the PPR bit. Reported-by: NCQ Tang <cq.tang@intel.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com> Cc: stable@vger.kernel.org
-
- 12 2月, 2016 2 次提交
-
-
由 Dan Williams 提交于
The pfn_t type uses an unsigned long to store a pfn + flags value. On a 64-bit platform the upper 12 bits of an unsigned long are never used for storing the value of a pfn. However, this is not true on highmem platforms, all 32-bits of a pfn value are used to address a 44-bit physical address space. A pfn_t needs to store a 64-bit value. Link: https://bugzilla.kernel.org/show_bug.cgi?id=112211 Fixes: 01c8f1c4 ("mm, dax, gpu: convert vm_insert_mixed to pfn_t") Signed-off-by: NDan Williams <dan.j.williams@intel.com> Reported-by: NStuart Foster <smf.linux@ntlworld.com> Reported-by: NJulian Margetson <runaway@candw.ms> Tested-by: NJulian Margetson <runaway@candw.ms> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Morton 提交于
Mike said: : CONFIG_UBSAN_ALIGNMENT breaks x86-64 kernel with lockdep enabled, i. e : kernel with CONFIG_UBSAN_ALIGNMENT fails to load without even any error : message. : : The problem is that ubsan callbacks use spinlocks and might be called : before lockdep is initialized. Particularly this line in the : reserve_ebda_region function causes problem: : : lowmem = *(unsigned short *)__va(BIOS_LOWMEM_KILOBYTES); : : If i put lockdep_init() before reserve_ebda_region call in : x86_64_start_reservations kernel loads well. Fix this ordering issue permanently: change lockdep so that it uses hlists for the hash tables. Unlike a list_head, an hlist_head is in its initialized state when it is all-zeroes, so lockdep is ready for operation immediately upon boot - lockdep_init() need not have run. The patch will also save some memory. lockdep_init() and lockdep_initialized can be done away with now - a 4.6 patch has been prepared to do this. Reported-by: NMike Krinkin <krinkin.m.u@gmail.com> Suggested-by: NMike Krinkin <krinkin.m.u@gmail.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 2月, 2016 1 次提交
-
-
由 Arnd Bergmann 提交于
As reported by Soohoon Lee, the HDIO_GET_32BIT ioctl does not work correctly in compat mode with libata. I have investigated the issue further and found multiple problems that all appeared with the same commit that originally introduced HDIO_GET_32BIT handling in libata back in linux-2.6.8 and presumably also linux-2.4, as the code uses "copy_to_user(arg, &val, 1)" to copy a 'long' variable containing either 0 or 1 to user space. The problems with this are: * On big-endian machines, this will always write a zero because it stores the wrong byte into user space. * In compat mode, the upper three bytes of the variable are updated by the compat_hdio_ioctl() function, but they now contain uninitialized stack data. * The hdparm tool calling this ioctl uses a 'static long' variable to store the result. This means at least the upper bytes are initialized to zero, but calling another ioctl like HDIO_GET_MULTCOUNT would fill them with data that remains stale when the low byte is overwritten. Fortunately libata doesn't implement any of the affected ioctl commands, so this would only happen when we query both an IDE and an ATA device in the same command such as "hdparm -N -c /dev/hda /dev/sda" * The libata code for unknown reasons started using ATA_IOC_GET_IO32 and ATA_IOC_SET_IO32 as aliases for HDIO_GET_32BIT and HDIO_SET_32BIT, while the ioctl commands that were added later use the normal HDIO_* names. This is harmless but rather confusing. This addresses all four issues by changing the code to use put_user() on an 'unsigned long' variable in HDIO_GET_32BIT, like the IDE subsystem does, and by clarifying the names of the ioctl commands. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reported-by: NSoohoon Lee <Soohoon.Lee@f5.com> Tested-by: NSoohoon Lee <Soohoon.Lee@f5.com> Cc: stable@vger.kernel.org Signed-off-by: NTejun Heo <tj@kernel.org>
-