- 09 9月, 2014 1 次提交
-
-
由 Dmitry Kasatkin 提交于
Empty files and missing xattrs do not guarantee that a file was just created. This patch passes FILE_CREATED flag to IMA to reliably identify new files. Signed-off-by: NDmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: NMimi Zohar <zohar@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> 3.14+
-
- 02 9月, 2014 1 次提交
-
-
由 Mark Rustad 提交于
Renaming an unused formal parameter in the static inline function security_inode_init_security eliminates many W=2 warnings. Signed-off-by: NMark Rustad <mark.d.rustad@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NJames Morris <james.l.morris@oracle.com>
-
- 30 7月, 2014 1 次提交
-
-
由 Jason Gunthorpe 提交于
Some Atmel TPMs provide completely wrong timeouts from their TPM_CAP_PROP_TIS_TIMEOUT query. This patch detects that and returns new correct values via a DID/VID table in the TIS driver. Tested on ARM using an AT97SC3204T FW version 37.16 Cc: <stable@vger.kernel.org> [PHuewe: without this fix these 'broken' Atmel TPMs won't function on older kernels] Signed-off-by: N"Berg, Christopher" <Christopher.Berg@atmel.com> Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: NPeter Huewe <peterhuewe@gmx.de>
-
- 28 7月, 2014 1 次提交
-
-
由 Paul Moore 提交于
This reverts commit 4da6daf4. Unfortunately, the commit in question caused problems with Bluetooth devices, specifically it caused them to get caught in the newly created BUG_ON() check. The AF_ALG problem still exists, but will be addressed in a future patch. Cc: stable@vger.kernel.org Signed-off-by: NPaul Moore <pmoore@redhat.com>
-
- 26 7月, 2014 2 次提交
-
-
由 Mimi Zohar 提交于
The "security: introduce kernel_fw_from_file hook" patch defined a new security hook to evaluate any loaded firmware that wasn't built into the kernel. This patch defines ima_fw_from_file(), which is called from the new security hook, to measure and/or appraise the loaded firmware's integrity. Signed-off-by: NMimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: NKees Cook <keescook@chromium.org>
-
由 Kees Cook 提交于
In order to validate the contents of firmware being loaded, there must be a hook to evaluate any loaded firmware that wasn't built into the kernel itself. Without this, there is a risk that a root user could load malicious firmware designed to mount an attack against kernel memory (e.g. via DMA). Signed-off-by: NKees Cook <keescook@chromium.org> Reviewed-by: NTakashi Iwai <tiwai@suse.de>
-
- 24 7月, 2014 1 次提交
-
-
由 Eric Paris 提交于
This is effectively a revert of 7b9a7ec5 plus fixing it a different way... We found, when trying to run an application from an application which had dropped privs that the kernel does security checks on undefined capability bits. This was ESPECIALLY difficult to debug as those undefined bits are hidden from /proc/$PID/status. Consider a root application which drops all capabilities from ALL 4 capability sets. We assume, since the application is going to set eff/perm/inh from an array that it will clear not only the defined caps less than CAP_LAST_CAP, but also the higher 28ish bits which are undefined future capabilities. The BSET gets cleared differently. Instead it is cleared one bit at a time. The problem here is that in security/commoncap.c::cap_task_prctl() we actually check the validity of a capability being read. So any task which attempts to 'read all things set in bset' followed by 'unset all things set in bset' will not even attempt to unset the undefined bits higher than CAP_LAST_CAP. So the 'parent' will look something like: CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 CapBnd: ffffffc000000000 All of this 'should' be fine. Given that these are undefined bits that aren't supposed to have anything to do with permissions. But they do... So lets now consider a task which cleared the eff/perm/inh completely and cleared all of the valid caps in the bset (but not the invalid caps it couldn't read out of the kernel). We know that this is exactly what the libcap-ng library does and what the go capabilities library does. They both leave you in that above situation if you try to clear all of you capapabilities from all 4 sets. If that root task calls execve() the child task will pick up all caps not blocked by the bset. The bset however does not block bits higher than CAP_LAST_CAP. So now the child task has bits in eff which are not in the parent. These are 'meaningless' undefined bits, but still bits which the parent doesn't have. The problem is now in cred_cap_issubset() (or any operation which does a subset test) as the child, while a subset for valid cap bits, is not a subset for invalid cap bits! So now we set durring commit creds that the child is not dumpable. Given it is 'more priv' than its parent. It also means the parent cannot ptrace the child and other stupidity. The solution here: 1) stop hiding capability bits in status This makes debugging easier! 2) stop giving any task undefined capability bits. it's simple, it you don't put those invalid bits in CAP_FULL_SET you won't get them in init and you won't get them in any other task either. This fixes the cap_issubset() tests and resulting fallout (which made the init task in a docker container untraceable among other things) 3) mask out undefined bits when sys_capset() is called as it might use ~0, ~0 to denote 'all capabilities' for backward/forward compatibility. This lets 'capsh --caps="all=eip" -- -c /bin/bash' run. 4) mask out undefined bit when we read a file capability off of disk as again likely all bits are set in the xattr for forward/backward compatibility. This lets 'setcap all+pe /bin/bash; /bin/bash' run Signed-off-by: NEric Paris <eparis@redhat.com> Reviewed-by: NKees Cook <keescook@chromium.org> Cc: Andrew Vagin <avagin@openvz.org> Cc: Andrew G. Morgan <morgan@kernel.org> Cc: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: Kees Cook <keescook@chromium.org> Cc: Steve Grubb <sgrubb@redhat.com> Cc: Dan Walsh <dwalsh@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NJames Morris <james.l.morris@oracle.com>
-
- 23 7月, 2014 2 次提交
-
-
由 David Howells 提交于
Allow a key type's preparsing routine to set the expiry time for a key. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NSteve Dickson <steved@redhat.com> Acked-by: NJeff Layton <jlayton@primarydata.com> Reviewed-by: NSage Weil <sage@redhat.com>
-
由 David Howells 提交于
struct key_preparsed_payload should have two payload pointers to correspond with those in struct key. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NSteve Dickson <steved@redhat.com> Acked-by: NJeff Layton <jlayton@primarydata.com> Reviewed-by: NSage Weil <sage@redhat.com>
-
- 19 7月, 2014 5 次提交
-
-
由 Kees Cook 提交于
Applying restrictive seccomp filter programs to large or diverse codebases often requires handling threads which may be started early in the process lifetime (e.g., by code that is linked in). While it is possible to apply permissive programs prior to process start up, it is difficult to further restrict the kernel ABI to those threads after that point. This change adds a new seccomp syscall flag to SECCOMP_SET_MODE_FILTER for synchronizing thread group seccomp filters at filter installation time. When calling seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, filter) an attempt will be made to synchronize all threads in current's threadgroup to its new seccomp filter program. This is possible iff all threads are using a filter that is an ancestor to the filter current is attempting to synchronize to. NULL filters (where the task is running as SECCOMP_MODE_NONE) are also treated as ancestors allowing threads to be transitioned into SECCOMP_MODE_FILTER. If prctrl(PR_SET_NO_NEW_PRIVS, ...) has been set on the calling thread, no_new_privs will be set for all synchronized threads too. On success, 0 is returned. On failure, the pid of one of the failing threads will be returned and no filters will have been applied. The race conditions against another thread are: - requesting TSYNC (already handled by sighand lock) - performing a clone (already handled by sighand lock) - changing its filter (already handled by sighand lock) - calling exec (handled by cred_guard_mutex) The clone case is assisted by the fact that new threads will have their seccomp state duplicated from their parent before appearing on the tasklist. Holding cred_guard_mutex means that seccomp filters cannot be assigned while in the middle of another thread's exec (potentially bypassing no_new_privs or similar). The call to de_thread() may kill threads waiting for the mutex. Changes across threads to the filter pointer includes a barrier. Based on patches by Will Drewry. Suggested-by: NJulien Tinnes <jln@chromium.org> Signed-off-by: NKees Cook <keescook@chromium.org> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Reviewed-by: NAndy Lutomirski <luto@amacapital.net>
-
由 Kees Cook 提交于
Normally, task_struct.seccomp.filter is only ever read or modified by the task that owns it (current). This property aids in fast access during system call filtering as read access is lockless. Updating the pointer from another task, however, opens up race conditions. To allow cross-thread filter pointer updates, writes to the seccomp fields are now protected by the sighand spinlock (which is shared by all threads in the thread group). Read access remains lockless because pointer updates themselves are atomic. However, writes (or cloning) often entail additional checking (like maximum instruction counts) which require locking to perform safely. In the case of cloning threads, the child is invisible to the system until it enters the task list. To make sure a child can't be cloned from a thread and left in a prior state, seccomp duplication is additionally moved under the sighand lock. Then parent and child are certain have the same seccomp state when they exit the lock. Based on patches by Will Drewry and David Drysdale. Signed-off-by: NKees Cook <keescook@chromium.org> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Reviewed-by: NAndy Lutomirski <luto@amacapital.net>
-
由 Kees Cook 提交于
Since seccomp transitions between threads requires updates to the no_new_privs flag to be atomic, the flag must be part of an atomic flag set. This moves the nnp flag into a separate task field, and introduces accessors. Signed-off-by: NKees Cook <keescook@chromium.org> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Reviewed-by: NAndy Lutomirski <luto@amacapital.net>
-
由 Kees Cook 提交于
This adds the new "seccomp" syscall with both an "operation" and "flags" parameter for future expansion. The third argument is a pointer value, used with the SECCOMP_SET_MODE_FILTER operation. Currently, flags must be 0. This is functionally equivalent to prctl(PR_SET_SECCOMP, ...). In addition to the TSYNC flag later in this patch series, there is a non-zero chance that this syscall could be used for configuring a fixed argument area for seccomp-tracer-aware processes to pass syscall arguments in the future. Hence, the use of "seccomp" not simply "seccomp_add_filter" for this syscall. Additionally, this syscall uses operation, flags, and user pointer for arguments because strictly passing arguments via a user pointer would mean seccomp itself would be unable to trivially filter the seccomp syscall itself. Signed-off-by: NKees Cook <keescook@chromium.org> Reviewed-by: NOleg Nesterov <oleg@redhat.com> Reviewed-by: NAndy Lutomirski <luto@amacapital.net>
-
由 David Howells 提交于
Provide a generic instantiation function for key types that use the preparse hook. This makes it easier to prereserve key quota before keyrings get locked to retain the new key. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NSteve Dickson <steved@redhat.com> Acked-by: NJeff Layton <jlayton@primarydata.com> Reviewed-by: NSage Weil <sage@redhat.com>
-
- 18 7月, 2014 1 次提交
-
-
由 David Howells 提交于
Special kernel keys, such as those used to hold DNS results for AFS, CIFS and NFS and those used to hold idmapper results for NFS, used to be 'invalidateable' with key_revoke(). However, since the default permissions for keys were reduced: Commit: 96b5c8fe KEYS: Reduce initial permissions on keys it has become impossible to do this. Add a key flag (KEY_FLAG_ROOT_CAN_INVAL) that will permit a key to be invalidated by root. This should not be used for system keyrings as the garbage collector will try and remove any invalidate key. For system keyrings, KEY_FLAG_ROOT_CAN_CLEAR can be used instead. After this, from userspace, keyctl_invalidate() and "keyctl invalidate" can be used by any possessor of CAP_SYS_ADMIN (typically root) to invalidate DNS and idmapper keys. Invalidated keys are immediately garbage collected and will be immediately rerequested if needed again. Signed-off-by: NDavid Howells <dhowells@redhat.com> Tested-by: NSteve Dickson <steved@redhat.com>
-
- 17 7月, 2014 1 次提交
-
-
由 Dmitry Kasatkin 提交于
Instead of allowing public keys, with certificates signed by any key on the system trusted keyring, to be added to a trusted keyring, this patch further restricts the certificates to those signed only by builtin keys on the system keyring. This patch defines a new option 'builtin' for the kernel parameter 'keys_ownerid' to allow trust validation using builtin keys. Simplified Mimi's "KEYS: define an owner trusted keyring" patch Changelog v7: - rename builtin_keys to use_builtin_keys Signed-off-by: NDmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: NMimi Zohar <zohar@linux.vnet.ibm.com>
-
- 10 7月, 2014 1 次提交
-
-
由 Paul Moore 提交于
The sock_graft() hook has special handling for AF_INET, AF_INET, and AF_UNIX sockets as those address families have special hooks which label the sock before it is attached its associated socket. Unfortunately, the sock_graft() hook was missing a default approach to labeling sockets which meant that any other address family which made use of connections or the accept() syscall would find the returned socket to be in an "unlabeled" state. This was recently demonstrated by the kcrypto/AF_ALG subsystem and the newly released cryptsetup package (cryptsetup v1.6.5 and later). This patch preserves the special handling in selinux_sock_graft(), but adds a default behavior - setting the sock's label equal to the associated socket - which resolves the problem with AF_ALG and presumably any other address family which makes use of accept(). Cc: stable@vger.kernel.org Signed-off-by: NPaul Moore <pmoore@redhat.com> Tested-by: NMilan Broz <gmazyland@gmail.com>
-
- 09 7月, 2014 3 次提交
-
-
由 David Howells 提交于
The PKCS#7 certificate should contain a "Microsoft individual code signing" data blob as its signed content. This blob contains a digest of the signed content of the PE binary and the OID of the digest algorithm used (typically SHA256). Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NVivek Goyal <vgoyal@redhat.com> Reviewed-by: NKees Cook <keescook@chromium.org>
-
由 David Howells 提交于
Parse a PE binary to find a key and a signature contained therein. Later patches will check the signature and add the key if the signature checks out. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NVivek Goyal <vgoyal@redhat.com> Reviewed-by: NKees Cook <keescook@chromium.org>
-
由 David Howells 提交于
Provide some PE binary structural and constant definitions as taken from the pesign package sources. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NVivek Goyal <vgoyal@redhat.com> Reviewed-by: NKees Cook <keescook@chromium.org>
-
- 08 7月, 2014 1 次提交
-
-
由 David Howells 提交于
Implement a parser for a PKCS#7 signed-data message as described in part of RFC 2315. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NVivek Goyal <vgoyal@redhat.com> Reviewed-by: NKees Cook <keescook@chromium.org>
-
- 04 7月, 2014 1 次提交
-
-
由 Tejun Heo 提交于
The 'sysret' fastpath does not correctly restore even all regular registers, much less any segment registers or reflags values. That is very much part of why it's faster than 'iret'. Normally that isn't a problem, because the normal ptrace() interface catches the process using the signal handler infrastructure, which always returns with an iret. However, some paths can get caught using ptrace_event() instead of the signal path, and for those we need to make sure that we aren't going to return to user space using 'sysret'. Otherwise the modifications that may have been done to the register set by the tracer wouldn't necessarily take effect. Fix it by forcing IRET path by setting TIF_NOTIFY_RESUME from arch_ptrace_stop_needed() which is invoked from ptrace_stop(). Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NAndy Lutomirski <luto@amacapital.net> Acked-by: NOleg Nesterov <oleg@redhat.com> Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 7月, 2014 1 次提交
-
-
由 Tejun Heo 提交于
d911d987 ("kernfs: make kernfs_notify() trigger inotify events too") added fsnotify triggering to kernfs_notify() which requires a sleepable context. There are already existing users of kernfs_notify() which invoke it from an atomic context and in general it's silly to require a sleepable context for triggering a notification. The following is an invalid context bug triggerd by md invoking sysfs_notify() from IO completion path. BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1 2 locks held by swapper/1/0: #0: (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>] virtblk_done+0x42/0xe0 [virtio_blk] #1: (&(&bitmap->counts.lock)->rlock){-.....}, at: [<ffffffff81633718>] bitmap_endwrite+0x68/0x240 irq event stamp: 33518 hardirqs last enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230 hardirqs last disabled at (33516): [<ffffffff818122ed>] common_interrupt+0x6d/0x72 softirqs last enabled at (33518): [<ffffffff810a1272>] _local_bh_enable+0x22/0x50 softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c 0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000 0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2 Call Trace: <IRQ> [<ffffffff81807b4c>] dump_stack+0x4d/0x66 [<ffffffff810d4f14>] __might_sleep+0x184/0x240 [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440 [<ffffffff812d76a0>] kernfs_notify+0x90/0x150 [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240 [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1] [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1] [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1] [<ffffffff813acb8b>] bio_endio+0x6b/0xa0 [<ffffffff813b46c4>] blk_update_request+0x94/0x420 [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70 [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk] [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120 [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30 [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk] [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring] [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340 [<ffffffff8111647d>] handle_irq_event+0x3d/0x60 [<ffffffff81119436>] handle_edge_irq+0x66/0x130 [<ffffffff8101c3e4>] handle_irq+0x84/0x150 [<ffffffff818146ad>] do_IRQ+0x4d/0xe0 [<ffffffff818122f2>] common_interrupt+0x72/0x72 <EOI> [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10 [<ffffffff81025454>] default_idle+0x24/0x230 [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20 [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0 [<ffffffff8104df1b>] start_secondary+0x25b/0x300 This patch fixes it by punting the notification delivery through a work item. This ends up adding an extra pointer to kernfs_elem_attr enlarging kernfs_node by a pointer, which is not ideal but not a very big deal either. If this turns out to be an actual issue, we can move kernfs_elem_attr->size to kernfs_node->iattr later. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NJosh Boyer <jwboyer@fedoraproject.org> Cc: Jens Axboe <axboe@kernel.dk> Reviewed-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 02 7月, 2014 1 次提交
-
-
由 Zhengyu He 提交于
This fixes a typo that named the read_mostly section of percpu as readmostly. It works fine with SMP because the linker script specifies .data..percpu..readmostly. However, UP kernel builds don't have percpu sections defined and the non-percpu version of the section is called data..read_mostly, so .data..readmostly will float around and may break things unexpectedly. Looking at the original change that introduced data..percpu..readmostly (commit c957ef2c), it looks like this was the original intention. Tested: Built UP kernel and confirmed the sections got merged. - Before the patch: $ objdump -h vmlinux.o | grep '\.data\.\.read.*mostly' 38 .data..read_mostly 00004418 0000000000000000 0000000000000000 00431ac0 2**6 50 .data..readmostly 00000014 0000000000000000 0000000000000000 00444000 2**3 - After the patch: $ objdump -h vmlinux.o | grep '\.data\.\.read.*mostly' 38 .data..read_mostly 00004438 0000000000000000 0000000000000000 00431ac0 2**6 Signed-off-by: NZhengyu He <hzy@google.com> Signed-off-by: NFilipe Brandenburger <filbranden@google.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 01 7月, 2014 1 次提交
-
-
由 Alan Stern 提交于
Some buggy JMicron USB-ATA bridges don't know how to translate the FUA bit in READs or WRITEs. This patch adds an entry in unusual_devs.h and a blacklist flag to tell the sd driver not to use FUA. Signed-off-by: NAlan Stern <stern@rowland.harvard.edu> Reported-by: NMichael Büsch <m@bues.ch> Tested-by: NMichael Büsch <m@bues.ch> Acked-by: NJames Bottomley <James.Bottomley@HansenPartnership.com> CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net> CC: <stable@vger.kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 30 6月, 2014 1 次提交
-
-
由 Li Zefan 提交于
kernfs_pin_sb() tries to get a refcnt of the superblock. This will be used by cgroupfs. v2: - make kernfs_pin_sb() return the superblock. - drop kernfs_drop_sb(). tj: Updated the comment a bit. [ This is a prerequisite for a bugfix. ] Cc: <stable@vger.kernel.org> # 3.15 Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 28 6月, 2014 1 次提交
-
-
由 Michael S. Tsirkin 提交于
ERROR: "memcpy_fromiovecend" [drivers/vhost/vhost_scsi.ko] undefined! commit 9f977ef7 vhost-scsi: Include prot_bytes into expected data transfer length in target-pending makes drivers/vhost/scsi.c call memcpy_fromiovecend(). This function is not available when CONFIG_NET is not enabled. socket.h already includes uio.h, so no callers need updating. Reported-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
- 27 6月, 2014 1 次提交
-
-
由 Al Viro 提交于
blkdev_read_iter() wants to cap the iov_iter by the amount of data remaining to the end of device. That's what iov_iter_truncate() is for (trim iter->count if it's above the given limit). So far, so good, but the argument of iov_iter_truncate() is size_t, so on 32bit boxen (in case of a large device) we end up with that upper limit truncated down to 32 bits *before* comparing it with iter->count. Easily fixed by making iov_iter_truncate() take 64bit argument - it does the right thing after such change (we only reach the assignment in there when the current value of iter->count is greater than the limit, i.e. for anything that would get truncated we don't reach the assignment at all) and that argument is not the new value of iter->count - it's an upper limit for such. The overhead of passing u64 is not an issue - the thing is inlined, so callers passing size_t won't pay any penalty. Reported-and-tested-by: NTheodore Tso <tytso@mit.edu> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Tested-by: NAlan Cox <gnomes@lxorguk.ukuu.org.uk> Tested-by: NBruno Wolff III <bruno@wolff.to> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 6月, 2014 2 次提交
-
-
由 Jens Axboe 提交于
Another restriction inherited for NVMe - those devices don't support SG lists that have "gaps" in them. Gaps refers to cases where the previous SG entry doesn't end on a page boundary. For NVMe, all SG entries must start at offset 0 (except the first) and end on a page boundary (except the last). Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Gu Zheng 提交于
Macro bip_vec_idx() was used by bio integrity originally, but no longer used now. So remove it. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 24 6月, 2014 3 次提交
-
-
由 Aaron Tomlin 提交于
A 'softlockup' is defined as a bug that causes the kernel to loop in kernel mode for more than a predefined period to time, without giving other tasks a chance to run. Currently, upon detection of this condition by the per-cpu watchdog task, debug information (including a stack trace) is sent to the system log. On some occasions, we have observed that the "victim" rather than the actual "culprit" (i.e. the owner/holder of the contended resource) is reported to the user. Often this information has proven to be insufficient to assist debugging efforts. To avoid loss of useful debug information, for architectures which support NMI, this patch makes it possible to improve soft lockup reporting. This is accomplished by issuing an NMI to each cpu to obtain a stack trace. If NMI is not supported we just revert back to the old method. A sysctl and boot-time parameter is available to toggle this feature. [dzickus@redhat.com: add CONFIG_SMP in certain areas] [akpm@linux-foundation.org: additional CONFIG_SMP=n optimisations] [mq@suse.cz: fix warning] Signed-off-by: NAaron Tomlin <atomlin@redhat.com> Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Mateusz Guzik <mguzik@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: NJan Moskyto Matejka <mq@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Aaron Tomlin 提交于
Sometimes it is preferred not to use the trigger_all_cpu_backtrace() routine when one wants to avoid capturing a back trace for current. For instance if one was previously captured recently. This patch provides a new routine namely trigger_allbutself_cpu_backtrace() which offers the flexibility to issue an NMI to every cpu but current and capture a back trace accordingly. Patch x86 and sparc to support new routine. [dzickus@redhat.com: add stub in #else clause] [dzickus@redhat.com: don't print message in single processor case, wrap with get/put_cpu based on Oleg's suggestion] [sfr@canb.auug.org.au: undo C99ism] Signed-off-by: NAaron Tomlin <atomlin@redhat.com> Signed-off-by: NDon Zickus <dzickus@redhat.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Cc: Mateusz Guzik <mguzik@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Petr Tesarik 提交于
To allow filtering of huge pages, makedumpfile must be able to identify them in the dump. This can be done by checking the appropriate page flag, so communicate its value to makedumpfile through the VMCOREINFO interface. There's only one small catch. Depending on how many page flags are available on a given architecture, this bit can be called PG_head or PG_compound. I sent a similar patch back in 2012, but Eric Biederman did not like using an #ifdef. So, this time I'm adding a common symbol (PG_head_mask) instead. See https://lkml.org/lkml/2012/11/28/91 for the previous version. Signed-off-by: NPetr Tesarik <ptesarik@suse.cz> Acked-by: NVivek Goyal <vgoyal@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Shaohua Li <shli@kernel.org> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 6月, 2014 1 次提交
-
-
由 Jens Axboe 提交于
This reverts commit b5097e95. The original commit is buggy, we do use the registration functions at runtime, for instance when loading IO schedulers through sysfs. Reported-by: NDamien Wyart <damien.wyart@gmail.com>
-
- 22 6月, 2014 1 次提交
-
-
由 Daniel Mack 提交于
Add a notify callback to inform phy drivers when the core is about to do its link adjustment. No change for drivers that do not implement this callback. Signed-off-by: NDaniel Mack <zonque@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 6月, 2014 2 次提交
-
-
由 Alexander Gordeev 提交于
Fix racy updates of shared blk_mq_bitmap_tags::wake_index and blk_mq_hw_ctx::wake_index fields. Cc: Ming Lei <tom.leiming@gmail.com> Signed-off-by: NAlexander Gordeev <agordeev@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
Commit 762380ad inadvertently changed a check for max_sectors to max_hw_sectors. Revert that part, so we still compare against max_sectors. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 17 6月, 2014 1 次提交
-
-
由 Kees Cook 提交于
To support using kernel features that are not compatible with hibernation, this creates the "nohibernate" kernel boot parameter to disable both hibernation and resume. This allows hibernation support to be a boot-time choice instead of only a compile-time choice. Signed-off-by: NKees Cook <keescook@chromium.org> Acked-by: NPavel Machek <pavel@ucw.cz> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 15 6月, 2014 2 次提交
-
-
由 Tom Herbert 提交于
This function is used by UDP encapsulation protocols in RX when crossing encapsulation boundary. If ip_summed is set to CHECKSUM_UNNECESSARY and encapsulation is not set, change to CHECKSUM_NONE since the checksum has not been validated within the encapsulation. Clears csum_valid by the same rationale. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Joseph Gasparakis reported that VXLAN GSO offload stopped working with i40e device after recent UDP changes. The problem is that the SKB_GSO_* bits are out of sync with the corresponding NETIF flags. This patch fixes that. Also, we add BUILD_BUG_ONs in net_gso_ok for several GSO constants that were missing to avoid the problem in the future. Reported-by: NJoseph Gasparakis <joseph.gasparakis@intel.com> Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-