- 01 2月, 2018 1 次提交
-
-
由 Jeff Layton 提交于
As Linus points out: The inode_cmp_iversion{+raw}() functions are pure and utter crap. Why? You say that they return 0/negative/positive, but they do so in a completely broken manner. They return that ternary value as the sequence number difference in a 's64', which means that if you actually care about that ternary value, and do the *sane* thing that the kernel-doc of the function implies is the right thing, you would do int cmp = inode_cmp_iversion(inode, old); if (cmp < 0 ... and as a result you get code that looks sane, but that doesn't actually *WORK* right. Since none of the callers actually care about the ternary value here, convert the inode_cmp_iversion{+raw} functions to just return a boolean value (false for matching, true for non-matching). This matches the existing use of these functions just fine, and makes it simple to convert them to return a ternary value in the future if we grow callers that need it. With this change we can also reimplement inode_cmp_iversion in a simpler way using inode_peek_iversion. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 1月, 2018 3 次提交
-
-
由 Jeff Layton 提交于
Since i_version is mostly treated as an opaque value, we can exploit that fact to avoid incrementing it when no one is watching. With that change, we can avoid incrementing the counter on writes, unless someone has queried for it since it was last incremented. If the a/c/mtime don't change, and the i_version hasn't changed, then there's no need to dirty the inode metadata on a write. Convert the i_version counter to an atomic64_t, and use the lowest order bit to hold a flag that will tell whether anyone has queried the value since it was last incremented. When we go to maybe increment it, we fetch the value and check the flag bit. If it's clear then we don't need to do anything if the update isn't being forced. If we do need to update, then we increment the counter by 2, and clear the flag bit, and then use a CAS op to swap it into place. If that works, we return true. If it doesn't then do it again with the value that we fetch from the CAS operation. On the query side, if the flag is already set, then we just shift the value down by 1 bit and return it. Otherwise, we set the flag in our on-stack value and again use cmpxchg to swap it into place if it hasn't changed. If it has, then we use the value from the cmpxchg as the new "old" value and try again. This method allows us to avoid incrementing the counter on writes (and dirtying the metadata) under typical workloads. We only need to increment if it has been queried since it was last changed. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NJan Kara <jack@suse.cz> Acked-by: NDave Chinner <dchinner@redhat.com> Tested-by: NKrzysztof Kozlowski <krzk@kernel.org>
-
由 Jeff Layton 提交于
The rationale for taking the i_lock when incrementing this value is lost in antiquity. The readers of the field don't take it (at least not universally), so my assumption is that it was only done here to serialize incrementors. If that is indeed the case, then we can drop the i_lock from this codepath and treat it as a atomic64_t for the purposes of incrementing it. This allows us to use inode_inc_iversion without any danger of lock inversion. Note that the read side is not fetched atomically with this change. The assumption here is that that is not a critical issue since the i_version is not fully synchronized with anything else anyway. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Jeff Layton 提交于
Add a documentation blob that explains what the i_version field is, how it is expected to work, and how it is currently implemented by various filesystems. We already have inode_inc_iversion. Add several other functions for manipulating and accessing the i_version counter. For now, the implementation is trivial and basically works the way that all of the open-coded i_version accesses work today. Future patches will convert existing users of i_version to use the new API, and then convert the backend implementation to do things more efficiently. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NJan Kara <jack@suse.cz>
-
- 26 1月, 2018 6 次提交
-
-
由 Chunyan Zhang 提交于
In this patch, consumers are allowed to set suspend voltage, and this actually just set the "uV" in constraint::regulator_state, when the regulator_suspend_late() was called by PM core through callback when the system is entering into suspend, the regulator device would act suspend activity then. And it assumes that if any consumer set suspend voltage, the regulator device should be enabled in the suspend state. And if the suspend voltage of a regulator device for all consumers was set zero, the regulator device would be off in the suspend state. This patch also provides a new function hook to regulator devices for resuming from suspend states. Signed-off-by: NChunyan Zhang <zhang.chunyan@linaro.org> Signed-off-by: NMark Brown <broonie@kernel.org>
-
由 Chunyan Zhang 提交于
Regualtor suspend/resume functions should only be called by PM suspend core via registering dev_pm_ops, and regulator devices should implement the callback functions. Thus, any regulator consumer shouldn't call the regulator suspend/resume functions directly. In order to avoid compile errors, two empty functions with the same name still be left for the time being. Signed-off-by: NChunyan Zhang <zhang.chunyan@linaro.org> Signed-off-by: NMark Brown <broonie@kernel.org>
-
由 Chunyan Zhang 提交于
The items "disabled" and "enabled" are a little redundant, since only one of them would be set to record if the regulator device should keep on or be switched to off in suspend states. So in this patch, the "disabled" was removed, only leave the "enabled": - enabled == 1 for regulator-on-in-suspend - enabled == 0 for regulator-off-in-suspend - enabled == -1 means do nothing when entering suspend mode. Signed-off-by: NChunyan Zhang <zhang.chunyan@linaro.org> Signed-off-by: NMark Brown <broonie@kernel.org>
-
由 Andi Kleen 提交于
There's a risk that a kernel which has full retpoline mitigations becomes vulnerable when a module gets loaded that hasn't been compiled with the right compiler or the right option. To enable detection of that mismatch at module load time, add a module info string "retpoline" at build time when the module was compiled with retpoline support. This only covers compiled C source, but assembler source or prebuilt object files are not checked. If a retpoline enabled kernel detects a non retpoline protected module at load time, print a warning and report it in the sysfs vulnerability file. [ tglx: Massaged changelog ] Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: jeyu@kernel.org Cc: arjan@linux.intel.com Link: https://lkml.kernel.org/r/20180125235028.31211-1-andi@firstfloor.org
-
由 Chao Yu 提交于
This patch adds creation time field in inode layout to support showing kstat.btime in ->statx. Signed-off-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Nicolas Dichtel 提交于
Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to: "BUG: unable to handle kernel NULL pointer dereference at (null)" Let's add a helper to check if update_pmtu is available before calling it. Fixes: 52a589d5 ("geneve: update skb dst pmtu on tx path") Fixes: a93bf0ff ("vxlan: update skb dst pmtu on tx path") CC: Roman Kapl <code@rkapl.cz> CC: Xin Long <lucien.xin@gmail.com> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 1月, 2018 3 次提交
-
-
由 Dan Streetman 提交于
When a tcp socket is closed, if it detects that its net namespace is exiting, close immediately and do not wait for FIN sequence. For normal sockets, a reference is taken to their net namespace, so it will never exit while the socket is open. However, kernel sockets do not take a reference to their net namespace, so it may begin exiting while the kernel socket is still open. In this case if the kernel socket is a tcp socket, it will stay open trying to complete its close sequence. The sock's dst(s) hold a reference to their interface, which are all transferred to the namespace's loopback interface when the real interfaces are taken down. When the namespace tries to take down its loopback interface, it hangs waiting for all references to the loopback interface to release, which results in messages like: unregister_netdevice: waiting for lo to become free. Usage count = 1 These messages continue until the socket finally times out and closes. Since the net namespace cleanup holds the net_mutex while calling its registered pernet callbacks, any new net namespace initialization is blocked until the current net namespace finishes exiting. After this change, the tcp socket notices the exiting net namespace, and closes immediately, releasing its dst(s) and their reference to the loopback interface, which lets the net namespace continue exiting. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811Signed-off-by: NDan Streetman <ddstreet@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wolfgang Bumiller 提交于
TCF_LAYER_LINK and TCF_LAYER_NETWORK returned the same pointer as skb->data points to the network header. Use skb_mac_header instead. Signed-off-by: NWolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Greg Kroah-Hartman 提交于
This reverts commit 6cfb521a. Turns out distros do not want to make retpoline as part of their "ABI", so this patch should not have been merged. Sorry Andi, this was my fault, I suggested it when your original patch was the "correct" way of doing this instead. Reported-by: NJiri Kosina <jikos@kernel.org> Fixes: 6cfb521a ("module: Add retpoline tag to VERMAGIC") Acked-by: NAndi Kleen <ak@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: rusty@rustcorp.com.au Cc: arjan.van.de.ven@intel.com Cc: jeyu@kernel.org Cc: stable <stable@vger.kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 1月, 2018 5 次提交
-
-
由 Peter Zijlstra 提交于
Tejun reported the following cpu-hotplug lock (percpu-rwsem) read recursion: tg_set_cfs_bandwidth() get_online_cpus() cpus_read_lock() cfs_bandwidth_usage_inc() static_key_slow_inc() cpus_read_lock() Reported-by: NTejun Heo <tj@kernel.org> Tested-by: NTejun Heo <tj@kernel.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180122215328.GP3397@worktopSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ben Hutchings 提交于
Commit 513674b5 ("net: reevalulate autoflowlabel setting after sysctl setting") removed the initialisation of ipv6_pinfo::autoflowlabel and added a second flag to indicate whether this field or the net namespace default should be used. The getsockopt() handling for this case was not updated, so it currently returns 0 for all sockets for which IPV6_AUTOFLOWLABEL is not explicitly enabled. Fix it to return the effective value, whether that has been set at the socket or net namespace level. Fixes: 513674b5 ("net: reevalulate autoflowlabel setting after sysctl ...") Signed-off-by: NBen Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steven Rostedt (VMware) 提交于
The function tracer can create a dynamically allocated trampoline that is called by the function mcount or fentry hook that is used to call the function callback that is registered. The problem is that the orc undwinder will bail if it encounters one of these trampolines. This breaks the stack trace of function callbacks, which include the stack tracer and setting the stack trace for individual functions. Since these dynamic trampolines are basically copies of the static ftrace trampolines defined in ftrace_*.S, we do not need to create new orc entries for the dynamic trampolines. Finding the return address on the stack will be identical as the functions that were copied to create the dynamic trampolines. When encountering a ftrace dynamic trampoline, we can just use the orc entry of the ftrace static function that was copied for that trampoline. Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Benjamin Coddington 提交于
Backchannel tasks will not have a reference to the rpc_clnt. Return -1 for cl_clid in that case. Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NTrond Myklebust <trondmy@gmail.com>
-
由 Eric W. Biederman 提交于
Today 4 architectures set ARCH_SUPPORTS_MEMORY_FAILURE (arm64, parisc, powerpc, and x86), while 4 other architectures set __ARCH_SI_TRAPNO (alpha, metag, sparc, and tile). These two sets of architectures do not interesect so remove the trapno paramater to remove confusion. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 23 1月, 2018 21 次提交
-
-
由 Chuck Lever 提交于
Clean up: Make it easier to use text search when browsing a trace report. Other events use "status=%d". Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Track RPC timeouts: report the XID and the server address to match the content of network capture. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
This includes decoding Write and Reply chunks, and fixing up inline payloads. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
These can be shared with all kernel ULPs, and more can easily be added as needed. Note: checkpatch.pl has some heartburn with the TRACE_DEFINE_ENUM macros and the LIST macros. These follow the same style as other header files under include/tracing/events , thus should be considered acceptable exceptions. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Acked-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Bob Peterson 提交于
This patch just adds the capability for GFS2 to track which function called gfs2_log_flush. This should make it easier to diagnose problems based on the sequence of events found in the journals. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com>
-
由 Bob Peterson 提交于
This patch adds a new structure called gfs2_log_header_v2 which is used to store expanded fields into previously unused areas of the log headers (i.e., this change is backwards compatible). Some of these are used for debug purposes so we can backtrack when problems occur. Others are reserved for future expansion. This patch is based on a prototype from Steve Whitehouse. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
-
由 Eric W. Biederman 提交于
There are so many places that build struct siginfo by hand that at least one of them is bound to get it wrong. A handful of cases in the kernel arguably did just that when using the errno field of siginfo to pass no errno values to userspace. The usage is limited to a single si_code so at least does not mess up anything else. Encapsulate this questionable pattern in a helper function so that the userspace ABI is preserved. Update all of the places that use this pattern to use the new helper function. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
由 Eric W. Biederman 提交于
The helpers added are: send_sig_mceerr force_sig_mceerr force_sig_bnderr force_sig_pkuerr Filling out siginfo properly can ge tricky. Especially for these specialized cases where the temptation is to share code with other cases which use a different subset of siginfo fields. Unfortunately that code sharing frequently results in bugs with the wrong siginfo fields filled in, and makes it harder to verify that the siginfo structure was properly initialized. Provide these helpers instead that get all of the details right, and guarantee that siginfo is properly initialized. send_sig_mceerr and force_sig_mceer are a little special as two si codes BUS_MCEERR_AO and BUS_MCEER_AR both use the same extended signinfo layout. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
由 Eric W. Biederman 提交于
The vast majority of signals sent from architecture specific code are simple faults. Encapsulate this reality with two helper functions so that the nit-picky implementation of preparing a siginfo does not need to be repeated many times on each architecture. As only some architectures support the trapno field, make the trapno arguement only present on those architectures. Similary as ia64 has three fields: imm, flags, and isr that are specific to it. Have those arguments always present on ia64 and no where else. This ensures the architecture specific code always remembers which fields it needs to pass into the siginfo structure. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
由 Jaegeuk Kim 提交于
If fsck.f2fs changes crc, we have no way to recover some inode blocks by roll- forward recovery. Let's relax the condition to recover them. Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
由 Jaegeuk Kim 提交于
This patch gives a flag to disable GC on given file, which would be useful, when user wants to keep its block map. It also conducts in-place-update for dontmove file. Reviewed-by: NChao Yu <yuchao0@huawei.com> Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
-
- 22 1月, 2018 1 次提交
-
-
由 Anand Jain 提交于
The UUID change by btrfstune sets SUPER_FLAG_CHANGING_FSID and resets it only when changing fsid is complete. Its not a good idea to mount the device anything in between, reading metadata blocks would fail with UUID mismatch. This patch doesn't add SUPER_FLAG_CHANGING_FSID into BTRFS_SUPER_FLAG_SUPP list, so mount will fail (along with the fix in the next patch) when SUPER_FLAG_CHANGING_FSID is set. Signed-off-by: NAnand Jain <anand.jain@oracle.com> Reviewed-by: NQu Wenruo <wqu@suse.com> Reviewed-by: NDavid Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: NDavid Sterba <dsterba@suse.com>
-