- 07 8月, 2020 1 次提交
-
-
由 Yonghong Song 提交于
Commit a5cbe05a ("bpf: Implement bpf iterator for map elements") added bpf iterator support for map elements. The map element bpf iterator requires info to identify a particular map. In the above commit, the attr->link_create.target_fd is used to carry map_fd and an enum bpf_iter_link_info is added to uapi to specify the target_fd actually representing a map_fd: enum bpf_iter_link_info { BPF_ITER_LINK_UNSPEC = 0, BPF_ITER_LINK_MAP_FD = 1, MAX_BPF_ITER_LINK_INFO, }; This is an extensible approach as we can grow enumerator for pid, cgroup_id, etc. and we can unionize target_fd for pid, cgroup_id, etc. But in the future, there are chances that more complex customization may happen, e.g., for tasks, it could be filtered based on both cgroup_id and user_id. This patch changed the uapi to have fields __aligned_u64 iter_info; __u32 iter_info_len; for additional iter_info for link_create. The iter_info is defined as union bpf_iter_link_info { struct { __u32 map_fd; } map; }; So future extension for additional customization will be easier. The bpf_iter_link_info will be passed to target callback to validate and generic bpf_iter framework does not need to deal it any more. Note that map_fd = 0 will be considered invalid and -EBADF will be returned to user space. Fixes: a5cbe05a ("bpf: Implement bpf iterator for map elements") Signed-off-by: NYonghong Song <yhs@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrii Nakryiko <andriin@fb.com> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200805055056.1457463-1-yhs@fb.com
-
- 05 8月, 2020 1 次提交
-
-
由 Stefano Brivio 提交于
Currently, processes sending traffic to a local bridge with an encapsulation device as a port don't get ICMP errors if they exceed the PMTU of the encapsulated link. David Ahern suggested this as a hack, but it actually looks like the correct solution: when we update the PMTU for a given destination by means of updating or creating a route exception, the encapsulation might trigger this because of PMTU discovery happening either on the encapsulation device itself, or its lower layer. This happens on bridged encapsulations only. The output interface shouldn't matter, because we already have a valid destination. Drop the output interface restriction from the associated route lookup. For UDP tunnels, we will now have a route exception created for the encapsulation itself, with a MTU value reflecting its headroom, which allows a bridge forwarding IP packets originated locally to deliver errors back to the sending socket. The behaviour is now consistent with IPv6 and verified with selftests pmtu_ipv{4,6}_br_{geneve,vxlan}{4,6}_exception introduced later in this series. v2: - reset output interface only for bridge ports (David Ahern) - add and use netif_is_any_bridge_port() helper (David Ahern) Suggested-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NStefano Brivio <sbrivio@redhat.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 8月, 2020 6 次提交
-
-
由 Linus Torvalds 提交于
The addition of percpu.h to the list of includes in random.h revealed some circular dependencies on arm64 and possibly other platforms. This include was added solely for the pseudo-random definitions, which have nothing to do with the rest of the definitions in this file but are still there for legacy reasons. This patch moves the pseudo-random parts to linux/prandom.h and the percpu.h include with it, which is now guarded by _LINUX_PRANDOM_H and protected against recursive inclusion. A further cleanup step would be to remove this from <linux/random.h> entirely, and make people who use the prandom infrastructure include just the new header file. That's a bit of a churn patch, but grepping for "prandom_" and "next_pseudo_random32" "struct rnd_state" should catch most users. But it turns out that that nice cleanup step is fairly painful, because a _lot_ of code currently seems to depend on the implicit include of <linux/random.h>, which can currently come in a lot of ways, including such fairly core headfers as <linux/net.h>. So the "nice cleanup" part may or may never happen. Fixes: 1c9df907 ("random: fix circular include dependency on arm64 after addition of percpu.h") Tested-by: NGuenter Roeck <linux@roeck-us.net> Acked-by: NWilly Tarreau <w@1wt.eu> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Florian Fainelli 提交于
For now we simply store the port MTU into a per-port member. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Fainelli 提交于
In preparation for adding support for a mockup data path, move the driver data structures to include/linux/dsa/loop.h such that we can share them between net/dsa/ and drivers/net/dsa/ later on. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ahmad Fatoum 提交于
Commit 959bc7b2 ("gpio: Automatically add lockdep keys") documents in its commits message its intention to "create a unique class key for each driver". It does so by having gpiochip_add_data add in-place the definition of two static lockdep classes for LOCKDEP use. That way, every caller of the macro adds their gpiochip with unique lockdep classes. There are many indirect callers of gpiochip_add_data, however, via use of devm_gpiochip_add_data. devm_gpiochip_add_data has external linkage and all its users will share the same lockdep classes, which probably is not intended. Fix this by replicating the gpio_chip_add_data statics-in-macro for the devm_ version as well. Fixes: 959bc7b2 ("gpio: Automatically add lockdep keys") Signed-off-by: NAhmad Fatoum <a.fatoum@pengutronix.de> Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: NBartosz Golaszewski <bgolaszewski@baylibre.com> Link: https://lore.kernel.org/r/20200731123835.8003-1-a.fatoum@pengutronix.deSigned-off-by: NLinus Walleij <linus.walleij@linaro.org>
-
由 wenxu 提交于
When openvswitch conntrack offload with act_ct action. Fragment packets defrag in the ingress tc act_ct action and miss the next chain. Then the packet pass to the openvswitch datapath without the mru. The over mtu packet will be dropped in output action in openvswitch for over mtu. "kernel: net2: dropped over-mtu packet: 1528 > 1500" This patch add mru in the tc_skb_ext for adefrag and miss next chain situation. And also add mru in the qdisc_skb_cb. The act_ct set the mru to the qdisc_skb_cb when the packet defrag. And When the chain miss, The mru is set to tc_skb_ext which can be got by ovs datapath. Fixes: b57dc7c1 ("net/sched: Introduce action ct") Signed-off-by: Nwenxu <wenxu@ucloud.cn> Reviewed-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bruno Thomsen 提交于
Load new "reset-post-delay-us" value from MDIO properties, and if configured to a greater then zero delay do a flexible sleeping delay after MDIO bus reset deassert. This allows devices to exit reset state before start bus communication. Signed-off-by: NBruno Thomsen <bruno.thomsen@gmail.com> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 8月, 2020 2 次提交
-
-
由 Jouni Malinen 提交于
SAE authentication has been extended with H2E (IEEE 802.11 REVmd) and PK (WFA) options. Those extensions use special status code values in the SAE commit messages (Authentication frame with transaction sequence number 1) to identify which extension is in use. mac80211 was interpreting those new values as the AP denying authentication and that resulted in failure to complete SAE authentication in some cases. Fix this by adding exceptions for the new status code values 126 and 127. Signed-off-by: NJouni Malinen <jouni@codeaurora.org> Link: https://lore.kernel.org/r/20200731183830.18735-1-jouni@codeaurora.orgSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Linus Torvalds 提交于
That gives us ordering guarantees around the pair. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 8月, 2020 3 次提交
-
-
由 Ajay Singh 提交于
Moved macros used for Vendor/Device ID from wilc1000 driver to common header file and changed macro name for consistency with other macros. Signed-off-by: NAjay Singh <ajay.kathat@microchip.com> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Acked-by: NPali Rohár <pali@kernel.org> Signed-off-by: NKalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20200717051134.19160-1-ajay.kathat@microchip.com
-
由 Andrii Nakryiko 提交于
Add LINK_DETACH command to force-detach bpf_link without destroying it. It has the same behavior as auto-detaching of bpf_link due to cgroup dying for bpf_cgroup_link or net_device being destroyed for bpf_xdp_link. In such case, bpf_link is still a valid kernel object, but is defuncts and doesn't hold BPF program attached to corresponding BPF hook. This functionality allows users with enough access rights to manually force-detach attached bpf_link without killing respective owner process. This patch implements LINK_DETACH for cgroup, xdp, and netns links, mostly re-using existing link release handling code. Signed-off-by: NAndrii Nakryiko <andriin@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NSong Liu <songliubraving@fb.com> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200731182830.286260-2-andriin@fb.com
-
由 Pavel Begunkov 提交于
Use a local var to collect flags in kiocb_set_rw_flags(). That spares some memory writes and allows to replace most of the jumps with MOVEcc. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 01 8月, 2020 3 次提交
-
-
由 Valentin Schneider 提交于
Rather that hide their purpose in some dark, damp corner of Documentation/, add some documentation to the default implementations. Signed-off-by: NValentin Schneider <valentin.schneider@arm.com> Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200731192016.7484-2-valentin.schneider@arm.com
-
由 Roopa Prabhu 提交于
netdev protodown is a mechanism that allows protocols to hold an interface down. It was initially introduced in the kernel to hold links down by a multihoming protocol. There was also an attempt to introduce protodown reason at the time but was rejected. protodown and protodown reason is supported by almost every switching and routing platform. It was ok for a while to live without a protodown reason. But, its become more critical now given more than one protocol may need to keep a link down on a system at the same time. eg: vrrp peer node, port security, multihoming protocol. Its common for Network operators and protocol developers to look for such a reason on a networking box (Its also known as errDisable by most networking operators) This patch adds support for link protodown reason attribute. There are two ways to maintain protodown reasons. (a) enumerate every possible reason code in kernel - A protocol developer has to make a request and have that appear in a certain kernel version (b) provide the bits in the kernel, and allow user-space (sysadmin or NOS distributions) to manage the bit-to-reasonname map. - This makes extending reason codes easier (kind of like the iproute2 table to vrf-name map /etc/iproute2/rt_tables.d/) This patch takes approach (b). a few things about the patch: - It treats the protodown reason bits as counter to indicate active protodown users - Since protodown attribute is already an exposed UAPI, the reason is not enforced on a protodown set. Its a no-op if not used. the patch follows the below algorithm: - presence of reason bits set indicates protodown is in use - user can set protodown and protodown reason in a single or multiple setlink operations - setlink operation to clear protodown, will return -EBUSY if there are active protodown reason bits - reason is not included in link dumps if not used example with patched iproute2: $cat /etc/iproute2/protodown_reasons.d/r.conf 0 mlag 1 evpn 2 vrrp 3 psecurity $ip link set dev vxlan0 protodown on protodown_reason vrrp on $ip link set dev vxlan0 protodown_reason mlag on $ip link show 14: vxlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether f6:06:be:17:91:e7 brd ff:ff:ff:ff:ff:ff protodown on <mlag,vrrp> $ip link set dev vxlan0 protodown_reason mlag off $ip link set dev vxlan0 protodown off protodown_reason vrrp off Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> -
由 Yousuk Seung 提交于
This change adds TCP_NLA_EDT to SCM_TIMESTAMPING_OPT_STATS that reports the earliest departure time(EDT) of the timestamped skb. By tracking EDT values of the skb from different timestamps, we can observe when and how much the value changed. This allows to measure the precise delay injected on the sender host e.g. by a bpf-base throttler. Signed-off-by: NYousuk Seung <ysseung@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 7月, 2020 7 次提交
-
-
由 Marco Elver 提交于
To improve the general usefulness of the IRQ state trace events with KCSAN enabled, save and restore the trace information when entering and exiting the KCSAN runtime as well as when generating a KCSAN report. Without this, reporting the IRQ trace events (whether via a KCSAN report or outside of KCSAN via a lockdep report) is rather useless due to continuously being touched by KCSAN. This is because if KCSAN is enabled, every instrumented memory access causes changes to IRQ trace events (either by KCSAN disabling/enabling interrupts or taking report_lock when generating a report). Before "lockdep: Prepare for NMI IRQ state tracking", KCSAN avoided touching the IRQ trace events via raw_local_irq_save/restore() and lockdep_off/on(). Fixes: 248591f5 ("kcsan: Make KCSAN compatible with new IRQ state tracking") Signed-off-by: NMarco Elver <elver@google.com> Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200729110916.3920464-2-elver@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Marco Elver 提交于
Refactor the IRQ trace events fields, used for printing information about the IRQ trace events, into a separate struct 'irqtrace_events'. This improves readability by separating the information only used in reporting, as well as enables (simplified) storing/restoring of irqtrace_events snapshots. No functional change intended. Signed-off-by: NMarco Elver <elver@google.com> Signed-off-by: NIngo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200729110916.3920464-1-elver@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Nick Terrell 提交于
- Add unzstd() and the zstd decompress interface. - Add zstd support to decompress_method(). The decompress_method() and unzstd() functions are used to decompress the initramfs and the initrd. The __decompress() function is used in the preboot environment to decompress a zstd compressed kernel. The zstd decompression function allows the input and output buffers to overlap because that is used by x86 kernel decompression. Signed-off-by: NNick Terrell <terrelln@fb.com> Signed-off-by: NIngo Molnar <mingo@kernel.org> Tested-by: NSedat Dilek <sedat.dilek@gmail.com> Reviewed-by: NKees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20200730190841.2071656-3-nickrterrell@gmail.com
-
由 Marcelo Henrique Cerri 提交于
Add mpi_sub_ui() based on Gnu MP mpz_sub_ui() function from file mpz/aors_ui.h[1] from change id 510b83519d1c adapting the code to the kernel's data structures, helper functions and coding style and also removing the defines used to produce mpz_sub_ui() and mpz_add_ui() from the same code. [1] https://gmplib.org/repo/gmp-6.2/file/510b83519d1c/mpz/aors.hSigned-off-by: NMarcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: NStephan Mueller <smueller@chronox.de> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Romain Perier 提交于
Nowadays, modern kernel subsystems that use callbacks pass the data structure associated with a given callback as argument to the callback. The tasklet subsystem remains one which passes an arbitrary unsigned long to the callback function. This has several problems: - This keeps an extra field for storing the argument in each tasklet data structure, it bloats the tasklet_struct structure with a redundant .data field - No type checking can be performed on this argument. Instead of using container_of() like other callback subsystems, it forces callbacks to do explicit type cast of the unsigned long argument into the required object type. - Buffer overflows can overwrite the .func and the .data field, so an attacker can easily overwrite the function and its first argument to whatever it wants. Add a new tasklet initialization API, via DECLARE_TASKLET() and tasklet_setup(), which will replace the existing ones. This work is greatly inspired by the timer_struct conversion series, see commit e99e88a9 ("treewide: setup_timer() -> timer_setup()") To avoid problems with both -Wcast-function-type (which is enabled in the kernel via -Wextra is several subsystems), and with mismatched function prototypes when build with Control Flow Integrity enabled, this adds the "use_callback" member to let the tasklet caller choose which union member to call through. Once all old API uses are removed, this and the .data member will be removed as well. (On 64-bit this does not grow the struct size as the new member fills the hole after atomic_t, which is also "int" sized.) Signed-off-by: NRomain Perier <romain.perier@gmail.com> Co-developed-by: NAllen Pais <allen.lkml@gmail.com> Signed-off-by: NAllen Pais <allen.lkml@gmail.com> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NThomas Gleixner <tglx@linutronix.de> Co-developed-by: NKees Cook <keescook@chromium.org> Signed-off-by: NKees Cook <keescook@chromium.org>
-
由 Kees Cook 提交于
This converts all the existing DECLARE_TASKLET() (and ...DISABLED) macros with DECLARE_TASKLET_OLD() in preparation for refactoring the tasklet callback type. All existing DECLARE_TASKLET() users had a "0" data argument, it has been removed here as well. Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NKees Cook <keescook@chromium.org>
-
由 Willy Tarreau 提交于
Daniel Díaz and Kees Cook independently reported that commit f227e3ec ("random32: update the net random state on interrupt and activity") broke arm64 due to a circular dependency on include files since the addition of percpu.h in random.h. The correct fix would definitely be to move all the prandom32 stuff out of random.h but for backporting, a smaller solution is preferred. This one replaces linux/percpu.h with asm/percpu.h, and this fixes the problem on x86_64, arm64, arm, and mips. Note that moving percpu.h around didn't change anything and that removing it entirely broke differently. When backporting, such options might still be considered if this patch fails to help. [ It turns out that an alternate fix seems to be to just remove the troublesome <asm/pointer_auth.h> remove from the arm64 <asm/smp.h> that causes the circular dependency. But we might as well do the whole belt-and-suspenders thing, and minimize inclusion in <linux/random.h> too. Either will fix the problem, and both are good changes. - Linus ] Reported-by: NDaniel Díaz <daniel.diaz@linaro.org> Reported-by: NKees Cook <keescook@chromium.org> Tested-by: NMarc Zyngier <maz@kernel.org> Fixes: f227e3ec Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NWilly Tarreau <w@1wt.eu> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 7月, 2020 5 次提交
-
-
由 Chanwoo Choi 提交于
Until now, the devfreq driver using polling mode like simple_ondemand governor have used only deferrable timer for reducing the redundant power consumption. It reduces the CPU wake-up from idle due to polling mode which check the status of Non-CPU device. But, it has a problem for Non-CPU device like DMC device with DMA operation. Some Non-CPU device need to do monitor continuously regardless of CPU state in order to decide the proper next status of Non-CPU device. So, add support the delayed timer for polling mode to support the repetitive monitoring. The devfreq driver and user can select the kind of timer on either deferrable and delayed timer. For example, change the timer type of DMC device based on Exynos5422-based Odroid-XU3 as following: - If want to use deferrable timer as following: echo deferrable > /sys/class/devfreq/10c20000.memory-controller/timer - If want to use delayed timer as following: echo delayed > /sys/class/devfreq/10c20000.memory-controller/timer Reviewed-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: NLukasz Luba <lukasz.luba@arm.com> Signed-off-by: NChanwoo Choi <cw00.choi@samsung.com>
-
由 Andrzej Hajda 提交于
During probe every time driver gets resource it should usually check for error printk some message if it is not -EPROBE_DEFER and return the error. This pattern is simple but requires adding few lines after any resource acquisition code, as a result it is often omitted or implemented only partially. dev_err_probe helps to replace such code sequences with simple call, so code: if (err != -EPROBE_DEFER) dev_err(dev, ...); return err; becomes: return dev_err_probe(dev, err, ...); Signed-off-by: NAndrzej Hajda <a.hajda@samsung.com> Reviewed-by: NRafael J. Wysocki <rafael@kernel.org> Reviewed-by: NMark Brown <broonie@kernel.org> Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20200713144324.23654-2-a.hajda@samsung.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Linus Torvalds 提交于
It turns out that the plugin right now ends up being really unhappy about the change from 'static' to 'extern' storage that happened in commit f227e3ec ("random32: update the net random state on interrupt and activity"). This is probably a trivial fix for the latent_entropy plugin, but for now, just remove net_rand_state from the list of things the plugin worries about. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Cc: Emese Revfy <re.emese@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Willy Tarreau 提交于
This modifies the first 32 bits out of the 128 bits of a random CPU's net_rand_state on interrupt or CPU activity to complicate remote observations that could lead to guessing the network RNG's internal state. Note that depending on some network devices' interrupt rate moderation or binding, this re-seeding might happen on every packet or even almost never. In addition, with NOHZ some CPUs might not even get timer interrupts, leaving their local state rarely updated, while they are running networked processes making use of the random state. For this reason, we also perform this update in update_process_times() in order to at least update the state when there is user or system activity, since it's the only case we care about. Reported-by: NAmit Klein <aksecurity@gmail.com> Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Eric Dumazet <edumazet@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: NWilly Tarreau <w@1wt.eu> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Neal Liu 提交于
Control Flow Integrity(CFI) is a security mechanism that disallows changes to the original control flow graph of a compiled binary, making it significantly harder to perform such attacks. init_state_node() assign same function callback to different function pointer declarations. static int init_state_node(struct cpuidle_state *idle_state, const struct of_device_id *matches, struct device_node *state_node) { ... idle_state->enter = match_id->data; ... idle_state->enter_s2idle = match_id->data; } Function declarations: struct cpuidle_state { ... int (*enter) (struct cpuidle_device *dev, struct cpuidle_driver *drv, int index); void (*enter_s2idle) (struct cpuidle_device *dev, struct cpuidle_driver *drv, int index); }; In this case, either enter() or enter_s2idle() would cause CFI check failed since they use same callee. Align function prototype of enter() since it needs return value for some use cases. The return value of enter_s2idle() is no need currently. Signed-off-by: NNeal Liu <neal.liu@mediatek.com> Reviewed-by: NSami Tolvanen <samitolvanen@google.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 29 7月, 2020 12 次提交
-
-
由 Srinivas Kandagatla 提交于
For nvmem providers which have multiple instances, it is required to suffix the provider name with proper id, so that they do not confict for the same name. Currently the core does not handle this case properly eventhough core already has logic to generate the id. This patch add new devid type NVMEM_DEVID_AUTO for providers to be able to allow core to assign id and append it to provier name. Reported-by: NShawn Guo <shawn.guo@linaro.org> Signed-off-by: NSrinivas Kandagatla <srinivas.kandagatla@linaro.org> Tested-by: NShawn Guo <shawn.guo@linaro.org> Link: https://lore.kernel.org/r/20200722100705.7772-8-srinivas.kandagatla@linaro.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Andreas Färber 提交于
Complement the u16, u32 and u64 helpers with a u8 variant to ease accessing byte-sized values. This helper will be useful for Realtek Digital Home Center platforms, which store some byte and sub-byte sized values in non-volatile memory. Signed-off-by: NAndreas Färber <afaerber@suse.de> Signed-off-by: NSrinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20200722100705.7772-7-srinivas.kandagatla@linaro.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Ahmed S. Darwish 提交于
Preemption must be disabled before entering a sequence count write side critical section. Failing to do so, the seqcount read side can preempt the write side section and spin for the entire scheduler tick. If that reader belongs to a real-time scheduling class, it can spin forever and the kernel will livelock. Assert through lockdep that preemption is disabled for seqcount writers. Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-9-a.darwish@linutronix.de
-
由 Ahmed S. Darwish 提交于
Asserting that preemption is enabled or disabled is a critical sanity check. Developers are usually reluctant to add such a check in a fastpath as reading the preemption count can be costly. Extend the lockdep API with macros asserting that preemption is disabled or enabled. If lockdep is disabled, or if the underlying architecture does not support kernel preemption, this assert has no runtime overhead. References: f54bb2ec ("locking/lockdep: Add IRQs disabled/enabled assertion APIs: ...") Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-8-a.darwish@linutronix.de
-
由 Ahmed S. Darwish 提交于
raw_seqcount_begin() has the same code as raw_read_seqcount(), with the exception of masking the sequence counter's LSB before returning it to the caller. Note, raw_seqcount_begin() masks the counter's LSB before returning it to the caller so that read_seqcount_retry() can fail if the counter is odd -- without the overhead of an extra branching instruction. Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-7-a.darwish@linutronix.de
-
由 Ahmed S. Darwish 提交于
seqlock.h is now included by kernel's RST documentation, but a small number of the the exported seqlock.h functions are kernel-doc annotated. Add kernel-doc for all seqlock.h exported APIs. Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-6-a.darwish@linutronix.de
-
由 Ahmed S. Darwish 提交于
The seqlock.h seqcount_t and seqlock_t API definitions are presented in the chronological order of their development rather than the order that makes most sense to readers. This makes it hard to follow and understand the header file code. Group and reorder all of the exported seqlock.h functions according to their function. First, group together the seqcount_t standard read path functions: - __read_seqcount_begin() - raw_read_seqcount_begin() - read_seqcount_begin() since each function is implemented exactly in terms of the one above it. Then, group the special-case seqcount_t readers on their own as: - raw_read_seqcount() - raw_seqcount_begin() since the only difference between the two functions is that the second one masks the sequence counter LSB while the first one does not. Note that raw_seqcount_begin() can actually be implemented in terms of raw_read_seqcount(), which will be done in a follow-up commit. Then, group the seqcount_t write path functions, instead of injecting unrelated seqcount_t latch functions between them, and order them as: - raw_write_seqcount_begin() - raw_write_seqcount_end() - write_seqcount_begin_nested() - write_seqcount_begin() - write_seqcount_end() - raw_write_seqcount_barrier() - write_seqcount_invalidate() which is the expected natural order. This also isolates the seqcount_t latch functions into their own area, at the end of the sequence counters section, and before jumping to the next one: sequential locks (seqlock_t). Do a similar grouping and reordering for seqlock_t "locking" readers vs. the "conditionally locking or lockless" ones. No implementation code was changed in any of the reordering above. Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-5-a.darwish@linutronix.de -
由 Ahmed S. Darwish 提交于
The seqcount_t latch reader example at the raw_write_seqcount_latch() kernel-doc comment ends the latch read section with a manual smp memory barrier and sequence counter comparison. This is technically correct, but it is suboptimal: read_seqcount_retry() already contains the same logic of an smp memory barrier and sequence counter comparison. End the latch read critical section example with read_seqcount_retry(). Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-4-a.darwish@linutronix.de
-
由 Ahmed S. Darwish 提交于
Align the code samples and note sections inside kernel-doc comments with tabs. This way they can be properly parsed and rendered by Sphinx. It also makes the code samples easier to read from text editors. Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-3-a.darwish@linutronix.de
-
由 Ahmed S. Darwish 提交于
Proper documentation for the design and usage of sequence counters and sequential locks does not exist. Complete the seqlock.h documentation as follows: - Divide all documentation on a seqcount_t vs. seqlock_t basis. The description for both mechanisms was intermingled, which is incorrect since the usage constrains for each type are vastly different. - Add an introductory paragraph describing the internal design of, and rationale for, sequence counters. - Document seqcount_t writer non-preemptibility requirement, which was not previously documented anywhere, and provide a clear rationale. - Provide template code for seqcount_t and seqlock_t initialization and reader/writer critical sections. - Recommend using seqlock_t by default. It implicitly handles the serialization and non-preemptibility requirements of writers. At seqlock.h: - Remove references to brlocks as they've long been removed from the kernel. - Remove references to gcc-3.x since the kernel's minimum supported gcc version is 4.9. References: 0f6ed63b ("no need to keep brlock macros anymore...") References: 6ec4476a ("Raise gcc version requirement to 4.9") Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200720155530.1173732-2-a.darwish@linutronix.de -
由 Herbert Xu 提交于
This patch moves ATOMIC_INIT from asm/atomic.h into linux/types.h. This allows users of atomic_t to use ATOMIC_INIT without having to include atomic.h as that way may lead to header loops. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NWaiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/20200729123105.GB7047@gondor.apana.org.au
-
由 Qais Yousef 提交于
RT tasks by default run at the highest capacity/performance level. When uclamp is selected this default behavior is retained by enforcing the requested uclamp.min (p->uclamp_req[UCLAMP_MIN]) of the RT tasks to be uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum value. This is also referred to as 'the default boost value of RT tasks'. See commit 1a00d999 ("sched/uclamp: Set default clamps for RT tasks"). On battery powered devices, it is desired to control this default (currently hardcoded) behavior at runtime to reduce energy consumed by RT tasks. For example, a mobile device manufacturer where big.LITTLE architecture is dominant, the performance of the little cores varies across SoCs, and on high end ones the big cores could be too power hungry. Given the diversity of SoCs, the new knob allows manufactures to tune the best performance/power for RT tasks for the particular hardware they run on. They could opt to further tune the value when the user selects a different power saving mode or when the device is actively charging. The runtime aspect of it further helps in creating a single kernel image that can be run on multiple devices that require different tuning. Keep in mind that a lot of RT tasks in the system are created by the kernel. On Android for instance I can see over 50 RT tasks, only a handful of which created by the Android framework. To control the default behavior globally by system admins and device integrator, introduce the new sysctl_sched_uclamp_util_min_rt_default to change the default boost value of the RT tasks. I anticipate this to be mostly in the form of modifying the init script of a particular device. To avoid polluting the fast path with unnecessary code, the approach taken is to synchronously do the update by traversing all the existing tasks in the system. This could race with a concurrent fork(), which is dealt with by introducing sched_post_fork() function which will ensure the racy fork will get the right update applied. Tested on Juno-r2 in combination with the RT capacity awareness [1]. By default an RT task will go to the highest capacity CPU and run at the maximum frequency, which is particularly energy inefficient on high end mobile devices because the biggest core[s] are 'huge' and power hungry. With this patch the RT task can be controlled to run anywhere by default, and doesn't cause the frequency to be maximum all the time. Yet any task that really needs to be boosted can easily escape this default behavior by modifying its requested uclamp.min value (p->uclamp_req[UCLAMP_MIN]) via sched_setattr() syscall. [1] 804d402f: ("sched/rt: Make RT capacity-aware") Signed-off-by: NQais Yousef <qais.yousef@arm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200716110347.19553-2-qais.yousef@arm.com
-