- 30 8月, 2018 11 次提交
-
-
由 Peng Li 提交于
If there are packets in hardware when changing the speed or duplex, it may cause hardware hang up. This patch adds the code for waiting chip to clean the all pkts(TX & RX) in chip when the driver uses the function named "adjust link". This patch cleans the pkts as follows: 1) close rx of chip, close tx of protocol stack. 2) wait rcb, ppe, mac to clean. 3) adjust link 4) open rx of chip, open tx of protocol stack. Signed-off-by: NPeng Li <lipeng321@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Azat Khuzhin 提交于
I have two Ethernet adapters: r8169 0000:03:01.0 eth0: RTL8169sb/8110sb, 00:14:d1:14:2d:49, XID 10000000, IRQ 18 r8169 0000:01:00.0 eth0: RTL8168e/8111e, 64:66:b3:11:14:5d, XID 2c200000, IRQ 30 And after upgrading from linux 4.15 [1] to linux 4.18+ [2] RTL8169sb failed to receive any packets. tcpdump shows a lot of checksum mismatch. [1]: a0f79386 [2]: 05193597 (4.19 merge window opened) I started bisecting and the found that [3] breaks it. According to [4]: "For 8110S, 8110SB, and 8110SC series, the initial value of RxConfig needs to be set after the tx/rx is enabled." So I moved rtl_init_rxcfg() after enabling tx/rs and now my adapter works (RTL8168e works too). [3]: 3559d81e [4]: e542a226 ("r8169: adjust the RxConfig settings.") Also drop "rx" from rtl_set_rx_tx_config_registers(), since it does nothing with it already. Fixes: 3559d81e ("r8169: simplify rtl_hw_start_8169") Cc: Heiner Kallweit <hkallweit1@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: Realtek linux nic maintainers <nic_swsd@realtek.com> Signed-off-by: NAzat Khuzhin <a3at.mail@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
syzbot reported a use-after-free in tipc_group_fill_sock_diag(), where tipc_group_fill_sock_diag() still reads tsk->group meanwhile tipc_group_delete() just deletes it in tipc_release(). tipc_nl_sk_walk() aims to lock this sock when walking each sock in the hash table to close race conditions with sock changes like this one, by acquiring tsk->sk.sk_lock.slock spinlock, unfortunately this doesn't work at all. All non-BH call path should take lock_sock() instead to make it work. tipc_nl_sk_walk() brutally iterates with raw rht_for_each_entry_rcu() where RCU read lock is required, this is the reason why lock_sock() can't be taken on this path. This could be resolved by switching to rhashtable iterator API's, where taking a sleepable lock is possible. Also, the iterator API's are friendly for restartable calls like diag dump, the last position is remembered behind the scence, all we need to do here is saving the iterator into cb->args[]. I tested this with parallel tipc diag dump and thousands of tipc socket creation and release, no crash or memory leak. Reported-by: syzbot+b9c8f3ab2994b7cd1625@syzkaller.appspotmail.com Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jerome Brunet 提交于
This reverts commit 4ae0169f. This change in the handling of the coalesce timer is causing regression on (at least) amlogic platforms. Network will break down very quickly (a few seconds) after starting a download. This can easily be reproduced using iperf3 for example. The problem has been reported on the S805, S905, S912 and A113 SoCs (Realtek and Micrel PHYs) and it is likely impacting all Amlogics platforms using Gbit ethernet No problem was seen with the platform using 10/100 only PHYs (GXL internal) Reverting change brings things back to normal and allows to use network again until we better understand the problem with the coalesce timer. Cc: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Vitor Soares <soares@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: NJerome Brunet <jbrunet@baylibre.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
rhashtable_walk_exit() must be paired with rhashtable_walk_enter(). Fixes: 40f9f439 ("tipc: Fix tipc_sk_reinit race conditions") Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexey Kodanev 提交于
Before the commit d6990976 ("vti6: fix PMTU caching and reporting on xmit") '!skb->ignore_df' check was always true because the function skb_scrub_packet() was called before it, resetting ignore_df to zero. In the commit, skb_scrub_packet() was moved below, and now this check can be false for the packet, e.g. when sending it in the two fragments, this prevents successful PMTU updates in such case. The next attempts to send the packet lead to the same tx error. Moreover, vti6 initial MTU value relies on PMTU adjustments. This issue can be reproduced with the following LTP test script: udp_ipsec_vti.sh -6 -p ah -m tunnel -s 2000 Fixes: ccd740cb ("vti6: Add pmtu handling to vti6_xmit.") Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com> Acked-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf由 David S. Miller 提交于
Daniel Borkmann says: ==================== pull-request: bpf 2018-08-29 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a build error in sk_reuseport_convert_ctx_access() when compiling with clang which cannot resolve hweight_long() at build time inside the BUILD_BUG_ON() assertion, from Stefan. 2) Several fixes for BPF sockmap, four of them in getting the bpf_msg_pull_data() helper to work, one use after free case in bpf_tcp_close() and one refcount leak in bpf_tcp_recvmsg(), from Daniel. 3) Another fix for BPF sockmap where we misaccount sk_mem_uncharge() in the socket redirect error case from unwinding scatterlist twice, from John. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexei Starovoitov 提交于
Daniel Borkmann says: ==================== This set contains three more fixes for the bpf_msg_pull_data() mainly for correcting scatterlist ring wrap-arounds as well as fixing up data pointers. For details please see individual patches. Thanks! ==================== Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Daniel Borkmann 提交于
When we perform the sg shift repair for the scatterlist ring, we currently start out at i = first_sg + 1. However, this is not correct since the first_sg could point to the sge sitting at slot MAX_SKB_FRAGS - 1, and a subsequent i = MAX_SKB_FRAGS will access the scatterlist ring (sg) out of bounds. Add the sk_msg_iter_var() helper for iterating through the ring, and apply the same rule for advancing to the next ring element as we do elsewhere. Later work will use this helper also in other places. Fixes: 015632bb ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Daniel Borkmann 提交于
If first_sg and last_sg wraps around in the scatterlist ring, then we need to account for that in the shift as well. E.g. crafting such msgs where this is the case leads to a hang as shift becomes negative. E.g. consider the following scenario: first_sg := 14 |=> shift := -12 msg->sg_start := 10 last_sg := 3 | msg->sg_end := 5 round 1: i := 15, move_from := 3, sg[15] := sg[ 3] round 2: i := 0, move_from := -12, sg[ 0] := sg[-12] round 3: i := 1, move_from := -11, sg[ 1] := sg[-11] round 4: i := 2, move_from := -10, sg[ 2] := sg[-10] [...] round 13: i := 11, move_from := -1, sg[ 2] := sg[ -1] round 14: i := 12, move_from := 0, sg[ 2] := sg[ 0] round 15: i := 13, move_from := 1, sg[ 2] := sg[ 1] round 16: i := 14, move_from := 2, sg[ 2] := sg[ 2] round 17: i := 15, move_from := 3, sg[ 2] := sg[ 3] [...] This means we will loop forever and never hit the msg->sg_end condition to break out of the loop. When we see that the ring wraps around, then the shift should be MAX_SKB_FRAGS - first_sg + last_sg - 1. Meaning, the remainder slots from the tail of the ring and the head until last_sg combined. Fixes: 015632bb ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Daniel Borkmann 提交于
In the current code, msg->data is set as sg_virt(&sg[i]) + start - offset and msg->data_end relative to it as msg->data + bytes. Using iterator i to point to the updated starting scatterlist element holds true for some cases, however not for all where we'd end up pointing out of bounds. It is /correct/ for these ones: 1) When first finding the starting scatterlist element (sge) where we find that the page is already privately owned by the msg and where the requested bytes and headroom fit into the sge's length. However, it's /incorrect/ for the following ones: 2) After we made the requested area private and updated the newly allocated page into first_sg slot of the scatterlist ring; when we find that no shift repair of the ring is needed where we bail out updating msg->data and msg->data_end. At that point i will point to last_sg, which in this case is the next elem of first_sg in the ring. The sge at that point might as well be invalid (e.g. i == msg->sg_end), which we use for setting the range of sg_virt(&sg[i]). The correct one would have been first_sg. 3) Similar as in 2) but when we find that a shift repair of the ring is needed. In this case we fix up all sges and stop once we've reached the end. In this case i will point to will point to the new msg->sg_end, and the sge at that point will be invalid. Again here the requested range sits in first_sg. Fixes: 015632bb ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 29 8月, 2018 1 次提交
-
-
由 Daniel Borkmann 提交于
While recently going over bpf_msg_pull_data(), I noticed three issues which are fixed in here: 1) When we attempt to find the first scatterlist element (sge) for the start offset, we add len to the offset before we check for start < offset + len, whereas it should come after when we iterate to the next sge to accumulate the offsets. For example, given a start offset of 12 with a sge length of 8 for the first sge in the list would lead us to determine this sge as the first sge thinking it covers first 16 bytes where start is located, whereas start sits in subsequent sges so we would end up pulling in the wrong data. 2) After figuring out the starting sge, we have a short-cut test in !msg->sg_copy[i] && bytes <= len. This checks whether it's not needed to make the page at the sge private where we can just exit by updating msg->data and msg->data_end. However, the length test is not fully correct. bytes <= len checks whether the requested bytes (end - start offsets) fit into the sge's length. The part that is missing is that start must not be sge length aligned. Meaning, the start offset into the sge needs to be accounted as well on top of the requested bytes as otherwise we can access the sge out of bounds. For example the sge could have length of 8, our requested bytes could have length of 8, but at a start offset of 4, so we also would need to pull in 4 bytes of the next sge, when we jump to the out label we do set msg->data to sg_virt(&sg[i]) + start - offset and msg->data_end to msg->data + bytes which would be oob. 3) The subsequent bytes < copy test for finding the last sge has the same issue as in point 2) but also it tests for less than rather than less or equal to. Meaning if the sge length is of 8 and requested bytes of 8 while having the start aligned with the sge, we would unnecessarily go and pull in the next sge as well to make it private. Fixes: 015632bb ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 28 8月, 2018 14 次提交
-
-
由 John Fastabend 提交于
Currently, when a redirect occurs in sockmap and an error occurs in the redirect call we unwind the scatterlist once in the error path of bpf_tcp_sendmsg_do_redirect() and then again in sendmsg(). Then in the error path of sendmsg we decrement the copied count by the send size. However, its possible we partially sent data before the error was generated. This can happen if do_tcp_sendpages() partially sends the scatterlist before encountering a memory pressure error. If this happens we need to decrement the copied value (the value tracking how many bytes were actually sent to TCP stack) by the number of remaining bytes _not_ the entire send size. Otherwise we risk confusing userspace. Also we don't need two calls to free the scatterlist one is good enough. So remove the one in bpf_tcp_sendmsg_do_redirect() and then properly reduce copied by the number of remaining bytes which may in fact be the entire send size if no bytes were sent. To do this use bool to indicate if free_start_sg() should do mem accounting or not. Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Stefan Agner 提交于
Building the newly introduced BPF_PROG_TYPE_SK_REUSEPORT leads to a compile time error when building with clang: net/core/filter.o: In function `sk_reuseport_convert_ctx_access': ../net/core/filter.c:7284: undefined reference to `__compiletime_assert_7284' It seems that clang has issues resolving hweight_long at compile time. Since SK_FL_PROTO_MASK is a constant, we can use the interface for known constant arguments which works fine with clang. Fixes: 2dbb9b9e ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT") Signed-off-by: NStefan Agner <stefan@agner.ch> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Daniel Borkmann 提交于
In bpf_tcp_recvmsg() we first took a reference on the psock, however once we find that there are skbs in the normal socket's receive queue we return with processing them through tcp_recvmsg(). Problem is that we leak the taken reference on the psock in that path. Given we don't really do anything with the psock at this point, move the skb_queue_empty() test before we fetch the psock to fix this case. Fixes: 8934ce2f ("bpf: sockmap redirect ingress support") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Daniel Borkmann 提交于
bpf_tcp_close() we pop the psock linkage to a map via psock_map_pop(). A parallel update on the sock hash map can happen between psock_map_pop() and lookup_elem_raw() where we override the element under link->hash / link->key. In bpf_tcp_close()'s lookup_elem_raw() we subsequently only test whether an element is present, but we do not test whether the element is infact the element we were looking for. We lock the sock in bpf_tcp_close() during that time, so do we hold the lock in sock_hash_update_elem(). However, the latter locks the sock which is newly updated, not the one we're purging from the hash table. This means that while one CPU is doing the lookup from bpf_tcp_close(), another CPU is doing the map update in parallel, dropped our sock from the hlist and released the psock. Subsequently the first CPU will find the new sock and attempts to drop and release the old sock yet another time. Fix is that we need to check the elements for a match after lookup, similar as we do in the sock map. Note that the hash tab elems are freed via RCU, so access to their link->hash / link->key is fine since we're under RCU read side there. Fixes: e9db4ef6 ("bpf: sockhash fix omitted bucket lock in sock_close") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Zhu Yanjun 提交于
In IPv4, the newly introduced rdma_read_gids is used to read the SGID/DGID for the connection which returns GID correctly for RoCE transport as well. In IPv6, rdma_read_gids is also used. The following are why rdma_read_gids is introduced. rdma_addr_get_dgid() for RoCE for client side connections returns MAC address, instead of DGID. rdma_addr_get_sgid() for RoCE doesn't return correct SGID for IPv6 and when more than one IP address is assigned to the netdevice. So the transport agnostic rdma_read_gids() API is provided by rdma_cm module. Signed-off-by: NZhu Yanjun <yanjun.zhu@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Linus Walleij 提交于
Commit 52638f71 ("dsa: Move gpio reset into switch driver") moved the GPIO handling into the switch drivers but forgot to remove the GPIO header includes. Signed-off-by: NLinus Walleij <linus.walleij@linaro.org> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Haiqing Bai 提交于
In function tipc_dest_push, the 32bit variables 'node' and 'port' are stored separately in uppper and lower part of 64bit 'value'. Then this value is assigned to dst->value which is a union like: union { struct { u32 port; u32 node; }; u64 value; } This works on little-endian machines like x86 but fails on big-endian machines. The fix remove the 'value' stack parameter and even the 'value' member of the union in tipc_dest, assign the 'node' and 'port' member directly with the input parameter to avoid the endian issue. Fixes: a80ae530 ("tipc: improve destination linked list") Signed-off-by: NZhenbo Gao <zhenbo.gao@windriver.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NHaiqing Bai <Haiqing.Bai@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Jiri Pirko says: ==================== net: sched: couple of small fixes Jiri Pirko (2): net: sched: fix extack error message when chain is failed to be created net: sched: return -ENOENT when trying to remove filter from non-existent chain ==================== Acked-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
When chain 0 was implicitly created, removal of non-existent filter from chain 0 gave -ENOENT. Once chain 0 became non-implicit, the same call is giving -EINVAL. Fix this by returning -ENOENT in that case. Reported-by: NRoman Mashak <mrv@mojatatu.com> Fixes: f71e0ca4 ("net: sched: Avoid implicit chain 0 creation") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Instead "Cannot find" say "Cannot create". Fixes: c35a4acc ("net: sched: cls_api: handle generic cls errors") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
After erspan_ver is introudced, if erspan_ver is not set in iproute, its value will be left 0 by default. Since Commit 02f99df1 ("erspan: fix invalid erspan version."), it has broken the traffic due to the version check in erspan_xmit if users are not aware of 'erspan_ver' param, like using an old version of iproute. To fix this compatibility problem, it sets erspan_ver to 1 by default when adding an erspan dev in erspan_setup. Note that we can't do it in ipgre_netlink_parms, as this function is also used by ipgre_changelink. Fixes: 02f99df1 ("erspan: fix invalid erspan version.") Reported-by: NJianlin Shi <jishi@redhat.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
After changing rhashtable_walk_start to return void, start_fail would never be set other value than 0, and the checking for start_fail is pointless, so remove it. Fixes: 97a6ec4a ("rhashtable: Change rhashtable_walk_start to return void") Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
As Marcelo noticed, in sctp_transport_get_next, it is iterating over transports but then also accessing the association directly, without checking any refcnts before that, which can cause an use-after-free Read. So fix it by holding transport before accessing the association. With that, sctp_transport_hold calls can be removed in the later places. Fixes: 626d16f5 ("sctp: export some apis or variables for sctp_diag and reuse some for proc") Reported-by: syzbot+fe62a0c9aa6a85c6de16@syzkaller.appspotmail.com Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net由 Linus Torvalds 提交于
Pull networking fixes from David Miller: 1) ICE, E1000, IGB, IXGBE, and I40E bug fixes from the Intel folks. 2) Better fix for AB-BA deadlock in packet scheduler code, from Cong Wang. 3) bpf sockmap fixes (zero sized key handling, etc.) from Daniel Borkmann. 4) Send zero IPID in TCP resets and SYN-RECV state ACKs, to prevent attackers using it as a side-channel. From Eric Dumazet. 5) Memory leak in mediatek bluetooth driver, from Gustavo A. R. Silva. 6) Hook up rt->dst.input of ipv6 anycast routes properly, from Hangbin Liu. 7) hns and hns3 bug fixes from Huazhong Tan. 8) Fix RIF leak in mlxsw driver, from Ido Schimmel. 9) iova range check fix in vhost, from Jason Wang. 10) Fix hang in do_tcp_sendpages() with tls, from John Fastabend. 11) More r8152 chips need to disable RX aggregation, from Kai-Heng Feng. 12) Memory exposure in TCA_U32_SEL handling, from Kees Cook. 13) TCP BBR congestion control fixes from Kevin Yang. 14) hv_netvsc, ignore non-PCI devices, from Stephen Hemminger. 15) qed driver fixes from Tomer Tayar. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (77 commits) net: sched: Fix memory exposure from short TCA_U32_SEL qed: fix spelling mistake "comparsion" -> "comparison" vhost: correctly check the iova range when waking virtqueue qlge: Fix netdev features configuration. net: macb: do not disable MDIO bus at open/close time Revert "net: stmmac: fix build failure due to missing COMMON_CLK dependency" net: macb: Fix regression breaking non-MDIO fixed-link PHYs mlxsw: spectrum_switchdev: Do not leak RIFs when removing bridge i40e: fix condition of WARN_ONCE for stat strings i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled ixgbe: fix driver behaviour after issuing VFLR ixgbe: Prevent unsupported configurations with XDP ixgbe: Replace GFP_ATOMIC with GFP_KERNEL igb: Replace mdelay() with msleep() in igb_integrated_phy_loopback() igb: Replace GFP_ATOMIC with GFP_KERNEL in igb_sw_init() igb: Use an advanced ctx descriptor for launchtime e1000: ensure to free old tx/rx rings in set_ringparam() e1000: check on netif_running() before calling e1000_up() ixgb: use dma_zalloc_coherent instead of allocator/memset ice: Trivial formatting fixes ...
-
- 27 8月, 2018 10 次提交
-
-
由 Jens Axboe 提交于
The email was botched in one entry, and I also forgot to update the location of the git tree. It'll be under the linux-block umbrella, just with different branches. Reported-by: NBaruch Siach <baruch@tkos.co.il> Fixes: 7634ccd2 ("libata: maintainership update") Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kees Cook 提交于
Via u32_change(), TCA_U32_SEL has an unspecified type in the netlink policy, so max length isn't enforced, only minimum. This means nkeys (from userspace) was being trusted without checking the actual size of nla_len(), which could lead to a memory over-read, and ultimately an exposure via a call to u32_dump(). Reachability is CAP_NET_ADMIN within a namespace. Reported-by: NAl Viro <viro@zeniv.linux.org.uk> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: NKees Cook <keescook@chromium.org> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Linus Torvalds 提交于
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull timer update from Thomas Gleixner: "New defines for the compat time* types so they can be shared between 32bit and 64bit builds. Not used yet, but merging them now allows the actual conversions to be merged through different maintainer trees without dependencies We still have compat interfaces for 32bit on 64bit even with the new 2038 safe timespec/val variants because pointer size is different. And for the old style timespec/val interfaces we need yet another 'compat' interface for both 32bit native and 32bit on 64bit" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: y2038: Provide aliases for compat helpers
-
git://git.infradead.org/users/willy/linux-dax由 Linus Torvalds 提交于
Pull IDA updates from Matthew Wilcox: "A better IDA API: id = ida_alloc(ida, GFP_xxx); ida_free(ida, id); rather than the cumbersome ida_simple_get(), ida_simple_remove(). The new IDA API is similar to ida_simple_get() but better named. The internal restructuring of the IDA code removes the bitmap preallocation nonsense. I hope the net -200 lines of code is convincing" * 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax: (29 commits) ida: Change ida_get_new_above to return the id ida: Remove old API test_ida: check_ida_destroy and check_ida_alloc test_ida: Convert check_ida_conv to new API test_ida: Move ida_check_max test_ida: Move ida_check_leaf idr-test: Convert ida_check_nomem to new API ida: Start new test_ida module target/iscsi: Allocate session IDs from an IDA iscsi target: fix session creation failure handling drm/vmwgfx: Convert to new IDA API dmaengine: Convert to new IDA API ppc: Convert vas ID allocation to new IDA API media: Convert entity ID allocation to new IDA API ppc: Convert mmu context allocation to new IDA API Convert net_namespace to new IDA API cb710: Convert to new IDA API rsxx: Convert to new IDA API osd: Convert to new IDA API sd: Convert to new IDA API ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux由 Linus Torvalds 提交于
Pull gcc plugin fix from Kees Cook: "Lift gcc test into Kconfig. This is for better behavior when the kernel is built with Clang, reported by Stefan Agner" * tag 'gcc-plugins-v4.19-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: gcc-plugins: Disable when building under Clang
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull perf updates from Thomas Gleixner: "Kernel: - Improve kallsyms coverage - Add x86 entry trampolines to kcore - Fix ARM SPE handling - Correct PPC event post processing Tools: - Make the build system more robust - Small fixes and enhancements all over the place - Update kernel ABI header copies - Preparatory work for converting libtraceevnt to a shared library - License cleanups" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits) tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy' tools arch x86: Update tools's copy of cpufeatures.h perf python: Fix pyrf_evlist__read_on_cpu() interface perf mmap: Store real cpu number in 'struct perf_mmap' perf tools: Remove ext from struct kmod_path perf tools: Add gzip_is_compressed function perf tools: Add lzma_is_compressed function perf tools: Add is_compressed callback to compressions array perf tools: Move the temp file processing into decompress_kmodule perf tools: Use compression id in decompress_kmodule() perf tools: Store compression id into struct dso perf tools: Add compression id into 'struct kmod_path' perf tools: Make is_supported_compression() static perf tools: Make decompress_to_file() function static perf tools: Get rid of dso__needs_decompress() call in __open_dso() perf tools: Get rid of dso__needs_decompress() call in symbol__disassemble() perf tools: Get rid of dso__needs_decompress() call in read_object_code() tools lib traceevent: Change to SPDX License format perf llvm: Allow passing options to llc in addition to clang perf parser: Improve error message for PMU address filters ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull x86 fixes from Thomas Gleixner: - Correct the L1TF fallout on 32bit and the off by one in the 'too much RAM for protection' calculation. - Add a helpful kernel message for the 'too much RAM' case - Unbreak the VDSO in case that the compiler desides to use indirect jumps/calls and emits retpolines which cannot be resolved because the kernel uses its own thunks, which does not work for the VDSO. Make it use the builtin thunks. - Re-export start_thread() which was unexported when the 32/64bit implementation was unified. start_thread() is required by modular binfmt handlers. - Trivial cleanups * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation/l1tf: Suggest what to do on systems with too much RAM x86/speculation/l1tf: Fix off-by-one error when warning that system has too much RAM x86/kvm/vmx: Remove duplicate l1d flush definitions x86/speculation/l1tf: Fix overflow in l1tf_pfn_limit() on 32bit x86/process: Re-export start_thread() x86/mce: Add notifier_block forward declaration x86/vdso: Fix vDSO build if a retpoline is emitted
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull irq update from Thomas Gleixner: "A small set of updats/fixes for the irq subsystem: - Allow GICv3 interrupts to be configured as wake-up sources to enable wakeup from suspend - Make the error handling of the STM32 irqchip init function work - A set of small cleanups and improvements" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3: Allow interrupt to be configured as wake-up sources irqchip/tango: Set irq handler and data in one go dt-bindings: irqchip: renesas-irqc: Document r8a774a1 support irqchip/s3c24xx: Remove unneeded comparison of unsigned long to 0 irqchip/stm32: Fix init error handling irqchip/bcm7038-l1: Hide cpu offline callback when building for !SMP
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull licking update from Thomas Gleixner: "Mark the switch cases which fall through to the next case with the proper comment so the fallthrough compiler checks can be enabled" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Mark expected switch fall-throughs
-
- 26 8月, 2018 4 次提交
-
-
由 Linus Torvalds 提交于
Merge tag 'libnvdimm-for-4.19_dax-memory-failure' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm memory-failure update from Dave Jiang: "As it stands, memory_failure() gets thoroughly confused by dev_pagemap backed mappings. The recovery code has specific enabling for several possible page states and needs new enabling to handle poison in dax mappings. In order to support reliable reverse mapping of user space addresses: 1/ Add new locking in the memory_failure() rmap path to prevent races that would typically be handled by the page lock. 2/ Since dev_pagemap pages are hidden from the page allocator and the "compound page" accounting machinery, add a mechanism to determine the size of the mapping that encompasses a given poisoned pfn. 3/ Given pmem errors can be repaired, change the speculatively accessed poison protection, mce_unmap_kpfn(), to be reversible and otherwise allow ongoing access from the kernel. A side effect of this enabling is that MADV_HWPOISON becomes usable for dax mappings, however the primary motivation is to allow the system to survive userspace consumption of hardware-poison via dax. Specifically the current behavior is: mce: Uncorrected hardware memory error in user-access at af34214200 {1}[Hardware Error]: It has been corrected by h/w and requires no further action mce: [Hardware Error]: Machine check events logged {1}[Hardware Error]: event severity: corrected Memory failure: 0xaf34214: reserved kernel page still referenced by 1 users [..] Memory failure: 0xaf34214: recovery action for reserved kernel page: Failed mce: Memory error not recovered <reboot> ...and with these changes: Injecting memory failure for pfn 0x20cb00 at process virtual address 0x7f763dd00000 Memory failure: 0x20cb00: Killing dax-pmd:5421 due to hardware memory corruption Memory failure: 0x20cb00: recovery action for dax page: Recovered Given all the cross dependencies I propose taking this through nvdimm.git with acks from Naoya, x86/core, x86/RAS, and of course dax folks" * tag 'libnvdimm-for-4.19_dax-memory-failure' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm, pmem: Restore page attributes when clearing errors x86/memory_failure: Introduce {set, clear}_mce_nospec() x86/mm/pat: Prepare {reserve, free}_memtype() for "decoy" addresses mm, memory_failure: Teach memory_failure() about dev_pagemap pages filesystem-dax: Introduce dax_lock_mapping_entry() mm, memory_failure: Collect mapping size in collect_procs() mm, madvise_inject_error: Let memory_failure() optionally take a page reference mm, dev_pagemap: Do not clear ->mapping on final put mm, madvise_inject_error: Disable MADV_SOFT_OFFLINE for ZONE_DEVICE pages filesystem-dax: Set page->index device-dax: Set page->index device-dax: Enable page_mapping() device-dax: Convert to vmf_insert_mixed and vm_fault_t
-
由 Linus Torvalds 提交于
Pull libnvdimm updates from Dave Jiang: "Collection of misc libnvdimm patches for 4.19 submission: - Adding support to read locked nvdimm capacity. - Change test code to make DSM failure code injection an override. - Add support for calculate maximum contiguous area for namespace. - Add support for queueing a short ARS when there is on going ARS for nvdimm. - Allow NULL to be passed in to ->direct_access() for kaddr and pfn params. - Improve smart injection support for nvdimm emulation testing. - Fix test code that supports for emulating controller temperature. - Fix hang on error before devm_memremap_pages() - Fix a bug that causes user memory corruption when data returned to user for ars_status. - Maintainer updates for Ross Zwisler emails and adding Jan Kara to fsdax" * tag 'libnvdimm-for-4.19_misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm: fix ars_status output length calculation device-dax: avoid hang on error before devm_memremap_pages() tools/testing/nvdimm: improve emulation of smart injection filesystem-dax: Do not request kaddr and pfn when not required md/dm-writecache: Don't request pointer dummy_addr when not required dax/super: Do not request a pointer kaddr when not required tools/testing/nvdimm: kaddr and pfn can be NULL to ->direct_access() s390, dcssblk: kaddr and pfn can be NULL to ->direct_access() libnvdimm, pmem: kaddr and pfn can be NULL to ->direct_access() acpi/nfit: queue issuing of ars when an uc error notification comes in libnvdimm: Export max available extent libnvdimm: Use max contiguous area for namespace size MAINTAINERS: Add Jan Kara for filesystem DAX MAINTAINERS: update Ross Zwisler's email address tools/testing/nvdimm: Fix support for emulating controller temperature tools/testing/nvdimm: Make DSM failure code injection an override acpi, nfit: Prefer _DSM over _LSR for namespace label reads libnvdimm: Introduce locked DIMM capacity support
-
由 Colin Ian King 提交于
Trivial fix to spelling mistake in DP_ERR error message Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
We don't wakeup the virtqueue if the first byte of pending iova range is the last byte of the range we just got updated. This will lead a virtqueue to wait for IOTLB updating forever. Fixing by correct the check and wake up the virtqueue in this case. Fixes: 6b1e6cc7 ("vhost: new device IOTLB API") Reported-by: NPeter Xu <peterx@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NPeter Xu <peterx@redhat.com> Tested-by: NPeter Xu <peterx@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-