- 10 11月, 2016 7 次提交
-
-
由 Timur Tabi 提交于
Pause frames are used to enable flow control. A MAC can send and receive pause frames in order to throttle traffic. However, the PHY must be configured to allow those frames to pass through. Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NTimur Tabi <timur@codeaurora.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rafał Miłecki 提交于
This fixes regression introduced by patch adding feature flags. It was already reported and patch followed (it got accepted) but it appears it was incorrect. Instead of fixing reversed condition it broke a good one. This patch was verified to actually fix SoC hanges caused by bgmac on BCM47186B0. Fixes: db791eb2 ("net: ethernet: bgmac: convert to feature flags") Fixes: 4af1474e ("net: bgmac: Fix errant feature flag check") Cc: Jon Mason <jon.mason@broadcom.com> Signed-off-by: NRafał Miłecki <rafal@milecki.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Benjamin Poirier 提交于
We received two reports of BUG_ON in bnad_txcmpl_process() where hw_consumer_index appeared to be ahead of producer_index. Out of order write/read of these variables could explain these reports. bnad_start_xmit(), as a producer of tx descriptors, has a few memory barriers sprinkled around writes to producer_index and the device's doorbell but they're not paired with anything in bnad_txcmpl_process(), a consumer. Since we are synchronizing with a device, we must use mandatory barriers, not smp_*. Also, I didn't see the purpose of the last smp_mb() in bnad_start_xmit(). Signed-off-by: NBenjamin Poirier <bpoirier@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tariq Toukan 提交于
This reverts commit 9d2afba0. The original issue would possibly exist if an external module tried calling our "ethtool_ops" without checking if it still exists. The right way of solving it is by simply doing the check in the caller side. Currently, no action is required as there's no such use case. Signed-off-by: NTariq Toukan <tariqt@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Maciej Żenczykowski 提交于
Routes can specify an mtu explicitly or inherit the mtu from the underlying device - this inheritance is implemented in dst->ops->mtu handlers ip6_mtu() and ip6_blackhole_mtu(). Currently changing the mtu of a device adds mtu explicitly to routes using that device. ie. # ip link set dev lo mtu 65536 # ip -6 route add local 2000::1 dev lo # ip -6 route get 2000::1 local 2000::1 dev lo table local src ... metric 1024 pref medium # ip link set dev lo mtu 65535 # ip -6 route get 2000::1 local 2000::1 dev lo table local src ... metric 1024 mtu 65535 pref medium # ip link set dev lo mtu 65536 # ip -6 route get 2000::1 local 2000::1 dev lo table local src ... metric 1024 mtu 65536 pref medium # ip -6 route del local 2000::1 After this patch the route entry no longer changes unless it already has an mtu. There is no need: this inheritance is already done in ip6_mtu() # ip link set dev lo mtu 65536 # ip -6 route add local 2000::1 dev lo # ip -6 route add local 2000::2 dev lo mtu 2000 # ip -6 route get 2000::1; ip -6 route get 2000::2 local 2000::1 dev lo table local src ... metric 1024 pref medium local 2000::2 dev lo table local src ... metric 1024 mtu 2000 pref medium # ip link set dev lo mtu 65535 # ip -6 route get 2000::1; ip -6 route get 2000::2 local 2000::1 dev lo table local src ... metric 1024 pref medium local 2000::2 dev lo table local src ... metric 1024 mtu 2000 pref medium # ip link set dev lo mtu 1501 # ip -6 route get 2000::1; ip -6 route get 2000::2 local 2000::1 dev lo table local src ... metric 1024 pref medium local 2000::2 dev lo table local src ... metric 1024 mtu 1501 pref medium # ip link set dev lo mtu 65536 # ip -6 route get 2000::1; ip -6 route get 2000::2 local 2000::1 dev lo table local src ... metric 1024 pref medium local 2000::2 dev lo table local src ... metric 1024 mtu 65536 pref medium # ip -6 route del local 2000::1 # ip -6 route del local 2000::2 This is desirable because changing device mtu and then resetting it to the previous value shouldn't change the user visible routing table. Signed-off-by: NMaciej Żenczykowski <maze@google.com> CC: Eric Dumazet <edumazet@google.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Soheil Hassas Yeganeh 提交于
Do not send the next message in sendmmsg for partial sendmsg invocations. sendmmsg assumes that it can continue sending the next message when the return value of the individual sendmsg invocations is positive. It results in corrupting the data for TCP, SCTP, and UNIX streams. For example, sendmmsg([["abcd"], ["efgh"]]) can result in a stream of "aefgh" if the first sendmsg invocation sends only the first byte while the second sendmsg goes through. Datagram sockets either send the entire datagram or fail, so this patch affects only sockets of type SOCK_STREAM and SOCK_SEQPACKET. Fixes: 228e548e ("net: Add sendmmsg socket system call") Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NMaciej Żenczykowski <maze@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gao Feng 提交于
When there is no existing macvlan port in lowdev, one new macvlan port would be created. But it doesn't be destoried when something failed later. It casues some memleak. Now add one flag to indicate if new macvlan port is created. Signed-off-by: NGao Feng <fgao@ikuai8.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 11月, 2016 12 次提交
-
-
由 Alexander Duyck 提交于
The display of /proc/net/route has had a couple issues due to the fact that when I originally rewrote most of fib_trie I made it so that the iterator was tracking the next value to use instead of the current. In addition it had an off by 1 error where I was tracking the first piece of data as position 0, even though in reality that belonged to the SEQ_START_TOKEN. This patch updates the code so the iterator tracks the last reported position and key instead of the next expected position and key. In addition it shifts things so that all of the leaves start at 1 instead of trying to report leaves starting with offset 0 as being valid. With these two issues addressed this should resolve any off by one errors that were present in the display of /proc/net/route. Fixes: 25b97c01 ("ipv4: off-by-one in continuation handling in /proc/net/route") Cc: Andy Whitcroft <apw@canonical.com> Reported-by: NJason Baron <jbaron@akamai.com> Tested-by: NJason Baron <jbaron@akamai.com> Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Fabian Mewes 提交于
Add Qualcomm QCA tagging introduced in cafdc45c to the list of supported protocols. Signed-off-by: NFabian Mewes <architekt@coding4coffee.org> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Acked-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michael S. Tsirkin 提交于
Virtio 1.0 spec says VIRTIO_F_ANY_LAYOUT and VIRTIO_NET_F_GSO are legacy-only feature bits. Do not negotiate them in virtio 1 mode. Note this is a spec violation so we need to backport it to stable/downstream kernels. Cc: stable@vger.kernel.org Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
icmp6_send is called in response to some event. The skb may not have the device set (skb->dev is NULL), but it is expected to have a dst set. Update icmp6_send to use the dst on the skb to determine L3 domain. Fixes: ca254490 ("net: Add VRF support to IPv6 stack") Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
In map_create(), we first find and create the map, then once that suceeded, we charge it to the user's RLIMIT_MEMLOCK, and then fetch a new anon fd through anon_inode_getfd(). The problem is, once the latter fails f.e. due to RLIMIT_NOFILE limit, then we only destruct the map via map->ops->map_free(), but without uncharging the previously locked memory first. That means that the user_struct allocation is leaked as well as the accounted RLIMIT_MEMLOCK memory not released. Make the label names in the fix consistent with bpf_prog_load(). Fixes: aaac3ba9 ("bpf: charge user for creation of BPF maps and programs") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Commit a6ed3ea6 ("bpf: restore behavior of bpf_map_update_elem") added an extra per-cpu reserve to the hash table map to restore old behaviour from pre prealloc times. When non-prealloc is in use for a map, then problem is that once a hash table extra element has been linked into the hash-table, and the hash table is destroyed due to refcount dropping to zero, then htab_map_free() -> delete_all_elements() will walk the whole hash table and drop all elements via htab_elem_free(). The problem is that the element from the extra reserve is first fed to the wrong backend allocator and eventually freed twice. Fixes: a6ed3ea6 ("bpf: restore behavior of bpf_map_update_elem") Reported-by: NDmitry Vyukov <dvyukov@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Marcelo Ricardo Leitner 提交于
sctp_wait_for_connect() currently already holds the asoc to keep it alive during the sleep, in case another thread release it. But Andrey Konovalov and Dmitry Vyukov reported an use-after-free in such situation. Problem is that __sctp_connect() doesn't get a ref on the asoc and will do a read on the asoc after calling sctp_wait_for_connect(), but by then another thread may have closed it and the _put on sctp_wait_for_connect will actually release it, causing the use-after-free. Fix is, instead of doing the read after waiting for the connect, do it before so, and avoid this issue as the socket is still locked by then. There should be no issue on returning the asoc id in case of failure as the application shouldn't trust on that number in such situations anyway. This issue doesn't exist in sctp_sendmsg() path. Reported-by: NDmitry Vyukov <dvyukov@google.com> Reported-by: NAndrey Konovalov <andreyknvl@google.com> Tested-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: NXin Long <lucien.xin@gmail.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Johan Hovold says: ==================== net: fix device reference leaks This series fixes a number of device reference leaks (and one of_node leak) due to failure to drop the references taken by bus_find_device() and friends. Note that the final two patches have been compile tested only. v2 - hold reference to cpsw-phy-sel device while accessing private data as requested by David. Also update the commit message. (patch 1/4) - add linux-omap on CC where appropriate ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johan Hovold 提交于
Make sure to drop the reference taken by class_find_device() in hnae_get_handle() on errors and when later releasing the handle. Fixes: 6fe6611f ("net: add Hisilicon Network Subsystem...") Cc: Yisen Zhuang <yisen.zhuang@huawei.com> Cc: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: NJohan Hovold <johan@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johan Hovold 提交于
Make sure to drop the references taken by bus_find_device() before returning from emac_dev_open(). Note that phy_connect still takes a reference to the phy device. Fixes: 5d69e007 ("net: davinci_emac: switch to new mdio") Cc: Mugunthan V N <mugunthanvnm@ti.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: linux-omap@vger.kernel.org Signed-off-by: NJohan Hovold <johan@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johan Hovold 提交于
Make sure to drop the references taken by of_get_child_by_name() and bus_find_device() before returning from cpsw_phy_sel(). Note that holding a reference to the cpsw-phy-sel device does not prevent the devres-managed private data from going away. Fixes: 5892cd13 ("drivers: net: cpsw-phy-sel: Add new driver...") Cc: Mugunthan V N <mugunthanvnm@ti.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: linux-omap@vger.kernel.org Signed-off-by: NJohan Hovold <johan@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johan Hovold 提交于
Make sure to drop the reference taken by bus_find_device_by_name() before returning from phy_connect() and phy_attach(). Note that both function still take a reference to the phy device through phy_attach_direct(). Fixes: e1393456 ("[PATCH] PHY Layer fixup") Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: NJohan Hovold <johan@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 11月, 2016 7 次提交
-
-
由 David S. Miller 提交于
Saeed Mahameed says: ==================== Mellanox 100G mlx5 fixes 2016-11-04 This series contains six hot fixes of the mlx5 core and mlx5e driver. Huy fixed an invalid pointer dereference on initialization flow for when the selected mlx5 load profile is out of range. Or provided three eswitch offloads related fixes - Prevent changing NS of a VF representor. - Handle matching on vlan priority for offloaded TC rules - Set the actions for offloaded rules properly On my part I here addressed the error flow related issues in mlx5e_open_channel reported by Jesper just this week. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Huy Nguyen 提交于
When prof_sel is invalid, mlx5_core_warn is called but the mlx5_core_dev is not initialized yet. Solution is moving the prof_sel code after dev->pdev assignment Fixes: 2974ab6e ('net/mlx5: Improve driver log messages') Signed-off-by: NHuy Nguyen <huyn@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Or Gerlitz 提交于
As for the current generation of the mlx5 HW (CX4/CX4-Lx) per flow vlan push/pop actions are emulated, we must not program them to the firmware. Fixes: f5f82476 ('net/mlx5: E-Switch, Support VLAN actions in the offloads mode') Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Reported-by: NPaul Blakey <paulb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Or Gerlitz 提交于
We ignored the vlan priority in offloaded TC rules matching part, fix that. Fixes: 095b6cfd ('net/mlx5e: Add TC vlan match parsing') Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Reported-by: NPaul Blakey <paulb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Or Gerlitz 提交于
VF reps should be altogether on the same NS as they were created. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Saeed Mahameed 提交于
In mlx5e_open_channel CQs must be created before napi is enabled. Here we move the XDP CQ creation to satisfy that fact. mlx5e_close_channel is already working according to the right order. Fixes: b5503b99 ("net/mlx5e: XDP TX forwarding support") Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Reported-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Saeed Mahameed 提交于
In case of mlx5e_open_rq fails the error handling will jump to label err_close_xdp_sq and will try to close the xdp_sq unconditionally. xdp_sq is valid only in case of XDP use cases, i.e priv->xdp_prog is not null. To fix this in this patch we test xdp_sq validity prior to closing it. In addition we now close the xdp_sq.cq as well. Fixes: b5503b99 ("net/mlx5e: XDP TX forwarding support") Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Reported-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 11月, 2016 14 次提交
-
-
由 WANG Cong 提交于
cgroupstats_cmd_get_policy is [CGROUPSTATS_CMD_ATTR_MAX+1], taskstats_cmd_get_policy[TASKSTATS_CMD_ATTR_MAX+1], but their family.maxattr is TASKSTATS_CMD_ATTR_MAX. CGROUPSTATS_CMD_ATTR_MAX is less than TASKSTATS_CMD_ATTR_MAX, so we could end up accessing out-of-bound. Change cgroupstats_cmd_get_policy to TASKSTATS_CMD_ATTR_MAX+1, this is safe because the rest are initialized to 0's. Reported-by: NAndrey Konovalov <andreyknvl@google.com> Tested-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 WANG Cong 提交于
In __genl_register_family(), when genl_validate_assign_mc_groups() fails, we forget to free the memory we possibly allocate for family->attrbuf. Note, some callers call genl_unregister_family() to clean up on error path, it doesn't work because the family is inserted to the global list in the nearly last step. Cc: Jakub Kicinski <kubakici@wp.pl> Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
While fuzzing kernel with syzkaller, Andrey reported a nasty crash in inet6_bind() caused by DCCP lacking a required method. Fixes: ab1e0a13 ("[SOCK] proto: Add hashinfo member to struct proto") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NAndrey Konovalov <andreyknvl@google.com> Tested-by: NAndrey Konovalov <andreyknvl@google.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Guilherme G. Piccoli 提交于
Currently the ehea driver is missing a call to netif_carrier_off() before the interface bring-up; this is necessary in order to initialize the __LINK_STATE_NOCARRIER bit in the net_device state field. Otherwise, we observe state UNKNOWN on "ip address" command output. This patch adds a call to netif_carrier_off() on ehea's net device open callback. Reported-by: NXiong Zhou <zhou@redhat.com> Reference-ID: IBM bz #137702, Red Hat bz #1089134 Signed-off-by: NGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: NDouglas Miller <dougmill@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
dccp_v6_err() does not use pskb_may_pull() and might access garbage. We only need 4 bytes at the beginning of the DCCP header, like TCP, so the 8 bytes pulled in icmpv6_notify() are more than enough. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
A recent commit removed locking from netlink_diag_dump() but forgot one error case. ===================================== [ BUG: bad unlock balance detected! ] 4.9.0-rc3+ #336 Not tainted ------------------------------------- syz-executor/4018 is trying to release lock ([ 36.220068] nl_table_lock ) at: [<ffffffff82dc8683>] netlink_diag_dump+0x1a3/0x250 net/netlink/diag.c:182 but there are no more locks to release! other info that might help us debug this: 3 locks held by syz-executor/4018: #0: [ 36.220068] ( sock_diag_mutex[ 36.220068] ){+.+.+.} , at: [ 36.220068] [<ffffffff82c3873b>] sock_diag_rcv+0x1b/0x40 #1: [ 36.220068] ( sock_diag_table_mutex[ 36.220068] ){+.+.+.} , at: [ 36.220068] [<ffffffff82c38e00>] sock_diag_rcv_msg+0x140/0x3a0 #2: [ 36.220068] ( nlk->cb_mutex[ 36.220068] ){+.+.+.} , at: [ 36.220068] [<ffffffff82db6600>] netlink_dump+0x50/0xac0 stack backtrace: CPU: 1 PID: 4018 Comm: syz-executor Not tainted 4.9.0-rc3+ #336 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 ffff8800645df688 ffffffff81b46934 ffffffff84eb3e78 ffff88006ad85800 ffffffff82dc8683 ffffffff84eb3e78 ffff8800645df6b8 ffffffff812043ca dffffc0000000000 ffff88006ad85ff8 ffff88006ad85fd0 00000000ffffffff Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [<ffffffff81b46934>] dump_stack+0xb3/0x10f lib/dump_stack.c:51 [<ffffffff812043ca>] print_unlock_imbalance_bug+0x17a/0x1a0 kernel/locking/lockdep.c:3388 [< inline >] __lock_release kernel/locking/lockdep.c:3512 [<ffffffff8120cfd8>] lock_release+0x8e8/0xc60 kernel/locking/lockdep.c:3765 [< inline >] __raw_read_unlock ./include/linux/rwlock_api_smp.h:225 [<ffffffff83fc001a>] _raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255 [<ffffffff82dc8683>] netlink_diag_dump+0x1a3/0x250 net/netlink/diag.c:182 [<ffffffff82db6947>] netlink_dump+0x397/0xac0 net/netlink/af_netlink.c:2110 Fixes: ad202074 ("netlink: Use rhashtable walk interface in diag dump") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NAndrey Konovalov <andreyknvl@google.com> Tested-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
dccp_v4_err() does not use pskb_may_pull() and might access garbage. We only need 4 bytes at the beginning of the DCCP header, like TCP, so the 8 bytes pulled in icmp_socket_deliver() are more than enough. This patch might allow to process more ICMP messages, as some routers are still limiting the size of reflected bytes to 28 (RFC 792), instead of extended lengths (RFC 1812 4.3.2.3) Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Andrey reported following warning while fuzzing with syzkaller WARNING: CPU: 1 PID: 21072 at net/dccp/proto.c:83 dccp_set_state+0x229/0x290 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 21072 Comm: syz-executor Not tainted 4.9.0-rc1+ #293 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 ffff88003d4c7738 ffffffff81b474f4 0000000000000003 dffffc0000000000 ffffffff844f8b00 ffff88003d4c7804 ffff88003d4c7800 ffffffff8140c06a 0000000041b58ab3 ffffffff8479ab7d ffffffff8140beae ffffffff8140cd00 Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [<ffffffff81b474f4>] dump_stack+0xb3/0x10f lib/dump_stack.c:51 [<ffffffff8140c06a>] panic+0x1bc/0x39d kernel/panic.c:179 [<ffffffff8111125c>] __warn+0x1cc/0x1f0 kernel/panic.c:542 [<ffffffff8111144c>] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585 [<ffffffff8389e5d9>] dccp_set_state+0x229/0x290 net/dccp/proto.c:83 [<ffffffff838a0aa2>] dccp_close+0x612/0xc10 net/dccp/proto.c:1016 [<ffffffff8316bf1f>] inet_release+0xef/0x1c0 net/ipv4/af_inet.c:415 [<ffffffff82b6e89e>] sock_release+0x8e/0x1d0 net/socket.c:570 [<ffffffff82b6e9f6>] sock_close+0x16/0x20 net/socket.c:1017 [<ffffffff815256ad>] __fput+0x29d/0x720 fs/file_table.c:208 [<ffffffff81525bb5>] ____fput+0x15/0x20 fs/file_table.c:244 [<ffffffff811727d8>] task_work_run+0xf8/0x170 kernel/task_work.c:116 [< inline >] exit_task_work include/linux/task_work.h:21 [<ffffffff8111bc53>] do_exit+0x883/0x2ac0 kernel/exit.c:828 [<ffffffff811221fe>] do_group_exit+0x10e/0x340 kernel/exit.c:931 [<ffffffff81143c94>] get_signal+0x634/0x15a0 kernel/signal.c:2307 [<ffffffff81054aad>] do_signal+0x8d/0x1a30 arch/x86/kernel/signal.c:807 [<ffffffff81003a05>] exit_to_usermode_loop+0xe5/0x130 arch/x86/entry/common.c:156 [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190 [<ffffffff81006298>] syscall_return_slowpath+0x1a8/0x1e0 arch/x86/entry/common.c:259 [<ffffffff83fc1a62>] entry_SYSCALL_64_fastpath+0xc0/0xc2 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Fix this the same way we did for TCP in commit 565b7b2d ("tcp: do not send reset to already closed sockets") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NAndrey Konovalov <andreyknvl@google.com> Tested-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Andrey Konovalov reported following error while fuzzing with syzkaller : IPv4: Attempt to release alive inet socket ffff880068e98940 kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Modules linked in: CPU: 1 PID: 3905 Comm: a.out Not tainted 4.9.0-rc3+ #333 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88006b9e0000 task.stack: ffff880068770000 RIP: 0010:[<ffffffff819ead5f>] [<ffffffff819ead5f>] selinux_socket_sock_rcv_skb+0xff/0x6a0 security/selinux/hooks.c:4639 RSP: 0018:ffff8800687771c8 EFLAGS: 00010202 RAX: ffff88006b9e0000 RBX: 1ffff1000d0eee3f RCX: 1ffff1000d1d312a RDX: 1ffff1000d1d31a6 RSI: dffffc0000000000 RDI: 0000000000000010 RBP: ffff880068777360 R08: 0000000000000000 R09: 0000000000000002 R10: dffffc0000000000 R11: 0000000000000006 R12: ffff880068e98940 R13: 0000000000000002 R14: ffff880068777338 R15: 0000000000000000 FS: 00007f00ff760700(0000) GS:ffff88006cd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020008000 CR3: 000000006a308000 CR4: 00000000000006e0 Stack: ffff8800687771e0 ffffffff812508a5 ffff8800686f3168 0000000000000007 ffff88006ac8cdfc ffff8800665ea500 0000000041b58ab3 ffffffff847b5480 ffffffff819eac60 ffff88006b9e0860 ffff88006b9e0868 ffff88006b9e07f0 Call Trace: [<ffffffff819c8dd5>] security_sock_rcv_skb+0x75/0xb0 security/security.c:1317 [<ffffffff82c2a9e7>] sk_filter_trim_cap+0x67/0x10e0 net/core/filter.c:81 [<ffffffff82b81e60>] __sk_receive_skb+0x30/0xa00 net/core/sock.c:460 [<ffffffff838bbf12>] dccp_v4_rcv+0xdb2/0x1910 net/dccp/ipv4.c:873 [<ffffffff83069d22>] ip_local_deliver_finish+0x332/0xad0 net/ipv4/ip_input.c:216 [< inline >] NF_HOOK_THRESH ./include/linux/netfilter.h:232 [< inline >] NF_HOOK ./include/linux/netfilter.h:255 [<ffffffff8306abd2>] ip_local_deliver+0x1c2/0x4b0 net/ipv4/ip_input.c:257 [< inline >] dst_input ./include/net/dst.h:507 [<ffffffff83068500>] ip_rcv_finish+0x750/0x1c40 net/ipv4/ip_input.c:396 [< inline >] NF_HOOK_THRESH ./include/linux/netfilter.h:232 [< inline >] NF_HOOK ./include/linux/netfilter.h:255 [<ffffffff8306b82f>] ip_rcv+0x96f/0x12f0 net/ipv4/ip_input.c:487 [<ffffffff82bd9fb7>] __netif_receive_skb_core+0x1897/0x2a50 net/core/dev.c:4213 [<ffffffff82bdb19a>] __netif_receive_skb+0x2a/0x170 net/core/dev.c:4251 [<ffffffff82bdb493>] netif_receive_skb_internal+0x1b3/0x390 net/core/dev.c:4279 [<ffffffff82bdb6b8>] netif_receive_skb+0x48/0x250 net/core/dev.c:4303 [<ffffffff8241fc75>] tun_get_user+0xbd5/0x28a0 drivers/net/tun.c:1308 [<ffffffff82421b5a>] tun_chr_write_iter+0xda/0x190 drivers/net/tun.c:1332 [< inline >] new_sync_write fs/read_write.c:499 [<ffffffff8151bd44>] __vfs_write+0x334/0x570 fs/read_write.c:512 [<ffffffff8151f85b>] vfs_write+0x17b/0x500 fs/read_write.c:560 [< inline >] SYSC_write fs/read_write.c:607 [<ffffffff81523184>] SyS_write+0xd4/0x1a0 fs/read_write.c:599 [<ffffffff83fc02c1>] entry_SYSCALL_64_fastpath+0x1f/0xc2 It turns out DCCP calls __sk_receive_skb(), and this broke when lookups no longer took a reference on listeners. Fix this issue by adding a @refcounted parameter to __sk_receive_skb(), so that sock_put() is used only when needed. Fixes: 3b24d854 ("tcp/dccp: do not touch listener sk_refcnt under synflood") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NAndrey Konovalov <andreyknvl@google.com> Tested-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
After my commit, tcp_sendmsg() might restart its loop after processing socket backlog. If sk_err is set, we blindly return an error, even though we copied data to user space before. We should instead return number of bytes that could be copied, otherwise user space might resend data and corrupt the stream. This might happen if another thread is using recvmsg(MSG_ERRQUEUE) to process timestamps. Issue was diagnosed by Soheil and Willem, big kudos to them ! Fixes: d41a69f1 ("tcp: make tcp_sendmsg() aware of socket backlog") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Tested-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lance Richardson 提交于
Some configurations (e.g. geneve interface with default MTU of 1500 over an ethernet interface with 1500 MTU) result in the transmission of packets that exceed the configured MTU. While this should be considered to be a "bad" configuration, it is still allowed and should not result in the sending of packets that exceed the configured MTU. Fix by dropping the assumption in ip_finish_output_gso() that locally originated gso packets will never need fragmentation. Basic testing using iperf (observing CPU usage and bandwidth) have shown no measurable performance impact for traffic not requiring fragmentation. Fixes: c7ba65d7 ("net: ip: push gso skb forwarding handling down the stack") Reported-by: NJan Tluka <jtluka@redhat.com> Signed-off-by: NLance Richardson <lrichard@redhat.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Andrey reported the following error report while running the syzkaller fuzzer: general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 648 Comm: syz-executor Not tainted 4.9.0-rc3+ #333 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff8800398c4480 task.stack: ffff88003b468000 RIP: 0010:[<ffffffff83091106>] [< inline >] inet_exact_dif_match include/net/tcp.h:808 RIP: 0010:[<ffffffff83091106>] [<ffffffff83091106>] __inet_lookup_listener+0xb6/0x500 net/ipv4/inet_hashtables.c:219 RSP: 0018:ffff88003b46f270 EFLAGS: 00010202 RAX: 0000000000000004 RBX: 0000000000004242 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffc90000e3c000 RDI: 0000000000000054 RBP: ffff88003b46f2d8 R08: 0000000000004000 R09: ffffffff830910e7 R10: 0000000000000000 R11: 000000000000000a R12: ffffffff867fa0c0 R13: 0000000000004242 R14: 0000000000000003 R15: dffffc0000000000 FS: 00007fb135881700(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020cc3000 CR3: 000000006d56a000 CR4: 00000000000006f0 Stack: 0000000000000000 000000000601a8c0 0000000000000000 ffffffff00004242 424200003b9083c2 ffff88003def4041 ffffffff84e7e040 0000000000000246 ffff88003a0911c0 0000000000000000 ffff88003a091298 ffff88003b9083ae Call Trace: [<ffffffff831100f4>] tcp_v4_send_reset+0x584/0x1700 net/ipv4/tcp_ipv4.c:643 [<ffffffff83115b1b>] tcp_v4_rcv+0x198b/0x2e50 net/ipv4/tcp_ipv4.c:1718 [<ffffffff83069d22>] ip_local_deliver_finish+0x332/0xad0 net/ipv4/ip_input.c:216 ... MD5 has a code path that calls __inet_lookup_listener with a null skb, so inet{6}_exact_dif_match needs to check skb against null before pulling the flag. Fixes: a04a480d ("net: Require exact match for TCP socket lookups if dif is l3mdev") Reported-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Tested-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Imagine initial value of max_skb_frags is 17, and last skb in write queue has 15 frags. Then max_skb_frags is lowered to 14 or smaller value. tcp_sendmsg() will then be allowed to add additional page frags and eventually go past MAX_SKB_FRAGS, overflowing struct skb_shared_info. Fixes: 5f74f82e ("net:Add sysctl_max_skb_frags") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Hans Westgaard Ry <hans.westgaard.ry@oracle.com> Cc: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mintz, Yuval 提交于
Driver allocates replacement buffers before-hand to make sure whenever an aggregation begins there would be a replacement for the Rx buffers, as we can't release the buffer until aggregation is terminated and driver logic assumes the Rx rings are always full. For every other Rx page that's being allocated [I.e., regular] the page is being completely mapped while for the replacement buffers only the first portion of the page is being mapped. This means that: a. Once replacement buffer replenishes the regular Rx ring, assuming there's more than a single packet on page we'd post unmapped memory toward HW [assuming mapping is actually done in granularity smaller than page]. b. Unmaps are being done for the entire page, which is incorrect. Fixes: 55482edc ("qede: Add slowpath/fastpath support and enable hardware GRO") Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-