- 31 10月, 2017 10 次提交
-
-
由 Stefano Stabellini 提交于
Implement recvmsg by copying data from the "in" ring. If not enough data is available and the recvmsg call is blocking, then wait on the inflight_conn_req waitqueue. Take the active socket in_mutex so that only one function can access the ring at any given time. Signed-off-by: NStefano Stabellini <stefano@aporeto.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
由 Stefano Stabellini 提交于
Send data to an active socket by copying data to the "out" ring. Take the active socket out_mutex so that only one function can access the ring at any given time. If not enough room is available on the ring, rather than returning immediately or sleep-waiting, spin for up to 5000 cycles. This small optimization turns out to improve performance significantly. Signed-off-by: NStefano Stabellini <stefano@aporeto.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
由 Stefano Stabellini 提交于
Introduce a waitqueue to allow only one outstanding accept command at any given time and to implement polling on the passive socket. Introduce a flags field to keep track of in-flight accept and poll commands. Send PVCALLS_ACCEPT to the backend. Allocate a new active socket. Make sure that only one accept command is executed at any given time by setting PVCALLS_FLAG_ACCEPT_INFLIGHT and waiting on the inflight_accept_req waitqueue. Convert the new struct sock_mapping pointer into an uintptr_t and use it as id for the new socket to pass to the backend. Check if the accept call is non-blocking: in that case after sending the ACCEPT command to the backend store the sock_mapping pointer of the new struct and the inflight req_id then return -EAGAIN (which will respond only when there is something to accept). Next time accept is called, we'll check if the ACCEPT command has been answered, if so we'll pick up where we left off, otherwise we return -EAGAIN again. Note that, differently from the other commands, we can use wait_event_interruptible (instead of wait_event) in the case of accept as we are able to track the req_id of the ACCEPT response that we are waiting. Signed-off-by: NStefano Stabellini <stefano@aporeto.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
由 Stefano Stabellini 提交于
Send PVCALLS_LISTEN to the backend. Signed-off-by: NStefano Stabellini <stefano@aporeto.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
由 Stefano Stabellini 提交于
Send PVCALLS_BIND to the backend. Introduce a new structure, part of struct sock_mapping, to store information specific to passive sockets. Introduce a status field to keep track of the status of the passive socket. Signed-off-by: NStefano Stabellini <stefano@aporeto.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
由 Stefano Stabellini 提交于
Send PVCALLS_CONNECT to the backend. Allocate a new ring and evtchn for the active socket. Introduce fields in struct sock_mapping to keep track of active sockets. Introduce a waitqueue to allow the frontend to wait on data coming from the backend on the active socket (recvmsg command). Two mutexes (one of reads and one for writes) will be used to protect the active socket in and out rings from concurrent accesses. Signed-off-by: NStefano Stabellini <stefano@aporeto.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
由 Stefano Stabellini 提交于
Send a PVCALLS_SOCKET command to the backend, use the masked req_prod_pvt as req_id. This way, req_id is guaranteed to be between 0 and PVCALLS_NR_REQ_PER_RING. We already have a slot in the rsp array ready for the response, and there cannot be two outstanding responses with the same req_id. Wait for the response by waiting on the inflight_req waitqueue and check for the req_id field in rsp[req_id]. Use atomic accesses and barriers to read the field. Note that the barriers are simple smp barriers (as opposed to virt barriers) because they are for internal frontend synchronization, not frontend<->backend communication. Once a response is received, clear the corresponding rsp slot by setting req_id to PVCALLS_INVALID_ID. Note that PVCALLS_INVALID_ID is invalid only from the frontend point of view. It is not part of the PVCalls protocol. pvcalls_front_event_handler is in charge of copying responses from the ring to the appropriate rsp slot. It is done by copying the body of the response first, then by copying req_id atomically. After the copies, wake up anybody waiting on waitqueue. socket_lock protects accesses to the ring. Convert the pointer to sock_mapping into an uintptr_t and use it as id for the new socket to pass to the backend. The struct will be fully initialized later on connect or bind. sock->sk->sk_send_head is not used for ip sockets: reuse the field to store a pointer to the struct sock_mapping corresponding to the socket. This way, we can easily get the struct sock_mapping from the struct socket. Signed-off-by: NStefano Stabellini <stefano@aporeto.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
由 Stefano Stabellini 提交于
Implement the probe function for the pvcalls frontend. Read the supported versions, max-page-order and function-calls nodes from xenstore. Only one frontend<->backend connection is supported at any given time for a guest. Store the active frontend device to a static pointer. Introduce a stub functions for the event handler. Signed-off-by: NStefano Stabellini <stefano@aporeto.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
由 Stefano Stabellini 提交于
Introduce a data structure named pvcalls_bedata. It contains pointers to the command ring, the event channel, a list of active sockets and a list of passive sockets. Lists accesses are protected by a spin_lock. Introduce a waitqueue to allow waiting for a response on commands sent to the backend. Introduce an array of struct xen_pvcalls_response to store commands responses. Introduce a new struct sock_mapping to keep track of sockets. In this patch the struct sock_mapping is minimal, the fields will be added by the next patches. pvcalls_refcount is used to keep count of the outstanding pvcalls users. Only remove connections once the refcount is zero. Implement pvcalls frontend removal function. Go through the list of active and passive sockets and free them all, one at a time. Signed-off-by: NStefano Stabellini <stefano@aporeto.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
由 Stefano Stabellini 提交于
Introduce a xenbus frontend for the pvcalls protocol, as defined by https://xenbits.xen.org/docs/unstable/misc/pvcalls.html. This patch only adds the stubs, the code will be added by the following patches. Signed-off-by: NStefano Stabellini <stefano@aporeto.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
- 30 10月, 2017 1 次提交
-
-
由 Linus Torvalds 提交于
-
- 29 10月, 2017 29 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net由 Linus Torvalds 提交于
Pull networking fixes from David Miller: 1) Fix route leak in xfrm_bundle_create(). 2) In mac80211, validate user rate mask before configuring it. From Johannes Berg. 3) Properly enforce memory limits in fair queueing code, from Toke Hoiland-Jorgensen. 4) Fix lockdep splat in inet_csk_route_req(), from Eric Dumazet. 5) Fix TSO header allocation and management in mvpp2 driver, from Yan Markman. 6) Don't take socket lock in BH handler in strparser code, from Tom Herbert. 7) Don't show sockets from other namespaces in AF_UNIX code, from Andrei Vagin. 8) Fix double free in error path of tap_open(), from Girish Moodalbail. 9) Fix TX map failure path in igb and ixgbe, from Jean-Philippe Brucker and Alexander Duyck. 10) Fix DCB mode programming in stmmac driver, from Jose Abreu. 11) Fix err_count handling in various tunnels (ipip, ip6_gre). From Xin Long. 12) Properly align SKB head before building SKB in tuntap, from Jason Wang. 13) Avoid matching qdiscs with a zero handle during lookups, from Cong Wang. 14) Fix various endianness bugs in sctp, from Xin Long. 15) Fix tc filter callback races and add selftests which trigger the problem, from Cong Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits) selftests: Introduce a new test case to tc testsuite selftests: Introduce a new script to generate tc batch file net_sched: fix call_rcu() race on act_sample module removal net_sched: add rtnl assertion to tcf_exts_destroy() net_sched: use tcf_queue_work() in tcindex filter net_sched: use tcf_queue_work() in rsvp filter net_sched: use tcf_queue_work() in route filter net_sched: use tcf_queue_work() in u32 filter net_sched: use tcf_queue_work() in matchall filter net_sched: use tcf_queue_work() in fw filter net_sched: use tcf_queue_work() in flower filter net_sched: use tcf_queue_work() in flow filter net_sched: use tcf_queue_work() in cgroup filter net_sched: use tcf_queue_work() in bpf filter net_sched: use tcf_queue_work() in basic filter net_sched: introduce a workqueue for RCU callbacks of tc filter sctp: fix some type cast warnings introduced since very beginning sctp: fix a type cast warnings that causes a_rwnd gets the wrong value sctp: fix some type cast warnings introduced by transport rhashtable sctp: fix some type cast warnings introduced by stream reconf ...
-
由 David S. Miller 提交于
Cong Wang says: ==================== net_sched: fix races with RCU callbacks Recently, the RCU callbacks used in TC filters and TC actions keep drawing my attention, they introduce at least 4 race condition bugs: 1. A simple one fixed by Daniel: commit c78e1746 Author: Daniel Borkmann <daniel@iogearbox.net> Date: Wed May 20 17:13:33 2015 +0200 net: sched: fix call_rcu() race on classifier module unloads 2. A very nasty one fixed by me: commit 1697c4bb Author: Cong Wang <xiyou.wangcong@gmail.com> Date: Mon Sep 11 16:33:32 2017 -0700 net_sched: carefully handle tcf_block_put() 3. Two more bugs found by Chris: https://patchwork.ozlabs.org/patch/826696/ https://patchwork.ozlabs.org/patch/826695/ Usually RCU callbacks are simple, however for TC filters and actions, they are complex because at least TC actions could be destroyed together with the TC filter in one callback. And RCU callbacks are invoked in BH context, without locking they are parallel too. All of these contribute to the cause of these nasty bugs. Alternatively, we could also: a) Introduce a spinlock to serialize these RCU callbacks. But as I said in commit 1697c4bb ("net_sched: carefully handle tcf_block_put()"), it is very hard to do because of tcf_chain_dump(). Potentially we need to do a lot of work to make it possible (if not impossible). b) Just get rid of these RCU callbacks, because they are not necessary at all, callers of these call_rcu() are all on slow paths and holding RTNL lock, so blocking is allowed in their contexts. However, David and Eric dislike adding synchronize_rcu() here. As suggested by Paul, we could defer the work to a workqueue and gain the permission of holding RTNL again without any performance impact, however, in tcf_block_put() we could have a deadlock when flushing workqueue while hodling RTNL lock, the trick here is to defer the work itself in workqueue and make it queued after all other works so that we keep the same ordering to avoid any use-after-free. Please see the first patch for details. Patch 1 introduces the infrastructure, patch 2~12 move each tc filter to the new tc filter workqueue, patch 13 adds an assertion to catch potential bugs like this, patch 14 closes another rcu callback race, patch 15 and patch 16 add new test cases. ==================== Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chris Mi 提交于
In this patchset, we fixed a tc bug. This patch adds the test case that reproduces the bug. To run this test case, user should specify an existing NIC device: # sudo ./tdc.py -d enp4s0f0 This test case belongs to category "flower". If user doesn't specify a NIC device, the test cases belong to "flower" will not be run. In this test case, we create 1M filters and all filters share the same action. When destroying all filters, kernel should not panic. It takes about 18s to run it. Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Acked-by: NLucas Bates <lucasb@mojatatu.com> Signed-off-by: NChris Mi <chrism@mellanox.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chris Mi 提交于
# ./tdc_batch.py -h usage: tdc_batch.py [-h] [-n NUMBER] [-o] [-s] [-p] device file TC batch file generator positional arguments: device device name file batch file name optional arguments: -h, --help show this help message and exit -n NUMBER, --number NUMBER how many lines in batch file -o, --skip_sw skip_sw (offload), by default skip_hw -s, --share_action all filters share the same action -p, --prio all filters have different prio Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Acked-by: NLucas Bates <lucasb@mojatatu.com> Signed-off-by: NChris Mi <chrism@mellanox.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Similar to commit c78e1746 ("net: sched: fix call_rcu() race on classifier module unloads"), we need to wait for flying RCU callback tcf_sample_cleanup_rcu(). Cc: Yotam Gigi <yotamg@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
After previous patches, it is now safe to claim that tcf_exts_destroy() is always called with RTNL lock. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Defer the tcf_exts_destroy() in RCU callback to tc filter workqueue and get RTNL lock. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Defer the tcf_exts_destroy() in RCU callback to tc filter workqueue and get RTNL lock. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Defer the tcf_exts_destroy() in RCU callback to tc filter workqueue and get RTNL lock. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Defer the tcf_exts_destroy() in RCU callback to tc filter workqueue and get RTNL lock. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Defer the tcf_exts_destroy() in RCU callback to tc filter workqueue and get RTNL lock. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Defer the tcf_exts_destroy() in RCU callback to tc filter workqueue and get RTNL lock. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Defer the tcf_exts_destroy() in RCU callback to tc filter workqueue and get RTNL lock. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Defer the tcf_exts_destroy() in RCU callback to tc filter workqueue and get RTNL lock. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Defer the tcf_exts_destroy() in RCU callback to tc filter workqueue and get RTNL lock. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Defer the tcf_exts_destroy() in RCU callback to tc filter workqueue and get RTNL lock. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Defer the tcf_exts_destroy() in RCU callback to tc filter workqueue and get RTNL lock. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
This patch introduces a dedicated workqueue for tc filters so that each tc filter's RCU callback could defer their action destroy work to this workqueue. The helper tcf_queue_work() is introduced for them to use. Because we hold RTNL lock when calling tcf_block_put(), we can not simply flush works inside it, therefore we have to defer it again to this workqueue and make sure all flying RCU callbacks have already queued their work before this one, in other words, to ensure this is the last one to execute to prevent any use-after-free. On the other hand, this makes tcf_block_put() ugly and harder to understand. Since David and Eric strongly dislike adding synchronize_rcu(), this is probably the only solution that could make everyone happy. Please also see the code comments below. Reported-by: NChris Mi <chrism@mellanox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Xin Long says: ==================== sctp: a bunch of fixes for some sparse warnings As Eric noticed, when running 'make C=2 M=net/sctp/', a plenty of warnings or errors checked by sparse appear. They are all problems about Endian and type cast. Most of them are just warnings by which no issues could be caused while some might be bugs. This patchset fixes them with four patches basically according to how they are introduced. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
These warnings were found by running 'make C=2 M=net/sctp/'. They are there since very beginning. Note after this patch, there still one warning left in sctp_outq_flush(): sctp_chunk_fail(chunk, SCTP_ERROR_INV_STRM) Since it has been moved to sctp_stream_outq_migrate on net-next, to avoid the extra job when merging net-next to net, I will post the fix for it after the merging is done. Reported-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
These warnings were found by running 'make C=2 M=net/sctp/'. Commit d4d6fb57 ("sctp: Try not to change a_rwnd when faking a SACK from SHUTDOWN.") expected to use the peers old rwnd and add our flight size to the a_rwnd. But with the wrong Endian, it may not work as well as expected. So fix it by converting to the right value. Fixes: d4d6fb57 ("sctp: Try not to change a_rwnd when faking a SACK from SHUTDOWN.") Reported-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
These warnings were found by running 'make C=2 M=net/sctp/'. They are introduced by not aware of Endian for the port when coding transport rhashtable patches. Fixes: 7fda702f ("sctp: use new rhlist interface on sctp transport rhashtable") Reported-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
These warnings were found by running 'make C=2 M=net/sctp/'. They are introduced by not aware of Endian when coding stream reconf patches. Since commit c0d8bab6 ("sctp: add get and set sockopt for reconf_enable") enabled stream reconf feature for users, the Fixes tag below would use it. Fixes: c0d8bab6 ("sctp: add get and set sockopt for reconf_enable") Reported-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
Davide found the following script triggers a NULL pointer dereference: ip l a name eth0 type dummy tc q a dev eth0 parent :1 handle 1: htb This is because for a freshly created netdevice noop_qdisc is attached and when passing 'parent :1', kernel actually tries to match the major handle which is 0 and noop_qdisc has handle 0 so is matched by mistake. Commit 69012ae4 tries to fix a similar bug but still misses this case. Handle 0 is not a valid one, should be just skipped. In fact, kernel uses it as TC_H_UNSPEC. Fixes: 69012ae4 ("net: sched: fix handling of singleton qdiscs with qdisc_hash") Fixes: 59cc1f61 ("net: sched:convert qdisc linked list to hashtable") Reported-by: NDavide Caratti <dcaratti@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
Now when migrating sock to another one in sctp_sock_migrate(), it only resets owner sk for the data in receive queues, not the chunks on out queues. It would cause that data chunks length on the sock is not consistent with sk sk_wmem_alloc. When closing the sock or freeing these chunks, the old sk would never be freed, and the new sock may crash due to the overflow sk_wmem_alloc. syzbot found this issue with this series: r0 = socket$inet_sctp() sendto$inet(r0) listen(r0) accept4(r0) close(r0) Although listen() should have returned error when one TCP-style socket is in connecting (I may fix this one in another patch), it could also be reproduced by peeling off an assoc. This issue is there since very beginning. This patch is to reset owner sk for the chunks on out queues so that sk sk_wmem_alloc has correct value after accept one sock or peeloff an assoc to one sock. Note that when resetting owner sk for chunks on outqueue, it has to sctp_clear_owner_w/skb_orphan chunks before changing assoc->base.sk first and then sctp_set_owner_w them after changing assoc->base.sk, due to that sctp_wfree and it's callees are using assoc->base.sk. Reported-by: NDmitry Vyukov <dvyukov@google.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
John Fastabend says: ==================== net: sockmap fixes Last two fixes (as far as I know) for sockmap code this round. First, we are using the qdisc cb structure when making the data end calculation. This is really just wrong so, store it with the other metadata in the correct tcp_skb_cb sturct to avoid breaking things. Next, with recent work to attach multiple programs to a cgroup a specific enumeration of return codes was agreed upon. However, I wrote the sk_skb program types before seeing this work and used a different convention. Patch 2 in the series aligns the return codes to avoid breaking with this infrastructure and also aligns with other programming conventions to avoid being the odd duck out forcing programs to remember SK_SKB programs are different. Pusing to net because its a user visible change. With this SK_SKB program return codes are the same as other cgroup program types. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John Fastabend 提交于
Recent additions to support multiple programs in cgroups impose a strict requirement, "all yes is yes, any no is no". To enforce this the infrastructure requires the 'no' return code, SK_DROP in this case, to be 0. To apply these rules to SK_SKB program types the sk_actions return codes need to be adjusted. This fix adds SK_PASS and makes 'SK_DROP = 0'. Finally, remove SK_ABORTED to remove any chance that the API may allow aborted program flows to be passed up the stack. This would be incorrect behavior and allow programs to break existing policies. Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John Fastabend 提交于
SK_SKB program types use bpf_compute_data to store the end of the packet data. However, bpf_compute_data assumes the cb is stored in the qdisc layer format. But, for SK_SKB this is the wrong layer of the stack for this type. It happens to work (sort of!) because in most cases nothing happens to be overwritten today. This is very fragile and error prone. Fortunately, we have another hole in tcp_skb_cb we can use so lets put the data_end value there. Note, SK_SKB program types do not use data_meta, they are failed by sk_skb_is_valid_access(). Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Linus Torvalds 提交于
Merge tag 'kbuild-fixes-v4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - fix O= building on dash - remove unused dependency in Makefile - fix default of a choice in Kconfig - fix typos and documentation style - fix command options unrecognized by sparse * tag 'kbuild-fixes-v4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: clang: fix build failures with sparse check kbuild doc: a bundle of fixes on makefiles.txt Makefile: kselftest: fix grammar typo kbuild: Fix optimization level choice default kbuild: drop unused symverfile in Makefile.modpost kbuild: revert $(realpath ...) to $(shell cd ... && /bin/pwd)
-