- 13 12月, 2013 1 次提交
-
-
由 Jerry Chu 提交于
This patch modifies the GRO stack to avoid the use of "network_header" and associated macros like ip_hdr() and ipv6_hdr() in order to allow an arbitary number of IP hdrs (v4 or v6) to be used in the encapsulation chain. This lays the foundation for various IP tunneling support (IP-in-IP, GRE, VXLAN, SIT,...) to be added later. With this patch, the GRO stack traversing now is mostly based on skb_gro_offset rather than special hdr offsets saved in skb (e.g., skb->network_header). As a result all but the top layer (i.e., the the transport layer) must have hdrs of the same length in order for a pkt to be considered for aggregation. Therefore when adding a new encap layer (e.g., for tunneling), one must check and skip flows (e.g., by setting NAPI_GRO_CB(p)->same_flow to 0) that have a different hdr length. Note that unlike the network header, the transport header can and will continue to be set by the GRO code since there will be at most one "transport layer" in the encap chain. Signed-off-by: NH.K. Jerry Chu <hkchu@google.com> Suggested-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 12月, 2013 5 次提交
-
-
由 Jiri Benc 提交于
RFC 4191 states in 3.5: When a host avoids using any non-reachable router X and instead sends a data packet to another router Y, and the host would have used router X if router X were reachable, then the host SHOULD probe each such router X's reachability by sending a single Neighbor Solicitation to that router's address. A host MUST NOT probe a router's reachability in the absence of useful traffic that the host would have sent to the router if it were reachable. In any case, these probes MUST be rate-limited to no more than one per minute per router. Currently, when the neighbour corresponding to a router falls into NUD_FAILED, it's never considered again. Introduce a new rt6_nud_state value, RT6_NUD_FAIL_PROBE, which suggests the route should not be used but should be probed with a single NS. The probe is ratelimited by the existing code. To better distinguish meanings of the failure values, rename RT6_NUD_FAIL_SOFT to RT6_NUD_FAIL_DO_RR. Signed-off-by: NJiri Benc <jbenc@redhat.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 wangweidong 提交于
In sctp_err_lookup, goto out while the asoc is not NULL, so remove the check NULL. Also, in sctp_err_finish which called by sctp_v4_err and sctp_v6_err, they pass asoc to sctp_err_finish while the asoc is not NULL, so remove the check. Signed-off-by: NWang Weidong <wangweidong1@huawei.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NVlad Yasevich <vyasevich@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yang Yingliang 提交于
It already has a NULL pointer judgment of rtab in qdisc_put_rtab(). Remove the judgment outside of qdisc_put_rtab(). Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
Help of this function says: "in_dev: only on this interface, 0=any interface", but since commit 39a6d063 ("[NETNS]: Process inet_confirm_addr in the correct namespace."), the code supposes that it will never be NULL. This function is never called with in_dev == NULL, but it's exported and may be used by an external module. Because this patch restore the ability to call inet_confirm_addr() with in_dev == NULL, I partially revert the above commit, as suggested by Julian. CC: Julian Anastasov <ja@ssi.bg> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yang Yingliang 提交于
SKIP_NONLOCAL hides the control flow. The control flow should be inlined and expanded explicitly in code so that someone who reads it can tell the control flow can be changed by the statement. Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 12月, 2013 18 次提交
-
-
由 Ying Xue 提交于
In early versions of TIPC it was possible to administratively block individual links through the use of the member flag 'blocked'. This functionality was deemed redundant, and since commit 7368dd ("tipc: clean out all instances of #if 0'd unused code"), this flag has been unused. In the current code, a link only needs to be blocked for sending and reception if it is subject to an ongoing link failover. In that case, it is sufficient to check if the number of expected failover packets is non-zero, something which is done via the funtion 'link_blocked()'. This commit finally removes the redundant 'blocked' flag completely. Signed-off-by: NYing Xue <ying.xue@windriver.com> Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
Currently TIPC supports two L2 media types, Ethernet and Infiniband. Because both these media are accessed through the common net_device API, several functions in the two media adaptation files turn out to be fully or almost identical, leading to unnecessary code duplication. In this commit we extract this common code from the two media files and move them to the generic bearer.c. Additionally, we change the function names to reflect their real role: to access L2 media, irrespective of type. Signed-off-by: NYing Xue <ying.xue@windriver.com> Cc: Patrick McHardy <kaber@trash.net> Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
Currently, registering a TIPC stack handler in the network device layer is done twice, once for Ethernet (eth_media) and Infiniband (ib_media) repectively. But, as this registration is not media specific, we can avoid some code duplication by moving the registering function to the generic bearer layer, to the file bearer.c, and call it only once. The same is true for the network device event notifier. As a side effect, the two workqueues we are using for for setting up/ cleaning up media can now be eliminated. Furthermore, the array for storing the specific media type structs, media_array[], can be entirely deleted. Note that the eth_started and ib_started flags were removed during the code relocation. There is now only one call to bearer_setup and bearer_cleanup, and these can logically not race against each other. Despite its size, this cleanup work incurs no functional changes in TIPC. In particular, it should be noted that the sequence ordering of received packets is unaffected by this change, since packet reception never was subject to any work queue handling in the first place. Signed-off-by: NYing Xue <ying.xue@windriver.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
TIPC is currently using the field 'af_packet_priv' in struct net_device as a handle to find the bearer instance associated to the given network device. But, by doing so it is blocking other networking cleanups, such as the one discussed here: http://patchwork.ozlabs.org/patch/178044/ This commit removes this usage from TIPC. Instead, we introduce a new field, 'tipc_ptr', to the net_device structure, to serve this purpose. When TIPC bearer is enabled, the bearer object is associated to 'tipc_ptr'. When a TIPC packet arrives in the recv_msg() upcall from a networking device, the bearer object can now be obtained from 'tipc_ptr'. When a bearer is disabled, the bearer object is detached from its underlying network device by setting 'tipc_ptr' to NULL. Additionally, an RCU lock is used to protect the new pointer. Henceforth, the existing tipc_net_lock is used in write mode to serialize write accesses to this pointer, while the new RCU lock is applied on the read side to ensure that the pointer is 100% valid within its wrapped area for all readers. Signed-off-by: NYing Xue <ying.xue@windriver.com> Cc: Patrick McHardy <kaber@trash.net> Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
struct 'tipc_media' represents the specific info that the media layer adaptors (eth_media and ib_media) expose to the generic bearer layer. We clarify this by improved commenting, and by giving the 'media_list' array the more appropriate name 'media_info_array'. There are no functional changes in this commit. Signed-off-by: NYing Xue <ying.xue@windriver.com> Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
Communication media types are abstracted through the struct 'tipc_media', one per media type. These structs are allocated statically inside their respective media file. Furthermore, in order to be able to reach all instances from a central location, we keep a static array with pointers to these structs. This array is currently initialized at runtime, under protection of tipc_net_lock. However, since the contents of the array itself never changes after initialization, we can just as well initialize it at compile time and make it 'const', at the same time making it obvious that no lock protection is needed here. This commit makes the array constant and removes the redundant lock protection. Signed-off-by: NYing Xue <ying.xue@windriver.com> Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
sk_buff lists are currently relased by looping over the list and explicitly releasing each buffer. We replace all occurrences of this loop with a call to kfree_skb_list(). Signed-off-by: NYing Xue <ying.xue@windriver.com> Reviewed-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yang Yingliang 提交于
Macros with multiple statements should be enclosed in a do - while loop Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yang Yingliang 提交于
Spaces required around that '>' (ctx:VxV) and before the open parenthesis '('. Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yang Yingliang 提交于
"foo* bar" or "foo * bar" should be "foo *bar". Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yang Yingliang 提交于
Code indent should use tabs where possible Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yang Yingliang 提交于
return is not a function, parentheses are not required. Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yann Droneaud 提交于
This patch makes socketpair() use error paths which do not rely on heavy-weight call to sys_close(): it's better to try to push the file descriptor to userspace before installing the socket file to the file descriptor, so that errors are catched earlier and being easier to handle. Using sys_close() seems to be the exception, while writing the file descriptor before installing it look like it's more or less the norm: eg. except for code used in init/, error handling involve fput() and put_unused_fd(), but not sys_close(). This make socketpair() usage of sys_close() quite unusual. So it deserves to be replaced by the common pattern relying on fput() and put_unused_fd() just like, for example, the one used in pipe(2) or recvmsg(2). Three distinct error paths are still needed since calling fput() on file structure returned by sock_alloc_file() will implicitly call sock_release() on the associated socket structure. Cc: David S. Miller <davem@davemloft.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NYann Droneaud <ydroneaud@opteya.com> Link: http://marc.info/?i=1385979146-13825-1-git-send-email-ydroneaud@opteya.comSigned-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 stephen hemminger 提交于
Various spelling fixes in networking stack Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
This fixes compile error when CONFIG_NET_NS is not set. Introduced by: commit 1d4c8c29 "neigh: restore old behaviour of default parms values" Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Turned out that applications like ifconfig do not handle the change. So revert ifa_flag format back to 2-letter hex value. Introduced by: commit 479840ff "ipv6 addrconf: extend ifa_flags to u32" Reported-by: NAlexander Aring <alex.aring@gmail.com> Signed-off-by: NJiri Pirko <jiri@resnulli.us> Tested-by: NFLorent Fourcot <florent.fourcot@enst-bretagne.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 12月, 2013 16 次提交
-
-
由 Florent Fourcot 提交于
Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florent Fourcot 提交于
And use it if possible. Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florent Fourcot 提交于
tclass information in now already stored in rcv_flowinfo We do not need to store the same information twice. Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florent Fourcot 提交于
Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florent Fourcot 提交于
The current implementation of IPV6_FLOWINFO only gives a result if pktoptions is available (thanks to the ip6_datagram_recv_ctl function). It gives inconsistent results to user space, sometimes there is a result for getsockopt(IPV6_FLOWINFO), sometimes not. This patch add rcv_flowinfo to store it, and return it to the userspace in the same way than other pkt_options. Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Perches 提交于
Use the newly added generic routine ether_addr_equal_unaligned to test if possibly unaligned to u16 Ethernet addresses are equal. This slightly improves comparison time for systems with CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Make the behaviour similar to ipv4. This will allow user to set sysctl default neigh param values and these values will be respected even by devices registered before (that ones what do not have address set yet). Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Previously inet devices were only constructed when addresses are added. Therefore the default neigh parms values they get are the ones at the time of these operations. Now that we're creating inet devices earlier, this changes the behaviour of default neigh parms values in an incompatible way (see bug #8519). This patch creates a compromise by setting the default values at the same point as before but only for those that have not been explicitly set by the user since the inet device's creation. Introduced by: commit 8030f544 Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Thu Feb 22 01:53:47 2007 +0900 [IPV4] devinet: Register inetdev earlier. Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
This will be needed later on to provide better management of default values. Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
This patch converts the neigh param members to an array. This allows easier manipulation which will be needed later on to provide better management of default values. Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Erik Hugne 提交于
struct 'tipc_bearer' is a generic representation of the underlying media type, and exists in a one-to-one relationship to each interface TIPC is using. The struct contains a 'blocked' flag that mirrors the operational and execution state of the represented interface, and is updated through notification calls from the latter. The users of tipc_bearer are checking this flag before each attempt to send a packet via the interface. This state mirroring serves no purpose in the current code base. TIPC links will not discover a media failure any faster through this mechanism, and in reality the flag only adds overhead at packet sending and reception. Furthermore, the fact that the flag needs to be protected by a spinlock aggregated into tipc_bearer has turned out to cause a serious and completely unnecessary deadlock problem. CPU0 CPU1 ---- ---- Time 0: bearer_disable() link_timeout() Time 1: spin_lock_bh(&b_ptr->lock) tipc_link_push_queue() Time 2: tipc_link_delete() tipc_bearer_blocked(b_ptr) Time 3: k_cancel_timer(&req->timer) spin_lock_bh(&b_ptr->lock) Time 4: del_timer_sync(&req->timer) I.e., del_timer_sync() on CPU0 never returns, because the timer handler on CPU1 is waiting for the bearer lock. We eliminate the 'blocked' flag from struct tipc_bearer, along with all tests on this flag. This not only resolves the deadlock, but also simplifies and speeds up the data path execution of TIPC. It also fits well into our ongoing effort to make the locking policy simpler and more manageable. An effect of this change is that we can get rid of functions such as tipc_bearer_blocked(), tipc_continue() and tipc_block_bearer(). We replace the latter with a new function, tipc_reset_bearer(), which resets all links associated to the bearer immediately after an interface goes down. A user might notice one slight change in link behaviour after this change. When an interface goes down, (e.g. through a NETDEV_DOWN event) all attached links will be reset immediately, instead of leaving it to each link to detect the failure through a timer-driven mechanism. We consider this an improvement, and see no obvious risks with the new behavior. Signed-off-by: NErik Hugne <erik.hugne@ericsson.com> Reviewed-by: NYing Xue <ying.xue@windriver.com> Reviewed-by: NPaul Gortmaker <Paul.Gortmaker@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 wangweidong 提交于
use pr_<level> instead of printk(LEVEL) Suggested-by: NJoe Perches <joe@perches.com> Signed-off-by: NWang Weidong <wangweidong1@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
This patch introduces a PACKET_QDISC_BYPASS socket option, that allows for using a similar xmit() function as in pktgen instead of taking the dev_queue_xmit() path. This can be very useful when PF_PACKET applications are required to be used in a similar scenario as pktgen, but with full, flexible packet payload that needs to be provided, for example. On default, nothing changes in behaviour for normal PF_PACKET TX users, so everything stays as is for applications. New users, however, can now set PACKET_QDISC_BYPASS if needed to prevent own packets from i) reentering packet_rcv() and ii) to directly push the frame to the driver. In doing so we can increase pps (here 64 byte packets) for PF_PACKET a bit: # CPUs -- QDISC_BYPASS -- qdisc path -- qdisc path[**] 1 CPU == 1,509,628 pps -- 1,208,708 -- 1,247,436 2 CPUs == 3,198,659 pps -- 2,536,012 -- 1,605,779 3 CPUs == 4,787,992 pps -- 3,788,740 -- 1,735,610 4 CPUs == 6,173,956 pps -- 4,907,799 -- 1,909,114 5 CPUs == 7,495,676 pps -- 5,956,499 -- 2,014,422 6 CPUs == 9,001,496 pps -- 7,145,064 -- 2,155,261 7 CPUs == 10,229,776 pps -- 8,190,596 -- 2,220,619 8 CPUs == 11,040,732 pps -- 9,188,544 -- 2,241,879 9 CPUs == 12,009,076 pps -- 10,275,936 -- 2,068,447 10 CPUs == 11,380,052 pps -- 11,265,337 -- 1,578,689 11 CPUs == 11,672,676 pps -- 11,845,344 -- 1,297,412 [...] 20 CPUs == 11,363,192 pps -- 11,014,933 -- 1,245,081 [**]: qdisc path with packet_rcv(), how probably most people seem to use it (hopefully not anymore if not needed) The test was done using a modified trafgen, sending a simple static 64 bytes packet, on all CPUs. The trick in the fast "qdisc path" case, is to avoid reentering packet_rcv() by setting the RAW socket protocol to zero, like: socket(PF_PACKET, SOCK_RAW, 0); Tradeoffs are documented as well in this patch, clearly, if queues are busy, we will drop more packets, tc disciplines are ignored, and these packets are not visible to taps anymore. For a pktgen like scenario, we argue that this is acceptable. The pointer to the xmit function has been placed in packet socket structure hole between cached_dev and prot_hook that is hot anyway as we're working on cached_dev in each send path. Done in joint work together with Jesper Dangaard Brouer. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
As we need it elsewhere, move the inline helper function of skb_needs_linearize() over to skbuff.h include file. While at it, also convert the return to 'bool' instead of 'int' and add a proper kernel doc. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Commit e40526cb introduced a cached dev pointer, that gets hooked into register_prot_hook(), __unregister_prot_hook() to update the device used for the send path. We need to fix this up, as otherwise this will not work with sockets created with protocol = 0, plus with sll_protocol = 0 passed via sockaddr_ll when doing the bind. So instead, assign the pointer directly. The compiler can inline these helper functions automagically. While at it, also assume the cached dev fast-path as likely(), and document this variant of socket creation as it seems it is not widely used (seems not even the author of TX_RING was aware of that in his reference example [1]). Tested with reproducer from e40526cb. [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example Fixes: e40526cb ("packet: fix use after free race in send path when dev is released") Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Tested-by: NSalam Noureddine <noureddine@aristanetworks.com> Tested-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-