- 12 6月, 2014 6 次提交
-
-
由 Daniel Borkmann 提交于
In function sctp_select_active_and_retran_path(), we walk the transport list in order to look for the two most recently used ACTIVE transports (trans_pri, trans_sec). In case we didn't find anything ACTIVE, we currently just camp on a possibly PF or INACTIVE transport that is primary path; this behavior actually dates back to linux-history tree of the very early days of lksctp, and can yield a behavior that chooses suboptimal transport paths. Instead, be a bit more clever by reusing and extending the recently introduced sctp_trans_elect_best() handler. In case both transports are evaluated to have the same score resulting from their states, break the tie by looking at: 1) transport patch error count 2) last_time_heard value from each transport. This is analogous to Nishida's Quick Failover draft [1], section 5.1, 3: The sender SHOULD avoid data transmission to PF destinations. When all destinations are in either PF or Inactive state, the sender MAY either move the destination from PF to active state (and transmit data to the active destination) or the sender MAY transmit data to a PF destination. In the former scenario, (i) the sender MUST NOT notify the ULP about the state transition, and (ii) MUST NOT clear the destination's error counter. It is recommended that the sender picks the PF destination with least error count (fewest consecutive timeouts) for data transmission. In case of a tie (multiple PF destinations with same error count), the sender MAY choose the last active destination. Thus for sctp_select_active_and_retran_path(), we keep track of the best, if any, transport that is in PF state and in case no ACTIVE transport has been found (hence trans_{pri,sec} is NULL), we select the best out of the three: current primary_path and retran_path as well as a possible PF transport. The secondary may still camp on the original primary_path as before. The change in sctp_trans_elect_best() with a more fine grained tie selection also improves at the same time path selection for sctp_assoc_update_retran_path() in case of non-ACTIVE states. [1] http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Be more precise in transport path selection and use ktime helpers instead of jiffies to compare and pick the better primary and secondary recently used transports. This also avoids any side-effects during a possible roll-over, and could lead to better path decision-making. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
This patch just refactors and moves the code for the active path selection into its own helper function outside of sctp_assoc_control_transport() which is already big enough. No functional changes here. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Add two minimal helper functions analogous to time_before() and time_after() that will later on both be needed by SCTP code. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Phoebe Buckheister 提交于
Only one WPAN devices can be active at any given time, so only deliver packets to that one interface that is actually up. Multiple monitors may be up at any given time, but we don't have to deliver to monitors that are down either. Signed-off-by: NPhoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Phoebe Buckheister 提交于
mac802154 RX did not free skbs on decryption failure, assuming that the caller would when the local rx handler returned _DROP. This was false. Signed-off-by: NPhoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 6月, 2014 7 次提交
-
-
由 Alexei Starovoitov 提交于
The macro 'A' used in internal BPF interpreter: #define A regs[insn->a_reg] was easily confused with the name of classic BPF register 'A', since 'A' would mean two different things depending on context. This patch is trying to clean up the naming and clarify its usage in the following way: - A and X are names of two classic BPF registers - BPF_REG_A denotes internal BPF register R0 used to map classic register A in internal BPF programs generated from classic - BPF_REG_X denotes internal BPF register R7 used to map classic register X in internal BPF programs generated from classic - internal BPF instruction format: struct sock_filter_int { __u8 code; /* opcode */ __u8 dst_reg:4; /* dest register */ __u8 src_reg:4; /* source register */ __s16 off; /* signed offset */ __s32 imm; /* signed immediate constant */ }; - BPF_X/BPF_K is 1 bit used to encode source operand of instruction In classic: BPF_X - means use register X as source operand BPF_K - means use 32-bit immediate as source operand In internal: BPF_X - means use 'src_reg' register as source operand BPF_K - means use 32-bit immediate as source operand Suggested-by: NChema Gonzalez <chema@google.com> Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Acked-by: NDaniel Borkmann <dborkman@redhat.com> Acked-by: NChema Gonzalez <chema@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Linus Lüssing 提交于
Adding bridge support to the batman-adv multicast optimization requires batman-adv knowing about the existence of bridged-in IGMP/MLD queriers to be able to reliably serve any multicast listener behind this same bridge. Signed-off-by: NLinus Lüssing <linus.luessing@web.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Linus Lüssing 提交于
With this new, exported function br_multicast_list_adjacent(net_dev) a list of IPv4/6 addresses is returned. This list contains all multicast addresses sensed by the bridge multicast snooping feature on all bridge ports of the bridge interface of net_dev, excluding addresses from the specified net_device itself. Adding bridge support to the batman-adv multicast optimization requires batman-adv knowing about the existence of bridged-in multicast listeners to be able to reliably serve them with multicast packets. Signed-off-by: NLinus Lüssing <linus.luessing@web.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Linus Lüssing 提交于
MLDv1 (RFC2710 section 6), MLDv2 (RFC3810 section 7.6.2), IGMPv2 (RFC2236 section 3) and IGMPv3 (RFC3376 section 6.6.2) specify that the querier with lowest source address shall become the selected querier. So far the bridge stopped its querier as soon as it heard another querier regardless of its source address. This results in the "wrong" querier potentially becoming the active querier or a potential, unnecessary querying delay. With this patch the bridge memorizes the source address of the currently selected querier and ignores queries from queriers with a higher source address than the currently selected one. This slight optimization is supposed to make it more RFC compliant (but is rather uncritical and therefore probably not necessary to be queued for stable kernels). Signed-off-by: NLinus Lüssing <linus.luessing@web.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Linus Lüssing 提交于
The current naming of these two structs is very random, in that reversing their naming would not make any semantical difference. This patch tries to make the naming less confusing by giving them a more specific, distinguishable naming. This is also useful for the upcoming patches reintroducing the "struct bridge_mcast_querier" but for storing information about the selected querier (no matter if our own or a foreign querier). Signed-off-by: NLinus Lüssing <linus.luessing@web.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 stephen hemminger 提交于
There is no need to require forcing device down on a Ethernet GRE (gretap) tunnel to change the MAC address. Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
tcp_fragment can be called from process context (from tso_fragment). Add a new gfp parameter to allow it to preserve atomic memory if possible. Signed-off-by: NOctavian Purdila <octavian.purdila@intel.com> Reviewed-by: NChristoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 6月, 2014 1 次提交
-
-
由 Phoebe Buckheister 提交于
During key removal, the key object is freed, but not taken out of the llsec key list properly. Fix that. Signed-off-by: NPhoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 6月, 2014 4 次提交
-
-
由 Sven Wegener 提交于
To avoid the confusion of having two variables, shrink the function to only use the parameter variable for looping. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NSven Wegener <sven.wegener@stealer.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexei Starovoitov 提交于
BPF classic->internal converter broke SKF_AD_PKTTYPE extension, since pkt_type_offset() was failing to find skb->pkt_type field which is defined as: __u8 pkt_type:3, fclone:2, ipvs_property:1, peeked:1, nf_trace:1; Fix it by searching for 3 most significant bits and shift them by 5 at run-time Fixes: bd4cf0ed ("net: filter: rework/optimize internal BPF interpreter's instruction set") Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com> Acked-by: NDaniel Borkmann <dborkman@redhat.com> Tested-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Toshiaki Makita 提交于
br_manage_promisc() incorrectly expects br_auto_port() to return only 0 or 1, while it actually returns flags, i.e., a subset of BR_AUTO_MASK. Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Simon Horman 提交于
If an MPLS packet requires segmentation then use mpls_features to determine if the software implementation should be used. As no driver advertises MPLS GSO segmentation this will always be the case. I had not noticed that this was necessary before as software MPLS GSO segmentation was already being used in my test environment. I believe that the reason for that is the skbs in question always had fragments and the driver I used does not advertise NETIF_F_FRAGLIST (which seems to be the case for most drivers). Thus software segmentation was activated by skb_gso_ok(). This introduces the overhead of an extra call to skb_network_protocol() in the case where where CONFIG_NET_MPLS_GSO is set and skb->ip_summed == CHECKSUM_NONE. Thanks to Jesse Gross for prompting me to investigate this. Signed-off-by: NSimon Horman <horms@verge.net.au> Acked-by: NYAMAMOTO Takashi <yamamoto@valinux.co.jp> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 6月, 2014 12 次提交
-
-
由 WANG Cong 提交于
Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 WANG Cong 提交于
It is available since v3.15-rc5. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Manuel Schölling 提交于
According to RFC1035 "[...] the total length of a domain name (i.e., label octets and label length octets) is restricted to 255 octets or less." Signed-off-by: NManuel Schölling <manuel.schoelling@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Added VXLAN link configuration for sending UDP checksums, and allowing TX and RX of UDP6 checksums. Also, call common iptunnel_handle_offloads and added GSO support for checksums. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Call gso_make_checksum. This should have the benefit of using a checksum that may have been previously computed for the packet. This also adds NETIF_F_GSO_GRE_CSUM to differentiate devices that offload GRE GSO with and without the GRE checksum offloaed. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Added a new netif feature for GSO_UDP_TUNNEL_CSUM. This indicates that a device is capable of computing the UDP checksum in the encapsulating header of a UDP tunnel. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Call common gso_make_checksum when calculating checksum for a TCP GSO segment. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
When creating a GSO packet segment we may need to set more than one checksum in the packet (for instance a TCP checksum and UDP checksum for VXLAN encapsulation). To be efficient, we want to do checksum calculation for any part of the packet at most once. This patch adds csum_start offset to skb_gso_cb. This tracks the starting offset for skb->csum which is initially set in skb_segment. When a protocol needs to compute a transport checksum it calls gso_make_checksum which computes the checksum value from the start of transport header to csum_start and then adds in skb->csum to get the full checksum. skb->csum and csum_start are then updated to reflect the checksum of the resultant packet starting from the transport header. This patch also adds a flag to skbuff, encap_hdr_csum, which is set in *gso_segment fucntions to indicate that a tunnel protocol needs checksum calculation Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Call common functions to set checksum for UDP tunnel. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Added udp_set_csum and udp6_set_csum functions to set UDP checksums in packets. These are for simple UDP packets such as those that might be created in UDP tunnels. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sven Wegener 提交于
Commit efe4208f ("ipv6: make lookups simpler and faster") introduced a regression in udp_v6_mcast_next(), resulting in multicast packets not reaching the destination sockets under certain conditions. The packet's IPv6 addresses are wrongly compared to the IPv6 addresses from the function's socket argument, which indicates the starting point for looping, instead of the loop variable. If the addresses from the first socket do not match the packet's addresses, no socket in the list will match. Signed-off-by: NSven Wegener <sven.wegener@stealer.net> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sasha Levin 提交于
This reverts commit 30f38d2f. fib_triestat is surrounded by a big lie: while it claims that it's a seq_file (fib_triestat_seq_open, fib_triestat_seq_show), it isn't: static const struct file_operations fib_triestat_fops = { .owner = THIS_MODULE, .open = fib_triestat_seq_open, .read = seq_read, .llseek = seq_lseek, .release = single_release_net, }; Yes, fib_triestat is just a regular file. A small detail (assuming CONFIG_NET_NS=y) is that while for seq_files you could do seq_file_net() to get the net ptr, doing so for a regular file would be wrong and would dereference an invalid pointer. The fib_triestat lie claimed a victim, and trying to show the file would be bad for the kernel. This patch just reverts the issue and fixes fib_triestat, which still needs a rewrite to either be a seq_file or stop claiming it is. Signed-off-by: NSasha Levin <sasha.levin@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 6月, 2014 3 次提交
-
-
由 WANG Cong 提交于
When we jump to free_pcpu on failure in alloc_netdev_mqs() rx and tx queues are not yet allocated, so no need to free them. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
It is possible that ->newlink() fails before registering the device, in this case we should just free it, it's safe to call free_netdev(). Fixes: commit 0e0eee24 (net: correct error path in rtnl_newlink()) Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NCong Wang <cwang@twopensource.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michal Kubecek 提交于
The xfrm_user module registers its pernet init/exit after xfrm itself so that its net exit function xfrm_user_net_exit() is executed before xfrm_net_exit() which calls xfrm_state_fini() to cleanup the SA's (xfrm states). This opens a window between zeroing net->xfrm.nlsk pointer and deleting all xfrm_state instances which may access it (via the timer). If an xfrm state expires in this window, xfrm_exp_state_notify() will pass null pointer as socket to nlmsg_multicast(). As the notifications are called inside rcu_read_lock() block, it is sufficient to retrieve the nlsk socket with rcu_dereference() and check the it for null. Signed-off-by: NMichal Kubecek <mkubecek@suse.cz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 6月, 2014 7 次提交
-
-
由 Ben Hutchings 提交于
We should fail rather than silently ignoring use of these extensions. Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
-
由 Ben Hutchings 提交于
ETHTOOL_{G,S}RXFHINDIR and ETHTOOL_{G,S}RSSH should work for drivers regardless of whether they expose the hash key, unless you try to set a hash key for a driver that doesn't expose it. Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Acked-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Roopa Prabhu 提交于
(This patch was previously posted as RFC at http://patchwork.ozlabs.org/patch/352677/) This patch adds NDA_MASTER attribute to neighbour attributes enum for bridge/master ifindex. And adds NDA_MASTER to bridge fdb notify msgs. Today bridge fdb notifications dont contain bridge information. Userspace can derive it from the port information in the fdb notification. However this is tricky in some scenarious. Example, bridge port delete notification comes before bridge fdb delete notifications. And we have seen problems in userspace when using libnl where, the bridge fdb delete notification handling code does not understand which bridge this fdb entry is part of because the bridge and port association has already been deleted. And these notifications (port membership and fdb) are generated on separate rtnl groups. Fixing the order of notifications could possibly solve the problem for some cases (I can submit a separate patch for that). This patch chooses to add NDA_MASTER to bridge fdb notify msgs because it not only solves the problem described above, but also helps userspace avoid another lookup into link msgs to derive the master index. Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Leon Yu 提交于
__sk_prepare_filter() was reworked in commit bd4cf0ed (net: filter: rework/optimize internal BPF interpreter's instruction set) so that it should have uncharged memory once things went wrong. However that work isn't complete. Error is handled only in __sk_migrate_filter() while memory can still leak in the error path right after sk_chk_filter(). Fixes: bd4cf0ed ("net: filter: rework/optimize internal BPF interpreter's instruction set") Signed-off-by: NLeon Yu <chianglungyu@gmail.com> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Tested-by: NAlexei Starovoitov <ast@plumgrid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuchung Cheng 提交于
This bug is discovered by an recent F-RTO issue on tcpm list https://www.ietf.org/mail-archive/web/tcpm/current/msg08794.html The bug is that currently F-RTO does not use DSACK to undo cwnd in certain cases: upon receiving an ACK after the RTO retransmission in F-RTO, and the ACK has DSACK indicating the retransmission is spurious, the sender only calls tcp_try_undo_loss() if some never retransmisted data is sacked (FLAG_ORIG_DATA_SACKED). The correct behavior is to unconditionally call tcp_try_undo_loss so the DSACK information is used properly to undo the cwnd reduction. Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Make fib_triestat_seq_show consistent with other /proc/net files and use seq_file_net. Signed-off-by: NDavid Ahern <dsahern@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
It was possible to get a setuid root or setcap executable to write to it's stdout or stderr (which has been set made a netlink socket) and inadvertently reconfigure the networking stack. To prevent this we check that both the creator of the socket and the currentl applications has permission to reconfigure the network stack. Unfortunately this breaks Zebra which always uses sendto/sendmsg and creates it's socket without any privileges. To keep Zebra working don't bother checking if the creator of the socket has privilege when a destination address is specified. Instead rely exclusively on the privileges of the sender of the socket. Note from Andy: This is exactly Eric's code except for some comment clarifications and formatting fixes. Neither I nor, I think, anyone else is thrilled with this approach, but I'm hesitant to wait on a better fix since 3.15 is almost here. Note to stable maintainers: This is a mess. An earlier series of patches in 3.15 fix a rather serious security issue (CVE-2014-0181), but they did so in a way that breaks Zebra. The offending series includes: commit aa4cf945 Author: Eric W. Biederman <ebiederm@xmission.com> Date: Wed Apr 23 14:28:03 2014 -0700 net: Add variants of capable for use on netlink messages If a given kernel version is missing that series of fixes, it's probably worth backporting it and this patch. if that series is present, then this fix is critical if you care about Zebra. Cc: stable@vger.kernel.org Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-