- 26 2月, 2016 10 次提交
-
-
由 David Decotigny 提交于
This patch defines a new ETHTOOL_GLINKSETTINGS/SLINKSETTINGS API, handled by the new get_link_ksettings/set_link_ksettings callbacks. This API provides support for most legacy ethtool_cmd fields, adds support for larger link mode masks (up to 4064 bits, variable length), and removes ethtool_cmd deprecated fields (transceiver/maxrxpkt/maxtxpkt). This API is deprecating the legacy ETHTOOL_GSET/SSET API and provides the following backward compatibility properties: - legacy ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks. - legacy ethtool with new get/set_link_ksettings drivers: the new driver callbacks are used, data internally converted to legacy ethtool_cmd. ETHTOOL_GSET will return only the 1st 32b of each link mode mask. ETHTOOL_SSET will fail if user tries to set the ethtool_cmd deprecated fields to non-0 (transceiver/maxrxpkt/maxtxpkt). A kernel warning is logged if driver sets higher bits. - future ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks, internally converted to new data structure. Deprecated fields (transceiver/maxrxpkt/maxtxpkt) will be ignored and seen as 0 from user space. Note that that "future" ethtool tool will not allow changes to these deprecated fields. - future ethtool with new drivers: direct call to the new callbacks. By "future" ethtool, what is meant is: - query: first try ETHTOOL_GLINKSETTINGS, and revert to ETHTOOL_GSET if fails - set: query first and remember which of ETHTOOL_GLINKSETTINGS or ETHTOOL_GSET was successful + if ETHTOOL_GLINKSETTINGS was successful, then change config with ETHTOOL_SLINKSETTINGS. A failure there is final (do not try ETHTOOL_SSET). + otherwise ETHTOOL_GSET was successful, change config with ETHTOOL_SSET. A failure there is final (do not try ETHTOOL_SLINKSETTINGS). The interaction user/kernel via the new API requires a small ETHTOOL_GLINKSETTINGS handshake first to agree on the length of the link mode bitmaps. If kernel doesn't agree with user, it returns the bitmap length it is expecting from user as a negative length (and cmd field is 0). When kernel and user agree, kernel returns valid info in all fields (ie. link mode length > 0 and cmd is ETHTOOL_GLINKSETTINGS). Data structure crossing user/kernel boundary is 32/64-bit agnostic. Converted internally to a legal kernel bitmap. The internal __ethtool_get_settings kernel helper will gradually be replaced by __ethtool_get_link_ksettings by the time the first "link_settings" drivers start to appear. So this patch doesn't change it, it will be removed before it needs to be changed. Signed-off-by: NDavid Decotigny <decot@googlers.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
This patch add the SO_CNX_ADVICE socket option (setsockopt only). The purpose is to allow an application to give feedback to the kernel about the quality of the network path for a connected socket. The value argument indicates the type of quality report. For this initial patch the only supported advice is a value of 1 which indicates "bad path, please reroute"-- the action taken by the kernel is to call dst_negative_advice which will attempt to choose a different ECMP route, reset the TX hash for flow label and UDP source port in encapsulation, etc. This facility should be useful for connected UDP sockets where only the application can provide any feedback about path quality. It could also be useful for TCP applications that have additional knowledge about the path outside of the normal TCP control loop. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Currently, all ipv6 addresses are flushed when the interface is configured down, including global, static addresses: $ ip -6 addr show dev eth1 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000 inet6 2100:1::2/120 scope global valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link valid_lft forever preferred_lft forever $ ip link set dev eth1 down $ ip -6 addr show dev eth1 << nothing; all addresses have been flushed>> Add a new sysctl to make this behavior optional. The new setting defaults to flush all addresses to maintain backwards compatibility. When the set global addresses with no expire times are not flushed on an admin down. The sysctl is per-interface or system-wide for all interfaces $ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1 or $ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1 Will keep addresses on eth1 on an admin down. $ ip -6 addr show dev eth1 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000 inet6 2100:1::2/120 scope global valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link valid_lft forever preferred_lft forever $ ip link set dev eth1 down $ ip -6 addr show dev eth1 3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000 inet6 2100:1::2/120 scope global tentative valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative valid_lft forever preferred_lft forever Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
msg.dst_sk needs to be set up with a valid socket because some callbacks later derive the netns from it. Fixes: 263ea09084d172d ("Revert "genl: Add genlmsg_new_unicast() for unicast message allocation") Reported-by: NJon Maloy <maloy@donjonn.com> Bisected-by: NJon Maloy <maloy@donjonn.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Acked-by Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
When the TIPC module is unloaded, we have identified a race condition that allows a node reference counter to go to zero and the node instance being freed before the node timer is finished with accessing it. This leads to occasional crashes, especially in multi-namespace environments. The scenario goes as follows: CPU0:(node_stop) CPU1:(node_timeout) // ref == 2 1: if(!mod_timer()) 2: if (del_timer()) 3: tipc_node_put() // ref -> 1 4: tipc_node_put() // ref -> 0 5: kfree_rcu(node); 6: tipc_node_get(node) 7: // BOOM! We now clean up this functionality as follows: 1) We remove the node pointer from the node lookup table before we attempt deactivating the timer. This way, we reduce the risk that tipc_node_find() may obtain a valid pointer to an instance marked for deletion; a harmless but undesirable situation. 2) We use del_timer_sync() instead of del_timer() to safely deactivate the node timer without any risk that it might be reactivated by the timeout handler. There is no risk of deadlock here, since the two functions never touch the same spinlocks. 3: We remove a pointless tipc_node_get() + tipc_node_put() from the timeout handler. Reported-by: NZhijiang Hu <huzhijiang@gmail.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
Although we have never seen it happen, we have identified the following problematic scenario when nodes are stopped and deleted: CPU0: CPU1: tipc_node_xxx() //ref == 1 tipc_node_put() //ref -> 0 tipc_node_find() // node still in table tipc_node_delete() list_del_rcu(n. list) tipc_node_get() //ref -> 1, bad kfree_rcu() tipc_node_put() //ref to 0 again. kfree_rcu() // BOOM! We fix this by introducing use of the conditional kref_get_if_not_zero() instead of kref_get() in the function tipc_node_find(). This eliminates any risk of post-mortem access. Reported-by: NZhijiang Hu <huzhijiang@gmail.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vivien Didelot 提交于
The VLAN GetNext operation is specific to some switches, and thus can be complicated to implement for some drivers. Remove the support for the vlan_getnext/port_pvid_get approach in favor of the generic and simpler port_vlan_dump function. Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vivien Didelot 提交于
Similar to port_fdb_dump, add a port_vlan_dump function to DSA drivers which gets passed the switchdev VLAN object and callback. This function, if implemented, takes precedence over the soon legacy vlan_getnext/port_pvid_get approach. Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 WANG Cong 提交于
Currently tc actions are stored in a per-module hashtable, therefore are visible to all network namespaces. This is probably the last part of the tc subsystem which is not aware of netns now. This patch makes them per-netns, several tc action API's need to be adjusted for this. The tc action API code is ugly due to historical reasons, we need to refactor that code in the future. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 WANG Cong 提交于
We only release the memory of the hashtable itself, not its entries inside. This is not a problem yet since we only call it in module release path, and module is refcount'ed by actions. This would be a problem after we move the per module hinfo into per netns in the latter patch. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 2月, 2016 6 次提交
-
-
由 Alexander Duyck 提交于
We want to try and pull the L4 header in if it is available in the first fragment. As such add the flag to indicate we want to pull the headers on the first fragment in. Signed-off-by: NAlexander Duyck <aduyck@mirantis.com> Acked-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
The IPv6 parsing was using a local pointer when it could use the same pointer as the IPv4 portion of the code since the key_addrs can support both IPv4 and IPv6 as it is just a pointer. Signed-off-by: NAlexander Duyck <aduyck@mirantis.com> Acked-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
The flow dissector bits handling FCoE didn't bother to actually validate that the space there was enough for the FCoE header. So we need to update things so that if there is room we add the header and report a good result, otherwise we do not add the header, and report the bad result. Signed-off-by: NAlexander Duyck <aduyck@mirantis.com> Acked-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
It turns out that for IPv4 we were reporting the ip_proto of the fragment, and for IPv6 we were not. This patch updates that behavior so that we always report the IP protocol of the fragment. In addition it takes the steps of updating the payload offset code so that we will determine the start of the payload not including the L4 header for any fragment after the first. Signed-off-by: NAlexander Duyck <aduyck@mirantis.com> Acked-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Duyck 提交于
This patch corrects the logic for the IPv4 parsing so that it is consistent with how we handle IPv6. Specifically if we do not have the flow key indicating we want the addresses we still may need to take a look at the IP fragmentation bits and to see if we should stop after we have recognized the L3 header. Fixes: 807e165d ("flow_dissector: Add control/reporting of fragmentation") Signed-off-by: NAlexander Duyck <aduyck@mirantis.com> Acked-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Craig Gallek 提交于
One of the validation checks for the new array-based TCP SO_REUSEPORT validation was unintentionally dropped in ea8add2b. This adds it back. Lack of this check allows the user to allocate multiple sock_reuseport structures (leaking all but the first). Fixes: ea8add2b ("tcp/dccp: better use of ephemeral ports in bind()") Signed-off-by: NCraig Gallek <kraig@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 2月, 2016 2 次提交
-
-
由 Vivien Didelot 提交于
DSA drivers may support multiple bridge groups with the same hardware VLAN. The mv88e6xxx driver which cannot yet, already has its own check for overlapping bridges. Thus remove the check from the DSA layer. Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vivien Didelot 提交于
Some DSA drivers may or may not support multiple software bridges on top of an hardware switch. It is more convenient for them to access the bridge's net_device for finer configuration. Removing the need to craft and access a bitmask also simplifies the code. This patch changes the signature of bridge related functions, update DSA drivers, and removes dsa_slave_br_port_mask. Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Tested-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 2月, 2016 20 次提交
-
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Sven Eckelmann 提交于
The batman-adv source code is the only place in the kernel which uses the *_free_ref naming scheme for the *_put functions. Changing it to *_put makes it more consistent and makes it easier to understand the connection to the *_get functions. Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch> Signed-off-by: NAntonio Quartulli <a@unstable.cc>
-
由 Antonio Quartulli 提交于
BATADV_BONDING_TQ_THRESHOLD is not used anymore since the implementation of the bat_neigh_is_similar_or_better() API function. Such function uses the more generic BATADV_TQ_SIMILARITY_THRESHOLD constant. Therefore, remove definition of the unused BATADV_BONDING_TQ_THRESHOLD constant. Signed-off-by: NAntonio Quartulli <a@unstable.cc> Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
-
- 22 2月, 2016 2 次提交
-
-
由 Daniel Borkmann 提交于
While debugging with bpf_jit_disasm I noticed emissions of 'mov %eax,%eax', and found that this comes from BPF_RET | BPF_A translations from classic BPF. Emitting this is unnecessary as BPF_REG_A is mapped into BPF_REG_0 already, therefore only emit a mov when immediates are used as return value. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
When using this helper for updating UDP checksums, we need to extend this in order to write CSUM_MANGLED_0 for csum computations that result into 0 as sum. Reason we need this is because packets with a checksum could otherwise become incorrectly marked as a packet without a checksum. Likewise, if the user indicates BPF_F_MARK_MANGLED_0, then we should not turn packets without a checksum into ones with a checksum. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-