- 07 12月, 2013 4 次提交
-
-
由 Jiri Pirko 提交于
There is no more space in u8 ifa_flags. So do what davem suffested and add another netlink attr called IFA_FLAGS for carry more flags. Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NThomas Haller <thaller@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
With the introduction of TCP Small Queues, TSO auto sizing, and TCP pacing, we can implement Automatic Corking in the kernel, to help applications doing small write()/sendmsg() to TCP sockets. Idea is to change tcp_push() to check if the current skb payload is under skb optimal size (a multiple of MSS bytes) If under 'size_goal', and at least one packet is still in Qdisc or NIC TX queues, set the TCP Small Queue Throttled bit, so that the push will be delayed up to TX completion time. This delay might allow the application to coalesce more bytes in the skb in following write()/sendmsg()/sendfile() system calls. The exact duration of the delay is depending on the dynamics of the system, and might be zero if no packet for this flow is actually held in Qdisc or NIC TX ring. Using FQ/pacing is a way to increase the probability of autocorking being triggered. Add a new sysctl (/proc/sys/net/ipv4/tcp_autocorking) to control this feature and default it to 1 (enabled) Add a new SNMP counter : nstat -a | grep TcpExtTCPAutoCorking This counter is incremented every time we detected skb was under used and its flush was deferred. Tested: Interesting effects when using line buffered commands under ssh. Excellent performance results in term of cpu usage and total throughput. lpq83:~# echo 1 >/proc/sys/net/ipv4/tcp_autocorking lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128 9410.39 Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128': 35209.439626 task-clock # 2.901 CPUs utilized 2,294 context-switches # 0.065 K/sec 101 CPU-migrations # 0.003 K/sec 4,079 page-faults # 0.116 K/sec 97,923,241,298 cycles # 2.781 GHz [83.31%] 51,832,908,236 stalled-cycles-frontend # 52.93% frontend cycles idle [83.30%] 25,697,986,603 stalled-cycles-backend # 26.24% backend cycles idle [66.70%] 102,225,978,536 instructions # 1.04 insns per cycle # 0.51 stalled cycles per insn [83.38%] 18,657,696,819 branches # 529.906 M/sec [83.29%] 91,679,646 branch-misses # 0.49% of all branches [83.40%] 12.136204899 seconds time elapsed lpq83:~# echo 0 >/proc/sys/net/ipv4/tcp_autocorking lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128 6624.89 Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128': 40045.864494 task-clock # 3.301 CPUs utilized 171 context-switches # 0.004 K/sec 53 CPU-migrations # 0.001 K/sec 4,080 page-faults # 0.102 K/sec 111,340,458,645 cycles # 2.780 GHz [83.34%] 61,778,039,277 stalled-cycles-frontend # 55.49% frontend cycles idle [83.31%] 29,295,522,759 stalled-cycles-backend # 26.31% backend cycles idle [66.67%] 108,654,349,355 instructions # 0.98 insns per cycle # 0.57 stalled cycles per insn [83.34%] 19,552,170,748 branches # 488.244 M/sec [83.34%] 157,875,417 branch-misses # 0.81% of all branches [83.34%] 12.130267788 seconds time elapsed Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jeff Kirsher 提交于
Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jeff Kirsher 提交于
Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: Vlad Yasevich <vyasevich@gmail.com> CC: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 12月, 2013 2 次提交
-
-
由 Simon Wunderlich 提交于
The channel switch notification should be sent under the wdev/sdata-lock, preferably in the same moment as the channel change happens, to avoid races by other callers (e.g. start/stop_ap). This also adds the previously missing sdata_lock protection in csa_finalize_work. Reported-by: NJohannes Berg <johannes@sipsolutions.net> Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Andrei Otcheretianski 提交于
Change cfg80211 and mac80211 to use cfg80211_mgmt_tx_params struct to aggregate parameters for mgmt_tx functions. This makes the functions' signatures less clumsy and allows less painful parameters extension. Signed-off-by: NAndrei Otcheretianski <andrei.otcheretianski@intel.com> [fix all other drivers] Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
- 29 11月, 2013 1 次提交
-
-
由 Xufeng Zhang 提交于
Currently retransmitted DATA chunks could also be used for RTT measurements since there are no flag to identify whether the transmitted DATA chunk is a new one or a retransmitted one. This problem is introduced by commit ae19c548 ("sctp: remove 'resent' bit from the chunk") which inappropriately removed the 'resent' bit completely, instead of doing this, we should set the resent bit only for the retransmitted DATA chunks. Signed-off-by: NXufeng Zhang <xufeng.zhang@windriver.com> Acked-by: NVlad Yasevich <vyasevich@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 11月, 2013 9 次提交
-
-
由 Luis R. Rodriguez 提交于
u8 was used in some other places, just stick to the enum, this forces us to express the values that are expected. Signed-off-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Eliad Peller 提交于
Add a new field to ieee80211_chanctx_conf to indicate the min required channel configuration. Tuning to a narrower channel might help reducing the noise level and saving some power. The min required channel definition is the max of all min required channel definitions of the interfaces bound to this channel context. In AP mode, use 20MHz when there are no connected station. When a new station is added/removed, calculate the new max bandwidth supported by any of the stations (e.g. 80MHz when 80MHz and 40MHz stations are connected). In other cases, simply use bss_conf.chandef as the min required chandef. Notify drivers about changes to this field by calling drv_change_chanctx with a new CHANGE_MIN_WIDTH notification. Signed-off-by: NEliad Peller <eliad@wizery.com> Reviewed-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Luis R. Rodriguez 提交于
Certain vendors may want to disable the processing of country IEs so that they can continue using the regulatory domain the driver or user has set. Currently there is no way to stop the core from processing country IEs, so add support to the core to ignore country IE hints. Cc: Mihir Shete <smihir@qti.qualcomm.com> Cc: Henri Bahini <hbahini@qca.qualcomm.com> Cc: Tushnim Bhattacharyya <tushnimb@qca.qualcomm.com> Signed-off-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Luis R. Rodriguez 提交于
802.11 cards may have different country IE parsing behavioural preferences and vendors may want to support these. These preferences were managed by the REGULATORY_CUSTOM_REG and the REGULATORY_STRICT_REG flags and their combination. Instead of using this existing notation, split out the country IE behavioural preferences as a new flag. This will allow us to add more customizations easily and make the code more maintainable. Cc: Mihir Shete <smihir@qti.qualcomm.com> Cc: Henri Bahini <hbahini@qca.qualcomm.com> Cc: Tushnim Bhattacharyya <tushnimb@qca.qualcomm.com> Signed-off-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com> [fix up conflicts] Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Luis R. Rodriguez 提交于
We'll expand this later, this will make it easier to classify and review what things are related to regulatory or not. Coccinelle only missed 4 hits, which I had to do manually, supplying the SmPL in case of merge conflicts. @@ struct wiphy *wiphy; @@ -wiphy->flags |= WIPHY_FLAG_CUSTOM_REGULATORY +wiphy->regulatory_flags |= REGULATORY_CUSTOM_REG @@ expression e; @@ -e->flags |= WIPHY_FLAG_CUSTOM_REGULATORY +e->regulatory_flags |= REGULATORY_CUSTOM_REG @@ struct wiphy *wiphy; @@ -wiphy->flags &= ~WIPHY_FLAG_CUSTOM_REGULATORY +wiphy->regulatory_flags &= ~REGULATORY_CUSTOM_REG @@ struct wiphy *wiphy; @@ -wiphy->flags & WIPHY_FLAG_CUSTOM_REGULATORY +wiphy->regulatory_flags & REGULATORY_CUSTOM_REG @@ struct wiphy *wiphy; @@ -wiphy->flags |= WIPHY_FLAG_STRICT_REGULATORY +wiphy->regulatory_flags |= REGULATORY_STRICT_REG @@ expression e; @@ -e->flags |= WIPHY_FLAG_STRICT_REGULATORY +e->regulatory_flags |= REGULATORY_STRICT_REG @@ struct wiphy *wiphy; @@ -wiphy->flags &= ~WIPHY_FLAG_STRICT_REGULATORY +wiphy->regulatory_flags &= ~REGULATORY_STRICT_REG @@ struct wiphy *wiphy; @@ -wiphy->flags & WIPHY_FLAG_STRICT_REGULATORY +wiphy->regulatory_flags & REGULATORY_STRICT_REG @@ struct wiphy *wiphy; @@ -wiphy->flags |= WIPHY_FLAG_DISABLE_BEACON_HINTS +wiphy->regulatory_flags |= REGULATORY_DISABLE_BEACON_HINTS @@ expression e; @@ -e->flags |= WIPHY_FLAG_DISABLE_BEACON_HINTS +e->regulatory_flags |= REGULATORY_DISABLE_BEACON_HINTS @@ struct wiphy *wiphy; @@ -wiphy->flags &= ~WIPHY_FLAG_DISABLE_BEACON_HINTS +wiphy->regulatory_flags &= ~REGULATORY_DISABLE_BEACON_HINTS @@ struct wiphy *wiphy; @@ -wiphy->flags & WIPHY_FLAG_DISABLE_BEACON_HINTS +wiphy->regulatory_flags & REGULATORY_DISABLE_BEACON_HINTS Generated-by: Coccinelle SmPL Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Peter Senna Tschudin <peter.senna@gmail.com> Cc: Mihir Shete <smihir@qti.qualcomm.com> Cc: Henri Bahini <hbahini@qca.qualcomm.com> Cc: Tushnim Bhattacharyya <tushnimb@qca.qualcomm.com> Signed-off-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com> [fix up whitespace damage, overly long lines] Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Max Stepanov 提交于
This adds generic cipher scheme support to mac80211, such schemes are fully under control by the driver. On hw registration drivers may specify additional HW ciphers with a scheme how these ciphers have to be handled by mac80211 TX/RR. A cipher scheme specifies a cipher suite value, a size of the security header to be added to or stripped from frames and how the PN is to be verified on RX. Signed-off-by: NMax Stepanov <Max.Stepanov@intel.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Janusz Dziedzic 提交于
To report channel width correctly we have to send correct channel parameters from mac80211 when calling cfg80211_cac_event(). This is required in case of using channel width higher than 20MHz and we have to set correct dfs channel state after CAC (NL80211_DFS_AVAILABLE). Signed-off-by: NJanusz Dziedzic <janusz.dziedzic@tieto.com> Reviewed-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Luis R. Rodriguez 提交于
wiphy_apply_custom_regulatory() implies WIPHY_FLAG_CUSTOM_REGULATORY but we never enforced it, do that now and warn if the driver didn't set it. All drivers should be following this today already. Having WIPHY_FLAG_CUSTOM_REGULATORY does not however mean you will use wiphy_apply_custom_regulatory() though, you may have your own _orig value set up tools / helpers. The intel drivers are examples of this type of driver. Signed-off-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Luis R. Rodriguez 提交于
These two flags are used for the same purpose, just combine them into a no-ir flag to annotate no initiating radiation is allowed. Old userspace sending either flag will have it treated as the no-ir flag. To be considerate to older userspace we also send both the no-ir flag and the old no-ibss flags. Newer userspace will have to be aware of older kernels. Update all places in the tree using these flags with the following semantic patch: @@ @@ -NL80211_RRF_PASSIVE_SCAN +NL80211_RRF_NO_IR @@ @@ -NL80211_RRF_NO_IBSS +NL80211_RRF_NO_IR @@ @@ -IEEE80211_CHAN_PASSIVE_SCAN +IEEE80211_CHAN_NO_IR @@ @@ -IEEE80211_CHAN_NO_IBSS +IEEE80211_CHAN_NO_IR @@ @@ -NL80211_RRF_NO_IR | NL80211_RRF_NO_IR +NL80211_RRF_NO_IR @@ @@ -IEEE80211_CHAN_NO_IR | IEEE80211_CHAN_NO_IR +IEEE80211_CHAN_NO_IR @@ @@ -(NL80211_RRF_NO_IR) +NL80211_RRF_NO_IR @@ @@ -(IEEE80211_CHAN_NO_IR) +IEEE80211_CHAN_NO_IR Along with some hand-optimisations in documentation, to remove duplicates and to fix some indentation. Signed-off-by: NLuis R. Rodriguez <mcgrof@do-not-panic.com> [do all the driver updates in one go] Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
- 24 11月, 2013 1 次提交
-
-
由 Hannes Frederic Sowa 提交于
Commit bceaa902 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") conditionally updated addr_len if the msg_name is written to. The recv_error and rxpmtu functions relied on the recvmsg functions to set up addr_len before. As this does not happen any more we have to pass addr_len to those functions as well and set it to the size of the corresponding sockaddr length. This broke traceroute and such. Fixes: bceaa902 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") Reported-by: NBrad Spengler <spender@grsecurity.net> Reported-by: Tom Labanowski Cc: mpb <mpb.mail@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 11月, 2013 2 次提交
-
-
由 Johannes Berg 提交于
Fix another really stupid bug - I introduced genl_set_err() precisely to be able to adjust the group and reject invalid ones, but then forgot to do so. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
Unfortunately, I introduced a tremendously stupid bug into genlmsg_multicast() when doing all those multicast group changes: it adjusts the group number, but then passes it to genlmsg_multicast_netns() which does that again. Somehow, my tests failed to catch this, so add a warning into genlmsg_multicast_netns() and remove the offending group ID adjustment. Also add a warning to the similar code in other functions so people who misuse them are more loudly warned. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 11月, 2013 6 次提交
-
-
由 Johannes Berg 提交于
Register generic netlink multicast groups as an array with the family and give them contiguous group IDs. Then instead of passing the global group ID to the various functions that send messages, pass the ID relative to the family - for most families that's just 0 because the only have one group. This avoids the list_head and ID in each group, adding a new field for the mcast group ID offset to the family. At the same time, this allows us to prevent abusing groups again like the quota and dropmon code did, since we can now check that a family only uses a group it owns. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
This doesn't really change anything, but prepares for the next patch that will change the APIs to pass the group ID within the family, rather than the global group ID. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
Add a static inline to generic netlink to wrap netlink_set_err() to make it easier to use here - use it in openvswitch (the only generic netlink user of netlink_set_err()). Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
There's no reason to have the family pointer there since it can just be passed internally where needed, so remove it. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
There are no users of this API remaining, and we'll soon change group registration to be static (like ops are now) Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
As suggested by David Miller, make genl_register_family_with_ops() a macro and pass only the array, evaluating ARRAY_SIZE() in the macro, this is a little safer. The openvswitch has some indirection, assing ops/n_ops directly in that code. This might ultimately just assign the pointers in the family initializations, saving the struct genl_family_and_ops and code (once mcast groups are handled differently.) Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 11月, 2013 1 次提交
-
-
由 Johannes Berg 提交于
Now that the ops assignment is just two variables rather than a long list iteration etc., there's no reason to separately export __genl_register_family() and __genl_register_family_with_ops(). Unify the two functions into __genl_register_family() and make genl_register_family_with_ops() call it after assigning the ops. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 11月, 2013 4 次提交
-
-
由 Johannes Berg 提交于
To save some space in the struct on 32-bit systems, make the flags a u8 (only 4 bits are used) and also move them to the end of the struct. This has no impact on 64-bit systems as alignment of the struct in an array uses up the space anyway. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
Allow making the ops array const by not modifying the ops flags on registration but rather only when ops are sent out in the family information. No users are updated yet except for the pre_doit/post_doit calls in wireless (the only ones that exist now.) Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
Instead of using a linked list, use an array. This reduces the data size needed by the users of genetlink, for example in wireless (net/wireless/nl80211.c) on 64-bit it frees up over 1K of data space. Remove the attempted sending of CTRL_CMD_NEWOPS ctrl event since genl_ctrl_event(CTRL_CMD_NEWOPS, ...) only returns -EINVAL anyway, therefore no such event could ever be sent. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
genl_register_ops() is still needed for internal registration, but is no longer available to users of the API. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 11月, 2013 1 次提交
-
-
由 Jiri Pirko 提交于
Pushing original fragments through causes several problems. For example for matching, frags may not be matched correctly. Take following example: <example> On HOSTA do: ip6tables -I INPUT -p icmpv6 -j DROP ip6tables -I INPUT -p icmpv6 -m icmp6 --icmpv6-type 128 -j ACCEPT and on HOSTB you do: ping6 HOSTA -s2000 (MTU is 1500) Incoming echo requests will be filtered out on HOSTA. This issue does not occur with smaller packets than MTU (where fragmentation does not happen) </example> As was discussed previously, the only correct solution seems to be to use reassembled skb instead of separete frags. Doing this has positive side effects in reducing sk_buff by one pointer (nfct_reasm) and also the reams dances in ipvs and conntrack can be removed. Future plan is to remove net/ipv6/netfilter/nf_conntrack_reasm.c entirely and use code in net/ipv6/reassembly.c instead. Signed-off-by: NJiri Pirko <jiri@resnulli.us> Acked-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NMarcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 11月, 2013 1 次提交
-
-
由 Florent Fourcot 提交于
It is already possible to set/put/renew a label with IPV6_FLOWLABEL_MGR and setsockopt. This patch add the possibility to get information about this label (current value, time before expiration, etc). It helps application to take decision for a renew or a release of the label. v2: * Add spin_lock to prevent race condition * return -ENOENT if no result found * check if flr_action is GET v3: * move the spin_lock to protect only the relevant code Signed-off-by: NFlorent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 11月, 2013 1 次提交
-
-
由 Mathias Krause 提交于
This function has usage beside IPsec so move it to the core skbuff code. While doing so, give it some documentation and change its return type to 'unsigned char *' to be in line with skb_put(). Signed-off-by: NMathias Krause <mathias.krause@secunet.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 11月, 2013 1 次提交
-
-
由 Joe Perches 提交于
Just an unnecessary semicolon that should be removed... Whitespace neatening of macro too. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 11月, 2013 1 次提交
-
-
由 Hannes Frederic Sowa 提交于
Sockets marked with IP_PMTUDISC_INTERFACE won't do path mtu discovery, their sockets won't accept and install new path mtu information and they will always use the interface mtu for outgoing packets. It is guaranteed that the packet is not fragmented locally. But we won't set the DF-Flag on the outgoing frames. Florian Weimer had the idea to use this flag to ensure DNS servers are never generating outgoing fragments. They may well be fragmented on the path, but the server never stores or usees path mtu values, which could well be forged in an attack. (The root of the problem with path MTU discovery is that there is no reliable way to authenticate ICMP Fragmentation Needed But DF Set messages because they are sent from intermediate routers with their source addresses, and the IMCP payload will not always contain sufficient information to identify a flow.) Recent research in the DNS community showed that it is possible to implement an attack where DNS cache poisoning is feasible by spoofing fragments. This work was done by Amir Herzberg and Haya Shulman: <https://sites.google.com/site/hayashulman/files/fragmentation-poisoning.pdf> This issue was previously discussed among the DNS community, e.g. <http://www.ietf.org/mail-archive/web/dnsext/current/msg01204.html>, without leading to fixes. This patch depends on the patch "ipv4: fix DO and PROBE pmtu mode regarding local fragmentation with UFO/CORK" for the enforcement of the non-fragmentable checks. If other users than ip_append_page/data should use this semantic too, we have to add a new flag to IPCB(skb)->flags to suppress local fragmentation and check for this in ip_finish_output. Many thanks to Florian Weimer for the idea and feedback while implementing this patch. Cc: David S. Miller <davem@davemloft.net> Suggested-by: NFlorian Weimer <fweimer@redhat.com> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 11月, 2013 3 次提交
-
-
由 Jesper Dangaard Brouer 提交于
As described in commit 5a581b36 (jiffies: Avoid undefined behavior from signed overflow), according to the C standard 3.4.3p3, overflow of a signed integer results in undefined behavior. To fix this, do as the above commit, and do an unsigned subtraction, and interpreting the result as a signed two's-complement number. This is based on the theory from RFC 1982 and is nicely described in wikipedia here: https://en.wikipedia.org/wiki/Serial_number_arithmetic#General_Solution A side-note, I have seen practical issues with the previous logic when dealing with 16-bit, on a 64-bit machine (gcc version 4.4.5). This were 32-bit, which I have not observed issues with. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NJesper Dangaard Brouer <netoptimizer@brouer.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuchung Cheng 提交于
Slow start now increases cwnd by 1 if an ACK acknowledges some packets, regardless the number of packets. Consequently slow start performance is highly dependent on the degree of the stretch ACKs caused by receiver or network ACK compression mechanisms (e.g., delayed-ACK, GRO, etc). But slow start algorithm is to send twice the amount of packets of packets left so it should process a stretch ACK of degree N as if N ACKs of degree 1, then exits when cwnd exceeds ssthresh. A follow up patch will use the remainder of the N (if greater than 1) to adjust cwnd in the congestion avoidance phase. In addition this patch retires the experimental limited slow start (LSS) feature. LSS has multiple drawbacks but questionable benefit. The fractional cwnd increase in LSS requires a loop in slow start even though it's rarely used. Configuring such an increase step via a global sysctl on different BDPS seems hard. Finally and most importantly the slow start overshoot concern is now better covered by the Hybrid slow start (hystart) enabled by default. Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
This patch fixes a build warning in skb_checksum() by wrapping the csum_partial() usage in skb_checksum(). The problem is that on a few architectures, csum_partial is used with prefix asmlinkage whereas on most architectures it's not. So fix this up generically as we did with csum_block_add_ext() to match the signature. Introduced by 2817a336 ("net: skb_checksum: allow custom update/combine for walking skb"). Reported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 11月, 2013 2 次提交
-
-
由 Daniel Borkmann 提交于
This fixes an outstanding bug found through IPVS, where SCTP packets with skb->data_len > 0 (non-linearized) and empty frag_list, but data accumulated in frags[] member, are forwarded with incorrect checksum letting SCTP initial handshake fail on some systems. Linearizing each SCTP skb in IPVS to prevent that would not be a good solution as this leads to an additional and unnecessary performance penalty on the load-balancer itself for no good reason (as we actually only want to update the checksum, and can do that in a different/better way presented here). The actual problem is elsewhere, namely, that SCTP's checksumming in sctp_compute_cksum() does not take frags[] into account like skb_checksum() does. So while we are fixing this up, we better reuse the existing code that we have anyway in __skb_checksum() and use it for walking through the data doing checksumming. This will not only fix this issue, but also consolidates some SCTP code with core sk_buff code, bringing it closer together and removing respectively avoiding reimplementation of skb_checksum() for no good reason. As crc32c() can use hardware implementation within the crypto layer, we leave that intact (it wraps around / falls back to e.g. slice-by-8 algorithm in __crc32c_le() otherwise); plus use the __crc32c_le_combine() combinator for crc32c blocks. Also, we remove all other SCTP checksumming code, so that we only have to use sctp_compute_cksum() from now on; for doing that, we need to transform SCTP checkumming in output path slightly, and can leave the rest intact. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Currently, skb_checksum walks over 1) linearized, 2) frags[], and 3) frag_list data and calculats the one's complement, a 32 bit result suitable for feeding into itself or csum_tcpudp_magic(), but unsuitable for SCTP as we're calculating CRC32c there. Hence, in order to not re-implement the very same function in SCTP (and maybe other protocols) over and over again, use an update() + combine() callback internally to allow for walking over the skb with different algorithms. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-