- 23 10月, 2015 5 次提交
-
-
由 Robert Shearman 提交于
Change the selection of a multipath route to use a flow-based hash. This more suitable for traffic sensitive to reordering within a flow (e.g. TCP, L2VPN) and whilst still allowing a good distribution of traffic given enough flows. Selection of the path for a multipath route is done using a hash of: 1. Label stack up to MAX_MP_SELECT_LABELS labels or up to and including entropy label, whichever is first. 2. 3-tuple of (L3 src, L3 dst, proto) from IPv4/IPv6 header in MPLS payload, if present. Naturally, a 5-tuple hash using L4 information in addition would be possible and be better in some scenarios, but there is a tradeoff between looking deeper into the packet to achieve good distribution, and packet forwarding performance, and I have erred on the side of the latter as the default. Signed-off-by: NRobert Shearman <rshearma@brocade.com> Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Roopa Prabhu 提交于
This patch adds support for MPLS multipath routes. Includes following changes to support multipath: - splits struct mpls_route into 'struct mpls_route + struct mpls_nh' - 'struct mpls_nh' represents a mpls nexthop label forwarding entry - moves mpls route and nexthop structures into internal.h - A mpls_route can point to multiple mpls_nh structs - the nexthops are maintained as a array (similar to ipv4 fib) - In the process of restructuring, this patch also consistently changes all labels to u8 - Adds support to parse/fill RTA_MULTIPATH netlink attribute for multipath routes similar to ipv4/v6 fib - In this patch, the multipath route nexthop selection algorithm simply returns the first nexthop. It is replaced by a hash based algorithm from Robert Shearman in the next patch - mpls_route_update cleanup: remove 'dev' handling in mpls_route_update. mpls_route_update though implemented to update based on dev, it was never used that way. And the dev handling gets tricky with multiple nexthops. Cannot match against any single nexthops dev. So, this patch removes the unused 'dev' handling in mpls_route_update. - dead route/path handling will be implemented in a subsequent patch Example: $ip -f mpls route add 100 nexthop as 200 via inet 10.1.1.2 dev swp1 \ nexthop as 700 via inet 10.1.1.6 dev swp2 \ nexthop as 800 via inet 40.1.1.2 dev swp3 $ip -f mpls route show 100 nexthop as to 200 via inet 10.1.1.2 dev swp1 nexthop as to 700 via inet 10.1.1.6 dev swp2 nexthop as to 800 via inet 40.1.1.2 dev swp3 Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Acked-by: NRobert Shearman <rshearma@brocade.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hiroshi Shimamoto 提交于
Add netlink directives and ndo entry to trust VF user. This controls the special permission of VF user. The administrator will dedicatedly trust VF user to use some features which impacts security and/or performance. The administrator never turn it on unless VF user is fully trusted. CC: Sy Jong Choi <sy.jong.choi@intel.com> Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Acked-by: NGreg Rose <gregory.v.rose@intel.com> Tested-by: NKrishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Eric Dumazet 提交于
Multiple cpus can process duplicates of incoming ACK messages matching a SYN_RECV request socket. This is a rare event under normal operations, but definitely can happen. Only one must win the race, otherwise corruption would occur. To fix this without adding new atomic ops, we use logic in inet_ehash_nolisten() to detect the request was present in the same ehash bucket where we try to insert the new child. If request socket was not found, we have to undo the child creation. This actually removes a spin_lock()/spin_unlock() pair in reqsk_queue_unlink() for the fast path. Fixes: e994b2f0 ("tcp: do not lock listener to process SYN packets") Fixes: 079096f1 ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Currently adding a new ipv4 address always cause the creation of the related network route, with default metric. When a host has multiple interfaces on the same network, multiple routes with the same metric are created. If the userspace wants to set specific metric on each routes, i.e. giving better metric to ethernet links in respect to Wi-Fi ones, the network routes must be deleted and recreated, which is error-prone. This patch implements the support for IFA_F_NOPREFIXROUTE for ipv4 address. When an address is added with such flag set, no associated network route is created, no network route is deleted when said IP is gone and it's up to the user space manage such route. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 10月, 2015 20 次提交
-
-
由 Vivien Didelot 提交于
No driver implements port_fdb_getnext anymore, and port_fdb_dump is preferred anyway, so remove this function from DSA. Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vivien Didelot 提交于
Not all switch chips support a Get Next operation to iterate on its FDB. So add a more simple port_fdb_dump function for them. Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pravin B Shelar 提交于
With use of lwtunnel, we can directly call dev_queue_xmit() rather than calling netdev vport send operation. Following change make tunnel vport code bit cleaner. Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Acked-by: NThomas Graf <tgraf@suug.ch> Acked-by: NJiri Benc <jbenc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pravin B Shelar 提交于
Patch fixes following sparse warning. net/openvswitch/flow_netlink.c:583:30: warning: incorrect type in assignment (different base types) net/openvswitch/flow_netlink.c:583:30: expected restricted __be16 [usertype] ipv4 net/openvswitch/flow_netlink.c:583:30: got int Fixes: 6b26ba3a ("openvswitch: netlink attributes for IPv6 tunneling") Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Acked-by: NThomas Graf <tgraf@suug.ch> Acked-by: NJiri Benc <jbenc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Marcel Holtmann 提交于
With the addition of support for diagnostic feature, it makes sense to increase the minor version of the Bluetooth core module. The module version is not used anywhere, but it gives a nice extra hint for debugging purposes. Signed-off-by: NMarcel Holtmann <marcel@holtmann.org> Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
-
由 Alexander Aring 提交于
Looking at current situation of memory management in 6lowpan receive function I detected some invalid handling. After calling lowpan_invoke_rx_handlers we will do a kfree_skb and then NET_RX_DROP on error handling. We don't do this before, also on skb_share_check/skb_unshare which might manipulate the reference counters. After running some 'grep -r "dev_add_pack" net/' to look how others packet-layer receive callbacks works I detected that every subsystem do a kfree_skb, then NET_RX_DROP without calling skb functions which might manipulate the skb reference counters. This is the reason why we should do the same here like all others subsystems. I didn't find any documentation how the packet-layer receive callbacks handle NET_RX_DROP return values either. This patch will add a kfree_skb, then NET_RX_DROP handling for the "trivial checks", in case of skb_share_check/skb_unshare the kfree_skb call will be done inside these functions. Signed-off-by: NAlexander Aring <alex.aring@gmail.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Johan Hedberg 提交于
There are a few places that don't explicitly check the connection state before calling hci_disconnect(). To make this API do the right thing take advantage of the new hci_abort_conn() API and also make sure to only read the clock offset if we're really connected. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Johan Hedberg 提交于
Convert the various places mapping connection state to disconnect/cancel HCI command to use the new hci_abort_conn helper API. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Johan Hedberg 提交于
There are several different places needing to make sure that a connection gets disconnected or canceled. The exact action needed depends on the connection state, so centralizing this logic can save quite a lot of code duplication. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Johan Hedberg 提交于
When unpairing the keys stored in hci_dev are removed. If SMP is ongoing the SMP context will also have references to these keys, so removing them from the hci_dev lists will make the pointers invalid. This can result in the following type of crashes: BUG: unable to handle kernel paging request at 6b6b6b6b IP: [<c11f26be>] __list_del_entry+0x44/0x71 *pde = 00000000 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: hci_uart btqca btusb btintel btbcm btrtl hci_vhci rfcomm bluetooth_6lowpan bluetooth CPU: 0 PID: 723 Comm: kworker/u5:0 Not tainted 4.3.0-rc3+ #1379 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 Workqueue: hci0 hci_rx_work [bluetooth] task: f19da940 ti: f1a94000 task.ti: f1a94000 EIP: 0060:[<c11f26be>] EFLAGS: 00010202 CPU: 0 EIP is at __list_del_entry+0x44/0x71 EAX: c0088d20 EBX: f30fcac0 ECX: 6b6b6b6b EDX: 6b6b6b6b ESI: f4b60000 EDI: c0088d20 EBP: f1a95d90 ESP: f1a95d8c DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 CR0: 8005003b CR2: 6b6b6b6b CR3: 319e5000 CR4: 00000690 Stack: f30fcac0 f1a95db0 f82dc3e1 f1bfc000 00000000 c106524f f1bfc000 f30fd020 f1a95dc0 f1a95dd0 f82dcbdb f1a95de0 f82dcbdb 00000067 f1bfc000 f30fd020 f1a95de0 f1a95df0 f82d1126 00000067 f82d1126 00000006 f30fd020 f1bfc000 Call Trace: [<f82dc3e1>] smp_chan_destroy+0x192/0x240 [bluetooth] [<c106524f>] ? trace_hardirqs_on_caller+0x14e/0x169 [<f82dcbdb>] smp_teardown_cb+0x47/0x64 [bluetooth] [<f82dcbdb>] ? smp_teardown_cb+0x47/0x64 [bluetooth] [<f82d1126>] l2cap_chan_del+0x5d/0x14d [bluetooth] [<f82d1126>] ? l2cap_chan_del+0x5d/0x14d [bluetooth] [<f82d40ef>] l2cap_conn_del+0x109/0x17b [bluetooth] [<f82d40ef>] ? l2cap_conn_del+0x109/0x17b [bluetooth] [<f82c0205>] ? hci_event_packet+0x5b1/0x2092 [bluetooth] [<f82d41aa>] l2cap_disconn_cfm+0x49/0x50 [bluetooth] [<f82d41aa>] ? l2cap_disconn_cfm+0x49/0x50 [bluetooth] [<f82c0228>] hci_event_packet+0x5d4/0x2092 [bluetooth] [<c1332c16>] ? skb_release_data+0x6a/0x95 [<f82ce5d4>] ? hci_send_to_monitor+0xe7/0xf4 [bluetooth] [<c1409708>] ? _raw_spin_unlock_irqrestore+0x44/0x57 [<f82b3bb0>] hci_rx_work+0xf1/0x28b [bluetooth] [<f82b3bb0>] ? hci_rx_work+0xf1/0x28b [bluetooth] [<c10635a0>] ? __lock_is_held+0x2e/0x44 [<c104772e>] process_one_work+0x232/0x432 [<c1071ddc>] ? rcu_read_lock_sched_held+0x50/0x5a [<c104772e>] ? process_one_work+0x232/0x432 [<c1047d48>] worker_thread+0x1b8/0x255 [<c1047b90>] ? rescuer_thread+0x23c/0x23c [<c104bb71>] kthread+0x91/0x96 [<c14096a7>] ? _raw_spin_unlock_irq+0x27/0x44 [<c1409d61>] ret_from_kernel_thread+0x21/0x30 [<c104bae0>] ? kthread_parkme+0x1e/0x1e To solve the issue, introduce a new smp_cancel_pairing() API that can be used to clean up the SMP state before touching the hci_dev lists. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Johan Hedberg 提交于
For connection parameters that are left around until a disconnection we should at least clear any auto-connection properties. This way a new Add Device call is required to re-set them after calling Unpair Device. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Eric Dumazet 提交于
Tom Herbert added SIT support to GRO with commit 19424e05 ("sit: Add gro callbacks to sit_offload"), later reverted by Herbert Xu. The problem came because Tom patch was building GRO packets without proper meta data : If packets were locally delivered, we would not care. But if packets needed to be forwarded, GSO engine was not able to segment individual segments. With the following patch, we correctly set skb->encapsulation and inner network header. We also update gso_type. Tested: Server : netserver modprobe dummy ifconfig dummy0 8.0.0.1 netmask 255.255.255.0 up arp -s 8.0.0.100 4e:32:51:04:47:e5 iptables -I INPUT -s 10.246.7.151 -j TEE --gateway 8.0.0.100 ifconfig sixtofour0 sixtofour0 Link encap:IPv6-in-IPv4 inet6 addr: 2002:af6:798::1/128 Scope:Global inet6 addr: 2002:af6:798::/128 Scope:Global UP RUNNING NOARP MTU:1480 Metric:1 RX packets:411169 errors:0 dropped:0 overruns:0 frame:0 TX packets:409414 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:20319631739 (20.3 GB) TX bytes:29529556 (29.5 MB) Client : netperf -H 2002:af6:798::1 -l 1000 & Checked on server traffic copied on dummy0 and verify segments were properly rebuilt, with proper IP headers, TCP checksums... tcpdump on eth0 shows proper GRO aggregation takes place. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arad, Ronen 提交于
if_nlmsg_size() overestimates the minimum allocation size of netlink dump request (when called from rtnl_calcit()) or the size of the message (when called from rtnl_getlink()). This is because ext_filter_mask is not supported by rtnl_link_get_af_size() and rtnl_link_get_size(). The over-estimation is significant when at least one netdev has many VLANs configured (8 bytes for each configured VLAN). This patch-set "rightsizes" the protocol specific attribute size calculation by propagating ext_filter_mask to rtnl_link_get_af_size() and adding this a argument to get_link_af_size op in rtnl_af_ops. Bridge module already used filtering aware sizing for notifications. br_get_link_af_size_filtered() is consistent with the modified get_link_af_size op so it replaces br_get_link_af_size() in br_af_ops. br_get_link_af_size() becomes unused and thus removed. Signed-off-by: NRonen Arad <ronen.arad@intel.com> Acked-by: NSridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johan Hedberg 提交于
There's only one user of this helper which can be replaces with a call to hci_pend_le_action_lookup() and a check for params->explicit_connect. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Johan Hedberg 提交于
There's no need to clear the HCI_CONN_ENCRYPT_PEND flag in smp_failure. In fact, this may cause the encryption tracking to get out of sync as this has nothing to do with HCI activity. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Johan Hedberg 提交于
The hci_le_create_connection_cancel() function needs to use the hdev pointer in many places so add a variable for it to avoid the need to dereference the hci_conn every time. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Johan Hedberg 提交于
Instead of doing all of the LE-specific handling in an else-branch in unpair_device() create a 'done' label for the BR/EDR branch to jump to and then remove the else-branch completely. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Johan Hedberg 提交于
Use the new hci_conn_hash_lookup_le() API to look up LE connections. This way we're guaranteed exact matches that also take into account the address type. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Johan Hedberg 提交于
Use the new hci_conn_hash_lookup_le() API to look up LE connections. This way we're guaranteed exact matches that also take into account the address type. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Johan Hedberg 提交于
The mgmt code needs to convert from mgmt/L2CAP address types to HCI in many places. Having a dedicated helper function for this simplifies code by shortening it and removing unnecessary 'addr_type' variables. Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
- 21 10月, 2015 15 次提交
-
-
由 Elad Raz 提交于
Configure ageing time to the HW for newly bridged device CC: Scott Feldman <sfeldma@gmail.com> CC: Jiri Pirko <jiri@resnulli.us> Signed-off-by: NElad Raz <eladr@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NScott Feldman <sfeldma@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuchung Cheng 提交于
This patch implements the second half of RACK that uses the the most recent transmit time among all delivered packets to detect losses. tcp_rack_mark_lost() is called upon receiving a dubious ACK. It then checks if an not-yet-sacked packet was sent at least "reo_wnd" prior to the sent time of the most recently delivered. If so the packet is deemed lost. The "reo_wnd" reordering window starts with 1msec for fast loss detection and changes to min-RTT/4 when reordering is observed. We found 1msec accommodates well on tiny degree of reordering (<3 pkts) on faster links. We use min-RTT instead of SRTT because reordering is more of a path property but SRTT can be inflated by self-inflicated congestion. The factor of 4 is borrowed from the delayed early retransmit and seems to work reasonably well. Since RACK is still experimental, it is now used as a supplemental loss detection on top of existing algorithms. It is only effective after the fast recovery starts or after the timeout occurs. The fast recovery is still triggered by FACK and/or dupack threshold instead of RACK. We introduce a new sysctl net.ipv4.tcp_recovery for future experiments of loss recoveries. For now RACK can be disabled by setting it to 0. Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuchung Cheng 提交于
This patch is the first half of the RACK loss recovery. RACK loss recovery uses the notion of time instead of packet sequence (FACK) or counts (dupthresh). It's inspired by the previous FACK heuristic in tcp_mark_lost_retrans(): when a limited transmit (new data packet) is sacked, then current retransmitted sequence below the newly sacked sequence must been lost, since at least one round trip time has elapsed. But it has several limitations: 1) can't detect tail drops since it depends on limited transmit 2) is disabled upon reordering (assumes no reordering) 3) only enabled in fast recovery ut not timeout recovery RACK (Recently ACK) addresses these limitations with the notion of time instead: a packet P1 is lost if a later packet P2 is s/acked, as at least one round trip has passed. Since RACK cares about the time sequence instead of the data sequence of packets, it can detect tail drops when later retransmission is s/acked while FACK or dupthresh can't. For reordering RACK uses a dynamically adjusted reordering window ("reo_wnd") to reduce false positives on ever (small) degree of reordering. This patch implements tcp_advanced_rack() which tracks the most recent transmission time among the packets that have been delivered (ACKed or SACKed) in tp->rack.mstamp. This timestamp is the key to determine which packet has been lost. Consider an example that the sender sends six packets: T1: P1 (lost) T2: P2 T3: P3 T4: P4 T100: sack of P2. rack.mstamp = T2 T101: retransmit P1 T102: sack of P2,P3,P4. rack.mstamp = T4 T205: ACK of P4 since the hole is repaired. rack.mstamp = T101 We need to be careful about spurious retransmission because it may falsely advance tp->rack.mstamp by an RTT or an RTO, causing RACK to falsely mark all packets lost, just like a spurious timeout. We identify spurious retransmission by the ACK's TS echo value. If TS option is not applicable but the retransmission is acknowledged less than min-RTT ago, it is likely to be spurious. We refrain from using the transmission time of these spurious retransmissions. The second half is implemented in the next patch that marks packet lost using RACK timestamp. Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuchung Cheng 提交于
a helper to prepare the main RACK patch Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuchung Cheng 提交于
Remove the existing lost retransmit detection because RACK subsumes it completely. This also stops the overloading the ack_seq field of the skb control block. Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuchung Cheng 提交于
Kathleen Nichols' algorithm for tracking the minimum RTT of a data stream over some measurement window. It uses constant space and constant time per update. Yet it almost always delivers the same minimum as an implementation that has to keep all the data in the window. The measurement window is tunable via sysctl.net.ipv4.tcp_min_rtt_wlen with a default value of 5 minutes. The algorithm keeps track of the best, 2nd best & 3rd best min values, maintaining an invariant that the measurement time of the n'th best >= n-1'th best. It also makes sure that the three values are widely separated in the time window since that bounds the worse case error when that data is monotonically increasing over the window. Upon getting a new min, we can forget everything earlier because it has no value - the new min is less than everything else in the window by definition and it's the most recent. So we restart fresh on every new min and overwrites the 2nd & 3rd choices. The same property holds for the 2nd & 3rd best. Therefore we have to maintain two invariants to maximize the information in the samples, one on values (1st.v <= 2nd.v <= 3rd.v) and the other on times (now-win <=1st.t <= 2nd.t <= 3rd.t <= now). These invariants determine the structure of the code The RTT input to the windowed filter is the minimum RTT measured from ACK or SACK, or as the last resort from TCP timestamps. The accessor tcp_min_rtt() returns the minimum RTT seen in the window. ~0U indicates it is not available. The minimum is 1usec even if the true RTT is below that. Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuchung Cheng 提交于
Currently ca_seq_rtt_us does not use Kern's check. Fix that by checking if any packet acked is a retransmit, for both RTT used for RTT estimation and congestion control. Fixes: 5b08e47c ("tcp: prefer packet timing to TS-ECR for RTT") Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johan Hedberg 提交于
The hci_conn objects don't have a dedicated lock themselves but rely on the caller to hold the hci_dev lock for most types of access. The hci_conn_timeout() function has so far sent certain HCI commands based on the hci_conn state which has been possible without holding the hci_dev lock. The recent changes to do LE scanning before connect attempts added even more operations to hci_conn and hci_dev from hci_conn_timeout, thereby exposing potential race conditions with the hci_dev and hci_conn states. As an example of such a race, here there's a timeout but an l2cap_sock_connect() call manages to race with the cleanup routine: [Oct21 08:14] l2cap_chan_timeout: chan ee4b12c0 state BT_CONNECT [ +0.000004] l2cap_chan_close: chan ee4b12c0 state BT_CONNECT [ +0.000002] l2cap_chan_del: chan ee4b12c0, conn f3141580, err 111, state BT_CONNECT [ +0.000002] l2cap_sock_teardown_cb: chan ee4b12c0 state BT_CONNECT [ +0.000005] l2cap_chan_put: chan ee4b12c0 orig refcnt 4 [ +0.000010] hci_conn_drop: hcon f53d56e0 orig refcnt 1 [ +0.000013] l2cap_chan_put: chan ee4b12c0 orig refcnt 3 [ +0.000063] hci_conn_timeout: hcon f53d56e0 state BT_CONNECT [ +0.000049] hci_conn_params_del: addr ee:0d:30:09:53:1f (type 1) [ +0.000002] hci_chan_list_flush: hcon f53d56e0 [ +0.000001] hci_chan_del: hci0 hcon f53d56e0 chan f4e7ccc0 [ +0.004528] l2cap_sock_create: sock e708fc00 [ +0.000023] l2cap_chan_create: chan ee4b1770 [ +0.000001] l2cap_chan_hold: chan ee4b1770 orig refcnt 1 [ +0.000002] l2cap_sock_init: sk ee4b3390 [ +0.000029] l2cap_sock_bind: sk ee4b3390 [ +0.000010] l2cap_sock_setsockopt: sk ee4b3390 [ +0.000037] l2cap_sock_connect: sk ee4b3390 [ +0.000002] l2cap_chan_connect: 00:02:72:d9:e5:8b -> ee:0d:30:09:53:1f (type 2) psm 0x00 [ +0.000002] hci_get_route: 00:02:72:d9:e5:8b -> ee:0d:30:09:53:1f [ +0.000001] hci_dev_hold: hci0 orig refcnt 8 [ +0.000003] hci_conn_hold: hcon f53d56e0 orig refcnt 0 Above the l2cap_chan_connect() shouldn't have been able to reach the hci_conn f53d56e0 anymore but since hci_conn_timeout didn't do proper locking that's not the case. The end result is a reference to hci_conn that's not in the conn_hash list, resulting in list corruption when trying to remove it later: [Oct21 08:15] l2cap_chan_timeout: chan ee4b1770 state BT_CONNECT [ +0.000004] l2cap_chan_close: chan ee4b1770 state BT_CONNECT [ +0.000003] l2cap_chan_del: chan ee4b1770, conn f3141580, err 111, state BT_CONNECT [ +0.000001] l2cap_sock_teardown_cb: chan ee4b1770 state BT_CONNECT [ +0.000005] l2cap_chan_put: chan ee4b1770 orig refcnt 4 [ +0.000002] hci_conn_drop: hcon f53d56e0 orig refcnt 1 [ +0.000015] l2cap_chan_put: chan ee4b1770 orig refcnt 3 [ +0.000038] hci_conn_timeout: hcon f53d56e0 state BT_CONNECT [ +0.000003] hci_chan_list_flush: hcon f53d56e0 [ +0.000002] hci_conn_hash_del: hci0 hcon f53d56e0 [ +0.000001] ------------[ cut here ]------------ [ +0.000461] WARNING: CPU: 0 PID: 1782 at lib/list_debug.c:56 __list_del_entry+0x3f/0x71() [ +0.000839] list_del corruption, f53d56e0->prev is LIST_POISON2 (00000200) The necessary fix is unfortunately more complicated than just adding hci_dev_lock/unlock calls to the hci_conn_timeout() call path. Particularly, the hci_conn_del() API, which expects the hci_dev lock to be held, performs a cancel_delayed_work_sync(&hcon->disc_work) which would lead to a deadlock if the hci_conn_timeout() call path tries to acquire the same lock. This patch solves the problem by deferring the cleanup work to a separate work callback. To protect against the hci_dev or hci_conn going away meanwhile temporary references are taken with the help of hci_dev_hold() and hci_conn_get(). Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org # 4.3
-
由 Johannes Berg 提交于
Group station statistics by where they're (mostly) updated (TX, RX and TX-status) and group them into sub-structs of the struct sta_info. Also rename the variables since the grouping now makes it obvious where they belong. This makes it easier to identify where the statistics are updated in the code, and thus easier to think about them. Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Johannes Berg 提交于
There's little point in keeping (and even sending to userspace) the beacon_loss_count value per station, since it can only apply to the AP on a managed-mode connection. Move the value to ifmgd, advertise it only in managed mode, and remove it from ethtool as it's available through better interfaces. Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Johannes Berg 提交于
This file only feeds a debugfs file that isn't very useful, so remove it. If necessary, we can add other ways to get this information, for example in the NL80211_CMD_PROBE_CLIENT response. Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Marcel Holtmann 提交于
Some drivers might have to restore certain settings after the init procedure has been completed. This driver callback allows them to hook into that stage. This callback is run just before the controller is declared as powered up. Signed-off-by: NMarcel Holtmann <marcel@holtmann.org> Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
-
由 Dean Jenkins 提交于
There is a L2CAP protocol race between the local peer and the remote peer demanding disconnection of the L2CAP link. When L2CAP ERTM is used, l2cap_sock_shutdown() can be called from userland to disconnect L2CAP. However, there can be a delay introduced by waiting for ACKs. During this waiting period, the remote peer may have sent a Disconnection Request. Therefore, recheck the shutdown status of the socket after waiting for ACKs because there is no need to do further processing if the connection has gone. Signed-off-by: NDean Jenkins <Dean_Jenkins@mentor.com> Signed-off-by: NHarish Jenny K N <harish_kandiga@mentor.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Dean Jenkins 提交于
This commit reorganizes the mutex lock and is now only protecting l2cap_chan_close(). This is now consistent with other places where l2cap_chan_close() is called. If a conn connection exists, call mutex_lock(&conn->chan_lock) before calling l2cap_chan_close() to ensure other L2CAP protocol operations do not interfere. Note that the conn structure has to be protected from being freed as it is possible for the connection to be disconnected whilst the locks are not held. This solution allows the mutex lock to be used even when the connection has just been disconnected. This commit also reduces the scope of chan locking. The only place where chan locking is needed is the call to l2cap_chan_close(chan, 0) which if necessary closes the channel. Therefore, move the l2cap_chan_lock(chan) and l2cap_chan_lock(chan) locking calls to around l2cap_chan_close(chan, 0). This allows __l2cap_wait_ack(sk, chan) to be called with no chan locks being held so L2CAP messaging over the ACL link can be done unimpaired. Signed-off-by: NDean Jenkins <Dean_Jenkins@mentor.com> Signed-off-by: NHarish Jenny K N <harish_kandiga@mentor.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-
由 Dean Jenkins 提交于
l2cap_sock_shutdown() is designed to only action shutdown of the channel when shutdown is not already in progress. Therefore, reorganise the code flow by adding a goto to jump to the end of function handling when shutdown is already being actioned. This removes one level of code indentation and make the code more readable. Signed-off-by: NDean Jenkins <Dean_Jenkins@mentor.com> Signed-off-by: NHarish Jenny K N <harish_kandiga@mentor.com> Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
-