- 16 6月, 2016 13 次提交
-
-
由 Alexander Aring 提交于
This patch introduces neighbour discovery ops callback structure. The idea is to separate the handling for 6LoWPAN into the 6lowpan module. These callback offers 6lowpan different handling, such as 802.15.4 short address handling or RFC6775 (Neighbor Discovery Optimization for IPv6 over 6LoWPANs). Cc: David S. Miller <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NAlexander Aring <aar@pengutronix.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds __ndisc_opt_addr_data as low-level function for ndisc_opt_addr_data which doesn't depend on net_device parameter. Cc: David S. Miller <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Reviewed-by: NStefan Schmidt <stefan@osg.samsung.com> Signed-off-by: NAlexander Aring <aar@pengutronix.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds __ndisc_opt_addr_space as low-level function for ndisc_opt_addr_space which doesn't depend on net_device parameter. Cc: David S. Miller <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Reviewed-by: NStefan Schmidt <stefan@osg.samsung.com> Signed-off-by: NAlexander Aring <aar@pengutronix.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds the autoconfiguration if a valid 802.15.4 short address is available for 802.15.4 6LoWPAN interfaces. Cc: David S. Miller <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Reviewed-by: NStefan Schmidt <stefan@osg.samsung.com> Signed-off-by: NAlexander Aring <aar@pengutronix.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch will introduce a 6lowpan neighbour private data. Like the interface private data we handle private data for generic 6lowpan and for link-layer specific 6lowpan. The current first use case if to save the short address for a 802.15.4 6lowpan neighbour. Cc: David S. Miller <davem@davemloft.net> Reviewed-by: NStefan Schmidt <stefan@osg.samsung.com> Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NAlexander Aring <aar@pengutronix.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
qdisc are changed under RTNL protection and often while blocking BH and root qdisc spinlock. When lots of skbs need to be dropped, we free them under these locks causing TX/RX freezes, and more generally latency spikes. This commit adds rtnl_kfree_skbs(), used to queue skbs for deferred freeing. Actual freeing happens right after RTNL is released, with appropriate scheduling points. rtnl_qdisc_drop() can also be used in place of disc_drop() when RTNL is held. qdisc_reset_queue() and __qdisc_reset_queue() get the new behavior, so standard qdiscs like pfifo, pfifo_fast... have their ->reset() method automatically handled. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michael S. Tsirkin 提交于
Update skb_array after ptr_ring API changes. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Tested-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michael S. Tsirkin 提交于
This adds ring resize support. Seems to be necessary as users such as tun allow userspace control over queue size. If resize is used, this costs us ability to peek at queue without consumer lock - should not be a big deal as peek and consumer are usually run on the same CPU. If ring is made bigger, ring contents is preserved. If ring is made smaller, extra pointers are passed to an optional destructor callback. Cleanup function also gains destructor callback such that all pointers in queue can be cleaned up. This changes some APIs but we don't have any users yet, so it won't break bisect. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michael S. Tsirkin 提交于
A simple array based FIFO of pointers. Intended for net stack so uses skbs for type safety. Implemented as a set of wrappers around ptr_ring. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Tested-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michael S. Tsirkin 提交于
A simple array based FIFO of pointers. Intended for net stack which commonly has a single consumer/producer. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 WANG Cong 提交于
Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
IPv6 multicast and link-local addresses require special handling by the VRF driver: 1. Rather than using the VRF device index and full FIB lookups, packets to/from these addresses should use direct FIB lookups based on the VRF device table. 2. fail sends/receives on a VRF device to/from a multicast address (e.g, make ping6 ff02::1%<vrf> fail) 3. move the setting of the flow oif to the first dst lookup and revert the change in icmpv6_echo_reply made in ca254490 ("net: Add VRF support to IPv6 stack"). Linklocal/mcast addresses require use of the skb->dev. With this change connections into and out of a VRF enslaved device work for multicast and link-local addresses work (icmp, tcp, and udp) e.g., 1. packets into VM with VRF config: ping6 -c3 fe80::e0:f9ff:fe1c:b974%br1 ping6 -c3 ff02::1%br1 ssh -6 fe80::e0:f9ff:fe1c:b974%br1 2. packets going out a VRF enslaved device: ping6 -c3 fe80::18f8:83ff:fe4b:7a2e%eth1 ping6 -c3 ff02::1%eth1 ssh -6 root@fe80::18f8:83ff:fe4b:7a2e%eth1 Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Allow drivers to pass flow arg to functions where the arg is not const and allow the driver to make updates as needed (eg., setting oif). Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 6月, 2016 1 次提交
-
-
由 Florian Westphal 提交于
sch_atm returns this when TC_ACT_SHOT classification occurs. But all other schedulers that use tc_classify (htb, hfsc, drr, fq_codel ...) return NET_XMIT_SUCCESS | __BYPASS in this case so just do that in atm. BATMAN uses it as an intermediate return value to signal forwarding vs. buffering, but it did not return POLICED to callers outside of BATMAN. Reviewed-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 6月, 2016 8 次提交
-
-
由 Eric Dumazet 提交于
__QDISC_STATE_THROTTLED bit manipulation is rather expensive for HTB and few others. I already removed it for sch_fq in commit f2600cf0 ("net: sched: avoid costly atomic operation in fq_dequeue()") and so far nobody complained. When one ore more packets are stuck in one or more throttled HTB class, a htb dequeue() performs two atomic operations to clear/set __QDISC_STATE_THROTTLED bit, while root qdisc lock is held. Removing this pair of atomic operations bring me a 8 % performance increase on 200 TCP_RR tests, in presence of throttled classes. This patch has no side effect, since nothing actually uses disc_is_throttled() anymore. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pramod Kumar 提交于
An integrated multiplexer uses same address space for "muxed bus selection" and "generation of mdio transaction" hence its good to register parent bus from mux driver. Hence added a mechanism where mux driver could register a parent bus and pass it down to framework via mdio_mux_init api. Signed-off-by: NPramod Kumar <pramod.kumar@broadcom.com> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Zi Shen Lim 提交于
Commit 0fc174de ("ebpf: make internal bpf API independent of CONFIG_BPF_SYSCALL ifdefs") introduced usage of ERR_PTR() in bpf_prog_get(), however did not include linux/err.h. Without this patch, when compiling arm64 BPF without CONFIG_BPF_SYSCALL: ... In file included from arch/arm64/net/bpf_jit_comp.c:21:0: include/linux/bpf.h: In function 'bpf_prog_get': include/linux/bpf.h:235:9: error: implicit declaration of function 'ERR_PTR' [-Werror=implicit-function-declaration] return ERR_PTR(-EOPNOTSUPP); ^ include/linux/bpf.h:235:9: warning: return makes pointer from integer without a cast [-Wint-conversion] In file included from include/linux/rwsem.h:17:0, from include/linux/mm_types.h:10, from include/linux/sched.h:27, from arch/arm64/include/asm/compat.h:25, from arch/arm64/include/asm/stat.h:23, from include/linux/stat.h:5, from include/linux/compat.h:12, from include/linux/filter.h:10, from arch/arm64/net/bpf_jit_comp.c:22: include/linux/err.h: At top level: include/linux/err.h:23:35: error: conflicting types for 'ERR_PTR' static inline void * __must_check ERR_PTR(long error) ^ In file included from arch/arm64/net/bpf_jit_comp.c:21:0: include/linux/bpf.h:235:9: note: previous implicit declaration of 'ERR_PTR' was here return ERR_PTR(-EOPNOTSUPP); ^ ... Fixes: 0fc174de ("ebpf: make internal bpf API independent of CONFIG_BPF_SYSCALL ifdefs") Suggested-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NZi Shen Lim <zlim.lnx@gmail.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lawrence Brakmo 提交于
Add in_flight (bytes in flight when packet was sent) field to tx component of tcp_skb_cb and make it available to congestion modules' pkts_acked() function through the ack_sample function argument. Signed-off-by: NLawrence Brakmo <brakmo@fb.com> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mike Rapoport 提交于
The code for conversion between virtio_net_hdr and skb GSO info is duplicated at several places. Let's put it to a common place to allow reuse. Signed-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mike Rapoport 提交于
This gives better namespacing and prevents conflicts with no-uapi version of virtio_net header that will be introduced in the following patch. Signed-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Respect the stack's xmit_recursion limit for calls into dev_queue_xmit(). Currently, they are not handeled by the limiter when attached to clsact's egress parent, for example, and a buggy program redirecting it to the same device again could run into stack overflow eventually. It would be good if we could notify an admin to give him a chance to react. We reuse xmit_recursion instead of having one private to eBPF, so that the stack's current recursion depth will be taken into account as well. Follow-up to commit 3896d655 ("bpf: introduce bpf_clone_redirect() helper") and 27b29f63 ("bpf: add bpf_redirect() helper"). Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 William Tu 提交于
The patch adds a new OVS action, OVS_ACTION_ATTR_TRUNC, in order to truncate packets. A 'max_len' is added for setting up the maximum packet size, and a 'cutlen' field is to record the number of bytes to trim the packet when the packet is outputting to a port, or when the packet is sent to userspace. Signed-off-by: NWilliam Tu <u9012063@gmail.com> Cc: Pravin Shelar <pshelar@nicira.com> Acked-by: NPravin B Shelar <pshelar@ovn.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 6月, 2016 11 次提交
-
-
由 Willem de Bruijn 提交于
Socket option PACKET_FANOUT_DATA takes a struct sock_fprog as argument if PACKET_FANOUT has mode PACKET_FANOUT_CBPF. This structure contains a pointer into user memory. If userland is 32-bit and kernel is 64-bit the two disagree about the layout of struct sock_fprog. Add compat setsockopt support to convert a 32-bit compat_sock_fprog to a 64-bit sock_fprog. This is analogous to compat_sock_fprog support for SO_REUSEPORT added in commit 19575988 ("soreuseport: add compat case for setsockopt SO_ATTACH_REUSEPORT_CBPF"). Reported-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NWillem de Bruijn <willemb@google.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Aaron Conole 提交于
A draft version of the MTU Advice feature bit was specified as 25. This bit is not within the allowed range for network device feature bits, and should be changed to be feature bit 3 to fully comply with the spec. Fixes 14de9d11 ('virtio-net: Add initial MTU advice feature') Signed-off-by: NAaron Conole <aconole@redhat.com> Suggested-by: N"Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Frank Kellermann reported a kernel crash with 4.5.0 when IPv6 is disabled at boot using the kernel option ipv6.disable=1. Using current net-next with the boot option: $ ip link add red type vrf table 1001 Generates: [12210.919584] BUG: unable to handle kernel NULL pointer dereference at 0000000000000748 [12210.921341] IP: [<ffffffff814b30e3>] fib6_get_table+0x2c/0x5a [12210.922537] PGD b79e3067 PUD bb32b067 PMD 0 [12210.923479] Oops: 0000 [#1] SMP [12210.924001] Modules linked in: ipvlan 8021q garp mrp stp llc [12210.925130] CPU: 3 PID: 1177 Comm: ip Not tainted 4.7.0-rc1+ #235 [12210.926168] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [12210.928065] task: ffff8800b9ac4640 ti: ffff8800bacac000 task.ti: ffff8800bacac000 [12210.929328] RIP: 0010:[<ffffffff814b30e3>] [<ffffffff814b30e3>] fib6_get_table+0x2c/0x5a [12210.930697] RSP: 0018:ffff8800bacaf888 EFLAGS: 00010202 [12210.931563] RAX: 0000000000000748 RBX: ffffffff81a9e280 RCX: ffff8800b9ac4e28 [12210.932688] RDX: 00000000000000e9 RSI: 0000000000000002 RDI: 0000000000000286 [12210.933820] RBP: ffff8800bacaf898 R08: ffff8800b9ac4df0 R09: 000000000052001b [12210.934941] R10: 00000000657c0000 R11: 000000000000c649 R12: 00000000000003e9 [12210.936032] R13: 00000000000003e9 R14: ffff8800bace7800 R15: ffff8800bb3ec000 [12210.937103] FS: 00007faa1766c700(0000) GS:ffff88013ac00000(0000) knlGS:0000000000000000 [12210.938321] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [12210.939166] CR2: 0000000000000748 CR3: 00000000b79d6000 CR4: 00000000000406e0 [12210.940278] Stack: [12210.940603] ffff8800bb3ec000 ffffffff81a9e280 ffff8800bacaf8c8 ffffffff814b3135 [12210.941818] ffff8800bb3ec000 ffffffff81a9e280 ffffffff81a9e280 ffff8800bace7800 [12210.943040] ffff8800bacaf8f0 ffffffff81397c88 ffff8800bb3ec000 ffffffff81a9e280 [12210.944288] Call Trace: [12210.944688] [<ffffffff814b3135>] fib6_new_table+0x24/0x8a [12210.945516] [<ffffffff81397c88>] vrf_dev_init+0xd4/0x162 [12210.946328] [<ffffffff814091e1>] register_netdevice+0x100/0x396 [12210.947209] [<ffffffff8139823d>] vrf_newlink+0x40/0xb3 [12210.948001] [<ffffffff814187f0>] rtnl_newlink+0x5d3/0x6d5 ... The problem above is due to the fact that the fib hash table is not allocated when IPv6 is disabled at boot. As for the VRF driver it should not do any IPv6 initializations if IPv6 is disabled, so it needs to know if IPv6 is disabled at boot. The disable parameter is private to the IPv6 module, so provide an accessor for modules to determine if IPv6 was disabled at boot time. Fixes: 35402e31 ("net: Add IPv6 support to VRF device") Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Howells 提交于
Simplify the RxRPC connect() implementation. It will just note the destination address it is given, and if a sendmsg() comes along with no address, this will be assigned as the address. No transport struct will be held internally, which will allow us to remove this later. Simplify sendmsg() also. Whilst a call is active, userspace refers to it by a private unique user ID specified in a control message. When sendmsg() sees a user ID that doesn't map to an extant call, it creates a new call for that user ID and attempts to add it. If, when we try to add it, the user ID is now registered, we now reject the message with -EEXIST. We should never see this situation unless two threads are racing, trying to create a call with the same ID - which would be an error. It also isn't required to provide sendmsg() with an address - provided the control message data holds a user ID that maps to a currently active call. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Fainelli 提交于
This patch adds support for Broadcom's BCM53xx switch family, also known as RoboSwitch. Some of these switches are ubiquituous, found in home routers, Wi-Fi routers, DSL and cable modem gateways and other networking related products. This drivers adds the library driver (b53_common.c) as well as a few bus glue drivers for MDIO, SPI, Switch Register Access Block (SRAB) and memory-mapped I/O into a SoC's address space (Broadcom BCM63xx/33xx). Basic operations are supported to bring the Layer 1/2 up and running, but not much more at this point, subsequent patches add the remaining features. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Noa Osherovich 提交于
In RoCE, the RDMA-CM needs the node guid to establish connection between nodes. Today, the node guid exposed to mlx5 Ethernet VFs is zero, therefore RDMA-CM on the VF is broken. Whenever the administrator sets a MAC for a VF, derive the node guid from it and set it as well in the following way: MAC: e4:1d:2d:b3:f4:01 -> node_guid: e4:1d:2d:ff:fe:b3:f4:01 Fixes: 77256579 ('net/mlx5: E-Switch, Introduce Vport...') Signed-off-by: NNoa Osherovich <noaos@mellanox.com> Signed-off-by: NMajd Dibbiny <majd@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Maor Gottlieb 提交于
Flow steering infrastructure is currently used only on link layer ethernet, therefore the driver should initialize the flow steering when the device link layer is ethernet. In addition, add missing capability check before initializing the namespace of NIC RX flow tables. Fixes: 25302363 ('net/mlx5_core: Flow steering tree initialization') Signed-off-by: NMaor Gottlieb <maorg@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Shahar Klein 提交于
Having MLX5_CMD_OP_MAX on another file causes us to repeatedly miss accounting new commands added to the driver and hence there're no entries for them in debugfs. To solve that, we integrate it into the commands enum as the last entry. Fixes: 34a40e68 ('net/mlx5_core: Introduce modify flow table command') Signed-off-by: NShahar Klein <shahark@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Majd Dibbiny 提交于
Add 16 reserved bytes at the end of mlx5_modify_qp_mbox_in to match the hardware spec definition. Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: NMajd Dibbiny <majd@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
It is time to add netdev_lockdep_set_classes() helper so that lockdep annotations per device type are easier to manage. This removes a lot of copies and missing annotations. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
1) qdisc_run_begin() is really using the equivalent of a trylock. Instead of using write_seqcount_begin(), use a combination of raw_write_seqcount_begin() and correct lockdep annotation. 2) sch_direct_xmit() should use regular spin_lock(root_lock) Fixes: f9eb8aea ("net_sched: transform qdisc running bit into a seqcount") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 6月, 2016 7 次提交
-
-
由 Michal Kazior 提交于
There is no other limit other than a global packet count limit when using software queuing. This means a single flow queue can grow insanely long. This is particularly bad for TCP congestion algorithms which requires a little more sophisticated frame dropping scheme than a mere headdrop on limit overflow. Hence apply (a slighly modified, to fit the knobs) CoDel5 on flow queues. This improves TCP convergence and stability when combined with wireless driver which keeps its own tx queue/fifo at a minimum fill level for given link conditions. Signed-off-by: NMichal Kazior <michal.kazior@tieto.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Michal Kazior 提交于
Qdiscs are designed with no regard to 802.11 aggregation requirements and hand out packet-by-packet with no guarantee they are destined to the same tid. This does more bad than good no matter how fairly a given qdisc may behave on an ethernet interface. Software queuing used per-AC netdev subqueue congestion control whenever a global AC limit was hit. This meant in practice a single station or tid queue could starve others rather easily. This could resonate with qdiscs in a bad way or could just end up with poor aggregation performance. Increasing the AC limit would increase induced latency which is also bad. Disabling qdiscs by default and performing taildrop instead of netdev subqueue congestion control on the other hand makes it possible for tid queues to fill up "in the meantime" while preventing stations starving each other. This increases aggregation opportunities and should allow software queuing based drivers achieve better performance by utilizing airtime more efficiently with big aggregates. Signed-off-by: NMichal Kazior <michal.kazior@tieto.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Johannes Berg 提交于
Everytime I need to look for these, my usual strategy fails because it assumes the right formatting. Fix the formatting here to make it consistent with the rest of the kernel. Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Florian Westphal 提交于
Earlier commits removed two members from struct Qdisc which places next_sched/gso_skb into a different cacheline than ->state. This restores the struct layout to what it was before the removal. Move the two members, then add an annotation so they all reside in the same cacheline. This adds a 16 byte hole after cpu_qstats. The hole could be closed but as it doesn't decrease total struct size just do it this way. Reported-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
after removal of TCA_CBQ_OVL_STRATEGY from cbq scheduler, there are no more callers of ->drop() outside of other ->drop functions, i.e. nothing calls them. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
After the removal of TCA_CBQ_POLICE in cbq scheduler qdisc->reshape_fail is always NULL, i.e. qdisc_rehape_fail is now the same as qdisc_drop. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
iproute2 doesn't implement any cbq option that results in this attribute being sent to kernel. To make use of it, user would have to - patch iproute2 - add a class - attach a qdisc to the class (default pfifo doesn't work as q->handle is 0 and cbq_set_police() is a no-op in this case) - re-'add' the same class (tc class change ...) again - user must also specifiy a defmap (e.g. 'split 1:0 defmap 3f'), since this 'police' feature relies on its presence - the added qdisc must be one of bfifo, pfifo or netem If all of these conditions are met and _some_ leaf qdiscs, namely p/bfifo, netem, plug or tbf would drop a packet, kernel calls back into cbq, which will attempt to re-queue the skb into a different class as indicated by the parents' defmap entry for TC_PRIO_BESTEFFORT. [ i.e. we behave as if tc_classify returned TC_ACT_RECLASSIFY ]. This feature, which isn't documented or implemented in iproute2, and isn't implemented consistently (most qdiscs like sfq, codel, etc drop right away instead of attempting this reclassification) is the sole reason for the reshape_fail and __parent member in Qdisc struct. So remove TCA_CBQ_POLICE support from the kernel, reject it via EOPNOTSUPP so userspace knows we don't support it, and then remove no-longer needed infrastructure in followup commit. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-