- 14 9月, 2018 4 次提交
-
-
由 YueHaibing 提交于
Remove duplicated include. Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
This patch introduces to a new tun/tap specific msg_control: #define TUN_MSG_UBUF 1 #define TUN_MSG_PTR 2 struct tun_msg_ctl { int type; void *ptr; }; This allows us to pass different kinds of msg_control through sendmsg(). The first supported type is ubuf (TUN_MSG_UBUF) which will be used by the existed vhost_net zerocopy code. The second is XDP buff, which allows vhost_net to pass XDP buff to TUN. This could be used to implement accepting an array of XDP buffs from vhost_net in the following patches. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
This patch introduces a new sock flag - SOCK_XDP. This will be used for notifying the upper layer that XDP program is attached on the lower socket, and requires for extra headroom. TUN will be the first user. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
llc_sap_close() is called by llc_sap_put() which could be called in BH context in llc_rcv(). We can't block in BH. There is no reason to block it here, kfree_rcu() should be sufficient. Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 9月, 2018 10 次提交
-
-
由 Andre Naujoks 提交于
The socket option will be enabled by default to ensure current behaviour is not changed. This is the same for the IPv4 version. A socket bound to in6addr_any and a specific port will receive all traffic on that port. Analogue to IP_MULTICAST_ALL, disable this behaviour, if one or more multicast groups were joined (using said socket) and only pass on multicast traffic from groups, which were explicitly joined via this socket. Without this option disabled a socket (system even) joined to multiple multicast groups is very hard to get right. Filtering by destination address has to take place in user space to avoid receiving multicast traffic from other multicast groups, which might have traffic on the same port. The extension of the IP_MULTICAST_ALL socketoption to just apply to ipv6, too, is not done to avoid changing the behaviour of current applications. Signed-off-by: NAndre Naujoks <nautsch2@gmail.com> Acked-By: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hauke Mehrtens 提交于
This handles the tag added by the PMAC on the VRX200 SoC line. The GSWIP uses internally a GSWIP special tag which is located after the Ethernet header. The PMAC which connects the GSWIP to the CPU converts this special tag used by the GSWIP into the PMAC special tag which is added in front of the Ethernet header. This was tested with GSWIP 2.1 found in the VRX200 SoCs, other GSWIP versions use slightly different PMAC special tags. Signed-off-by: NHauke Mehrtens <hauke@hauke-m.de> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hangbin Liu 提交于
Similar with commit 72f6d71e ("vxlan: add ttl inherit support"), currently ttl == 0 means "use whatever default value" on geneve instead of inherit inner ttl. To respect compatibility with old behavior, let's add a new IFLA_GENEVE_TTL_INHERIT for geneve ttl inherit support. Reported-by: NJianlin Shi <jishi@redhat.com> Suggested-by: NJiri Benc <jbenc@redhat.com> Signed-off-by: NHangbin Liu <liuhangbin@gmail.com> Reviewed-by: NJiri Benc <jbenc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nikolay Aleksandrov 提交于
Add support for entries which are "sticky", i.e. will not change their port if they show up from a different one. A new ndm flag is introduced for that purpose - NTF_STICKY. We allow to set it only to non-local entries. Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrew Lunn 提交于
Rather than have MAC drivers open code the test, add a helper in phylib. This will help when we change the type of phydev->supported. Signed-off-by: NAndrew Lunn <andrew@lunn.ch> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrew Lunn 提交于
ethtool can be used to enable/disable pause. Add a helper to configure the PHY when Pause is supported. Signed-off-by: NAndrew Lunn <andrew@lunn.ch> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrew Lunn 提交于
ethtool can be used to enable/disable pause. Add a helper to configure the PHY when asym pause is supported. Signed-off-by: NAndrew Lunn <andrew@lunn.ch> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrew Lunn 提交于
Rather than have the MAC drivers manipulate phydev members, add a helper function for MACs supporting Pause, but not Asym Pause. Signed-off-by: NAndrew Lunn <andrew@lunn.ch> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrew Lunn 提交于
Rather than have the MAC drivers manipulate phydev members to indicate they support Asym Pause, add a helper function. Signed-off-by: NAndrew Lunn <andrew@lunn.ch> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrew Lunn 提交于
Some MAC hardware cannot support a subset of link modes. e.g. often 1Gbps Full duplex is supported, but Half duplex is not. Add a helper to remove such a link mode. Signed-off-by: NAndrew Lunn <andrew@lunn.ch> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 9月, 2018 1 次提交
-
-
由 David Ahern 提交于
The fib6_info reference in rt6_info is rcu protected. Add a helper to extract prefsrc from and update cxgbi_check_route6 to use it. Fixes: 0153167a ("net/ipv6: Remove rt6i_prefsrc") Reported-by: Nkbuild test robot <lkp@intel.com> Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 9月, 2018 7 次提交
-
-
由 Vlad Buslov 提交于
Change flower in_hw_count type to fixed-size u32 and dump it as TCA_FLOWER_IN_HW_COUNT. This change is necessary to properly test shared blocks and re-offload functionality. Signed-off-by: NVlad Buslov <vladbu@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
It documents what is happening, and eliminates the spurious list pointer poisoning. In the long term, in order to get proper list head debugging, we might want to use the list poison value as the indicator that an SKB is a singleton and not on a list. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
An SKB is not on a list if skb->next is NULL. Codify this convention into a helper function and use it where we are dequeueing an SKB and need to mark it as such. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Add a helper, __skb_peek(), and use it in ppp_mp_reconstruct(). Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
By hand copies of SKB list handlers do not belong in individual packet schedulers. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Instead, adjust __qdisc_enqueue_tail() such that HTB can use it instead. The only other caller of __qdisc_enqueue_tail() is qdisc_enqueue_tail() so we can move the backlog and return value handling (which HTB doesn't need/want) to the latter. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
After the conversion to fib6_info, rt6i_prefsrc has a single user that reads the value and otherwise it is only set. The one reader can be converted to use rt->from so rt6i_prefsrc can be removed, reducing rt6_info by another 20 bytes. Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 9月, 2018 13 次提交
-
-
由 Denis Bolotin 提交于
This patch adds a new qed firmware with fixes and support for new features. Fixes: - Fix a rare case of device crash with iWARP, iSCSI or FCoE offload. - Fix GRE tunneled traffic when iWARP offload is enabled. - Fix RoCE failure in ib_send_bw when using inline data. - Fix latency optimization flow for inline WQEs. - BigBear 100G fix RDMA: - Reduce task context size. - Application page sizes above 2GB support. - Performance improvements. ETH: - Tenant DCB support. - Replace RSS indirection table update interface. Misc: - Debug Tools changes. Signed-off-by: NDenis Bolotin <denis.bolotin@cavium.com> Signed-off-by: NAriel Elior <ariel.elior@cavium.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christian Brauner 提交于
This adds IFLA_TARGET_NETNSID as an alias for IFLA_IF_NETNSID for RTM_*LINK requests. The new name is clearer and also aligns with the newly introduced IFA_TARGET_NETNSID propert for RTM_*ADDR requests. Signed-off-by: NChristian Brauner <christian@brauner.io> Suggested-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Cc: Jiri Benc <jbenc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christian Brauner 提交于
This adds a new IFA_TARGET_NETNSID property to be used by address families such as PF_INET and PF_INET6. The IFA_TARGET_NETNSID property can be used to send a network namespace identifier as part of a request. If a IFA_TARGET_NETNSID property is identified it will be used to retrieve the target network namespace in which the request is to be made. Signed-off-by: NChristian Brauner <christian@brauner.io> Cc: Jiri Benc <jbenc@redhat.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christian Brauner 提交于
get_target_net() will be used in follow-up patches in ipv{4,6} codepaths to retrieve network namespaces based on network namespace identifiers. So remove the static declaration and export in the rtnetlink header. Also, rename it to rtnl_get_net_ns_capable() to make it obvious what this function is doing. Export rtnl_get_net_ns_capable() so it can be used when ipv6 is built as a module. Signed-off-by: NChristian Brauner <christian@brauner.io> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vincent Whitchurch 提交于
Currently, the only way to ignore outgoing packets on a packet socket is via the BPF filter. With MSG_ZEROCOPY, packets that are looped into AF_PACKET are copied in dev_queue_xmit_nit(), and this copy happens even if the filter run from packet_rcv() would reject them. So the presence of a packet socket on the interface takes away the benefits of MSG_ZEROCOPY, even if the packet socket is not interested in outgoing packets. (Even when MSG_ZEROCOPY is not used, the skb is unnecessarily cloned, but the cost for that is much lower.) Add a socket option to allow AF_PACKET sockets to ignore outgoing packets to solve this. Note that the *BSDs already have something similar: BIOCSSEESENT/BIOCSDIRECTION and BIOCSDIRFILT. The first intended user is lldpd. Signed-off-by: NVincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Shay Agroskin 提交于
Changed "priv.clock.lock" lock from 'rw_lock' to 'seq_lock' in order to improve packet rate performance. Tested on Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz. Sent 64b packets between two peers connected by ConnectX-5, and measured packet rate for the receiver in three modes: no time-stamping (base rate) time-stamping using rw_lock (old lock) for critical region time-stamping using seq_lock (new lock) for critical region Only the receiver time stamped its packets. The measured packet rate improvements are: Single flow (multiple TX rings to single RX ring): without timestamping: 4.26 (M packets)/sec with rw-lock (old lock): 4.1 (M packets)/sec with seq-lock (new lock): 4.16 (M packets)/sec 1.46% improvement Multiple flows (multiple TX rings to six RX rings): without timestamping: 22 (M packets)/sec with rw-lock (old lock): 11.7 (M packets)/sec with seq-lock (new lock): 21.3 (M packets)/sec 82.05% improvement The packet rate improvement is due to the lack of atomic operations for the 'readers' by the seq-lock. Since there are much more 'readers' than 'writers' contention on this lock, almost all atomic operations are saved. this results in a dramatic decrease in overall cache misses. Signed-off-by: NShay Agroskin <shayag@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Vlad Buslov 提交于
Previous patch in series changed flow counter storage structure from rb_tree to linked list in order to improve flow counter traversal performance. The drawback of such solution is that flow counter lookup by id becomes linear in complexity. Store pointers to flow counters in idr in order to improve lookup performance to logarithmic again. Idr is non-intrusive data structure and doesn't require extending flow counter struct with new elements. This means that idr can be used for lookup, while linked list from previous patch is used for traversal, and struct mlx5_fc size is <= 2 cache lines. Signed-off-by: NVlad Buslov <vladbu@mellanox.com> Acked-by: NAmir Vadai <amir@vadai.me> Reviewed-by: NPaul Blakey <paulb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Vlad Buslov 提交于
In order to improve performance of flow counter stats query loop that traverses all configured flow counters, replace rb_tree with double-linked list. This change improves performance of traversing flow counters by removing the tree traversal. (profiling data showed that call to rb_next was most top CPU consumer) However, lookup of flow flow counter in list becomes linear, instead of logarithmic. This problem is fixed by next patch in series, which adds idr for fast lookup. Idr is to be used because it is not an intrusive data structure and doesn't require adding any new members to struct mlx5_fc, which allows its control data part to stay <= 1 cache line in size. Signed-off-by: NVlad Buslov <vladbu@mellanox.com> Acked-by: NAmir Vadai <amir@vadai.me> Reviewed-by: NPaul Blakey <paulb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Vlad Buslov 提交于
In order to prevent flow counters stats work function from traversing whole flow counters tree while searching for deleted flow counters, new list to store deleted flow counters is added to struct mlx5_fc_stats. Lockless NULL-terminated single linked list data type is used due to following reasons: - This use case only needs to add single element to list and remove/iterate whole list. Lockless list doesn't require any additional synchronization for these operations. - First cache line of flow counter data structure only has space to store single additional pointer, which precludes usage of double linked list. Remove flow counter 'deleted' flag that is no longer needed. Signed-off-by: NVlad Buslov <vladbu@mellanox.com> Acked-by: NAmir Vadai <amir@vadai.me> Reviewed-by: NPaul Blakey <paulb@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Vlad Buslov 提交于
In order to prevent flow counters stats work function from traversing whole flow counters tree while searching for deleted flow counters, new list to store deleted flow counters will be added to struct mlx5_fc_stats. However, the flow counter structure itself has no space left to store any more data in first cache line. To free space that is needed to store additional list node, convert current addlist double linked list (two pointers per node) to atomic single linked list (one pointer per node). Lockless NULL-terminated single linked list data type doesn't require any additional external synchronization for operations used by flow counters module (add single new element, remove all elements from list and traverse them). Remove addlist_lock that is no longer needed. Signed-off-by: NVlad Buslov <vladbu@mellanox.com> Acked-by: NAmir Vadai <amir@vadai.me> Reviewed-by: NPaul Blakey <paulb@mellanox.com> Reviewed-by: NRoi Dayan <roid@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Tariq Toukan 提交于
Minimal stride size is 16. Hence, the number of strides in a fragment (of PAGE_SIZE) is <= PAGE_SIZE / 16 <= 4K. u16 is sufficient to represent this. Fixes: d7037ad7 ("net/mlx5: Fix QP fragmented buffer allocation") Signed-off-by: NTariq Toukan <tariqt@mellanox.com> Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Tariq Toukan 提交于
Minimal stride size is 16. Hence, the number of strides in a fragment (of PAGE_SIZE) is <= PAGE_SIZE / 16 <= 4K. u16 is sufficient to represent this. Fixes: 388ca8be ("IB/mlx5: Implement fragmented completion queue (CQ)") Signed-off-by: NTariq Toukan <tariqt@mellanox.com> Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
由 Jack Morgenstein 提交于
When the mlx5 health mechanism detects a problem while the driver is in the middle of init_one or remove_one, the driver needs to prevent the health mechanism from scheduling future work; if future work is scheduled, there is a problem with use-after-free: the system WQ tries to run the work item (which has been freed) at the scheduled future time. Prevent this by disabling work item scheduling in the health mechanism when the driver is in the middle of init_one() or remove_one(). Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: NFeras Daoud <ferasda@mellanox.com> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
-
- 05 9月, 2018 5 次提交
-
-
由 Steven Rostedt (VMware) 提交于
Borislav reported the following splat: ============================= WARNING: suspicious RCU usage 4.19.0-rc1+ #1 Not tainted ----------------------------- ./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle! other info that might help us debug this: RCU used illegally from idle CPU! rcu_scheduler_active = 2, debug_locks = 1 RCU used illegally from extended quiescent state! 1 lock held by swapper/0/0: #0: 000000004557ee0e (rcu_read_lock){....}, at: perf_event_output_forward+0x0/0x130 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc1+ #1 Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012 Call Trace: dump_stack+0x85/0xcb perf_event_output_forward+0xf6/0x130 __perf_event_overflow+0x52/0xe0 perf_swevent_overflow+0x91/0xb0 perf_tp_event+0x11a/0x350 ? find_held_lock+0x2d/0x90 ? __lock_acquire+0x2ce/0x1350 ? __lock_acquire+0x2ce/0x1350 ? retint_kernel+0x2d/0x2d ? find_held_lock+0x2d/0x90 ? tick_nohz_get_sleep_length+0x83/0xb0 ? perf_trace_cpu+0xbb/0xd0 ? perf_trace_buf_alloc+0x5a/0xa0 perf_trace_cpu+0xbb/0xd0 cpuidle_enter_state+0x185/0x340 do_idle+0x1eb/0x260 cpu_startup_entry+0x5f/0x70 start_kernel+0x49b/0x4a6 secondary_startup_64+0xa4/0xb0 This is due to the tracepoints moving to SRCU usage which does not require RCU to be "watching". But perf uses these tracepoints with RCU and expects it to be. Hence, we still need to add in the rcu_irq_enter/exit_irqson() calls for "rcuidle" tracepoints. This is a temporary fix until we have SRCU working in NMI context, and then perf can be converted to use that instead of normal RCU. Link: http://lkml.kernel.org/r/20180904162611.6a120068@gandalf.local.home Cc: x86-ml <x86@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Reported-by: NBorislav Petkov <bp@alien8.de> Tested-by: NBorislav Petkov <bp@alien8.de> Reviewed-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Fixes: e6753f23 ("tracepoint: Make rcuidle tracepoint callers use SRCU") Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Sara Sharon 提交于
Some hardwares have limitations on the packets' type in AMSDU. Add an optional driver callback to determine if two skbs can be used in the same AMSDU or not. Signed-off-by: NSara Sharon <sara.sharon@intel.com> Signed-off-by: NLuca Coelho <luciano.coelho@intel.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Sara Sharon 提交于
Some drivers may have AMSDU size limitation per TID, due to HW constrains. Add an option to set this limit. Signed-off-by: NSara Sharon <sara.sharon@intel.com> Signed-off-by: NLuca Coelho <luciano.coelho@intel.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Sara Sharon 提交于
We have a TXQ abstraction for non-data packets that need powersave buffering. Since the AP cannot sleep, in case of station we can use this TXQ for all management frames, regardless if they are bufferable. Add HW flag to allow that. Signed-off-by: NSara Sharon <sara.sharon@intel.com> Signed-off-by: NLuca Coelho <luciano.coelho@intel.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Shaul Triebitz 提交于
Align to new 11ax draft D3.0. Change/add new MAC and PHY capabilities and update drivers' 11ax capabilities and mac80211's debugfs accordingly. Signed-off-by: NShaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: NLuca Coelho <luciano.coelho@intel.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-