- 05 6月, 2014 22 次提交
-
-
由 Tom Herbert 提交于
Call gso_make_checksum. This should have the benefit of using a checksum that may have been previously computed for the packet. This also adds NETIF_F_GSO_GRE_CSUM to differentiate devices that offload GRE GSO with and without the GRE checksum offloaed. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Added a new netif feature for GSO_UDP_TUNNEL_CSUM. This indicates that a device is capable of computing the UDP checksum in the encapsulating header of a UDP tunnel. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Call common gso_make_checksum when calculating checksum for a TCP GSO segment. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
When creating a GSO packet segment we may need to set more than one checksum in the packet (for instance a TCP checksum and UDP checksum for VXLAN encapsulation). To be efficient, we want to do checksum calculation for any part of the packet at most once. This patch adds csum_start offset to skb_gso_cb. This tracks the starting offset for skb->csum which is initially set in skb_segment. When a protocol needs to compute a transport checksum it calls gso_make_checksum which computes the checksum value from the start of transport header to csum_start and then adds in skb->csum to get the full checksum. skb->csum and csum_start are then updated to reflect the checksum of the resultant packet starting from the transport header. This patch also adds a flag to skbuff, encap_hdr_csum, which is set in *gso_segment fucntions to indicate that a tunnel protocol needs checksum calculation Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Call common functions to set checksum for UDP tunnel. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Added udp_set_csum and udp6_set_csum functions to set UDP checksums in packets. These are for simple UDP packets such as those that might be created in UDP tunnels. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Vlad Yasevich says: ==================== Fix support for macvlan devices on top bonding Currently, macvlan devices do not work well over bond interfaces. Everything works well, untill a failover is triggered in the bond device and then macvlan becomes unreachble untill arp entries are flushed. This series adds needed functionality to handle correct notifications and update switches with mac addresses assigned to macvlans. The first patch simply addes IFF_UNICAST_FLT flag to bonds since they already correctly manage the unicast filter list of the slaves, so we might as well prevent the bond from needlessly going into promiscuous mode. The second patch adds notifier handler to macvlan to trigger correct ARP notifications. The third patch adds handling for TLB and RLB modes that use special ETH_P_LOOPBACK type packets to teach switch about mac addresses. It also allow ARPs for the macvlan mac addresses to be handled by RLB mode. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vlad Yasevich 提交于
To make TLB mode work, the patch allows learning packets to be sent using mac addresses assigned to macvlan devices, also taking into an account vlans that may be between the bond and macvlan device. To make RLB work, all we have to do is accept ARP packets for addresses added to the bond dev->uc list. Since RLB mode will take care to update the peers directly with correct mac addresses, learning packets for these addresses do not have be send to switch. Signed-off-by: NVlad Yasevich <vyasevic@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vlad Yasevich 提交于
Bonding and team drivers generate specific events during failover that trigger switch updates. When a macvlan device is configured on top of bonding, we want switches to learn about the macvlan devices as well. This patch adds a handler to macvlan driver to propagate these events to all macvlan devices. We let the generic inetdev event handler do the work. This allows macvlan to operated correctly over active-backup mode bond. Signed-off-by: NVlad Yasevich <vyasevic@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vlad Yasevich 提交于
Bonding devices manage the unicast filters of the underlying interfaces, but do not turn on IFF_UNICAST_FLT flag. Thus anytime a unicast address is added to the bond, the bond is places in promiscuous mode. Turn on IFF_UNICAST_FLT on the bond device so that the bond does not go into promiscuous mode needlesly. If an underlying device does not support unicast filtering, that device will automaticall enter promiscuous mode already. Signed-off-by: NVlad Yasevich <vyasevic@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sasha Levin 提交于
This reverts commit 30f38d2f. fib_triestat is surrounded by a big lie: while it claims that it's a seq_file (fib_triestat_seq_open, fib_triestat_seq_show), it isn't: static const struct file_operations fib_triestat_fops = { .owner = THIS_MODULE, .open = fib_triestat_seq_open, .read = seq_read, .llseek = seq_lseek, .release = single_release_net, }; Yes, fib_triestat is just a regular file. A small detail (assuming CONFIG_NET_NS=y) is that while for seq_files you could do seq_file_net() to get the net ptr, doing so for a regular file would be wrong and would dereference an invalid pointer. The fib_triestat lie claimed a victim, and trying to show the file would be bad for the kernel. This patch just reverts the issue and fixes fib_triestat, which still needs a rewrite to either be a seq_file or stop claiming it is. Signed-off-by: NSasha Levin <sasha.levin@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Antonio Ospite 提交于
Signed-off-by: NAntonio Ospite <ao2@ao2.it> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexander Gordeev <agordeev@redhat.com> Cc: netdev@vger.kernel.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xiubo Li 提交于
Building with CONFIG_DEBUG_SECTION_MISMATCH enabled, the following WARNING is occured: LD drivers/net/built-in.o WARNING: drivers/net/built-in.o(.text+0xcd4c): Section mismatch in reference from the function gfar_probe() to the function .init.text:gfar_init_addr_hash_table() The function gfar_probe() references the function __init gfar_init_addr_hash_table(). This is often because gfar_probe lacks a __init annotation or the annotation of gfar_init_addr_hash_table is wrong. Signed-off-by: NXiubo Li <Li.Xiubo@freescale.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Wei Liu says: ==================== This is rebased version of Andrew's V8 patch series. The original cover letter: -------------------- xen-net{back, front}: Multiple transmit and receive queues This patch series implements multiple transmit and receive queues (i.e. multiple shared rings) for the xen virtual network interfaces. The series is split up as follows: - Patch 1 brings the 'grant_copy_op' array back into struct xenvif, in preparation for multi-queue support. See the patch itself for more details. - Patches 2 and 4 factor out the queue-specific data for netback and netfront respectively, and modify the rest of the code to use these as appropriate. - Patches 3 and 5 introduce new XenStore keys to negotiate and use multiple shared rings and event channels, and code to connect these as appropriate. - Patch 6 documents the XenStore keys required for the new feature in include/xen/interface/io/netif.h All other transmit and receive processing remains unchanged, i.e. there is a kthread per queue and a NAPI context per queue. The performance of these patches has been analysed in detail, with results available at: http://wiki.xenproject.org/wiki/Xen-netback_and_xen-netfront_multi-queue_performance_testing To summarise: * Using multiple queues allows a VM to transmit at line rate on a 10 Gbit/s NIC, compared with a maximum aggregate throughput of 6 Gbit/s with a single queue. * For intra-host VM--VM traffic, eight queues provide 171% of the throughput of a single queue; almost 12 Gbit/s instead of 6 Gbit/s. * There is a corresponding increase in total CPU usage, i.e. this is a scaling out over available resources, not an efficiency improvement. * Results depend on the availability of sufficient CPUs, as well as the distribution of interrupts and the distribution of TCP streams across the queues. Queue selection is currently achieved via an L4 hash on the packet (i.e. TCP src/dst port, IP src/dst address) and is not negotiated between the frontend and backend, since only one option exists. Future patches to support other frontends (particularly Windows) will need to add some capability to negotiate not only the hash algorithm selection, but also allow the frontend to specify some parameters to this. Note that queue selection is a decision by the transmitting system about which queue to use for a particular packet. In general, the algorithm may differ between the frontend and the backend with no adverse effects. Queue-specific XenStore entries for ring references and event channels are stored hierarchically, i.e. under .../queue-N/... where N varies from 0 to one less than the requested number of queues (inclusive). If only one queue is requested, it falls back to the flat structure where the ring references and event channels are written at the same level as other vif information. V8: - Squash the queue error handling code into patch 3. - Update the documentation (patch 6) according to comments on the equivalent patch to Xen. V7: - Rebase on latest net-next, which includes the netback grant mapping patch series from Zoltan Kiss - Reduce QUEUE_NAME_SIZE by 1 to avoid double-counting the trailing '\0' - Simplify the queue hashing by using (hash % num_queues) instead of multiply & shift. - Add ratelimited warning for invalid queue selection. - Fix error handling to correctly tear down already setup queues. - Use dev->real_num_tx_queues instead of separately maintaining a count of the number of queues. V6: - Use 'max_queues' as the module param. name for both netback and netfront. V5: - Fix bug in xenvif_free() that could lead to an attempt to transmit an skb after the queue structures had been freed. - Improve the XenStore protocol documentation in netif.h. - Fix IRQ_NAME_SIZE double-accounting for null terminator. - Move rx_gso_checksum_fixup stat into struct xenvif_stats (per-queue). - Don't initialise a local variable that is set in both branches (xspath). V4: - Add MODULE_PARM_DESC() for the multi-queue parameters for netback and netfront modules. - Move del_timer_sync() in netfront to after unregister_netdev, which restores the order in which these functions were called before applying these patches. V3: - Further indentation and style fixups. V2: - Rebase onto net-next. - Change queue->number to queue->id. - Add atomic operations around the small number of stats variables that are not queue-specific or per-cpu. - Fixup formatting and style issues. - XenStore protocol changes documented in netif.h. - Default max. number of queues to num_online_cpus(). - Check requested number of queues does not exceed maximum. -------------------- I rebased this on top of net-next. No functional change is introduced. The patch that needed some extra care was "xen-netback: Factor queue-specific data into queue struct" because it clashed with a fix introduced in net. A simple test of creating guest, iperf, then shutting down guest worked as expected. The last patch fixes a minor problem that queue name is not initialised in xen-netfront, resulting in names like "-tx" "-rx" in /proc/interrupt. Changes since v9 (no functional change introduced): * include commit summary in the commit message of first patch * fold David Vrabel's Reviewed-by into last patch ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Liu 提交于
Signed-off-by: NWei Liu <wei.liu2@citrix.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrew J. Bennieston 提交于
Document the multi-queue feature in terms of XenStore keys to be written by the backend and by the frontend. Signed-off-by: NAndrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: NWei Liu <wei.liu2@citrix.com> Acked-by: NIan Campbell <ian.campbell@citrix.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrew J. Bennieston 提交于
Build on the refactoring of the previous patch to implement multiple queues between xen-netfront and xen-netback. Check XenStore for multi-queue support, and set up the rings and event channels accordingly. Write ring references and event channels to XenStore in a queue hierarchy if appropriate, or flat when using only one queue. Update the xennet_select_queue() function to choose the queue on which to transmit a packet based on the skb hash result. Signed-off-by: NAndrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: NWei Liu <wei.liu2@citrix.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrew J. Bennieston 提交于
In preparation for multi-queue support in xen-netfront, move the queue-specific data from struct netfront_info to struct netfront_queue, and update the rest of the code to use this. Also adds loops over queues where appropriate, even though only one is configured at this point, and uses alloc_etherdev_mq() and the corresponding multi-queue netif wake/start/stop functions in preparation for multiple active queues. Finally, implements a trivial queue selection function suitable for ndo_select_queue, which simply returns 0, selecting the first (and only) queue. Signed-off-by: NAndrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: NWei Liu <wei.liu2@citrix.com> Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrew J. Bennieston 提交于
Builds on the refactoring of the previous patch to implement multiple queues between xen-netfront and xen-netback. Writes the maximum supported number of queues into XenStore, and reads the values written by the frontend to determine how many queues to use. Ring references and event channels are read from XenStore on a per-queue basis and rings are connected accordingly. Also adds code to handle the cleanup of any already initialised queues if the initialisation of a subsequent queue fails. Signed-off-by: NAndrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: NWei Liu <wei.liu2@citrix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Liu 提交于
In preparation for multi-queue support in xen-netback, move the queue-specific data from struct xenvif into struct xenvif_queue, and update the rest of the code to use this. Also adds loops over queues where appropriate, even though only one is configured at this point, and uses alloc_netdev_mq() and the corresponding multi-queue netif wake/start/stop functions in preparation for multiple active queues. Finally, implements a trivial queue selection function suitable for ndo_select_queue, which simply returns 0 for a single queue and uses skb_get_hash() to compute the queue index otherwise. Signed-off-by: NAndrew J. Bennieston <andrew.bennieston@citrix.com> Signed-off-by: NWei Liu <wei.liu2@citrix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrew J. Bennieston 提交于
This array was allocated separately in commit ac3d5ac2 ("xen-netback: fix guest-receive-side array sizes") due to it being very large, and a struct xenvif is allocated as the netdev_priv part of a struct net_device, i.e. via kmalloc() but falling back to vmalloc() if the initial alloc. fails. In preparation for the multi-queue patches, where this array becomes part of struct xenvif_queue and is always allocated through vzalloc(), move this back into the struct xenvif. Signed-off-by: NAndrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: NWei Liu <wei.liu2@citrix.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next由 David S. Miller 提交于
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates This series contains updates to e1000, igb and ixgbe. Emil provides his version 2 fix for the detection of SFP+ capable interfaces. In cases where the driver is loaded while there are no SFP+ modules in cage, the interface was not being detected as SFP capable. Resolve the issue by identifying interfaces with no PHY type set as SFP capable which allows the driver to detect the SFP module when the interface is brought up. In this version 2 of the patch, the 82599 specific check was removed since we only have 82598 devices that are SFP capable. Jacob removes the including of the export header in the ixgbe PTP core, since it is not needed. Renames igb_ptp_enable() to igb_ptp_feature_enable() to better reflect the actual functions purpose. Todd fixes the ethtool loopback test for i354 backplane devices since we do not know what PHY is to be used for the devices, use MAC loopback for ethtool tests. Todd also sets the packet buffer size register defaults for i210 devices. Yongjian Xu removes the check for skb->len being negative or zero since there is never a case where it would be zero or negative for e1000. Manuel Schölling updates e1000 to use the time_after() helper function. v2: Fix indentation on wrapped line in patch 3 of the series based on feedback from David Miller ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 6月, 2014 18 次提交
-
-
由 Manuel Schölling 提交于
To be future-proof and for better readability the time comparisons are modified to use time_after() instead of plain, error-prone math. Signed-off-by: NManuel Schölling <manuel.schoelling@gmx.de> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Yongjian Xu 提交于
There is no case skb->len would be 0 or 'negative'. Remove the check. Signed-off-by: NYongjian Xu <xuyongjiande@gmail.com> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Todd Fujinaka 提交于
Set the defaults on probe for the packet buffer size registers for the i210. Signed-off-by: NTodd Fujinaka <todd.fujinaka@intel.com> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Todd Fujinaka 提交于
We can't know what PHY is to be used for i354 backplane, so use MAC loopback for ethtool tests. Signed-off-by: NTodd Fujinaka <todd.fujinaka@intel.com> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Jacob Keller 提交于
The name igb_ptp_enable is not synonymous with the purpose of this function, so rename it to better explain its purpose. Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Tested-by: NAaron Brown <aaron.f.brown@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Jacob Keller 提交于
We don't need this header file, so we shouldn't be including it. Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Emil Tantilov 提交于
In cases where the driver is loaded while there are no SFP+ modules in the cage the interface was not being detected as SFP capable. To account for this the driver called identify_sfp in ixgbe_get_settings to make sure the data is correct. However when there is no SFP+ module in the cage the driver waits for the I2C reads to time out which can take more than a second and will cause issues with tools (like net-snmp) that may poll for that information. This patch resolves the issue by identifying interfaces with no PHY type set as SFP capable which allows the driver to detect the SFP module when the interface is brought up. As result of this we can also remove the identify_sfp call from ixgbe_get_settings. v2: remove the 82599 specific check since we have 82598 devices that are SFP capable. Signed-off-by: NEmil Tantilov <emil.s.tantilov@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net由 David S. Miller 提交于
Conflicts: include/net/inetpeer.h net/ipv6/output_core.c Changes in net were fixing bugs in code removed in net-next. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sergei Shtylyov 提交于
Commit 4a55530f (net: sh_eth: modify the definitions of register) managed to leave out the E-DMAC register entries in sh_eth_offset_fast_sh3_sh2[], thus totally breaking SH7619/771x support. Add the missing entries using the data from before that commit. Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: NYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Dooks 提交于
The current behaviour of the sh_eth driver is not to use the RNC bit for the receive ring. This means that every packet recieved is not only generating an IRQ but it also stops the receive ring DMA as well until the driver re-enables it after unloading the packet. This means that a number of the following errors are generated due to the receive packet FIFO overflowing due to nowhere to put packets: net eth0: Receive FIFO Overflow Since feedback from Yoshihiro Shimoda shows that every supported LSI for this driver should have the bit enabled it seems the best way is to remove the RMCR default value from the per-system data and just write it when initialising the RMCR value. This is discussed in the message (http://www.spinics.net/lists/netdev/msg284912.html). I have tested the RMCR_RNC configuration with NFS root filesystem and the driver has not failed yet. There are further test reports from Sergei Shtylov and others for both the R8A7790 and R8A7791. There is also feedback fron Cao Minh Hiep[1] which reports the same issue in (http://comments.gmane.org/gmane.linux.network/316285) showing this fixes issues with losing UDP datagrams under iperf. Tested-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: NBen Dooks <ben.dooks@codethink.co.uk> Acked-by: NYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Acked-by: NSimon Horman <horms+renesas@verge.net.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 WANG Cong 提交于
When we jump to free_pcpu on failure in alloc_netdev_mqs() rx and tx queues are not yet allocated, so no need to free them. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
It is possible that ->newlink() fails before registering the device, in this case we should just free it, it's safe to call free_netdev(). Fixes: commit 0e0eee24 (net: correct error path in rtnl_newlink()) Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NCong Wang <cwang@twopensource.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
A rmb() is required to ensure that the CQE is not read before it is written by the adapter DMA. PCI ordering rules will make sure the other fields are written before the marker at the end of struct eth_fast_path_rx_cqe but without rmb() a weakly ordered processor can process stale data. Without the barrier we have observed various crashes including bnx2x_tpa_start being called on queues not stopped (resulting in message start of bin not in stop) and NULL pointer exceptions from bnx2x_rx_int. Signed-off-by: NMilton Miller <miltonm@us.ibm.com> Signed-off-by: NWen Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
When injecting EEH error to bnx2x adapter, adapter couldn't be recovery and caused recursive EEH errors. The patch fixes the issue. Signed-off-by: NWen Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Balakumaran Kannan 提交于
As smsc driver supports carrier detection, it should unset NOCARRIER flag only after carrier state determination. By default that flag is off so driver should set it before starting auto-negotiation Signed-off-by: NBalakumaran <Balakumaran.Kannan@ap.sony.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 stephen hemminger 提交于
The uuid structure could be managed as a const in several places. Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Benoit Taine 提交于
This issue was reported by coccicheck using the semantic patch at scripts/coccinelle/api/resource_size.cocci Signed-off-by: NBenoit Taine <benoit.taine@lip6.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michal Kubecek 提交于
The xfrm_user module registers its pernet init/exit after xfrm itself so that its net exit function xfrm_user_net_exit() is executed before xfrm_net_exit() which calls xfrm_state_fini() to cleanup the SA's (xfrm states). This opens a window between zeroing net->xfrm.nlsk pointer and deleting all xfrm_state instances which may access it (via the timer). If an xfrm state expires in this window, xfrm_exp_state_notify() will pass null pointer as socket to nlmsg_multicast(). As the notifications are called inside rcu_read_lock() block, it is sufficient to retrieve the nlsk socket with rcu_dereference() and check the it for null. Signed-off-by: NMichal Kubecek <mkubecek@suse.cz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-