- 01 11月, 2020 11 次提交
-
-
由 Robert Hancock 提交于
Update the axienet driver to properly support the Xilinx PCS/PMA PHY component which is used for 1000BaseX and SGMII modes, including properly configuring the auto-negotiation mode of the PHY and reading the negotiated state from the PHY. Signed-off-by: NRobert Hancock <robert.hancock@calian.com> Reviewed-by: NRadhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Link: https://lore.kernel.org/r/20201028171429.1699922-1-robert.hancock@calian.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Alex Elder 提交于
The previous commit added support for IPA having up to six source and destination resources. But currently nothing uses more than four. (Five of each are used in a newer version of the hardware.) I find that in one of my build environments the compiler complains about newly-added code in two spots. Inspection shows that the warnings have no merit, but this compiler does not recognize that. ipa_main.c:457:39: warning: array index 5 is past the end of the array (which contains 4 elements) [-Warray-bounds] (and the same warning at line 483) We can make this warning go away by changing the number of elements in the source and destination resource limit arrays--now rather than waiting until we need it to support the newer hardware. This change was coming soon anyway; make it now to get rid of the warning. Signed-off-by: NAlex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20201031151524.32132-1-elder@linaro.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jakub Kicinski 提交于
Heiner Kallweit says: ==================== net: add functionality to net core byte/packet counters and use it in r8169 This series adds missing functionality to the net core handling of byte/packet counters and statistics. The extensions are then used to remove private rx/tx byte/packet counters in r8169 driver. ==================== Link: https://lore.kernel.org/r/1fdb8ecd-be0a-755d-1d92-c62ed8399e77@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Heiner Kallweit 提交于
After switching to the net core rx/tx byte/packet counters we can remove the now unused private version. Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Heiner Kallweit 提交于
Switch to the net core rx/tx byte/packet counter infrastructure. This simplifies the code, only small drawback is some memory overhead because we use just one queue, but allocate the counters per cpu. Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Heiner Kallweit 提交于
We have netdev_alloc_pcpu_stats(), and we have devm_alloc_percpu(). Add a managed version of netdev_alloc_pcpu_stats, e.g. for allocating the per-cpu stats in the probe() callback of a driver. It needs to be a macro for dealing properly with the type argument. Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Heiner Kallweit 提交于
Add dev_sw_netstats_tx_add(), complementing already existing dev_sw_netstats_rx_add(). Other than dev_sw_netstats_rx_add allow to pass the number of packets as function argument. Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jakub Kicinski 提交于
Sebastian Andrzej Siewior says: ==================== in_interrupt() cleanup, part 2 in the discussion about preempt count consistency across kernel configurations: https://lore.kernel.org/r/20200914204209.256266093@linutronix.de/ Linus clearly requested that code in drivers and libraries which changes behaviour based on execution context should either be split up so that e.g. task context invocations and BH invocations have different interfaces or if that's not possible the context information has to be provided by the caller which knows in which context it is executing. This includes conditional locking, allocation mode (GFP_*) decisions and avoidance of code paths which might sleep. In the long run, usage of 'preemptible, in_*irq etc.' should be banned from driver code completely. This is part two addressing remaining drivers except for orinoco-usb. ==================== Cherry picking only Ethernet changes. Link: https://lore.kernel.org/r/20201027225454.3492351-1-bigeasy@linutronix.deSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
The driver uses in_irq() to determine if the tlan_priv::lock has to be acquired in tlan_mii_read_reg() and tlan_mii_write_reg(). The interrupt handler acquires the lock outside of these functions so the in_irq() check is meant to prevent a lock recursion deadlock. But this check is incorrect when interrupt force threading is enabled because then the handler runs in thread context and in_irq() correctly returns false. The usage of in_*() in drivers is phased out and Linus clearly requested that code which changes behaviour depending on context should either be seperated or the context be conveyed in an argument passed by the caller, which usually knows the context. tlan_set_timer() has this conditional as well, but this function is only invoked from task context or the timer callback itself. So it always has to lock and the check can be removed. tlan_mii_read_reg(), tlan_mii_write_reg() and tlan_phy_print() are invoked from interrupt and other contexts. Split out the actual function body into helper variants which are called from interrupt context and make the original functions wrappers which acquire tlan_priv::lock unconditionally. Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Samuel Chessman <chessman@tux.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
nv_update_stats() triggers a WARN_ON() when invoked from hard interrupt context because the locks in use are not hard interrupt safe. It also has an assert_spin_locked() which was the lock check before the lockdep era. Lockdep has way broader locking correctness checks and covers both issues, so replace the warning and the lock assert with lockdep_assert_held(). Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Rain River <rain.1986.08.12@gmail.com> Cc: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
wait_for_cmd_complete() uses in_interrupt() to detect whether it is safe to sleep or not. The usage of in_interrupt() in drivers is phased out and Linus clearly requested that code which changes behaviour depending on context should either be seperated or the context be conveyed in an argument passed by the caller, which usually knows the context. in_interrupt() also is only partially correct because it fails to chose the correct code path when just preemption or interrupts are disabled. Add an argument 'may_block' to both functions and adjust the callers to pass the context information. The following call chains which end up invoking wait_for_cmd_complete() were analyzed to be safe to sleep: s2io_card_up() s2io_set_multicast() init_nic() init_tti() s2io_close() do_s2io_delete_unicast_mc() do_s2io_add_mac() s2io_set_mac_addr() do_s2io_prog_unicast() do_s2io_add_mac() s2io_reset() do_s2io_restore_unicast_mc() do_s2io_add_mc() do_s2io_add_mac() s2io_open() do_s2io_prog_unicast() do_s2io_add_mac() The following call chains which end up invoking wait_for_cmd_complete() were analyzed to be safe to sleep: __dev_set_rx_mode() s2io_set_multicast() s2io_txpic_intr_handle() s2io_link() init_tti() Add a may_sleep argument to wait_for_cmd_complete(), s2io_set_multicast() and init_tti() and hand the context information in from the call sites. Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Jon Mason <jdmason@kudzu.us> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 31 10月, 2020 29 次提交
-
-
由 Jakub Kicinski 提交于
Vladimir Oltean says: ==================== L2 multicast forwarding for Ocelot switch This series enables the mscc_ocelot switch to forward raw L2 (non-IP) mdb entries as configured by the bridge driver after this patch: https://patchwork.ozlabs.org/project/netdev/patch/20201028233831.610076-1-vladimir.oltean@nxp.com/ ==================== Link: https://lore.kernel.org/r/20201029022738.722794-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vladimir Oltean 提交于
There is one main difference in mscc_ocelot between IP multicast and L2 multicast. With IP multicast, destination ports are encoded into the upper bytes of the multicast MAC address. Example: to deliver the address 01:00:5E:11:22:33 to ports 3, 8, and 9, one would need to program the address of 00:03:08:11:22:33 into hardware. Whereas for L2 multicast, the MAC table entry points to a Port Group ID (PGID), and that PGID contains the port mask that the packet will be forwarded to. As to why it is this way, no clue. My guess is that not all port combinations can be supported simultaneously with the limited number of PGIDs, and this was somehow an issue for IP multicast but not for L2 multicast. Anyway. Prior to this change, the raw L2 multicast code was bogus, due to the fact that there wasn't really any way to test it using the bridge code. There were 2 issues: - A multicast PGID was allocated for each MDB entry, but it wasn't in fact programmed to hardware. It was dummy. - In fact we don't want to reserve a multicast PGID for every single MDB entry. That would be odd because we can only have ~60 PGIDs, but thousands of MDB entries. So instead, we want to reserve a multicast PGID for every single port combination for multicast traffic. And since we can have 2 (or more) MDB entries delivered to the same port group (and therefore PGID), we need to reference-count the PGIDs. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vladimir Oltean 提交于
This saves a re-classification of the MDB address on deletion. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vladimir Oltean 提交于
It is Not Needed, a comment will suffice. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vladimir Oltean 提交于
Since a helper is available for copying Ethernet addresses, let's use it. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vladimir Oltean 提交于
ocelot.h says: /* MAC table entry types. * ENTRYTYPE_NORMAL is subject to aging. * ENTRYTYPE_LOCKED is not subject to aging. * ENTRYTYPE_MACv4 is not subject to aging. For IPv4 multicast. * ENTRYTYPE_MACv6 is not subject to aging. For IPv6 multicast. */ We don't want the permanent entries added with 'bridge mdb' to be subject to aging. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Vladimir Oltean 提交于
When creating a new multicast port group, there is implicit conversion between the __u8 state member of struct br_mdb_entry and the unsigned char flags member of struct net_bridge_port_group. This implicit conversion relies on the fact that MDB_PERMANENT is equal to MDB_PG_FLAGS_PERMANENT. Let's be more explicit and convert the state to flags manually. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20201028234815.613226-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Nikolay Aleksandrov 提交于
Extend the bridge multicast control and data path to configure routes for L2 (non-IP) multicast groups. The uapi struct br_mdb_entry union u is extended with another variant, mac_addr, which does not change the structure size, and which is valid when the proto field is zero. To be compatible with the forwarding code that is already in place, which acts as an IGMP/MLD snooping bridge with querier capabilities, we need to declare that for L2 MDB entries (for which there exists no such thing as IGMP/MLD snooping/querying), that there is always a querier. Otherwise, these entries would be flooded to all bridge ports and not just to those that are members of the L2 multicast group. Needless to say, only permanent L2 multicast groups can be installed on a bridge port. Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20201028233831.610076-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jakub Kicinski 提交于
Edward Cree says: ==================== sfc: EF100 TSO enhancements Support TSO over encapsulation (with GSO_PARTIAL), and over VLANs (which the code already handled but we didn't advertise). Also correct our handling of IPID mangling. I couldn't find documentation of exactly what shaped SKBs we can get given, so patch #2 is slightly guesswork, but when I tested TSO over both underlay and (VxLAN) overlay, the checksums came out correctly, so at least in those cases the edits we're making must be the right ones. Similarly, I'm not 100% sure I've correctly understood how FIXEDID and MANGLEID are supposed to work in patch #3. ==================== Link: https://lore.kernel.org/r/6e1ea05f-faeb-18df-91ef-572445691d89@solarflare.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Edward Cree 提交于
Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Edward Cree 提交于
AIUI, the NETIF_F_TSO_MANGLEID flag is a signal to the stack that a driver may _need_ to mangle IDs in order to do TSO, and conversely a signal from the stack that the driver is permitted to do so. Since we support both fixed and incrementing IPIDs, we should rely on the SKB_GSO_FIXEDID flag on a per-skb basis, rather than using the MANGLEID feature to make all TSOs fixed-id. Includes other minor cleanups of ef100_make_tso_desc() coding style. Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Edward Cree 提交于
The NIC only needs to know where the headers it has to edit (TCP and inner and outer IPv4) are, which fits GSO_PARTIAL nicely. It also supports non-PARTIAL offload of UDP tunnels, again just needing to be told the outer transport offset so that it can edit the UDP length field. (It's not clear to me whether the stack will ever use the non-PARTIAL version with the netdev feature flags we're setting here.) Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Edward Cree 提交于
We need EFX_POPULATE_OWORD_17 for an encap TSO descriptor on EF100. Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jakub Kicinski 提交于
Alex Elder says: ==================== net: ipa: minor bug fixes This series fixes several bugs. They are minor, in that the code currently works on supported platforms even without these patches applied, but they're bugs nevertheless and should be fixed. Version 2 improves the commit message for the fourth patch. It also fixes a bug in two spots in the last patch. Both of these changes were suggested by Willem de Bruijn. ==================== Link: https://lore.kernel.org/r/20201028194148.6659-1-elder@linaro.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Alex Elder 提交于
The minimum and maximum limits for resources assigned to a given resource group are programmed in pairs, with the limits for two groups set in a single register. If the number of supported resource groups is odd, only half of the register that defines these limits is valid for the last group; that group has no second group in the pair. Currently we ignore this constraint, and it turns out to be harmless, but it is not guaranteed to be. This patch addresses that, and adds support for programming the 5th resource group's limits. Rework how the resource group limit registers are programmed by having a single function program all group pairs rather than having one function program each pair. Add the programming of the 4-5 resource group pair limits to this function. If a resource group is not supported, pass a null pointer to ipa_resource_config_common() for that group and have that function write zeroes in that case. Tested-by: NSujit Kautkar <sujitka@chromium.org> Signed-off-by: NAlex Elder <elder@linaro.org> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Alex Elder 提交于
The number of resource groups supported by the hardware can be different for source and destination resources. Determine the number supported for each using separate functions. Make the functions inline end move their definitions into "ipa_reg.h", because they determine whether certain register definitions are valid. Pass just the IPA hardware version as argument. IPA_RESOURCE_GROUP_COUNT represents the maximum number of resource groups the driver supports for any hardware version. Change that symbol to be two separate constants, one for source and the other for destination resource groups. Rename them to end with "_MAX" rather than "_COUNT", to reflect their true purpose. Tested-by: NSujit Kautkar <sujitka@chromium.org> Signed-off-by: NAlex Elder <elder@linaro.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Alex Elder 提交于
The IPA hardware manages various resources (e.g. descriptors) internally to perform its functions. The resources are grouped, allowing different endpoints to use separate resource pools. This way one group of endpoints can be configured to operate unaffected by the resource use of endpoints in a different group. Endpoints should be assigned to a resource group, but we currently don't do that. Define a new resource_group field in the endpoint configuration data, and use it to assign the proper resource group to use for each AP endpoint. Tested-by: NSujit Kautkar <sujitka@chromium.org> Signed-off-by: NAlex Elder <elder@linaro.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Alex Elder 提交于
The mask for the RSRC_GRP field in the INIT_RSRC_GRP endpoint initialization register is incorrectly defined for IPA v4.2 (where it is only one bit wide). So we need to fix this. The fix is not straightforward, however. Field masks are passed to functions like u32_encode_bits(), and for that they must be constant. To address this, we define a new inline function that returns the *encoded* value to use for a given RSRC_GRP field, which depends on the IPA version. The caller can then use something like this, to assign a given endpoint resource id 1: u32 offset = IPA_REG_ENDP_INIT_RSRC_GRP_N_OFFSET(endpoint_id); u32 val = rsrc_grp_encoded(ipa->version, 1); iowrite32(val, ipa->reg_virt + offset); The next patch requires this fix. Tested-by: NSujit Kautkar <sujitka@chromium.org> Signed-off-by: NAlex Elder <elder@linaro.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Alex Elder 提交于
At the end of ipa_mem_setup() we write the local packet processing context base register to tell it where the processing context memory is. But we are writing the wrong value. The value written turns out to be the offset of the modem header memory region (assigned earlier in the function). Fix this bug. Tested-by: NSujit Kautkar <sujitka@chromium.org> Signed-off-by: NAlex Elder <elder@linaro.org> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Moritz Fischer 提交于
The driver does not implement a shutdown handler which leads to issues when using kexec in certain scenarios. The NIC keeps on fetching descriptors which gets flagged by the IOMMU with errors like this: DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000 DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000 DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000 DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000 DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000 Signed-off-by: NMoritz Fischer <mdf@kernel.org> Link: https://lore.kernel.org/r/20201028172125.496942-1-mdf@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Robert Hancock 提交于
The Finisar FCLF8520P2BTL 1000BaseT SFP module uses a Marvel 88E1111 PHY with a modified PHY ID. Add support for this ID using the 88E1111 methods. By default these modules do not have 1000BaseX auto-negotiation enabled, which is not generally desirable with Linux networking drivers. Add handling to enable 1000BaseX auto-negotiation when these modules are used in 1000BaseX mode. Also, some special handling is required to ensure that 1000BaseT auto-negotiation is enabled properly when desired. Based on existing handling in the AMD xgbe driver and the information in the Finisar FAQ: https://www.finisar.com/sites/default/files/resources/an-2036_1000base-t_sfp_faqreve1.pdfSigned-off-by: NRobert Hancock <robert.hancock@calian.com> Reviewed-by: NRussell King <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20201028171540.1700032-1-robert.hancock@calian.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jakub Kicinski 提交于
Xin Long says: ==================== sctp: Implement RFC6951: UDP Encapsulation of SCTP Description From the RFC: The Main Reasons: o To allow SCTP traffic to pass through legacy NATs, which do not provide native SCTP support as specified in [BEHAVE] and [NATSUPP]. o To allow SCTP to be implemented on hosts that do not provide direct access to the IP layer. In particular, applications can use their own SCTP implementation if the operating system does not provide one. Implementation Notes: UDP-encapsulated SCTP is normally communicated between SCTP stacks using the IANA-assigned UDP port number 9899 (sctp-tunneling) on both ends. There are circumstances where other ports may be used on either end, and it might be required to use ports other than the registered port. Each SCTP stack uses a single local UDP encapsulation port number as the destination port for all its incoming SCTP packets, this greatly simplifies implementation design. An SCTP implementation supporting UDP encapsulation MUST maintain a remote UDP encapsulation port number per destination address for each SCTP association. Again, because the remote stack may be using ports other than the well-known port, each port may be different from each stack. However, because of remapping of ports by NATs, the remote ports associated with different remote IP addresses may not be identical, even if they are associated with the same stack. Because the well-known port might not be used, implementations need to allow other port numbers to be specified as a local or remote UDP encapsulation port number through APIs. Patches: This patchset is using the udp4/6 tunnel APIs to implement the UDP Encapsulation of SCTP with not much change in SCTP protocol stack and with all current SCTP features keeped in Linux Kernel. 1 - 4: Fix some UDP issues that may be triggered by SCTP over UDP. 5 - 7: Process incoming UDP encapsulated packets and ICMP packets. 8 -10: Remote encap port's update by sysctl, sockopt and packets. 11-14: Process outgoing pakects with UDP encapsulated and its GSO. 15-16: Add the part from draft-tuexen-tsvwg-sctp-udp-encaps-cons-03. 17: Enable this feature. Tests: - lksctp-tools/src/func_tests with UDP Encapsulation enabled/disabled: Both make v4test and v6test passed. - sctp-tests with UDP Encapsulation enabled/disabled: repeatability/procdumps/sctpdiag/gsomtuchange/extoverflow/ sctphashtable passed. Others failed as expected due to those "iptables -p sctp" rules. - netperf on lo/netns/virtio_net, with gso enabled/disabled and with ip_checksum enabled/disabled, with UDP Encapsulation enabled/disabled: No clear performance dropped. v1->v2: - Fix some incorrect code in the patches 5,6,8,10,11,13,14,17, suggested by Marcelo. - Append two patches 15-16 to add the Additional Considerations for UDP Encapsulation of SCTP from draft-tuexen-tsvwg-sctp-udp-encaps-cons-03. v2->v3: - remove the cleanup code in patch 2, suggested by Willem. - remove the patch 3 and fix the checksum in the new patch 3 after talking with Paolo, Marcelo and Guillaume. - add 'select NET_UDP_TUNNEL' in patch 4 to solve a compiling error. - fix __be16 type cast warning in patch 8. - fix the wrong endian orders when setting values in 14,16. v3->v4: - add entries in ip-sysctl.rst in patch 7,16, as Marcelo Suggested. - not create udp socks when udp_port is set to 0 in patch 16, as Marcelo noticed. v4->v5: - improve the description for udp_port and encap_port entries in patch 7, 16. - use 0 as the default udp_port. ==================== Link: https://lore.kernel.org/r/cover.1603955040.git.lucien.xin@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Xin Long 提交于
This patch is to enable udp tunneling socks by calling sctp_udp_sock_start() in sctp_ctrlsock_init(), and sctp_udp_sock_stop() in sctp_ctrlsock_exit(). Also add sysctl udp_port to allow changing the listening sock's port by users. Wit this patch, the whole sctp over udp feature can be enabled and used. v1->v2: - Also update ctl_sock udp_port in proc_sctp_do_udp_port() where netns udp_port gets changed. v2->v3: - Call htons() when setting sk udp_port from netns udp_port. v3->v4: - Not call sctp_udp_sock_start() when new_value is 0. - Add udp_port entry in ip-sysctl.rst. v4->v5: - Not call sctp_udp_sock_start/stop() in sctp_ctrlsock_init/exit(). - Improve the description of udp_port in ip-sysctl.rst. Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Xin Long 提交于
This is from Section 4 of draft-tuexen-tsvwg-sctp-udp-encaps-cons-03, and it requires responding with an abort chunk with an error cause when the udp source port of the received init chunk doesn't match the encap port of the transport. Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Xin Long 提交于
This patch is to add the function to make the abort chunk with the error cause for new encapsulation port restart, defined on Section 4.4 in draft-tuexen-tsvwg-sctp-udp-encaps-cons-03. v1->v2: - no change. v2->v3: - no need to call htons() when setting nep.cur_port/new_port. Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Xin Long 提交于
This one basically does the similar things in sctp_v6_xmit as does for udp4 sock in the last patch, just note that: 1. label needs to be calculated, as it's the param of udp_tunnel6_xmit_skb(). 2. The 'nocheck' param of udp_tunnel6_xmit_skb() is false, as required by RFC. v1->v2: - Use sp->udp_port instead in sctp_v6_xmit(), which is more safe. Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Xin Long 提交于
This patch does what the rfc6951#section-5.3 says for ipv4: "Within the UDP header, the source port MUST be the local UDP encapsulation port number of the SCTP stack, and the destination port MUST be the remote UDP encapsulation port number maintained for the association and the destination address to which the packet is sent (see Section 5.1). Because the SCTP packet is the UDP payload, the length of the UDP packet MUST be the length of the SCTP packet plus the size of the UDP header. The SCTP checksum MUST be computed for IPv4 and IPv6, and the UDP checksum SHOULD be computed for IPv4 and IPv6." Some places need to be adjusted in sctp_packet_transmit(): 1. For non-gso packets, when transport's encap_port is set, sctp checksum has to be done in sctp_packet_pack(), as the outer udp will use ip_summed = CHECKSUM_PARTIAL to do the offload setting for checksum. 2. Delay calling dst_clone() and skb_dst_set() for non-udp packets until sctp_v4_xmit(), as for udp packets, skb_dst_set() is not needed before calling udp_tunnel_xmit_skb(). then in sctp_v4_xmit(): 1. Go to udp_tunnel_xmit_skb() only when transport->encap_port and net->sctp.udp_port both are set, as these are one for dst port and another for src port. 2. For gso packet, SKB_GSO_UDP_TUNNEL_CSUM is set for gso_type, and with this udp checksum can be done in __skb_udp_tunnel_segment() for each segments after the sctp gso. 3. inner_mac_header and inner_transport_header are set, as these will be needed in __skb_udp_tunnel_segment() to find the right headers. 4. df and ttl are calculated, as these are the required params by udp_tunnel_xmit_skb(). 5. nocheck param has to be false, as "the UDP checksum SHOULD be computed for IPv4 and IPv6", says in rfc6951#section-5.3. v1->v2: - Use sp->udp_port instead in sctp_v4_xmit(), which is more safe. Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Xin Long 提交于
sk_setup_caps() was originally called in Commit 90017acc ("sctp: Add GSO support"), as: "We have to refresh this in case we are xmiting to more than one transport at a time" This actually happens in the loop of sctp_outq_flush_transports(), and it shouldn't be tied to gso, so move it out of gso part and before sctp_packet_pack(). Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Xin Long 提交于
sctp_mtu_payload() is for calculating the frag size before making chunks from a msg. So we should only add udphdr size to overhead when udp socks are listening, as only then sctp can handle the incoming sctp over udp packets and outgoing sctp over udp packets will be possible. Note that we can't do this according to transport->encap_port, as different transports may be set to different values, while the chunks were made before choosing the transport, we could not be able to meet all rfc6951#section-5.6 recommends. v1->v2: - Add udp_port for sctp_sock to avoid a potential race issue, it will be used in xmit path in the next patch. Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-