- 07 7月, 2018 5 次提交
-
-
由 Willem de Bruijn 提交于
skb_shinfo(skb)->tx_flags is derived from sk->sk_tsflags, possibly after modification by __sock_cmsg_send, by calling sock_tx_timestamp. The IPv4 and IPv6 paths do this conversion differently. In IPv4, the individual protocols that support tx timestamps call this function and store the result in ipc.tx_flags. In IPv6, sock_tx_timestamp is called in __ip6_append_data. There is no need to store both tx_flags and ts_flags in the cookie as one is derived from the other. Convert when setting up the cork and remove the redundant field. This is similar to IPv6, only have the conversion happen only once per datagram, in ip(6)_setup_cork. Also change __ip6_append_data to match __ip_append_data. Only update tskey if timestamping is enabled with OPT_ID. The SOCK_.. test is redundant: only valid protocols can have non-zero cork->tx_flags. After this change the IPv4 and IPv6 logic is the same. Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
ipcm_cookie includes sockcm_cookie. Do the same for ipcm6_cookie. This reduces the number of arguments that need to be passed around, applies ipcm6_init to all cookie fields at once and reduces code differentiation between ipv4 and ipv6. Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
Initialize the cookie in one location to reduce code duplication and avoid bugs from inconsistent initialization, such as that fixed in commit 9887cba1 ("ip: limit use of gso_size to udp"). Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
Initialize the cookie in one location to reduce code duplication and avoid bugs from inconsistent initialization, such as that fixed in commit 9887cba1 ("ip: limit use of gso_size to udp"). Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
Initialize the cookie in one location to reduce code duplication and avoid bugs from inconsistent initialization, such as that fixed in commit 9887cba1 ("ip: limit use of gso_size to udp"). Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 7月, 2018 2 次提交
-
-
由 Edward Cree 提交于
Essentially the same as the ipv4 equivalents. Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Edward Cree 提交于
If we have an L3 master device, l3mdev_ip_rcv() will steal the skb, but we were returning NET_RX_SUCCESS from ip_rcv_finish_core() which meant that ip_list_rcv_finish() would keep it on the list. Instead let's move the l3mdev_ip_rcv() call into the caller, so that our response to a steal can be different in the single packet path (return NET_RX_SUCCESS) and the list path (forget this packet and continue). Fixes: 5fa12739 ("net: ipv4: listify ip_rcv_finish") Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 7月, 2018 12 次提交
-
-
由 Gustavo A. R. Silva 提交于
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Warning level 2 was used: -Wimplicit-fallthrough=2 Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gustavo A. R. Silva 提交于
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gustavo A. R. Silva 提交于
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Warning level 2 was used: -Wimplicit-fallthrough=2 Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vasundhara Volam 提交于
enable_sriov - Enables Single-Root Input/Output Virtualization(SR-IOV) characteristic of the device. Reviewed-by: NMichael Chan <michael.chan@broadcom.com> Signed-off-by: NVasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Moshe Shemesh 提交于
Add 2 first generic parameters to devlink configuration parameters set: internal_err_reset - When set enables reset device on internal errors. max_macs - max number of MACs per ETH port. Signed-off-by: NMoshe Shemesh <moshe@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Moshe Shemesh 提交于
Add devlink_param_notify() function to support devlink param notifications. Add notification call to devlink param set, register and unregister functions. Add devlink_param_value_changed() function to enable the driver notify devlink on value change. Driver should use this function after value was changed on any configuration mode part to driverinit. Signed-off-by: NMoshe Shemesh <moshe@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Moshe Shemesh 提交于
"driverinit" configuration mode value is held by devlink to enable the driver query the value after reload. Two additional functions added to help the driver get/set the value from/to devlink: devlink_param_driverinit_value_set() and devlink_param_driverinit_value_get(). Signed-off-by: NMoshe Shemesh <moshe@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Moshe Shemesh 提交于
Add param set command to set value for a parameter. Value can be set to any of the supported configuration modes. Signed-off-by: NMoshe Shemesh <moshe@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Moshe Shemesh 提交于
Add param get command which gets data per parameter. Option to dump the parameters data per device. Signed-off-by: NMoshe Shemesh <moshe@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Moshe Shemesh 提交于
Define configuration parameters data structure. Add functions to register and unregister the driver supported configuration parameters table. For each parameter registered, the driver should fill all the parameter's fields. In case the only supported configuration mode is "driverinit" the parameter's get()/set() functions are not required and should be set to NULL, for any other configuration mode, these functions are required and should be set by the driver. Signed-off-by: NMoshe Shemesh <moshe@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Li RongQing 提交于
After commit 07d78363 ("net: Convert NAPI gro list into a small hash table.")' there is 8 hash buckets, which allows more flows to be held for merging. but MAX_GRO_SKBS, the total held skb for merging, is 8 skb still, limit the hash table performance. keep MAX_GRO_SKBS as 8 skb, but limit each hash list length to 8 skb, not the total 8 skb Signed-off-by: NLi RongQing <lirongqing@baidu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Edward Cree 提交于
Since callees (ip_rcv_core() and ip_rcv_finish_core()) might free or steal the skb, we can't use the list_cut_before() method; we can't even do a list_del(&skb->list) in the drop case, because skb might have already been freed and reused. So instead, take each skb off the source list before processing, and add it to the sublist afterwards if it wasn't freed or stolen. Fixes: 5fa12739 net: ipv4: listify ip_rcv_finish Fixes: 17266ee9 net: ipv4: listified version of ip_rcv Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 7月, 2018 21 次提交
-
-
由 Jesus Sanchez-Palencia 提交于
Use the socket error queue for reporting dropped packets if the socket has enabled that feature through the SO_TXTIME API. Packets are dropped either on enqueue() if they aren't accepted by the qdisc or on dequeue() if the system misses their deadline. Those are reported as different errors so applications can react accordingly. Userspace can retrieve the errors through the socket error queue and the corresponding cmsg interfaces. A struct sock_extended_err* is used for returning the error data, and the packet's timestamp can be retrieved by adding both ee_data and ee_info fields as e.g.: ((__u64) serr->ee_data << 32) + serr->ee_info This feature is disabled by default and must be explicitly enabled by applications. Enabling it can bring some overhead for the Tx cycles of the application. Signed-off-by: NJesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesus Sanchez-Palencia 提交于
Add infra so etf qdisc supports HW offload of time-based transmission. For hw offload, the time sorted list is still used, so packets are dequeued always in order of txtime. Example: $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0 $ tc qdisc add dev enp2s0 parent 100:1 etf offload delta 100000 \ clockid CLOCK_REALTIME In this example, the Qdisc will use HW offload for the control of the transmission time through the network adapter. The hrtimer used for packets scheduling inside the qdisc will use the clockid CLOCK_REALTIME as reference and packets leave the Qdisc "delta" (100000) nanoseconds before their transmission time. Because this will be using HW offload and since dynamic clocks are not supported by the hrtimer, the system clock and the PHC clock must be synchronized for this mode to behave as expected. Signed-off-by: NJesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vinicius Costa Gomes 提交于
The ETF (Earliest TxTime First) qdisc uses the information added earlier in this series (the socket option SO_TXTIME and the new role of sk_buff->tstamp) to schedule packets transmission based on absolute time. For some workloads, just bandwidth enforcement is not enough, and precise control of the transmission of packets is necessary. Example: $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0 $ tc qdisc add dev enp2s0 parent 100:1 etf delta 100000 \ clockid CLOCK_TAI In this example, the Qdisc will provide SW best-effort for the control of the transmission time to the network adapter, the time stamp in the socket will be in reference to the clockid CLOCK_TAI and packets will leave the qdisc "delta" (100000) nanoseconds before its transmission time. The ETF qdisc will buffer packets sorted by their txtime. It will drop packets on enqueue() if their skbuff clockid does not match the clock reference of the Qdisc. Moreover, on dequeue(), a packet will be dropped if it expires while being enqueued. The qdisc also supports the SO_TXTIME deadline mode. For this mode, it will dequeue a packet as soon as possible and change the skb timestamp to 'now' during etf_dequeue(). Note that both the qdisc's and the SO_TXTIME ABIs allow for a clockid to be configured, but it's been decided that usage of CLOCK_TAI should be enforced until we decide to allow for other clockids to be used. The rationale here is that PTP times are usually in the TAI scale, thus no other clocks should be necessary. For now, the qdisc will return EINVAL if any clocks other than CLOCK_TAI are used. Signed-off-by: NJesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vinicius Costa Gomes 提交于
This adds 'qdisc_watchdog_init_clockid()' that allows a clockid to be passed, this allows other time references to be used when scheduling the Qdisc to run. Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Cochran 提交于
For raw layer-2 packets, copy the desired future transmit time from the CMSG cookie into the skb. Signed-off-by: NRichard Cochran <rcochran@linutronix.de> Signed-off-by: NJesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesus Sanchez-Palencia 提交于
Add a struct sockcm_cookie parameter to ip6_setup_cork() so we can easily re-use the transmit_time field from struct inet_cork for most paths, by copying the timestamp from the CMSG cookie. This is later copied into the skb during __ip6_make_skb(). For the raw fast path, also pass the sockcm_cookie as a parameter so we can just perform the copy at rawv6_send_hdrinc() directly. Signed-off-by: NJesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesus Sanchez-Palencia 提交于
Add a transmit_time field to struct inet_cork, then copy the timestamp from the CMSG cookie at ip_setup_cork() so we can safely copy it into the skb later during __ip_make_skb(). For the raw fast path, just perform the copy at raw_send_hdrinc(). Signed-off-by: NRichard Cochran <rcochran@linutronix.de> Signed-off-by: NJesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Cochran 提交于
This patch introduces SO_TXTIME. User space enables this option in order to pass a desired future transmit time in a CMSG when calling sendmsg(2). The argument to this socket option is a 8-bytes long struct provided by the uapi header net_tstamp.h defined as: struct sock_txtime { clockid_t clockid; u32 flags; }; Note that new fields were added to struct sock by filling a 2-bytes hole found in the struct. For that reason, neither the struct size or number of cachelines were altered. Signed-off-by: NRichard Cochran <rcochran@linutronix.de> Signed-off-by: NJesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesus Sanchez-Palencia 提交于
This is done in preparation for the upcoming time based transmission patchset. Now that skb->tstamp will be used to hold packet's txtime, we must ensure that it is being cleared when traversing namespaces. Also, doing that from skb_scrub_packet() before the early return would break our feature when tunnels are used. Signed-off-by: NJesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Yongjun 提交于
'keys_ex' is malloced by tcf_pedit_keys_ex_parse() in tcf_pedit_init() but not all of the error handle path free it, this may cause memory leak. This patch fix it. Fixes: 71d0ed70 ("net/act_pedit: Support using offset relative to the conventional network headers") Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com> Acked-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Qiaobin Fu 提交于
The new action inheritdsfield copies the field DS of IPv4 and IPv6 packets into skb->priority. This enables later classification of packets based on the DS field. v5: *Update the drop counter for TC_ACT_SHOT v4: *Not allow setting flags other than the expected ones. *Allow dumping the pure flags. v3: *Use optional flags, so that it won't break old versions of tc. *Allow users to set both SKBEDIT_F_PRIORITY and SKBEDIT_F_INHERITDSFIELD flags. v2: *Fix the style issue *Move the code from skbmod to skbedit Original idea by Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NQiaobin Fu <qiaobinf@bu.edu> Reviewed-by: NMichel Machado <michel@digirati.com.br> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: NDavide Caratti <dcaratti@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Edward Cree 提交于
Generally the check should be very cheap, as the sk_buff_head is in cache. Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Edward Cree 提交于
ip_rcv_finish_core(), if it does not drop, sets skb->dst by either early demux or route lookup. The last step, calling dst_input(skb), is left to the caller; in the listified case, we split to form sublists with a common dst, but then ip_sublist_rcv_finish() just calls dst_input(skb) in a loop. The next step in listification would thus be to add a list_input() method to struct dst_entry. Early demux is an indirect call based on iph->protocol; this is another opportunity for listification which is not taken here (it would require slicing up ip_rcv_finish_core() to allow splitting on protocol changes). Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Edward Cree 提交于
Also involved adding a way to run a netfilter hook over a list of packets. Rather than attempting to make netfilter know about lists (which would be a major project in itself) we just let it call the regular okfn (in this case ip_rcv_finish()) for any packets it steals, and have it give us back a list of packets it's synchronously accepted (which normally NF_HOOK would automatically call okfn() on, but we want to be able to potentially pass the list to a listified version of okfn().) The netfilter hooks themselves are indirect calls that still happen per- packet (see nf_hook_entry_hookfn()), but again, changing that can be left for future work. There is potential for out-of-order receives if the netfilter hook ends up synchronously stealing packets, as they will be processed before any accepts earlier in the list. However, it was already possible for an asynchronous accept to cause out-of-order receives, so presumably this is considered OK. Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Edward Cree 提交于
__netif_receive_skb_core() does a depressingly large amount of per-packet work that can't easily be listified, because the another_round looping makes it nontrivial to slice up into smaller functions. Fortunately, most of that work disappears in the fast path: * Hardware devices generally don't have an rx_handler * Unless you're tcpdumping or something, there is usually only one ptype * VLAN processing comes before the protocol ptype lookup, so doesn't force a pt_prev deliver so normally, __netif_receive_skb_core() will run straight through and pass back the one ptype found in ptype_base[hash of skb->protocol]. Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Edward Cree 提交于
First example of a layer splitting the list (rather than merely taking individual packets off it). Involves new list.h function, list_cut_before(), like list_cut_position() but cuts on the other side of the given entry. Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Edward Cree 提交于
netif_receive_skb_list_internal() now processes a list and hands it on to the next function. Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Edward Cree 提交于
Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Edward Cree 提交于
Just calls netif_receive_skb() in a loop. Signed-off-by: NEdward Cree <ecree@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
The transport with illegal flowlabel should not be allowed to send packets. Other transport protocols already denies this. Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
Struct sockaddr_in6 has the member sin6_flowinfo that includes the ipv6 flowlabel, it should also support for setting flowlabel when adding a transport whose ipaddr is from userspace. Note that addrinfo in sctp_sendmsg is using struct in6_addr for the secondary addrs, which doesn't contain sin6_flowinfo, and it needs to copy sin6_flowinfo from the primary addr. Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-