- 29 11月, 2018 1 次提交
-
-
由 John Fastabend 提交于
This adds a BPF SK_MSG program helper so that we can pop data from a msg. We use this to pop metadata from a previous push data call. Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 27 11月, 2018 1 次提交
-
-
由 David Miller 提交于
'offset' is constant and if it is zero, no need to subtract it from BPF_REG_TMP. Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 26 11月, 2018 2 次提交
-
-
由 Eric Dumazet 提交于
I do not see how one can effectively use skb_insert() without holding some kind of lock. Otherwise other cpus could have changed the list right before we have a chance of acquiring list->lock. Only existing user is in drivers/infiniband/hw/nes/nes_mgt.c and this one probably meant to use __skb_insert() since it appears nesqp->pau_list is protected by nesqp->pau_lock. This looks like nesqp->pau_lock could be removed, since nesqp->pau_list.lock could be used instead. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Faisal Latif <faisal.latif@intel.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: linux-rdma <linux-rdma@vger.kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Colin Ian King 提交于
A recent change added a null check on p->dev after p->dev was being dereferenced by the ns_capable check on p->dev. It turns out that neither the p->dev and p->br null checks are necessary, and can be removed, which cleans up a static analyis warning. As Nikolay Aleksandrov noted, these checks can be removed because: "My reasoning of why it shouldn't be possible: - On port add new_nbp() sets both p->dev and p->br before creating kobj/sysfs - On port del (trickier) del_nbp() calls kobject_del() before call_rcu() to destroy the port which in turn calls sysfs_remove_dir() which uses kernfs_remove() which deactivates (shouldn't be able to open new files) and calls kernfs_drain() to drain current open/mmaped files in the respective dir before continuing, thus making it impossible to open a bridge port sysfs file with p->dev and p->br equal to NULL. So I think it's safe to remove those checks altogether. It'd be nice to get a second look over my reasoning as I might be missing something in sysfs/kernfs call path." Thanks to Nikolay Aleksandrov's suggestion to remove the check and David Miller for sanity checking this. Detected by CoverityScan, CID#751490 ("Dereference before null check") Fixes: a5f3ea54 ("net: bridge: add support for raw sysfs port options") Signed-off-by: NColin Ian King <colin.king@canonical.com> Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 11月, 2018 20 次提交
-
-
由 Petr Machata 提交于
Due to an explicit check in rocker_world_port_obj_vlan_add(), dsa_slave_switchdev_event() resp. port_switchdev_event(), VLAN objects that are added to a device that is not a front-panel port device are ignored. Therefore this check is immaterial. Signed-off-by: NPetr Machata <petrm@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
Drop switchdev_ops.switchdev_port_obj_add and _del. Drop the uses of this field from all clients, which were migrated to use switchdev notification in the previous patches. Add a new function switchdev_port_obj_notify() that sends the switchdev notifications SWITCHDEV_PORT_OBJ_ADD and _DEL. Update switchdev_port_obj_del_now() to dispatch to this new function. Drop __switchdev_port_obj_add() and update switchdev_port_obj_add() likewise. Signed-off-by: NPetr Machata <petrm@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
After the transition from switchdev operations to notifier chain (which will take place in following patches), the onus is on the driver to find its own devices below possible layer of LAG or other uppers. The logic to do so is fairly repetitive: each driver is looking for its own devices among the lowers of the notified device. For those that it finds, it calls a handler. To indicate that the event was handled, struct switchdev_notifier_port_obj_info.handled is set. The differences lie only in what constitutes an "own" device and what handler to call. Therefore abstract this logic into two helpers, switchdev_handle_port_obj_add() and switchdev_handle_port_obj_del(). If a driver only supports physical ports under a bridge device, it will simply avoid this layer of indirection. One area where this helper diverges from the current switchdev behavior is the case of mixed lowers, some of which are switchdev ports and some of which are not. Previously, such scenario would fail with -EOPNOTSUPP. The helper could do that for lowers for which the passed-in predicate doesn't hold. That would however break the case that switchdev ports from several different drivers are stashed under one master, a scenario that switchdev currently happily supports. Therefore tolerate any and all unknown netdevices, whether they are backed by a switchdev driver or not. Signed-off-by: NPetr Machata <petrm@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
Following patches will change the way of distributing port object changes from a switchdev operation to a switchdev notifier. The switchdev code currently recursively descends through layers of lower devices, eventually calling the op on a front-panel port device. The notifier will instead be sent referencing the bridge port device, which may be a stacking device that's one of front-panel ports uppers, or a completely unrelated device. DSA currently doesn't support any other uppers than bridge. SWITCHDEV_OBJ_ID_HOST_MDB and _PORT_MDB objects are always notified on the bridge port device. Thus the only case that a stacked device could be validly referenced by port object notifications are bridge notifications for VLAN objects added to the bridge itself. But the driver explicitly rejects such notifications in dsa_port_vlan_add(). It is therefore safe to assume that the only interesting case is that the notification is on a front-panel port netdevice. Therefore keep the filtering by dsa_slave_dev_check() in place. To handle SWITCHDEV_PORT_OBJ_ADD and _DEL, subscribe to the blocking notifier chain. Dispatch to rocker_port_obj_add() resp. _del() to maintain the behavior that the switchdev operation based code currently has. Signed-off-by: NPetr Machata <petrm@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
In general one can't assume that a switchdev notifier is called in a non-atomic context, and correspondingly, the switchdev notifier chain is an atomic one. However, port object addition and deletion messages are delivered from a process context. Even the MDB addition messages, whose delivery is scheduled from atomic context, are queued and the delivery itself takes place in blocking context. For VLAN messages in particular, keeping the blocking nature is important for error reporting. Therefore introduce a blocking notifier chain and related service functions to distribute the notifications for which a blocking context can be assumed. Signed-off-by: NPetr Machata <petrm@mellanox.com> Reviewed-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Karsten Graul 提交于
When an rmb is no longer in use by a connection, unregister its rkey at the remote peer with an LLC DELETE RKEY message. With this change, unused buffers held in the buffer pool are no longer registered at the remote peer. They are registered before the buffer is actually used and unregistered when they are no longer used by a connection. Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com> Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Karsten Graul 提交于
Add the infrastructure to send LLC messages of type DELETE RKEY to unregister a shared memory region at the peer. Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com> Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Karsten Graul 提交于
When a send failed then don't start to wait for a response in smc_llc_do_confirm_rkey. Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com> Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
For easier reading move the unlock of mutex smc_create_lgr_pending into smc_listen_work(), i.e. into the function the mutex has been locked. No functional change. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
After sending one of the initial LLC messages CONFIRM LINK or ADD LINK, there is already a wait for the LLC response. It does not make sense to wait another long time for a CLC DECLINE. Thus this patch introduces a shorter wait time for these cases. Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
If a link is terminated that has never reached the active state, there is no need to trigger an LLC DELETE LINK. Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
If connection initialization fails for the LLC CONFIRM LINK or the LLC ADD LINK step, fallback to TCP should be enabled. Thus the negative return code -EAGAIN should switch to a positive timeout reason code in these cases, and the internal CLC socket should not have a set sk_err. Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
There is no need to store the return value in sk_err, if it is afterwards cleared again with sock_error(). This patch sets the return value directly. Just cleanup, no functional change. Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
smc_lgr_free() is just called inside smc_core.c. Make it static. Just cleanup, no functional change. Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
The tcp_listen_worker is already initialized when socket is created (in smc_sock_alloc()). Get rid of the duplicate initialization in smc_listen(). No functional change. Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We very often have few flows/chains to look at, and we might increase GRO_HASH_BUCKETS to 32 or 64 in the future. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Davide Caratti 提交于
commit f2cbd485 ("net/sched: act_police: fix race condition on state variables") introduces a new spinlock, but forgets its initialization. Ensure that tcf_police_init() initializes 'tcfp_lock' every time a 'police' action is newly created, to avoid the following lockdep splat: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. <...> Call Trace: dump_stack+0x85/0xcb register_lock_class+0x581/0x590 __lock_acquire+0xd4/0x1330 ? tcf_police_init+0x2fa/0x650 [act_police] ? lock_acquire+0x9e/0x1a0 lock_acquire+0x9e/0x1a0 ? tcf_police_init+0x2fa/0x650 [act_police] ? tcf_police_init+0x55a/0x650 [act_police] _raw_spin_lock_bh+0x34/0x40 ? tcf_police_init+0x2fa/0x650 [act_police] tcf_police_init+0x2fa/0x650 [act_police] tcf_action_init_1+0x384/0x4c0 tcf_action_init+0xf6/0x160 tcf_action_add+0x73/0x170 tc_ctl_action+0x122/0x160 rtnetlink_rcv_msg+0x2a4/0x490 ? netlink_deliver_tap+0x99/0x400 ? validate_linkmsg+0x370/0x370 netlink_rcv_skb+0x4d/0x130 netlink_unicast+0x196/0x230 netlink_sendmsg+0x2e5/0x3e0 sock_sendmsg+0x36/0x40 ___sys_sendmsg+0x280/0x2f0 ? _raw_spin_unlock+0x24/0x30 ? handle_pte_fault+0xafe/0xf30 ? find_held_lock+0x2d/0x90 ? syscall_trace_enter+0x1df/0x360 ? __sys_sendmsg+0x5e/0xa0 __sys_sendmsg+0x5e/0xa0 do_syscall_64+0x60/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f1841c7cf10 Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24 RSP: 002b:00007ffcf9df4d68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f1841c7cf10 RDX: 0000000000000000 RSI: 00007ffcf9df4dc0 RDI: 0000000000000003 RBP: 000000005bf56105 R08: 0000000000000002 R09: 00007ffcf9df8edc R10: 00007ffcf9df47e0 R11: 0000000000000246 R12: 0000000000671be0 R13: 00007ffcf9df4e84 R14: 0000000000000008 R15: 0000000000000000 Fixes: f2cbd485 ("net/sched: act_police: fix race condition on state variables") Reported-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavide Caratti <dcaratti@redhat.com> Acked-by: NCong Wang <xiyou.wangcong@gmail.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Eric noted that with UDP GRO and NAPI timeout, we could keep a single UDP packet inside the GRO hash forever, if the related NAPI instance calls napi_gro_complete() at an higher frequency than the NAPI timeout. Willem noted that even TCP packets could be trapped there, till the next retransmission. This patch tries to address the issue, flushing the old packets - those with a NAPI_GRO_CB age before the current jiffy - before scheduling the NAPI timeout. The rationale is that such a timeout should be well below a jiffy and we are not flushing packets eligible for sane GRO. v1 -> v2: - clarified the commit message and comment RFC -> v1: - added 'Fixes tags', cleaned-up the wording. Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Fixes: 3b47d303 ("net: gro: add a per device gro flush timer") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Acked-by: NWillem de Bruijn <willemb@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hangbin Liu 提交于
When we add a new IPv6 address, we should also join corresponding solicited-node multicast address, unless the interface has IFF_NOARP flag, as function addrconf_join_solict() did. But if we remove IFF_NOARP flag later, we do not do dad and add the mcast address. So we will drop corresponding neighbour discovery message that came from other nodes. A typical example is after creating a ipvlan with mode l3, setting up an ipv6 address and changing the mode to l2. Then we will not be able to ping this address as the interface doesn't join related solicited-node mcast address. Fix it by re-doing dad when interface changed IFF_NOARP flag. Then we will add corresponding mcast group and check if there is a duplicate address on the network. Reported-by: NJianlin Shi <jishi@redhat.com> Reviewed-by: NStefano Brivio <sbrivio@redhat.com> Signed-off-by: NHangbin Liu <liuhangbin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
tpacket_snd sends packets with user pages linked into skb frags. It notifies that pages can be reused when the skb is released by setting skb->destructor to tpacket_destruct_skb. This can cause data corruption if the skb is orphaned (e.g., on transmit through veth) or cloned (e.g., on mirror to another psock). Create a kernel-private copy of data in these cases, same as tun/tap zerocopy transmission. Reuse that infrastructure: mark the skb as SKBTX_ZEROCOPY_FRAG, which will trigger copy in skb_orphan_frags(_rx). Unlike other zerocopy packets, do not set shinfo destructor_arg to struct ubuf_info. tpacket_destruct_skb already uses that ptr to notify when the original skb is released and a timestamp is recorded. Do not change this timestamp behavior. The ubuf_info->callback is not needed anyway, as no zerocopy notification is expected. Mark destructor_arg as not-a-uarg by setting the lower bit to 1. The resulting value is not a valid ubuf_info pointer, nor a valid tpacket_snd frame address. Add skb_zcopy_.._nouarg helpers for this. The fix relies on features introduced in commit 52267790 ("sock: add MSG_ZEROCOPY"), so can be backported as is only to 4.14. Tested with from `./in_netns.sh ./txring_overwrite` from http://github.com/wdebruij/kerneltools/tests Fixes: 69e3c75f ("net: TX_RING and packet mmap") Reported-by: NAnand H. Krishnan <anandhkrishnan@gmail.com> Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 11月, 2018 1 次提交
-
-
由 Vlad Dumitrescu 提交于
This could be used to rate limit egress traffic in concert with a qdisc which supports Earliest Departure Time, such as FQ. Write access from cg skb progs only with CAP_SYS_ADMIN, since the value will be used by downstream qdiscs. It might make sense to relax this. Changes v1 -> v2: - allow access from cg skb, write only with CAP_SYS_ADMIN Signed-off-by: NVlad Dumitrescu <vladum@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 22 11月, 2018 8 次提交
-
-
由 Ido Schimmel 提交于
Allow querying bridge port flags so that drivers capable of performing VxLAN learning will update the bridge driver only if learning is enabled on its bridge port corresponding to the VxLAN device. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Reviewed-by: NPetr Machata <petrm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
In smc_wr_tx_put_slot() field pend->idx is used after being cleared. That means always idx 0 is cleared in the wr_tx_mask. This results in a broken administration of available WR send payload buffers. Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
Running uperf tests with SMCD on LPARs results in corrupted cursors. SMCD cursors should be treated atomically to fix cursor corruption. Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hans Wippel 提交于
When a SMC-D link group is freed, a shutdown signal should be sent to the peer to indicate that the link group is invalid. This patch adds the shutdown signal to the SMC code. Signed-off-by: NHans Wippel <hwippel@linux.ibm.com> Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Karsten Graul 提交于
When searching for an existing link group the queue pair number is also to be taken into consideration. When the SMC server sends a new number in a CLC packet (keeping all other values equal) then a new link group is to be created on the SMC client side. Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com> Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hans Wippel 提交于
In case of a non-blocking SMC socket, the initial CLC handshake is performed over a blocking TCP connection in a worker. If the SMC socket is released, smc_release has to wait for the blocking CLC socket operations (e.g., kernel_connect) inside the worker. This patch aborts a CLC connection when the respective non-blocking SMC socket is released to avoid waiting on socket operations or timeouts. Signed-off-by: NHans Wippel <hwippel@linux.ibm.com> Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Jean-Louis reported a TCP regression and bisected to recent SACK compression. After a loss episode (receiver not able to keep up and dropping packets because its backlog is full), linux TCP stack is sending a single SACK (DUPACK). Sender waits a full RTO timer before recovering losses. While RFC 6675 says in section 5, "Algorithm Details", (2) If DupAcks < DupThresh but IsLost (HighACK + 1) returns true -- indicating at least three segments have arrived above the current cumulative acknowledgment point, which is taken to indicate loss -- go to step (4). ... (4) Invoke fast retransmit and enter loss recovery as follows: there are old TCP stacks not implementing this strategy, and still counting the dupacks before starting fast retransmit. While these stacks probably perform poorly when receivers implement LRO/GRO, we should be a little more gentle to them. This patch makes sure we do not enable SACK compression unless 3 dupacks have been sent since last rcv_nxt update. Ideally we should even rearm the timer to send one or two more DUPACK if no more packets are coming, but that will be work aiming for linux-4.21. Many thanks to Jean-Louis for bisecting the issue, providing packet captures and testing this patch. Fixes: 5d9f4262 ("tcp: add SACK compression") Reported-by: NJean-Louis Dupond <jean-louis@dupond.be> Tested-by: NJean-Louis Dupond <jean-louis@dupond.be> Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Petr Machata 提交于
When a packet is trapped and the corresponding SKB marked as already-forwarded, it retains this marking even after it is forwarded across veth links into another bridge. There, since it ingresses the bridge over veth, which doesn't have offload_fwd_mark, it triggers a warning in nbp_switchdev_frame_mark(). Then nbp_switchdev_allowed_egress() decides not to allow egress from this bridge through another veth, because the SKB is already marked, and the mark (of 0) of course matches. Thus the packet is incorrectly blocked. Solve by resetting offload_fwd_mark() in skb_scrub_packet(). That function is called from tunnels and also from veth, and thus catches the cases where traffic is forwarded between bridges and transformed in a way that invalidates the marking. Fixes: 6bc506b4 ("bridge: switchdev: Add forward mark support for stacked devices") Fixes: abf4bb6b ("skbuff: Add the offload_mr_fwd_mark field") Signed-off-by: NPetr Machata <petrm@mellanox.com> Suggested-by: NIdo Schimmel <idosch@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 11月, 2018 5 次提交
-
-
由 Davide Caratti 提交于
after 'police' configuration parameters were converted to use RCU instead of spinlock, the state variables used to compute the traffic rate (namely 'tcfp_toks', 'tcfp_ptoks' and 'tcfp_t_c') are erroneously read/updated in the traffic path without any protection. Use a dedicated spinlock to avoid race conditions on these variables, and ensure proper cache-line alignment. In this way, 'police' is still faster than what we observed when 'tcf_lock' was used in the traffic path _ i.e. reverting commit 2d550dba ("net/sched: act_police: don't use spinlock in the data path"). Moreover, we preserve the throughput improvement that was obtained after 'police' started using per-cpu counters, when 'avrate' is used instead of 'rate'. Changes since v1 (thanks to Eric Dumazet): - call ktime_get_ns() before acquiring the lock in the traffic path - use a dedicated spinlock instead of tcf_lock - improve cache-line usage Fixes: 2d550dba ("net/sched: act_police: don't use spinlock in the data path") Reported-and-suggested-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavide Caratti <dcaratti@redhat.com> Reviewed-by: NEric Dumazet <edumazet@google.com>
-
由 Stephen Mallon 提交于
During tcp coalescing ensure that the skb hardware timestamp refers to the highest sequence number data. Previously only the software timestamp was updated during coalescing. Signed-off-by: NStephen Mallon <stephen.mallon@sydney.edu.au> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Under stress, softirq rx handler often hits a socket owned by the user, and has to queue the packet into socket backlog. When this happens, skb dst refcount is taken before we escape rcu protected region. This is done from __sk_add_backlog() calling skb_dst_force(). Consumer will have to perform the opposite costly operation. AFAIK nothing in tcp stack requests the dst after skb was stored in the backlog. If this was the case, we would have had failures already since skb_dst_force() can end up clearing skb dst anyway. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This has no value whatsoever. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
There are two cases were we can avoid calling ktime_get_ns() : 1) Queue is empty. 2) Internal queue is not empty. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 11月, 2018 2 次提交
-
-
由 Jakub Kicinski 提交于
In case of egress offloads the class/flowid assigned by the filter may be very important for offloaded Qdisc selection. Provide this info to drivers. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NJohn Hurley <john.hurley@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Allow drivers which offload GRED to report back statistics. Since A lot of GRED stats is fairly ad hoc in nature pass to drivers the standard struct gnet_stats_basic/gnet_stats_queue pairs, and untangle the values in the core. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NJohn Hurley <john.hurley@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-