- 08 11月, 2013 3 次提交
-
-
由 John Fastabend 提交于
Add a operations structure that allows a network interface to export the fact that it supports package forwarding in hardware between physical interfaces and other mac layer devices assigned to it (such as macvlans). This operaions structure can be used by virtual mac devices to bypass software switching so that forwarding can be done in hardware more efficiently. Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dan Carpenter 提交于
There is a bug in cpsw_probe() where we do: ndev->irq = platform_get_irq(pdev, 0); if (ndev->irq < 0) { The problem is that "ndev->irq" is unsigned so the error handling doesn't work. I have changed it to a regular int. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eyal Perry 提交于
Provide a method for read-only access to the vlan device egress mapping. Do this by refactoring vlan_dev_get_egress_qos_mask() such that now it receives as an argument the skb priority instead of pointer to the skb. Such an access is needed for the IBoE stack where the control plane goes through the network stack. This is an add-on step on top of commit d4a96865 "net/route: export symbol ip_tos2prio" which allowed the RDMA-CM to use ip_tos2prio. Signed-off-by: NEyal Perry <eyalpe@mellanox.com> Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 11月, 2013 1 次提交
-
-
由 Joe Perches 提交于
Just an unnecessary semicolon that should be removed... Whitespace neatening of macro too. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 11月, 2013 2 次提交
-
-
由 Hannes Frederic Sowa 提交于
Sockets marked with IP_PMTUDISC_INTERFACE won't do path mtu discovery, their sockets won't accept and install new path mtu information and they will always use the interface mtu for outgoing packets. It is guaranteed that the packet is not fragmented locally. But we won't set the DF-Flag on the outgoing frames. Florian Weimer had the idea to use this flag to ensure DNS servers are never generating outgoing fragments. They may well be fragmented on the path, but the server never stores or usees path mtu values, which could well be forged in an attack. (The root of the problem with path MTU discovery is that there is no reliable way to authenticate ICMP Fragmentation Needed But DF Set messages because they are sent from intermediate routers with their source addresses, and the IMCP payload will not always contain sufficient information to identify a flow.) Recent research in the DNS community showed that it is possible to implement an attack where DNS cache poisoning is feasible by spoofing fragments. This work was done by Amir Herzberg and Haya Shulman: <https://sites.google.com/site/hayashulman/files/fragmentation-poisoning.pdf> This issue was previously discussed among the DNS community, e.g. <http://www.ietf.org/mail-archive/web/dnsext/current/msg01204.html>, without leading to fixes. This patch depends on the patch "ipv4: fix DO and PROBE pmtu mode regarding local fragmentation with UFO/CORK" for the enforcement of the non-fragmentable checks. If other users than ip_append_page/data should use this semantic too, we have to add a new flag to IPCB(skb)->flags to suppress local fragmentation and check for this in ip_finish_output. Many thanks to Florian Weimer for the idea and feedback while implementing this patch. Cc: David S. Miller <davem@davemloft.net> Suggested-by: NFlorian Weimer <fweimer@redhat.com> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Enrico Mioso 提交于
Some drivers implementing NCM-like protocols, may re-use those functions, as is the case in the huawei_cdc_ncm driver. Export them via EXPORT_SYMBOL_GPL, in accordance with how other functions have been exported. Signed-off-by: NEnrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: NBjørn Mork <bjorn@mork.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 11月, 2013 7 次提交
-
-
由 Jason Wang 提交于
Sometimes we need to coalesce the rx frags to avoid frag list. One example is virtio-net driver which tries to use small frags for both MTU sized packet and GSO packet. So this patch introduce skb_coalesce_rx_frag() to do this. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Michael Dalton <mwdalton@google.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
As described in commit 5a581b36 (jiffies: Avoid undefined behavior from signed overflow), according to the C standard 3.4.3p3, overflow of a signed integer results in undefined behavior. To fix this, do as the above commit, and do an unsigned subtraction, and interpreting the result as a signed two's-complement number. This is based on the theory from RFC 1982 and is nicely described in wikipedia here: https://en.wikipedia.org/wiki/Serial_number_arithmetic#General_Solution A side-note, I have seen practical issues with the previous logic when dealing with 16-bit, on a 64-bit machine (gcc version 4.4.5). This were 32-bit, which I have not observed issues with. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NJesper Dangaard Brouer <netoptimizer@brouer.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuchung Cheng 提交于
Slow start now increases cwnd by 1 if an ACK acknowledges some packets, regardless the number of packets. Consequently slow start performance is highly dependent on the degree of the stretch ACKs caused by receiver or network ACK compression mechanisms (e.g., delayed-ACK, GRO, etc). But slow start algorithm is to send twice the amount of packets of packets left so it should process a stretch ACK of degree N as if N ACKs of degree 1, then exits when cwnd exceeds ssthresh. A follow up patch will use the remainder of the N (if greater than 1) to adjust cwnd in the congestion avoidance phase. In addition this patch retires the experimental limited slow start (LSS) feature. LSS has multiple drawbacks but questionable benefit. The fractional cwnd increase in LSS requires a loop in slow start even though it's rarely used. Configuring such an increase step via a global sysctl on different BDPS seems hard. Finally and most importantly the slow start overshoot concern is now better covered by the Hybrid slow start (hystart) enabled by default. Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jack Morgenstein 提交于
This is step #1 for implementing SRIOV resource quotas for VFs. Quotas are implemented per resource type for VFs and the PF, to prevent any entity from simply grabbing all the resources for itself and leaving the other entities unable to obtain such resources. Resources which are allocated using quotas: QPs, CQs, SRQs, MPTs, MTTs, MAC, VLAN, and Counters. The quota system works as follows: Each entity (VF or PF) is given a max number of a given resource (its quota), and a guaranteed minimum number for each resource (starvation prevention). For QPs, CQs, SRQs, MPTs and MTTs: 50% of the available quantity for the resource is divided equally among the PF and all the active VFs (i.e., the number of VFs in the mlx4_core module parameter "num_vfs"). This 50% represents the "guaranteed minimum" pool. The other 50% is the "free pool", allocated on a first-come-first-serve basis. For each VF/PF, resources are first allocated from its "guaranteed-minimum" pool. When that pool is exhausted, the driver attempts to allocate from the resource "free-pool". The quota (i.e., max) for the VFs and the PF is: The free-pool amount (50% of the real max) + the guaranteed minimum For MACs: Guarantee 2 MACs per VF/PF per port. As a result, since we have only 128 MACs per port, reduce the allowable number of VFs from 64 to 63. Any remaining MACs are put into a free pool. For VLANs: For the PF, the per-port quota is 128 and guarantee is 64 (to allow the PF to register at least a VLAN per VF in VST mode). For the VFs, the per-port quota is 64 and the guarantee is 0. We assume that VGT VFs are trusted not to abuse the VLAN resource. For Counters: For all functions (PF and VFs), the quota is 128 and the guarantee is 0. In this patch, we define the needed structures, which are added to the resource-tracker struct. In addition, we do initialization for the resource quota, and adjust the query_device response to use quotas rather than resource maxima. As part of the implementation, we introduce a new field in mlx4_dev: quotas. This field holds the resource quotas used to report maxima to the upper layers (ib_core, via query_device). The HCA maxima of these values are passed to the VFs (via QUERY_HCA) so that they may continue to use these in handling QPs, CQs, SRQs and MPTs. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jack Morgenstein 提交于
Use of vlan_index created problems unregistering vlans on guests. In addition, tools delete vlan by tag, not by index, lets follow that. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jack Morgenstein 提交于
The functions mlx4_register_vlan, mlx4_unregister_vlan, mlx4_register_mac, mlx4_unregister_mac all made illegal use of the out_param in multifunc mode to pass the port number. The firmware spec specifies that the port number should be passed in bits 8..15 of the input-modifier field for ALLOC_RES and FREE_RES (sections 20.15.1 and 20.15.2). For MAC register/unregister, this patch contains workarounds so that guests running previous kernels continue to work on a new Hypervisor, and guests running the new kernel will continue to work on old hypervisors. Vlan registeration capability is still not operational in multifunction mode, since the vlan wrapper functions are not implemented in this patch. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
This patch fixes a build warning in skb_checksum() by wrapping the csum_partial() usage in skb_checksum(). The problem is that on a few architectures, csum_partial is used with prefix asmlinkage whereas on most architectures it's not. So fix this up generically as we did with csum_block_add_ext() to match the signature. Introduced by 2817a336 ("net: skb_checksum: allow custom update/combine for walking skb"). Reported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 11月, 2013 7 次提交
-
-
由 Arvid Brodin 提交于
High-availability Seamless Redundancy ("HSR") provides instant failover redundancy for Ethernet networks. It requires a special network topology where all nodes are connected in a ring (each node having two physical network interfaces). It is suited for applications that demand high availability and very short reaction time. HSR acts on the Ethernet layer, using a registered Ethernet protocol type to send special HSR frames in both directions over the ring. The driver creates virtual network interfaces that can be used just like any ordinary Linux network interface, for IP/TCP/UDP traffic etc. All nodes in the network ring must be HSR capable. This code is a "best effort" to comply with the HSR standard as described in IEC 62439-3:2010 (HSRv0). Signed-off-by: NArvid Brodin <arvid.brodin@xdin.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Joby Poriyath provided a xen-netback patch to reduce the size of xenvif structure as some netdev allocation could fail under memory pressure/fragmentation. This patch is handling the problem at the core level, allowing any netdev structures to use vmalloc() if kmalloc() failed. As vmalloc() adds overhead on a critical network path, add __GFP_REPEAT to kzalloc() flags to do this fallback only when really needed. Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NJoby Poriyath <joby.poriyath@citrix.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
This fixes an outstanding bug found through IPVS, where SCTP packets with skb->data_len > 0 (non-linearized) and empty frag_list, but data accumulated in frags[] member, are forwarded with incorrect checksum letting SCTP initial handshake fail on some systems. Linearizing each SCTP skb in IPVS to prevent that would not be a good solution as this leads to an additional and unnecessary performance penalty on the load-balancer itself for no good reason (as we actually only want to update the checksum, and can do that in a different/better way presented here). The actual problem is elsewhere, namely, that SCTP's checksumming in sctp_compute_cksum() does not take frags[] into account like skb_checksum() does. So while we are fixing this up, we better reuse the existing code that we have anyway in __skb_checksum() and use it for walking through the data doing checksumming. This will not only fix this issue, but also consolidates some SCTP code with core sk_buff code, bringing it closer together and removing respectively avoiding reimplementation of skb_checksum() for no good reason. As crc32c() can use hardware implementation within the crypto layer, we leave that intact (it wraps around / falls back to e.g. slice-by-8 algorithm in __crc32c_le() otherwise); plus use the __crc32c_le_combine() combinator for crc32c blocks. Also, we remove all other SCTP checksumming code, so that we only have to use sctp_compute_cksum() from now on; for doing that, we need to transform SCTP checkumming in output path slightly, and can leave the rest intact. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Currently, skb_checksum walks over 1) linearized, 2) frags[], and 3) frag_list data and calculats the one's complement, a 32 bit result suitable for feeding into itself or csum_tcpudp_magic(), but unsuitable for SCTP as we're calculating CRC32c there. Hence, in order to not re-implement the very same function in SCTP (and maybe other protocols) over and over again, use an update() + combine() callback internally to allow for walking over the skb with different algorithms. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
This patch adds a combinator to merge two or more crc32{,c}s into a new one. This is useful for checksum computations of fragmented skbs that use crc32/crc32c as checksums. The arithmetics for combining both in the GF(2) was taken and slightly modified from zlib. Only passing two crcs is insufficient as two crcs and the length of the second piece is needed for merging. The code is made generic, so that only polynomials need to be passed for crc32_le resp. crc32c_le. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Holger Eitzenberger 提交于
Encapsulate counters for both directions into nf_conn_acct. During that process also consistently name pointers to the extend 'acct', not 'counters'. This patch is a cleanup. Signed-off-by: NHolger Eitzenberger <holger@eitzenberger.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Mathias Krause 提交于
Negative message lengths make no sense -- so don't do negative queue lenghts or identifier counts. Prevent them from getting negative. Also change the underlying data types to be unsigned to avoid hairy surprises with sign extensions in cases where those variables get evaluated in unsigned expressions with bigger data types, e.g size_t. In case a user still wants to have "unlimited" sizes she could just use INT_MAX instead. Signed-off-by: NMathias Krause <minipli@googlemail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 11月, 2013 10 次提交
-
-
由 Bjørn Mork 提交于
Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: NBjørn Mork <bjorn@mork.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bjørn Mork 提交于
header_desc was completely unused and union_desc was never used outside cdc_ncm_bind_common. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: NBjørn Mork <bjorn@mork.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bjørn Mork 提交于
Moving the call to cdc_ncm_setup() after the endpoint setup removes the last remaining reference to ncm_parm outside cdc_ncm_setup. Collecting all the ncm_parm based calculations in cdc_ncm_setup improves readability. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: NBjørn Mork <bjorn@mork.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bjørn Mork 提交于
These fields are only used to prevent printing the same speeds multiple times if we receive multiple identical speed notifications. The value of these printk's is questionable, and even more so when we filter out some of the notifications sent us by the firmware. If we are going to print any of these, then we should print them all. Removing little used fields is a bonus. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: NBjørn Mork <bjorn@mork.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bjørn Mork 提交于
We already use the usbnet udev field everywhere this could have been used. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: NBjørn Mork <bjorn@mork.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bjørn Mork 提交于
Too many pointers back and forth are likely to confuse developers, creating subtle bugs whenever we forget to syncronize them all. As a usbnet driver, we should stick with the standard struct usbnet fields as much as possible. The netdevice is one such field. Cc: Greg Suarez <gsuarez@smithmicro.com> Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: NBjørn Mork <bjorn@mork.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bjørn Mork 提交于
No need to duplicate stuff already in the common usbnet struct. We still need to keep our special find_endpoints function because we need explicit control over the selected altsetting. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: NBjørn Mork <bjorn@mork.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bjørn Mork 提交于
This is always a duplicate of the "control" field. It causes confusion wrt intf_data updates and cleanups. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: NBjørn Mork <bjorn@mork.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bjørn Mork 提交于
This makes it a lot easier to test modified versions Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: NBjørn Mork <bjorn@mork.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jarno Rajahalme 提交于
tcp_flags=flags/mask Bitwise match on TCP flags. The flags and mask are 16-bit num‐ bers written in decimal or in hexadecimal prefixed by 0x. Each 1-bit in mask requires that the corresponding bit in port must match. Each 0-bit in mask causes the corresponding bit to be ignored. TCP protocol currently defines 9 flag bits, and additional 3 bits are reserved (must be transmitted as zero), see RFCs 793, 3168, and 3540. The flag bits are, numbering from the least significant bit: 0: FIN No more data from sender. 1: SYN Synchronize sequence numbers. 2: RST Reset the connection. 3: PSH Push function. 4: ACK Acknowledgement field significant. 5: URG Urgent pointer field significant. 6: ECE ECN Echo. 7: CWR Congestion Windows Reduced. 8: NS Nonce Sum. 9-11: Reserved. 12-15: Not matchable, must be zero. Signed-off-by: NJarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: NJesse Gross <jesse@nicira.com>
-
- 31 10月, 2013 1 次提交
-
-
由 Greg Thelen 提交于
this_cpu_sub() is implemented as negation and addition. This patch casts the adjustment to the counter type before negation to sign extend the adjustment. This helps in cases where the counter type is wider than an unsigned adjustment. An alternative to this patch is to declare such operations unsupported, but it seemed useful to avoid surprises. This patch specifically helps the following example: unsigned int delta = 1 preempt_disable() this_cpu_write(long_counter, 0) this_cpu_sub(long_counter, delta) preempt_enable() Before this change long_counter on a 64 bit machine ends with value 0xffffffff, rather than 0xffffffffffffffff. This is because this_cpu_sub(pcp, delta) boils down to this_cpu_add(pcp, -delta), which is basically: long_counter = 0 + 0xffffffff Also apply the same cast to: __this_cpu_sub() __this_cpu_sub_return() this_cpu_sub_return() All percpu_test.ko passes, especially the following cases which previously failed: l -= ui_one; __this_cpu_sub(long_counter, ui_one); CHECK(l, long_counter, -1); l -= ui_one; this_cpu_sub(long_counter, ui_one); CHECK(l, long_counter, -1); CHECK(l, long_counter, 0xffffffffffffffff); ul -= ui_one; __this_cpu_sub(ulong_counter, ui_one); CHECK(ul, ulong_counter, -1); CHECK(ul, ulong_counter, 0xffffffffffffffff); ul = this_cpu_sub_return(ulong_counter, ui_one); CHECK(ul, ulong_counter, 2); ul = __this_cpu_sub_return(ulong_counter, ui_one); CHECK(ul, ulong_counter, 1); Signed-off-by: NGreg Thelen <gthelen@google.com> Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 10月, 2013 1 次提交
-
-
由 Daniel Borkmann 提交于
This work contains a lightweight BPF-based traffic classifier that can serve as a flexible alternative to ematch-based tree classification, i.e. now that BPF filter engine can also be JITed in the kernel. Naturally, tc actions and policies are supported as well with cls_bpf. Multiple BPF programs/filter can be attached for a class, or they can just as well be written within a single BPF program, that's really up to the user how he wishes to run/optimize the code, e.g. also for inversion of verdicts etc. The notion of a BPF program's return/exit codes is being kept as follows: 0: No match -1: Select classid given in "tc filter ..." command else: flowid, overwrite the default one As a minimal usage example with iproute2, we use a 3 band prio root qdisc on a router with sfq each as leave, and assign ssh and icmp bpf-based filters to band 1, http traffic to band 2 and the rest to band 3. For the first two bands we load the bytecode from a file, in the 2nd we load it inline as an example: echo 1 > /proc/sys/net/core/bpf_jit_enable tc qdisc del dev em1 root tc qdisc add dev em1 root handle 1: prio bands 3 priomap 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 tc qdisc add dev em1 parent 1:1 sfq perturb 16 tc qdisc add dev em1 parent 1:2 sfq perturb 16 tc qdisc add dev em1 parent 1:3 sfq perturb 16 tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/ssh.bpf flowid 1:1 tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/icmp.bpf flowid 1:1 tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/http.bpf flowid 1:2 tc filter add dev em1 parent 1: bpf run bytecode "`bpfc -f tc -i misc.ops`" flowid 1:3 BPF programs can be easily created and passed to tc, either as inline 'bytecode' or 'bytecode-file'. There are a couple of front-ends that can compile opcodes, for example: 1) People familiar with tcpdump-like filters: tcpdump -iem1 -ddd port 22 | tr '\n' ',' > /etc/tc/ssh.bpf 2) People that want to low-level program their filters or use BPF extensions that lack support by libpcap's compiler: bpfc -f tc -i ssh.ops > /etc/tc/ssh.bpf ssh.ops example code: ldh [12] jne #0x800, drop ldb [23] jneq #6, drop ldh [20] jset #0x1fff, drop ldxb 4 * ([14] & 0xf) ldh [%x + 14] jeq #0x16, pass ldh [%x + 16] jne #0x16, drop pass: ret #-1 drop: ret #0 It was chosen to load bytecode into tc, since the reverse operation, tc filter list dev em1, is then able to show the exact commands again. Possible follow-up work could also include a small expression compiler for iproute2. Tested with the help of bmon. This idea came up during the Netfilter Workshop 2013 in Copenhagen. Also thanks to feedback from Eric Dumazet! Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 10月, 2013 6 次提交
-
-
由 Peter Zijlstra 提交于
The PPC64 people noticed a missing memory barrier and crufty old comments in the perf ring buffer code. So update all the comments and add the missing barrier. When the architecture implements local_t using atomic_long_t there will be double barriers issued; but short of introducing more conditional barrier primitives this is the best we can do. Reported-by: NVictor Kaplansky <victork@il.ibm.com> Tested-by: NVictor Kaplansky <victork@il.ibm.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: michael@ellerman.id.au Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: anton@samba.org Cc: benh@kernel.crashing.org Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lanSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jacob Keller 提交于
napi_disable uses an msleep() call to wait for outstanding napi work to be finished after setting the disable bit. It does not always sleep incase there was no outstanding work. This resulted in a rare bug in ixgbe_down operation where a napi_disable call took place inside of a local_bh_disable()d context. In order to enable easier detection of future sleep while atomic BUGs, this patch adds a might_sleep() call, so that every use of napi_disable during atomic context will be visible. Signed-off-by: NJacob Keller <jacob.e.keller@intel.com> Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: Alexander Duyck <alexander.duyck@intel.com> Cc: Hyong-Youb Kim <hykim@myri.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Dmitry Kravkov <dmitry@broadcom.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Joseph Gasparakis 提交于
This patch removes the burden from the NIC drivers to check if the vxlan driver is enabled in the kernel and also makes available the vxlan headrooms to them. Signed-off-by: NJoseph Gasparakis <joseph.gasparakis@intel.com> Tested-by: NKavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Mathias Krause 提交于
struct esp_data consists of a single pointer, vanishing the need for it to be a structure. Fold the pointer into 'data' direcly, removing one level of pointer indirection. Signed-off-by: NMathias Krause <mathias.krause@secunet.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
由 Mathias Krause 提交于
The padlen member of struct esp_data is always zero. Get rid of it. Signed-off-by: NMathias Krause <mathias.krause@secunet.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
由 David S. Miller 提交于
The code for privacy extentions is very mature, and making it configurable only gives marginal memory/code savings in exchange for obfuscation and hard to read code via CPP ifdef'ery. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 10月, 2013 2 次提交
-
-
由 Hannes Frederic Sowa 提交于
On receiving a packet too big icmp error we update the expire value by calling rt6_update_expires. This function uses dst_set_expires which is implemented that it can only reduce the expiration value of the dst entry. If we insert new routing non-expiry information into the ipv6 fib where we already have a matching rt6_info we only clear the RTF_EXPIRES flag in rt6i_flags and leave the dst.expires value as is. When new mtu information arrives for that cached dst_entry we again call dst_set_expires. This time it won't update the dst.expire value because we left the dst.expire value intact from the last update. So dst_set_expires won't touch dst.expires. Fix this by resetting dst.expires when clearing the RTF_EXPIRE flag. dst_set_expires checks for a zero expiration and updates the dst.expires. In the past this (not updating dst.expires) was necessary because dst.expire was placed in a union with the dst_entry *from reference and rt6_clean_expires did assign NULL to it. This split happend in ecd98837 ("ipv6: fix race condition regarding dst->expires and dst->from"). Reported-by: NSteinar H. Gunderson <sgunderson@bigfoot.com> Reported-by: NValentijn Sessink <valentyn@blub.net> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: NEric Dumazet <edumazet@google.com> Tested-by: NValentijn Sessink <valentyn@blub.net> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Antonio Quartulli 提交于
Right now skb->data is passed to rx_hook() even if the skb has not been linearised and without giving rx_hook() a way to linearise it. Change the rx_hook() interface and make it accept the skb and the offset to the UDP payload as arguments. rx_hook() is also renamed to rx_skb_hook() to ensure that out of the tree users notice the API change. In this way any rx_skb_hook() implementation can perform all the needed operations to properly (and safely) access the skb data. Signed-off-by: NAntonio Quartulli <antonio@meshcoding.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-