提交 · 8d829bdb97dc3a0c9c8090b9b168ca46ea99c8d8 · openanolis / cloud-kernel

10 11月, 2016 1 次提交

net/sched: cls_flower: Add UDP port to tunnel parameters · f4d997fd

由 Hadar Hen Zion 提交于 11月 07, 2016

The current IP tunneling classification supports only IP addresses and key.
Enhance UDP based IP tunneling classification parameters by adding UDP
src and dst port.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4d997fd

04 11月, 2016 1 次提交

net/sched: cls_flower: Support matching on SCTP ports · 5976c5f4

由 Simon Horman 提交于 11月 03, 2016

Support matching on SCTP ports in the same way that matching
on TCP and UDP ports is already supported.

Example usage:

tc qdisc add dev eth0 ingress

tc filter add dev eth0 protocol ip parent ffff: \
        flower indev eth0 ip_proto sctp dst_port 80 \
        action drop
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5976c5f4

22 9月, 2016 1 次提交

net: cls_bpf: limit hardware offload by software-only flag · 0d01d45f

由 Jakub Kicinski 提交于 9月 21, 2016

Add cls_bpf support for the TCA_CLS_FLAGS_SKIP_HW flag.
Unlike U32 and flower cls_bpf already has some netlink
flags defined.  Create a new attribute to be able to use
the same flag values as the above.

Unlike U32 and flower reject unknown flags.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d01d45f

16 9月, 2016 2 次提交

net/sched: cls_flower: Specify vlan attributes format in the UAPI header · 37a6c151

由 Or Gerlitz 提交于 9月 15, 2016

Specify the format (size and endianess) for the vlan attributes.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37a6c151

net/sched: cls_flower: Support masking for matching on tcp/udp ports · aa72d708

由 Or Gerlitz 提交于 9月 15, 2016

Add the definitions for src/dst udp/tcp port masks and use
them when setting && dumping the relevant keys.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NPaul Blakey <paulb@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa72d708

11 9月, 2016 1 次提交

net/sched: cls_flower: Classify packet in ip tunnels · bc3103f1

由 Amir Vadai 提交于 9月 08, 2016

Introduce classifying by metadata extracted by the tunnel device.
Outer header fields - source/dest ip and tunnel id, are extracted from
the metadata when classifying.

For example, the following will add a filter on the ingress Qdisc of shared
vxlan device named 'vxlan0'. To forward packets with outer src ip
11.11.0.2, dst ip 11.11.0.1 and tunnel id 11. The packets will be
forwarded to tap device 'vnet0' (after metadata is released):

$ tc filter add dev vxlan0 protocol ip parent ffff: \
    flower \
      enc_src_ip 11.11.0.2 \
      enc_dst_ip 11.11.0.1 \
      enc_key_id 11 \
      dst_ip 11.11.11.1 \
    action tunnel_key release \
    action mirred egress redirect dev vnet0

The action tunnel_key, will be introduced in the next patch in this
series.
Signed-off-by: NAmir Vadai <amir@vadai.me>
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc3103f1

19 8月, 2016 1 次提交

net_sched: flower: Add vlan support · 9399ae9a

由 Hadar Hen Zion 提交于 8月 17, 2016

Enhance flower to support 802.1Q vlan protocol classification.
Currently, the supported fields are vlan_id and vlan_priority.

Example:

	# add a flower filter with vlan id and priority classification
	tc filter add dev ens4f0 protocol 802.1Q parent ffff: \
		flower \
		indev ens4f0 \
		vlan_ethtype ipv4 \
		vlan_id 100 \
		vlan_prio 3 \
	action vlan pop
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9399ae9a

25 7月, 2016 2 次提交

net/sched: Add match-all classifier hw offloading. · b87f7936

由 Yotam Gigi 提交于 7月 21, 2016

Following the work that have been done on offloading classifiers like u32
and flower, now the match-all classifier hw offloading is possible. if
the interface supports tc offloading.

To control the offloading, two tc flags have been introduced: skip_sw and
skip_hw. Typical usage:

tc filter add dev eth25 parent ffff: 	\
	matchall skip_sw		\
	action mirred egress mirror	\
	dev eth27
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b87f7936

net/sched: introduce Match-all classifier · bf3994d2

由 Jiri Pirko 提交于 7月 21, 2016

The matchall classifier matches every packet and allows the user to apply
actions on it. This filter is very useful in usecases where every packet
should be matched, for example, packet mirroring (SPAN) can be setup very
easily using that filter.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf3994d2

08 6月, 2016 2 次提交

J
net sched: indentation and other OCD stylistic fixes · 0b0f43fe
由 Jamal Hadi Salim 提交于 6月 05, 2016
```
Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
```
0b0f43fe

net sched actions: introduce timestamp for firsttime use · 53eb440f

由 Jamal Hadi Salim 提交于 6月 06, 2016

Useful to know when the action was first used for accounting
(and debugging)
Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53eb440f

25 5月, 2016 1 次提交

net sched actions: policer missing timestamp processing · 3d3ed181

由 Jamal Hadi Salim 提交于 5月 23, 2016

Policer was not dumping or updating timestamps
Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d3ed181

17 5月, 2016 2 次提交

net: cls_u32: Add support for skip-sw flag to tc u32 classifier. · d34e3e18

由 Samudrala, Sridhar 提交于 5月 12, 2016

On devices that support TC U32 offloads, this flag enables a filter to be
added only to HW. skip-sw and skip-hw are mutually exclusive flags. By
default without any flags, the filter is added to both HW and SW, but no
error checks are done in case of failure to add to HW. With skip-sw,
failure to add to HW is treated as an error.

Here is a sample script that adds 2 filters, one with skip-sw and the other
with skip-hw flag.

   # add ingress qdisc
   tc qdisc add dev p4p1 ingress

   # enable hw tc offload.
   ethtool -K p4p1 hw-tc-offload on

   # add u32 filter with skip-sw flag.
   tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
      handle 800:0:1 u32 ht 800: flowid 800:1 \
      skip-sw \
      match ip src 192.168.1.0/24 \
      action drop

   # add u32 filter with skip-hw flag.
   tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
      handle 800:0:2 u32 ht 800: flowid 800:2 \
      skip-hw \
      match ip src 192.168.2.0/24 \
      action drop
Signed-off-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d34e3e18

net: sched: Move TCA_CLS_FLAGS_SKIP_HW to uapi header file. · 760edee8

由 Samudrala, Sridhar 提交于 5月 12, 2016

Signed-off-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

760edee8

27 4月, 2016 1 次提交

sched: align nlattr properly when needed · 9854518e

由 Nicolas Dichtel 提交于 4月 26, 2016

Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9854518e

11 3月, 2016 1 次提交

net/flower: Introduce hardware offload support · 5b33f488

由 Amir Vadai 提交于 3月 08, 2016

This patch is based on a patch made by John Fastabend.
It adds support for offloading cls_flower.
when NETIF_F_HW_TC is on:
  flags = 0       => Rule will be processed twice - by hardware, and if
                     still relevant, by software.
  flags = SKIP_HW => Rull will be processed by software only

If hardware fail/not capabale to apply the rule, operation will NOT
fail. Filter will be processed by SW only.
Acked-by: NJiri Pirko <jiri@mellanox.com>
Suggested-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NAmir Vadai <amir@vadai.me>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b33f488

02 3月, 2016 1 次提交

net: sched: cls_u32 add bit to specify software only rules · 9e8ce79c

由 John Fastabend 提交于 2月 26, 2016

In the initial implementation the only way to stop a rule from being
inserted into the hardware table was via the device feature flag.
However this doesn't work well when working on an end host system
where packets are expect to hit both the hardware and software
datapaths.

For example we can imagine a rule that will match an IP address and
increment a field. If we install this rule in both hardware and
software we may increment the field twice. To date we have only
added support for the drop action so we have been able to ignore
these cases. But as we extend the action support we will hit this
example plus more such cases. Arguably these are not even corner
cases in many working systems these cases will be common.

To avoid forcing the driver to always abort (i.e. the above example)
this patch adds a flag to add a rule in software only. A careful
user can use this flag to build software and hardware datapaths
that work together. One example we have found particularly useful
is to use hardware resources to set the skb->mark on the skb when
the match may be expensive to run in software but a mark lookup
in a hash table is cheap. The idea here is hardware can do in one
lookup what the u32 classifier may need to traverse multiple lists
and hash tables to compute. The flag is only passed down on inserts.
On deletion to avoid stale references in hardware we always try
to remove a rule if it exists.

The flags field is part of the classifier specific options. Although
it is tempting to lift this into the generic structure doing this
proves difficult do to how the tc netlink attributes are implemented
along with how the dump/change routines are called. There is also
precedence for putting seemingly generic pieces in the specific
classifier options such as TCA_U32_POLICE, TCA_U32_ACT, etc. So
although not ideal I've left FLAGS in the u32 options as well as it
simplifies the code greatly and user space has already learned how
to manage these bits ala 'tc' tool.

Another thing if trying to update a rule we require the flags to
be unchanged. This is to force user space, software u32 and
the hardware u32 to keep in sync. Thanks to Simon Horman for
catching this case.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e8ce79c

18 9月, 2015 2 次提交

bpf: add bpf_redirect() helper · 27b29f63

由 Alexei Starovoitov 提交于 9月 15, 2015

Existing bpf_clone_redirect() helper clones skb before redirecting
it to RX or TX of destination netdev.
Introduce bpf_redirect() helper that does that without cloning.

Benchmarked with two hosts using 10G ixgbe NICs.
One host is doing line rate pktgen.
Another host is configured as:
$ tc qdisc add dev $dev ingress
$ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
   action bpf run object-file tcbpf1_kern.o section clone_redirect_xmit drop
so it receives the packet on $dev and immediately xmits it on $dev + 1
The section 'clone_redirect_xmit' in tcbpf1_kern.o file has the program
that does bpf_clone_redirect() and performance is 2.0 Mpps

$ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
   action bpf run object-file tcbpf1_kern.o section redirect_xmit drop
which is using bpf_redirect() - 2.4 Mpps

and using cls_bpf with integrated actions as:
$ tc filter add dev $dev root pref 10 \
  bpf run object-file tcbpf1_kern.o section redirect_xmit integ_act classid 1
performance is 2.5 Mpps

To summarize:
u32+act_bpf using clone_redirect - 2.0 Mpps
u32+act_bpf using redirect - 2.4 Mpps
cls_bpf using redirect - 2.5 Mpps

For comparison linux bridge in this setup is doing 2.1 Mpps
and ixgbe rx + drop in ip_rcv - 7.8 Mpps
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27b29f63

cls_bpf: introduce integrated actions · 045efa82

由 Daniel Borkmann 提交于 9月 15, 2015

Often cls_bpf classifier is used with single action drop attached.
Optimize this use case and let cls_bpf return both classid and action.
For backwards compatibility reasons enable this feature under
TCA_BPF_FLAG_ACT_DIRECT flag.

Then more interesting programs like the following are easier to write:
int cls_bpf_prog(struct __sk_buff *skb)
{
  /* classify arp, ip, ipv6 into different traffic classes
   * and drop all other packets
   */
  switch (skb->protocol) {
  case htons(ETH_P_ARP):
    skb->tc_classid = 1;
    break;
  case htons(ETH_P_IP):
    skb->tc_classid = 2;
    break;
  case htons(ETH_P_IPV6):
    skb->tc_classid = 3;
    break;
  default:
    return TC_ACT_SHOT;
  }

  return TC_ACT_OK;
}

Joint work with Daniel Borkmann.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

045efa82

22 5月, 2015 1 次提交

net: sched: pkt_cls: remove unused macros from uapi · bd5850d3

由 Florian Westphal 提交于 5月 21, 2015

Jamal points out that this header also contains kernel internal magic that
cannot be used from userspace for anything meaningful.

Lets remove what the kernel doesn't use anymore and wrap remainder with
__KERNEL__.
Suggested-by: NJamal Hadi Salim <jhs@mojatatu.com>
Suggested-by: NAlexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bd5850d3

14 5月, 2015 2 次提交

tc: introduce Flower classifier · 77b9900e

由 Jiri Pirko 提交于 5月 12, 2015

This patch introduces a flow-based filter. So far, the very essential
packet fields are supported.

This patch is only the first step. There is a lot of potential performance
improvements possible to implement. Also a lot of features are missing
now. They will be addressed in follow-up patches.
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

77b9900e

net: sched: use counter to break reclassify loops · e578d9c0

由 Florian Westphal 提交于 5月 11, 2015

Seems all we want here is to avoid endless 'goto reclassify' loop.
tc_classify_compat even resets this counter when something other
than TC_ACT_RECLASSIFY is returned, so this skb-counter doesn't
break hypothetical loops induced by something other than perpetual
TC_ACT_RECLASSIFY return values.

skb_act_clone is now identical to skb_clone, so just use that.

Tested with following (bogus) filter:
tc filter add dev eth0 parent ffff: \
 protocol ip u32 match u32 0 0 police rate 10Kbit burst \
 64000 mtu 1500 action reclassify
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e578d9c0

05 5月, 2015 1 次提交

tc: remove unused redirect ttl · c19ae86a

由 Jamal Hadi Salim 提交于 5月 01, 2015

improves ingress+u32 performance from 22.4 Mpps to 22.9 Mpps
Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NFlorian Westphal <fw@strlen.de>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c19ae86a

03 5月, 2015 1 次提交

net: sched: remove TC_MUNGED bits · 4749c3ef

由 Florian Westphal 提交于 4月 30, 2015

Not used.

pedit sets TC_MUNGED when packet content was altered, but all the core
does is unset MUNGED again and then set OK2MUNGE.

And the latter isn't tested anywhere. So lets remove both
TC_MUNGED and TC_OK2MUNGE.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4749c3ef

02 3月, 2015 1 次提交

cls_bpf: add initial eBPF support for programmable classifiers · e2e9b654

由 Daniel Borkmann 提交于 3月 01, 2015

This work extends the "classic" BPF programmable tc classifier by
extending its scope also to native eBPF code!

This allows for user space to implement own custom, 'safe' C like
classifiers (or whatever other frontend language LLVM et al may
provide in future), that can then be compiled with the LLVM eBPF
backend to an eBPF elf file. The result of this can be loaded into
the kernel via iproute2's tc. In the kernel, they can be JITed on
major archs and thus run in native performance.

Simple, minimal toy example to demonstrate the workflow:

  #include <linux/ip.h>
  #include <linux/if_ether.h>
  #include <linux/bpf.h>

  #include "tc_bpf_api.h"

  __section("classify")
  int cls_main(struct sk_buff *skb)
  {
    return (0x800 << 16) | load_byte(skb, ETH_HLEN + __builtin_offsetof(struct iphdr, tos));
  }

  char __license[] __section("license") = "GPL";

The classifier can then be compiled into eBPF opcodes and loaded
via tc, for example:

  clang -O2 -emit-llvm -c cls.c -o - | llc -march=bpf -filetype=obj -o cls.o
  tc filter add dev em1 parent 1: bpf cls.o [...]

As it has been demonstrated, the scope can even reach up to a fully
fledged flow dissector (similarly as in samples/bpf/sockex2_kern.c).

For tc, maps are allowed to be used, but from kernel context only,
in other words, eBPF code can keep state across filter invocations.
In future, we perhaps may reattach from a different application to
those maps e.g., to read out collected statistics/state.

Similarly as in socket filters, we may extend functionality for eBPF
classifiers over time depending on the use cases. For that purpose,
cls_bpf programs are using BPF_PROG_TYPE_SCHED_CLS program type, so
we can allow additional functions/accessors (e.g. an ABI compatible
offset translation to skb fields/metadata). For an initial cls_bpf
support, we allow the same set of helper functions as eBPF socket
filters, but we could diverge at some point in time w/o problem.

I was wondering whether cls_bpf and act_bpf could share C programs,
I can imagine that at some point, we introduce i) further common
handlers for both (or even beyond their scope), and/or if truly needed
ii) some restricted function space for each of them. Both can be
abstracted easily through struct bpf_verifier_ops in future.

The context of cls_bpf versus act_bpf is slightly different though:
a cls_bpf program will return a specific classid whereas act_bpf a
drop/non-drop return code, latter may also in future mangle skbs.
That said, we can surely have a "classify" and "action" section in
a single object file, or considered mentioned constraint add a
possibility of a shared section.

The workflow for getting native eBPF running from tc [1] is as
follows: for f_bpf, I've added a slightly modified ELF parser code
from Alexei's kernel sample, which reads out the LLVM compiled
object, sets up maps (and dynamically fixes up map fds) if any, and
loads the eBPF instructions all centrally through the bpf syscall.

The resulting fd from the loaded program itself is being passed down
to cls_bpf, which looks up struct bpf_prog from the fd store, and
holds reference, so that it stays available also after tc program
lifetime. On tc filter destruction, it will then drop its reference.

Moreover, I've also added the optional possibility to annotate an
eBPF filter with a name (e.g. path to object file, or something
else if preferred) so that when tc dumps currently installed filters,
some more context can be given to an admin for a given instance (as
opposed to just the file descriptor number).

Last but not least, bpf_prog_get() and bpf_prog_put() needed to be
exported, so that eBPF can be used from cls_bpf built as a module.
Thanks to 60a3b225 ("net: bpf: make eBPF interpreter images
read-only") I think this is of no concern since anything wanting to
alter eBPF opcode after verification stage would crash the kernel.

  [1] http://git.breakpoint.cc/cgit/dborkman/iproute2.git/log/?h=ebpfSigned-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e2e9b654

30 10月, 2013 1 次提交

net: sched: cls_bpf: add BPF-based classifier · 7d1d65cb

由 Daniel Borkmann 提交于 10月 28, 2013

This work contains a lightweight BPF-based traffic classifier that can
serve as a flexible alternative to ematch-based tree classification, i.e.
now that BPF filter engine can also be JITed in the kernel. Naturally, tc
actions and policies are supported as well with cls_bpf. Multiple BPF
programs/filter can be attached for a class, or they can just as well be
written within a single BPF program, that's really up to the user how he
wishes to run/optimize the code, e.g. also for inversion of verdicts etc.
The notion of a BPF program's return/exit codes is being kept as follows:

     0: No match
    -1: Select classid given in "tc filter ..." command
  else: flowid, overwrite the default one

As a minimal usage example with iproute2, we use a 3 band prio root qdisc
on a router with sfq each as leave, and assign ssh and icmp bpf-based
filters to band 1, http traffic to band 2 and the rest to band 3. For the
first two bands we load the bytecode from a file, in the 2nd we load it
inline as an example:

echo 1 > /proc/sys/net/core/bpf_jit_enable

tc qdisc del dev em1 root
tc qdisc add dev em1 root handle 1: prio bands 3 priomap 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

tc qdisc add dev em1 parent 1:1 sfq perturb 16
tc qdisc add dev em1 parent 1:2 sfq perturb 16
tc qdisc add dev em1 parent 1:3 sfq perturb 16

tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/ssh.bpf flowid 1:1
tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/icmp.bpf flowid 1:1
tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/http.bpf flowid 1:2
tc filter add dev em1 parent 1: bpf run bytecode "`bpfc -f tc -i misc.ops`" flowid 1:3

BPF programs can be easily created and passed to tc, either as inline
'bytecode' or 'bytecode-file'. There are a couple of front-ends that can
compile opcodes, for example:

1) People familiar with tcpdump-like filters:

   tcpdump -iem1 -ddd port 22 | tr '\n' ',' > /etc/tc/ssh.bpf

2) People that want to low-level program their filters or use BPF
   extensions that lack support by libpcap's compiler:

   bpfc -f tc -i ssh.ops > /etc/tc/ssh.bpf

   ssh.ops example code:
   ldh [12]
   jne #0x800, drop
   ldb [23]
   jneq #6, drop
   ldh [20]
   jset #0x1fff, drop
   ldxb 4 * ([14] & 0xf)
   ldh [%x + 14]
   jeq #0x16, pass
   ldh [%x + 16]
   jne #0x16, drop
   pass: ret #-1
   drop: ret #0

It was chosen to load bytecode into tc, since the reverse operation,
tc filter list dev em1, is then able to show the exact commands again.
Possible follow-up work could also include a small expression compiler
for iproute2. Tested with the help of bmon. This idea came up during
the Netfilter Workshop 2013 in Copenhagen. Also thanks to feedback from
Eric Dumazet!
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d1d65cb

13 10月, 2012 1 次提交

UAPI: (Scripted) Disintegrate include/linux · 607ca46e

由 David Howells 提交于 10月 13, 2012

Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NMichael Kerrisk <mtk.manpages@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: NDave Jones <davej@redhat.com>

607ca46e

12 7月, 2012 1 次提交

net: sched: add ipset ematch · 6d4fa852

由 Florian Westphal 提交于 7月 11, 2012

Can be used to match packets against netfilter ip sets created via ipset(8).
skb->sk_iif is used as 'incoming interface', skb->dev is 'outgoing interface'.

Since ipset is usually called from netfilter, the ematch
initializes a fake xt_action_param, pulls the ip header into the
linear area and also sets skb->data to the IP header (otherwise
matching Layer 4 set types doesn't work).
Tested-by: NMr Dash Four <mr.dash.four@googlemail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d4fa852

04 7月, 2012 1 次提交

net: em_canid: Ematch rule to match CAN frames according to their identifiers · f057bbb6

由 Rostislav Lisovy 提交于 7月 04, 2012

This ematch makes it possible to classify CAN frames (AF_CAN) according
to their identifiers. This functionality can not be easily achieved with
existing classifiers, such as u32, because CAN identifier is always stored
in native endianness, whereas u32 expects Network byte order.
Signed-off-by: NRostislav Lisovy <lisovy@gmail.com>
Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

f057bbb6

22 8月, 2010 1 次提交

net_sched: cls_flow: add key rxhash · 739a91ef

由 Changli Gao 提交于 8月 21, 2010

We can use rxhash to classify the traffic into flows. As rxhash maybe
supplied by NIC or RPS, it is cheaper.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Acked-by: NJamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

739a91ef

05 11月, 2009 1 次提交

net: cleanup include/linux · d94d9fee

由 Eric Dumazet 提交于 11月 04, 2009

This cleanup patch puts struct/union/enum opening braces,
in first line to ease grep games.

struct something
{

becomes :

struct something {
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d94d9fee

31 1月, 2009 1 次提交

headers_check fix: linux/pkt_cls.h · ed307444

由 Jaswinder Singh Rajput 提交于 1月 30, 2009

fix the following 'make headers_check' warning:

linux/pkt_cls.h:122: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>

ed307444

08 11月, 2008 1 次提交

pkt_sched: Control group classifier · f4009237

由 Thomas Graf 提交于 11月 07, 2008

The classifier should cover the most common use case and will work
without any special configuration.

The principle of the classifier is to directly access the
task_struct via get_current(). In order for this to work,
classification requests from softirqs must be ignored. This is
not a problem because the vast majority of packets in softirq
context are not assigned to a task anyway. For this to work, a
mechanism is needed to trace softirq context. 

This repost goes back to the method of relying on the number of
nested bh disable calls for the sake of not adding too much
complexity and the option to come up with something more reliable
if actually needed.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4009237

15 7月, 2008 1 次提交

net-sched: cls_flow: add perturbation support · 72d9794f

由 Patrick McHardy 提交于 7月 14, 2008

Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72d9794f

18 3月, 2008 1 次提交

[PKT_SCHED]: annotate cls_u32 · 0382b9c3

由 Al Viro 提交于 3月 17, 2008

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0382b9c3

06 2月, 2008 1 次提交
- P
  [NET_SCHED]: cls_flow: support classification based on VLAN tag · 9ec13810
  由 Patrick McHardy 提交于 2月 05, 2008
```
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  9ec13810
05 2月, 2008 1 次提交

[PKT_SCHED]: vlan tag match · 3113e88c

由 Stephen Hemminger 提交于 2月 05, 2008

Provide a way to use tc filters on vlan tag even if tag is buried in
skb due to hardware acceleration.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3113e88c

01 2月, 2008 1 次提交

[NET_SCHED]: Add flow classifier · e5dfb815

由 Patrick McHardy 提交于 1月 31, 2008

Add new "flow" classifier, which is meant to extend the SFQ hashing
capabilities without hard-coding new hash functions and also allows
deterministic mappings of keys to classes, replacing some out of tree
iptables patches like IPCLASSIFY (maps IPs to classes), IPMARK (maps
IPs to marks, with fw filters to classes), ...

Some examples:

- Classic SFQ hash:

  tc filter add ... flow hash \
  	keys src,dst,proto,proto-src,proto-dst divisor 1024

- Classic SFQ hash, but using information from conntrack to work properly in
  combination with NAT:

  tc filter add ... flow hash \
  	keys nfct-src,nfct-dst,proto,nfct-proto-src,nfct-proto-dst divisor 1024

- Map destination IPs of 192.168.0.0/24 to classids 1-257:

  tc filter add ... flow map \
  	key dst addend -192.168.0.0 divisor 256

- alternatively:

  tc filter add ... flow map \
  	key dst and 0xff

- similar, but reverse ordered:

  tc filter add ... flow map \
  	key dst and 0xff xor 0xff

Perturbation is currently not supported because we can't reliable kill the
timer on destruction.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5dfb815

12 7月, 2007 1 次提交

[NET_SCHED]: ematch: module autoloading · db3d99c0

由 Patrick McHardy 提交于 7月 11, 2007

Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db3d99c0

23 9月, 2006 1 次提交

[NET_SCHED]: Add mask support to fwmark classifier · b4e9b520

由 Patrick McHardy 提交于 8月 25, 2006

Support masking the nfmark value before the search. The mask value is
global for all filters contained in one instance. It can only be set
when a new instance is created, all filters must specify the same mask.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b4e9b520

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功