提交 · 7276d5d743d775388bf382cd7bdea1a14e486d32 · openeuler / raspberrypi-kernel

25 4月, 2013 1 次提交

packet: minor: convert status bits into shifting format · 7276d5d7

由 Daniel Borkmann 提交于 4月 23, 2013

This makes it more readable and clearer what bits are still free to
use. The compiler reduces this to a constant for us anyway.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7276d5d7

20 3月, 2013 1 次提交

packet: packet fanout rollover during socket overload · 77f65ebd

由 Willem de Bruijn 提交于 3月 19, 2013

Changes:
  v3->v2: rebase (no other changes)
          passes selftest
  v2->v1: read f->num_members only once
          fix bug: test rollover mode + flag

Minimize packet drop in a fanout group. If one socket is full,
roll over packets to another from the group. Maintain flow
affinity during normal load using an rxhash fanout policy, while
dispersing unexpected traffic storms that hit a single cpu, such
as spoofed-source DoS flows. Rollover breaks affinity for flows
arriving at saturated sockets during those conditions.

The patch adds a fanout policy ROLLOVER that rotates between sockets,
filling each socket before moving to the next. It also adds a fanout
flag ROLLOVER. If passed along with any other fanout policy, the
primary policy is applied until the chosen socket is full. Then,
rollover selects another socket, to delay packet drop until the
entire system is saturated.

Probing sockets is not free. Selecting the last used socket, as
rollover does, is a greedy approach that maximizes chance of
success, at the cost of extreme load imbalance. In practice, with
sufficiently long queues to absorb bursts, sockets are drained in
parallel and load balance looks uniform in `top`.

To avoid contention, scales counters with number of sockets and
accesses them lockfree. Values are bounds checked to ensure
correctness.

Tested using an application with 9 threads pinned to CPUs, one socket
per thread and sufficient busywork per packet operation to limits each
thread to handling 32 Kpps. When sent 500 Kpps single UDP stream
packets, a FANOUT_CPU setup processes 32 Kpps in total without this
patch, 270 Kpps with the patch. Tested with read() and with a packet
ring (V1).

Also, passes psock_fanout.c unit test added to selftests.
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

77f65ebd

08 11月, 2012 1 次提交

packet: tx_ring: allow the user to choose tx data offset · 5920cd3a

由 Paul Chavent 提交于 11月 06, 2012

The tx data offset of packet mmap tx ring used to be :
(TPACKET2_HDRLEN - sizeof(struct sockaddr_ll))

The problem is that, with SOCK_RAW socket, the payload (14 bytes after
the beginning of the user data) is misaligned.

This patch allows to let the user gives an offset for it's tx data if
he desires.

Set sock option PACKET_TX_HAS_OFF to 1, then specify in each frame of
your tx ring tp_net for SOCK_DGRAM, or tp_mac for SOCK_RAW.
Signed-off-by: NPaul Chavent <paul.chavent@onera.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5920cd3a

13 10月, 2012 1 次提交

UAPI: (Scripted) Disintegrate include/linux · 607ca46e

由 David Howells 提交于 10月 13, 2012

Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NMichael Kerrisk <mtk.manpages@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: NDave Jones <davej@redhat.com>

607ca46e

04 10月, 2011 1 次提交

Repair wrong named definition aligned_u64 · 96c13184

由 Jiří Župka 提交于 9月 30, 2011

This repairs problem with compile library in userspace (libnl).
Signed-off-by: NJiří Župka <jzupka@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96c13184

27 8月, 2011 1 次提交

af_packet: Prefixed tpacket_v3 structs to avoid name space collision · bc59ba39

由 chetan loke 提交于 8月 25, 2011

structs introduced in tpacket_v3 implementation are prefixed with 'tpacket'
to avoid namespace collision.

Compile tested.
Signed-off-by: NChetan Loke <loke.chetan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc59ba39

25 8月, 2011 1 次提交

af-packet: Added TPACKET_V3 headers. · 0d4691ce

由 chetan loke 提交于 8月 19, 2011

Added TPACKET_V3 definitions.
Signed-off-by: NChetan Loke <loke.chetan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d4691ce

06 7月, 2011 3 次提交

packet: Add 'cpu' fanout policy. · 95ec3eb4

由 David S. Miller 提交于 7月 06, 2011

Unfortunately we have to use a real modulus here as
the multiply trick won't work as effectively with cpu
numbers as it does with rxhash values.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95ec3eb4

packet: Add pre-defragmentation support for ipv4 fanouts. · 7736d33f

由 David S. Miller 提交于 7月 05, 2011

The skb->rxhash cannot be properly computed if the
packet is a fragment.  To alleviate this, allow the
AF_PACKET client to ask for defragmentation to be
done at demux time.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7736d33f

packet: Add fanout support. · dc99f600

由 David S. Miller 提交于 7月 05, 2011

Fanouts allow packet capturing to be demuxed to a set of AF_PACKET
sockets.  Two fanout policies are implemented:

1) Hashing based upon skb->rxhash

2) Pure round-robin

An AF_PACKET socket must be fully bound before it tries to add itself
to a fanout.  All AF_PACKET sockets trying to join the same fanout
must all have the same bind settings.

Fanouts are identified (within a network namespace) by a 16-bit ID.
The first socket to try to add itself to a fanout with a particular
ID, creates that fanout.  When the last socket leaves the fanout
(which happens only when the socket is closed), that fanout is
destroyed.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc99f600

07 6月, 2011 1 次提交

af_packet: prevent information leak · 13fcb7bd

由 Eric Dumazet 提交于 6月 06, 2011

In 2.6.27, commit 393e52e3 (packet: deliver VLAN TCI to userspace)
added a small information leak.

Add padding field and make sure its zeroed before copy to user.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13fcb7bd

02 6月, 2011 1 次提交

af-packet: Add flag to distinguish VID 0 from no-vlan. · a3bcc23e

由 Ben Greear 提交于 6月 01, 2011

Currently, user-space cannot determine if a 0 tcp_vlan_tci
means there is no VLAN tag or the VLAN ID was zero.

Add flag to make this explicit.  User-space can check for
TP_STATUS_VLAN_VALID || tp_vlan_tci > 0, which will be backwards
compatible. Older could would have just checked for tp_vlan_tci,
so it will work no worse than before.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3bcc23e

02 6月, 2010 1 次提交

packet_mmap: expose hw packet timestamps to network packet capture utilities · 614f60fa

由 Scott McMillan 提交于 6月 02, 2010

This patch adds a setting, PACKET_TIMESTAMP, to specify the packet
timestamp source that is exported to capture utilities like tcpdump by
packet_mmap.

PACKET_TIMESTAMP accepts the same integer bit field as
SO_TIMESTAMPING.  However, only the SOF_TIMESTAMPING_SYS_HARDWARE and
SOF_TIMESTAMPING_RAW_HARDWARE values are currently recognized by
PACKET_TIMESTAMP.  SOF_TIMESTAMPING_SYS_HARDWARE takes precedence over
SOF_TIMESTAMPING_RAW_HARDWARE if both bits are set.

If PACKET_TIMESTAMP is not set, a software timestamp generated inside
the networking stack is used (the behavior before this setting was
added).
Signed-off-by: NScott McMillan <scott.a.mcmillan@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

614f60fa

13 4月, 2010 1 次提交

packet: support for TX time stamps on RAW sockets · ed85b565

由 Richard Cochran 提交于 4月 07, 2010

Enable the SO_TIMESTAMPING socket infrastructure for raw packet sockets.
We introduce PACKET_TX_TIMESTAMP for the control message cmsg_type.

Similar support for UDP and CAN sockets was added in commit
51f31cabSigned-off-by: NRichard Cochran <richard.cochran@omicron.at>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed85b565

05 2月, 2010 1 次提交

packet: Add GSO/csum offload support. · bfd5f4a3

由 Sridhar Samudrala 提交于 2月 04, 2010

This patch adds GSO/checksum offload to af_packet sockets using
virtio_net_hdr. Based on Rusty's patch to add this support to tun.
It allows GSO/checksum offload to be enabled when using raw socket
backend with virtio_net.
Adds PACKET_VNET_HDR socket option to prepend virtio_net_hdr in the
receive path and process/skip virtio_net_hdr in the send path. This
option is only allowed with SOCK_RAW sockets attached to ethernet
type devices.

v2 updates
----------
Michael's Comments
- Perform length check in packet_snd() when GSO is off even when
  vnet_hdr is present.
- Check for SKB_GSO_FCOE type and return -EINVAL
- don't allow tx/rx ring when vnet_hdr is enabled.
Herbert's Comments
- Removed ethernet specific code.
- protocol value is assumed to be passed in by the caller.
Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bfd5f4a3

05 11月, 2009 1 次提交

net: cleanup include/linux · d94d9fee

由 Eric Dumazet 提交于 11月 04, 2009

This cleanup patch puts struct/union/enum opening braces,
in first line to ease grep games.

struct something
{

becomes :

struct something {
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d94d9fee

12 10月, 2009 1 次提交

Revert "af_packet: add interframe drop cmsg (v6)" · d5e63bde

由 David S. Miller 提交于 10月 12, 2009

This reverts commit 97775007.

Neil is reimplementing this generically, outside of AF_PACKET.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5e63bde

05 10月, 2009 1 次提交

af_packet: add interframe drop cmsg (v6) · 97775007

由 Neil Horman 提交于 10月 02, 2009

Add Ancilliary data to better represent loss information

I've had a few requests recently to provide more detail regarding frame loss
during an AF_PACKET packet capture session. Specifically the requestors want to
see where in a packet sequence frames were lost, i.e. they want to see that 40
frames were lost between frames 302 and 303 in a packet capture file. In order
to do this we need:

1) The kernel to export this data to user space
2) The applications to make use of it

This patch addresses item (1). It does this by doing the following:

A) Anytime we drop a frame for which we would increment po->stats.tp_drops, we
also no increment a stats called po->stats.tp_gap.

B) Every time we successfully enqueue a frame to sk_receive_queue, we record the
value of po->stats.tp_gap in skb->mark. skb->cb would nominally be the place to
record this, but since all the space there is used up, we're overloading
skb->mark. Its safe to do since any enqueued packet is guaranteed to be
unshared at this point, and skb->mark isn't used for anything else in the rx
path to the application. After we record tp_gap in the skb, we zero
po->stats.tp_gap. This allows us to keep a counter of the number of frames lost
between any two enqueued packets

C) When the application goes to dequeue a frame from the packet socket, we look
at skb->mark for that frame. If it is non-zero, we add a cmsg chunk to the
msghdr of level SOL_PACKET and type PACKET_GAPDATA. Its a 32 bit integer that
represents the number of frames lost between this packet and the last previous
frame received.

Note there is a chance that if there is frame loss after a receive, and then the
socket is closed, some gap data might be lost. This is covered by the use of
the PACKET_AUXDATA socket option, which gives total loss data. With a bit of
math, the final gap can be determined that way.

I've tested this patch myself, and it works well.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>

include/linux/if_packet.h | 2 ++
net/packet/af_packet.c | 33 +++++++++++++++++++++++++++++++++
2 files changed, 35 insertions(+)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97775007

22 5月, 2009 1 次提交

af_packet: Teach to listen for multiple unicast addresses. · d95ed927

由 Eric W. Biederman 提交于 5月 19, 2009

The the PACKET_ADD_MEMBERSHIP and the PACKET_DROP_MEMBERSHIP setsockopt
calls for af_packet already has all of the infrastructure needed to subscribe
to multiple mac addresses.  All that is missing is a flag to say that
the address we want to listen on is a unicast address.

So introduce PACKET_MR_UNICAST and wire it up to dev_unicast_add and
dev_unicast_delete.

Additionally I noticed that errors from dev_mc_add were not propagated
from packet_dev_mc so fix that.
Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d95ed927

19 5月, 2009 1 次提交

net: TX_RING and packet mmap · 69e3c75f

由 Johann Baudy 提交于 5月 18, 2009

New packet socket feature that makes packet socket more efficient for
transmission.

- It reduces number of system call through a PACKET_TX_RING mechanism,
  based on PACKET_RX_RING (Circular buffer allocated in kernel space
  which is mmapped from user space).

- It minimizes CPU copy using fragmented SKB (almost zero copy).
Signed-off-by: NJohann Baudy <johann.baudy@gnu-log.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

69e3c75f

19 7月, 2008 1 次提交

packet: add PACKET_RESERVE sockopt · 8913336a

由 Patrick McHardy 提交于 7月 18, 2008

Add new sockopt to reserve some headroom in the mmaped ring frames in
front of the packet payload. This can be used f.i. when the VLAN header
needs to be (re)constructed to avoid moving the entire payload.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8913336a

15 7月, 2008 2 次提交

packet: deliver VLAN TCI to userspace · 393e52e3

由 Patrick McHardy 提交于 7月 14, 2008

Store the VLAN tag in the auxillary data/tpacket2_hdr so userspace can
properly deal with hardware VLAN tagging/stripping.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

393e52e3

packet: support extensible, 64 bit clean mmaped ring structure · bbd6ef87

由 Patrick McHardy 提交于 7月 14, 2008

The tpacket_hdr is not 64 bit clean due to use of an unsigned long
and can't be extended because the following struct sockaddr_ll needs
to be at a fixed offset.

Add support for a version 2 tpacket protocol that removes these
limitations.

Userspace can query the header size through a new getsockopt option
and change the protocol version through a setsockopt option. The
changes needed to switch to the new protocol version are:

1. replace struct tpacket_hdr by struct tpacket2_hdr
2. query header len and save
3. set protocol version to 2
 - set up ring as usual
4. for getting the sockaddr_ll, use (void *)hdr + TPACKET_ALIGN(hdrlen)
   instead of (void *)hdr + TPACKET_ALIGN(sizeof(struct tpacket_hdr))

Steps 2 and 4 can be omitted if the struct sockaddr_ll isn't needed.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bbd6ef87

26 4月, 2007 1 次提交

[AF_PACKET]: Add option to return orig_dev to userspace. · 80feaacb

由 Peter P. Waskiewicz Jr 提交于 4月 20, 2007

Add a packet socket option to allow the orig_dev index to be returned
to userspace when passing traffic through a decapsulated device, such
as the bonding driver.

This is very useful for layer 2 traffic being able to report which
physical device actually received the traffic, instead of having the
encapsulating device hide that information.

The new option is called PACKET_ORIGDEV.
Signed-off-by: NPeter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80feaacb

09 2月, 2007 1 次提交

[PACKET]: Add optional checksum computation for recvmsg · 8dc41944

由 Herbert Xu 提交于 2月 04, 2007

This patch is needed to make ISC's DHCP server (and probably other
DHCP servers/clients using AF_PACKET) to be able to serve another
client on the same Xen host.

The problem is that packets between different domains on the same
Xen host only have partial checksums.  Unfortunately this piece of
information is not passed along in AF_PACKET unless you're using
the mmap interface.  Since dhcpd doesn't support packet-mmap, UDP
packets from the same host come out with apparently bogus checksums.

This patch adds a mechanism for AF_PACKET recvmsg(2) to return the
status along with the packet.  It does so by adding a new cmsg that
contains this information along with some other relevant data such
as the original packet length.

I didn't include the time stamp information since there is already
a cmsg for that.

This patch also changes the mmap code to set the CSUMNOTREADY flag
on all packets instead of just outoing packets on cooked sockets.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8dc41944

03 12月, 2006 1 次提交

[AF_PACKET]: annotate · 0e11c91e

由 Al Viro 提交于 11月 08, 2006

Weirdness: the third argument of socket() is net-endian
here.  Oh, well - it's documented in packet(7).
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e11c91e

17 4月, 2005 1 次提交

Linux-2.6.12-rc2 · 1da177e4

由 Linus Torvalds 提交于 4月 16, 2005

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

1da177e4