提交 · a0dfb2634e5671770f598cda08002d8cda66ac77 · openeuler / raspberrypi-kernel

24 8月, 2012 1 次提交

af_packet: match_fanout_group() can be static · a0dfb263

由 Fengguang Wu 提交于 8月 23, 2012

cc: Eric Leblond <eric@regit.org>
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0dfb263

20 8月, 2012 1 次提交

af_packet: don't emit packet on orig fanout group · c0de08d0

由 Eric Leblond 提交于 8月 16, 2012

If a packet is emitted on one socket in one group of fanout sockets,
it is transmitted again. It is thus read again on one of the sockets
of the fanout group. This result in a loop for software which
generate packets when receiving one.
This retransmission is not the intended behavior: a fanout group
must behave like a single socket. The packet should not be
transmitted on a socket if it originates from a socket belonging
to the same fanout group.

This patch fixes the issue by changing the transmission check to
take fanout group info account.
Reported-by: NAleksandr Kotov <a1k@mail.ru>
Signed-off-by: NEric Leblond <eric@regit.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c0de08d0

13 8月, 2012 1 次提交

af_packet: remove BUG statement in tpacket_destruct_skb · 7f5c3e3a

由 danborkmann@iogearbox.net 提交于 8月 10, 2012

Here's a quote of the comment about the BUG macro from asm-generic/bug.h:

Don't use BUG() or BUG_ON() unless there's really no way out; one
example might be detecting data structure corruption in the middle
of an operation that can't be backed out of. If the (sub)system
can somehow continue operating, perhaps with reduced functionality,
it's probably not BUG-worthy.

If you're tempted to BUG(), think again: is completely giving up
really the *only* solution? There are usually better options, where
users don't need to reboot ASAP and can mostly shut down cleanly.

In our case, the status flag of a ring buffer slot is managed from both sides,
the kernel space and the user space. This means that even though the kernel
side might work as expected, the user space screws up and changes this flag
right between the send(2) is triggered when the flag is changed to
TP_STATUS_SENDING and a given skb is destructed after some time. Then, this
will hit the BUG macro. As David suggested, the best solution is to simply
remove this statement since it cannot be used for kernel side internal
consistency checks. I've tested it and the system still behaves /stable/ in
this case, so in accordance with the above comment, we should rather remove it.
Signed-off-by: NDaniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f5c3e3a

09 8月, 2012 1 次提交

af_packet: Quiet sparse noise about using plain integer as NULL pointer · 99aa3473

由 Ying Xue 提交于 8月 06, 2012

Quiets the sparse warning:
warning: Using plain integer as NULL pointer
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

99aa3473

28 6月, 2012 1 次提交

net: added support for 40GbE link. · e440cf2c

由 parav.pandit@emulex.com 提交于 6月 27, 2012

1. removed code replication for tov calculation for 1G, 10G and
made is common for speed > 1G (1G, 10G, 40G, 100G).
2. defines values for #4 different 40G Phys (KR4, LF4, SR4, CR4)
Signed-off-by: NParav Pandit <parav.pandit@emulex.com>
Reviewed-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e440cf2c

12 6月, 2012 1 次提交

af_packet: use sizeof instead of constant in spkt_device · de74e92a

由 danborkmann@iogearbox.net 提交于 6月 10, 2012

This small patch removes access to the last element of the spkt_device
array through a constant. Instead, it is accessed by sizeof() to respect
possible changes in if_packet.h.
Signed-off-by: NDaniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de74e92a

04 6月, 2012 1 次提交

net: Remove casts to same type · e3192690

由 Joe Perches 提交于 6月 03, 2012

Adding casts of objects to the same type is unnecessary
and confusing for a human reader.

For example, this cast:

	int y;
	int *p = (int *)&y;

I used the coccinelle script below to find and remove these
unnecessary casts.  I manually removed the conversions this
script produces of casts with __force and __user.

@@
type T;
T *p;
@@

-	(T *)p
+	p
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3192690

22 4月, 2012 1 次提交

af_packet: packet_getsockopt() cleanup · c06fff6e

由 Eric Dumazet 提交于 4月 19, 2012

Factorize code, since most fetched values are int type.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c06fff6e

20 4月, 2012 1 次提交

packet: dont drop packet but consume it · abc4e4fa

由 Eric Dumazet 提交于 4月 19, 2012

When we need to clone skb, we dont drop a packet.
Call consume_skb() to not confuse dropwatch.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

abc4e4fa

16 4月, 2012 1 次提交

net: cleanup unsigned to unsigned int · 95c96174

由 Eric Dumazet 提交于 4月 15, 2012

Use of "unsigned int" is preferred to bare "unsigned" in net tree.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95c96174

29 3月, 2012 1 次提交

Remove all #inclusions of asm/system.h · 9ffc93f2

由 David Howells 提交于 3月 28, 2012

Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`
Signed-off-by: NDavid Howells <dhowells@redhat.com>

9ffc93f2

24 2月, 2012 1 次提交

net: Add framework to allow sending packets with customized CRC. · 3bdc0eba

由 Ben Greear 提交于 2月 11, 2012

This is useful for testing RX handling of frames with bad
CRCs.

Requires driver support to actually put the packet on the
wire properly.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

3bdc0eba

28 12月, 2011 1 次提交

packet: fix possible dev refcnt leak when bind fail · aef950b4

由 Wei Yongjun 提交于 12月 27, 2011

If bind is fail when bind is called after set PACKET_FANOUT
sock option, the dev refcnt will leak.
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aef950b4

23 12月, 2011 1 次提交

net: relax rcvbuf limits · 0fd7bac6

由 Eric Dumazet 提交于 12月 21, 2011

skb->truesize might be big even for a small packet.

Its even bigger after commit 87fb4b7b (net: more accurate skb
truesize) and big MTU.

We should allow queueing at least one packet per receiver, even with a
low RCVBUF setting.
Reported-by: NMichal Simek <monstr@monstr.eu>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0fd7bac6

19 11月, 2011 2 次提交

packet: Add needed_tailroom to packet_sendmsg_spkt · 4ce40912

由 Herbert Xu 提交于 11月 18, 2011

packet: Add needed_tailroom to packet_sendmsg_spkt

While auditing LL_ALLOCATED_SPACE I noticed that packet_sendmsg_spkt
did not include needed_tailroom when allocating an skb.  This isn't
a fatal error as we should always tolerate inadequate tail room but
it isn't optimal.

This patch fixes that.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ce40912

net: Remove all uses of LL_ALLOCATED_SPACE · ae641949

由 Herbert Xu 提交于 11月 18, 2011

net: Remove all uses of LL_ALLOCATED_SPACE

The macro LL_ALLOCATED_SPACE was ill-conceived.  It applies the
alignment to the sum of needed_headroom and needed_tailroom.  As
the amount that is then reserved for head room is needed_headroom
with alignment, this means that the tail room left may be too small.

This patch replaces all uses of LL_ALLOCATED_SPACE with the macro
LL_RESERVED_SPACE and direct reference to needed_tailroom.

This also fixes the problem with needed_headroom changing between
allocating the skb and reserving the head room.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae641949

15 11月, 2011 1 次提交
- J
  net/packet: Revert incorrect dead-code changes to prb_setup_retire_blk_timer · eec20571
  由 Jesper Juhl 提交于 11月 14, 2011
```
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  eec20571
14 11月, 2011 1 次提交

net/packet: remove dead code and unneeded variable from prb_setup_retire_blk_timer() · 3ed90f76

由 Jesper Juhl 提交于 11月 14, 2011

We test for 'tx_ring' being != zero and BUG() if that's the case. So after
that check there is no way that 'tx_ring' could be anything _but_ zero, so
testing it again is just dead code. Once that dead code is removed, the
'pkc' local variable becomes entirely redundant, so remove that as well.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ed90f76

04 11月, 2011 1 次提交

af_packet: de-inline some helper functions · eea49cc9

由 Olof Johansson 提交于 11月 02, 2011

This popped some compiler errors due to mismatched prototypes. Just
remove most manual inlines, the compiler should be able to figure out
what makes sense to inline and not.

net/packet/af_packet.c:252: warning: 'prb_curr_blk_in_use' declared inline after being called
net/packet/af_packet.c:252: warning: previous declaration of 'prb_curr_blk_in_use' was here
net/packet/af_packet.c:258: warning: 'prb_queue_frozen' declared inline after being called
net/packet/af_packet.c:258: warning: previous declaration of 'prb_queue_frozen' was here
net/packet/af_packet.c:248: warning: 'packet_previous_frame' declared inline after being called
net/packet/af_packet.c:248: warning: previous declaration of 'packet_previous_frame' was here
net/packet/af_packet.c:251: warning: 'packet_increment_head' declared inline after being called
net/packet/af_packet.c:251: warning: previous declaration of 'packet_increment_head' was here
Signed-off-by: NOlof Johansson <olof@lixom.net>
Cc: Chetan Loke <loke.chetan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eea49cc9

19 10月, 2011 1 次提交

macvlan: handle fragmented multicast frames · bc416d97

由 Eric Dumazet 提交于 10月 06, 2011

Fragmented multicast frames are delivered to a single macvlan port,
because ip defrag logic considers other samples are redundant.

Implement a defrag step before trying to send the multicast frame.
Reported-by: NBen Greear <greearb@candelatech.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc416d97

11 10月, 2011 1 次提交

af_packet: remove unnecessary BUG_ON() in tpacket_destruct_skb · 95f5f803

由 danborkmann@iogearbox.net 提交于 10月 10, 2011

If skb is NULL, then stack trace is thrown anyway on dereference.
Therefore, the stack trace triggered by BUG_ON is duplicate.
Signed-off-by: NDaniel Borkmann <danborkmann@googlemail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95f5f803

04 10月, 2011 1 次提交

make PACKET_STATISTICS getsockopt report consistently between ring and non-ring · 7091fbd8

由 Willem de Bruijn 提交于 9月 30, 2011

This is a minor change.

Up until kernel 2.6.32, getsockopt(fd, SOL_PACKET, PACKET_STATISTICS,
...) would return total and dropped packets since its last invocation. The
introduction of socket queue overflow reporting [1] changed drop
rate calculation in the normal packet socket path, but not when using a
packet ring. As a result, the getsockopt now returns different statistics
depending on the reception method used. With a ring, it still returns the
count since the last call, as counts are incremented in tpacket_rcv and
reset in getsockopt. Without a ring, it returns 0 if no drops occurred
since the last getsockopt and the total drops over the lifespan of
the socket otherwise. The culprit is this line in packet_rcv, executed
on a drop:

drop_n_acct:
        po->stats.tp_drops = atomic_inc_return(&sk->sk_drops);

As it shows, the new drop number it taken from the socket drop counter,
which is not reset at getsockopt. I put together a small example
that demonstrates the issue [2]. It runs for 10 seconds and overflows
the queue/ring on every odd second. The reported drop rates are:
ring: 16, 0, 16, 0, 16, ...
non-ring: 0, 15, 0, 30, 0, 46, 0, 60, 0 , 74.

Note how the even ring counts monotonically increase. Because the
getsockopt adds tp_drops to tp_packets, total counts are similarly
reported cumulatively. Long story short, reinstating the original code, as
the below patch does, fixes the issue at the cost of additional per-packet
cycles. Another solution that does not introduce per-packet overhead
is be to keep the current data path, record the value of sk_drops at
getsockopt() at call N in a new field in struct packetsock and subtract
that when reporting at call N+1. I'll be happy to code that, instead,
it's just more messy.

[1] http://patchwork.ozlabs.org/patch/35665/
[2] http://kernel.googlecode.com/files/test-packetsock-getstatistics.cSigned-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7091fbd8

16 9月, 2011 1 次提交

net: consolidate and fix ethtool_ops->get_settings calling · 4bc71cb9

由 Jiri Pirko 提交于 9月 03, 2011

This patch does several things:
- introduces __ethtool_get_settings which is called from ethtool code and
  from drivers as well. Put ASSERT_RTNL there.
- dev_ethtool_get_settings() is replaced by __ethtool_get_settings()
- changes calling in drivers so rtnl locking is respected. In
  iboe_get_rate was previously ->get_settings() called unlocked. This
  fixes it. Also prb_calc_retire_blk_tmo() in af_packet.c had the same
  problem. Also fixed by calling __dev_get_by_index() instead of
  dev_get_by_index() and holding rtnl_lock for both calls.
- introduces rtnl_lock in bnx2fc_vport_create() and fcoe_vport_create()
  so bnx2fc_if_create() and fcoe_if_create() are called locked as they
  are from other places.
- use __ethtool_get_settings() in bonding code
Signed-off-by: NJiri Pirko <jpirko@redhat.com>

v2->v3:
	-removed dev_ethtool_get_settings()
	-added ASSERT_RTNL into __ethtool_get_settings()
	-prb_calc_retire_blk_tmo - use __dev_get_by_index() and lock
	 around it and __ethtool_get_settings() call
v1->v2:
        add missing export_symbol
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> [except FCoE bits]
Acked-by: NRalf Baechle <ralf@linux-mips.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4bc71cb9

27 8月, 2011 1 次提交

af_packet: Prefixed tpacket_v3 structs to avoid name space collision · bc59ba39

由 chetan loke 提交于 8月 25, 2011

structs introduced in tpacket_v3 implementation are prefixed with 'tpacket'
to avoid namespace collision.

Compile tested.
Signed-off-by: NChetan Loke <loke.chetan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc59ba39

25 8月, 2011 1 次提交

af-packet: TPACKET_V3 flexible buffer implementation. · f6fb8f10

由 chetan loke 提交于 8月 19, 2011

1) Blocks can be configured with non-static frame-size.
2) Read/poll is at a block-level(as opposed to packet-level).
3) Added poll timeout to avoid indefinite user-space wait on idle links.
4) Added user-configurable knobs:
   4.1) block::timeout.
   4.2) tpkt_hdr::sk_rxhash.

Changes:
C1) tpacket_rcv()
    C1.1) packet_current_frame() is replaced by packet_current_rx_frame()
          The bulk of the processing is then moved in the following chain:
          packet_current_rx_frame()
            __packet_lookup_frame_in_block
              fill_curr_block()
              or
                retire_current_block
                dispatch_next_block
              or
              return NULL(queue is plugged/paused)
Signed-off-by: NChetan Loke <loke.chetan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6fb8f10

14 7月, 2011 1 次提交

af-packet: fix - avoid reading stale data · cc9f01b2

由 Chetan Loke 提交于 7月 14, 2011

Currently we flush tp_status and then flush the remainder of the header+payload.
tp_status should be flushed in the end to avoid stale data being read by user-space.

Incorrectly re-ordered barriers in v1.
Signed-off-by: NChetan Loke <loke.chetan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc9f01b2

07 7月, 2011 2 次提交

packet: Fix build with INET disabled. · 31817df0

由 David S. Miller 提交于 7月 07, 2011

af_packet.c:(.text+0x3d130): undefined reference to `ip_defrag'
or
ERROR: "ip_defrag" [net/packet/af_packet.ko] undefined!
Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31817df0

af_packet: lock imbalance · afe62c68

由 Eric Dumazet 提交于 7月 07, 2011

fanout_add() might return with fanout_mutex held.

Reduce indentation level while we are at it
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

afe62c68

06 7月, 2011 5 次提交

packet: Fix leak in pre-defrag support. · aec27311

由 David S. Miller 提交于 7月 06, 2011

When we clone the SKB, we forget about the original
one.  Avoid this problem by using skb_share_check().
Reported-by: NPenttilä Mika <mika.penttila@ixonos.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aec27311

packet: Add 'cpu' fanout policy. · 95ec3eb4

由 David S. Miller 提交于 7月 06, 2011

Unfortunately we have to use a real modulus here as
the multiply trick won't work as effectively with cpu
numbers as it does with rxhash values.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95ec3eb4

packet: Add pre-defragmentation support for ipv4 fanouts. · 7736d33f

由 David S. Miller 提交于 7月 05, 2011

The skb->rxhash cannot be properly computed if the
packet is a fragment.  To alleviate this, allow the
AF_PACKET client to ask for defragmentation to be
done at demux time.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7736d33f

packet: Add fanout support. · dc99f600

由 David S. Miller 提交于 7月 05, 2011

Fanouts allow packet capturing to be demuxed to a set of AF_PACKET
sockets.  Two fanout policies are implemented:

1) Hashing based upon skb->rxhash

2) Pure round-robin

An AF_PACKET socket must be fully bound before it tries to add itself
to a fanout.  All AF_PACKET sockets trying to join the same fanout
must all have the same bind settings.

Fanouts are identified (within a network namespace) by a 16-bit ID.
The first socket to try to add itself to a fanout with a particular
ID, creates that fanout.  When the last socket leaves the fanout
(which happens only when the socket is closed), that fanout is
destroyed.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc99f600

D
packet: Add helpers to register/unregister ->prot_hook · ce06b03e
由 David S. Miller 提交于 7月 04, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
ce06b03e

12 6月, 2011 1 次提交

virtio_net: introduce VIRTIO_NET_HDR_F_DATA_VALID · 10a8d94a

由 Jason Wang 提交于 6月 10, 2011

There's no need for the guest to validate the checksum if it have been
validated by host nics. So this patch introduces a new flag -
VIRTIO_NET_HDR_F_DATA_VALID which is used to bypass the checksum
examing in guest. The backend (tap/macvtap) may set this flag when
met skbs with CHECKSUM_UNNECESSARY to save cpu utilization.

No feature negotiation is needed as old driver just ignore this flag.

Iperf shows 12%-30% performance improvement for UDP traffic. For TCP,
when gro is on no difference as it produces skb with partial
checksum. But when gro is disabled, 20% or even higher improvement
could be measured by netperf.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10a8d94a

07 6月, 2011 1 次提交

af_packet: prevent information leak · 13fcb7bd

由 Eric Dumazet 提交于 6月 06, 2011

In 2.6.27, commit 393e52e3 (packet: deliver VLAN TCI to userspace)
added a small information leak.

Add padding field and make sure its zeroed before copy to user.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13fcb7bd

06 6月, 2011 2 次提交

af-packet: Use existing netdev reference for bound sockets. · 827d9780

由 Ben Greear 提交于 6月 01, 2011

This saves a network device lookup on each packet transmitted,
for sockets that are bound to a network device.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

827d9780

af-packet: Hold reference to bound network devices. · 160ff18a

由 Ben Greear 提交于 6月 01, 2011

Old code was probably safe, but with this change we
can actually use the netdev object, not just compare
the pointer values.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

160ff18a

02 6月, 2011 1 次提交

af-packet: Add flag to distinguish VID 0 from no-vlan. · a3bcc23e

由 Ben Greear 提交于 6月 01, 2011

Currently, user-space cannot determine if a 0 tcp_vlan_tci
means there is no VLAN tag or the VLAN ID was zero.

Add flag to make this explicit.  User-space can check for
TP_STATUS_VLAN_VALID || tp_vlan_tci > 0, which will be backwards
compatible. Older could would have just checked for tp_vlan_tci,
so it will work no worse than before.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3bcc23e

24 5月, 2011 1 次提交

net: convert %p usage to %pK · 71338aa7

由 Dan Rosenberg 提交于 5月 23, 2011

The %pK format specifier is designed to hide exposed kernel pointers,
specifically via /proc interfaces.  Exposing these pointers provides an
easy target for kernel write vulnerabilities, since they reveal the
locations of writable structures containing easily triggerable function
pointers.  The behavior of %pK depends on the kptr_restrict sysctl.

If kptr_restrict is set to 0, no deviation from the standard %p behavior
occurs.  If kptr_restrict is set to 1, the default, if the current user
(intended to be a reader via seq_printf(), etc.) does not have CAP_SYSLOG
(currently in the LSM tree), kernel pointers using %pK are printed as 0's.
 If kptr_restrict is set to 2, kernel pointers using %pK are printed as
0's regardless of privileges.  Replacing with 0's was chosen over the
default "(null)", which cannot be parsed by userland %p, which expects
"(nil)".

The supporting code for kptr_restrict and %pK are currently in the -mm
tree.  This patch converts users of %p in net/ to %pK.  Cases of printing
pointers to the syslog are not covered, since this would eliminate useful
information for postmortem debugging and the reading of the syslog is
already optionally protected by the dmesg_restrict sysctl.
Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Thomas Graf <tgraf@infradead.org>
Cc: Eugene Teo <eugeneteo@kernel.org>
Cc: Kees Cook <kees.cook@canonical.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71338aa7

28 4月, 2011 1 次提交

net: filter: Just In Time compiler for x86-64 · 0a14842f

由 Eric Dumazet 提交于 4月 20, 2011

In order to speedup packet filtering, here is an implementation of a
JIT compiler for x86_64

It is disabled by default, and must be enabled by the admin.

echo 1 >/proc/sys/net/core/bpf_jit_enable

It uses module_alloc() and module_free() to get memory in the 2GB text
kernel range since we call helpers functions from the generated code.

EAX : BPF A accumulator
EBX : BPF X accumulator
RDI : pointer to skb   (first argument given to JIT function)
RBP : frame pointer (even if CONFIG_FRAME_POINTER=n)
r9d : skb->len - skb->data_len (headlen)
r8  : skb->data

To get a trace of generated code, use :

echo 2 >/proc/sys/net/core/bpf_jit_enable

Example of generated code :

# tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24

flen=18 proglen=147 pass=3 image=ffffffffa00b5000
JIT code: ffffffffa00b5000: 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 8b 4f 60
JIT code: ffffffffa00b5010: 44 2b 4f 64 4c 8b 87 b8 00 00 00 be 0c 00 00 00
JIT code: ffffffffa00b5020: e8 24 7b f7 e0 3d 00 08 00 00 75 28 be 1a 00 00
JIT code: ffffffffa00b5030: 00 e8 fe 7a f7 e0 24 00 3d 00 14 a8 c0 74 49 be
JIT code: ffffffffa00b5040: 1e 00 00 00 e8 eb 7a f7 e0 24 00 3d 00 14 a8 c0
JIT code: ffffffffa00b5050: 74 36 eb 3b 3d 06 08 00 00 74 07 3d 35 80 00 00
JIT code: ffffffffa00b5060: 75 2d be 1c 00 00 00 e8 c8 7a f7 e0 24 00 3d 00
JIT code: ffffffffa00b5070: 14 a8 c0 74 13 be 26 00 00 00 e8 b5 7a f7 e0 24
JIT code: ffffffffa00b5080: 00 3d 00 14 a8 c0 75 07 b8 ff ff 00 00 eb 02 31
JIT code: ffffffffa00b5090: c0 c9 c3

BPF program is 144 bytes long, so native program is almost same size ;)

(000) ldh      [12]
(001) jeq      #0x800           jt 2    jf 8
(002) ld       [26]
(003) and      #0xffffff00
(004) jeq      #0xc0a81400      jt 16   jf 5
(005) ld       [30]
(006) and      #0xffffff00
(007) jeq      #0xc0a81400      jt 16   jf 17
(008) jeq      #0x806           jt 10   jf 9
(009) jeq      #0x8035          jt 10   jf 17
(010) ld       [28]
(011) and      #0xffffff00
(012) jeq      #0xc0a81400      jt 16   jf 13
(013) ld       [38]
(014) and      #0xffffff00
(015) jeq      #0xc0a81400      jt 16   jf 17
(016) ret      #65535
(017) ret      #0
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a14842f