提交 · ce06b03e60fc19c680d1bf873e779bf11c2fc518 · openanolis / cloud-kernel

06 7月, 2011 1 次提交
- D
  packet: Add helpers to register/unregister ->prot_hook · ce06b03e
  由 David S. Miller 提交于 7月 04, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  ce06b03e
12 6月, 2011 1 次提交

virtio_net: introduce VIRTIO_NET_HDR_F_DATA_VALID · 10a8d94a

由 Jason Wang 提交于 6月 10, 2011

There's no need for the guest to validate the checksum if it have been
validated by host nics. So this patch introduces a new flag -
VIRTIO_NET_HDR_F_DATA_VALID which is used to bypass the checksum
examing in guest. The backend (tap/macvtap) may set this flag when
met skbs with CHECKSUM_UNNECESSARY to save cpu utilization.

No feature negotiation is needed as old driver just ignore this flag.

Iperf shows 12%-30% performance improvement for UDP traffic. For TCP,
when gro is on no difference as it produces skb with partial
checksum. But when gro is disabled, 20% or even higher improvement
could be measured by netperf.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10a8d94a

07 6月, 2011 1 次提交

af_packet: prevent information leak · 13fcb7bd

由 Eric Dumazet 提交于 6月 06, 2011

In 2.6.27, commit 393e52e3 (packet: deliver VLAN TCI to userspace)
added a small information leak.

Add padding field and make sure its zeroed before copy to user.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13fcb7bd

06 6月, 2011 2 次提交

af-packet: Use existing netdev reference for bound sockets. · 827d9780

由 Ben Greear 提交于 6月 01, 2011

This saves a network device lookup on each packet transmitted,
for sockets that are bound to a network device.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

827d9780

af-packet: Hold reference to bound network devices. · 160ff18a

由 Ben Greear 提交于 6月 01, 2011

Old code was probably safe, but with this change we
can actually use the netdev object, not just compare
the pointer values.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

160ff18a

02 6月, 2011 1 次提交

af-packet: Add flag to distinguish VID 0 from no-vlan. · a3bcc23e

由 Ben Greear 提交于 6月 01, 2011

Currently, user-space cannot determine if a 0 tcp_vlan_tci
means there is no VLAN tag or the VLAN ID was zero.

Add flag to make this explicit.  User-space can check for
TP_STATUS_VLAN_VALID || tp_vlan_tci > 0, which will be backwards
compatible. Older could would have just checked for tp_vlan_tci,
so it will work no worse than before.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3bcc23e

24 5月, 2011 1 次提交

net: convert %p usage to %pK · 71338aa7

由 Dan Rosenberg 提交于 5月 23, 2011

The %pK format specifier is designed to hide exposed kernel pointers,
specifically via /proc interfaces.  Exposing these pointers provides an
easy target for kernel write vulnerabilities, since they reveal the
locations of writable structures containing easily triggerable function
pointers.  The behavior of %pK depends on the kptr_restrict sysctl.

If kptr_restrict is set to 0, no deviation from the standard %p behavior
occurs.  If kptr_restrict is set to 1, the default, if the current user
(intended to be a reader via seq_printf(), etc.) does not have CAP_SYSLOG
(currently in the LSM tree), kernel pointers using %pK are printed as 0's.
 If kptr_restrict is set to 2, kernel pointers using %pK are printed as
0's regardless of privileges.  Replacing with 0's was chosen over the
default "(null)", which cannot be parsed by userland %p, which expects
"(nil)".

The supporting code for kptr_restrict and %pK are currently in the -mm
tree.  This patch converts users of %p in net/ to %pK.  Cases of printing
pointers to the syslog are not covered, since this would eliminate useful
information for postmortem debugging and the reading of the syslog is
already optionally protected by the dmesg_restrict sysctl.
Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Thomas Graf <tgraf@infradead.org>
Cc: Eugene Teo <eugeneteo@kernel.org>
Cc: Kees Cook <kees.cook@canonical.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71338aa7

28 4月, 2011 1 次提交

net: filter: Just In Time compiler for x86-64 · 0a14842f

由 Eric Dumazet 提交于 4月 20, 2011

In order to speedup packet filtering, here is an implementation of a
JIT compiler for x86_64

It is disabled by default, and must be enabled by the admin.

echo 1 >/proc/sys/net/core/bpf_jit_enable

It uses module_alloc() and module_free() to get memory in the 2GB text
kernel range since we call helpers functions from the generated code.

EAX : BPF A accumulator
EBX : BPF X accumulator
RDI : pointer to skb   (first argument given to JIT function)
RBP : frame pointer (even if CONFIG_FRAME_POINTER=n)
r9d : skb->len - skb->data_len (headlen)
r8  : skb->data

To get a trace of generated code, use :

echo 2 >/proc/sys/net/core/bpf_jit_enable

Example of generated code :

# tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24

flen=18 proglen=147 pass=3 image=ffffffffa00b5000
JIT code: ffffffffa00b5000: 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 8b 4f 60
JIT code: ffffffffa00b5010: 44 2b 4f 64 4c 8b 87 b8 00 00 00 be 0c 00 00 00
JIT code: ffffffffa00b5020: e8 24 7b f7 e0 3d 00 08 00 00 75 28 be 1a 00 00
JIT code: ffffffffa00b5030: 00 e8 fe 7a f7 e0 24 00 3d 00 14 a8 c0 74 49 be
JIT code: ffffffffa00b5040: 1e 00 00 00 e8 eb 7a f7 e0 24 00 3d 00 14 a8 c0
JIT code: ffffffffa00b5050: 74 36 eb 3b 3d 06 08 00 00 74 07 3d 35 80 00 00
JIT code: ffffffffa00b5060: 75 2d be 1c 00 00 00 e8 c8 7a f7 e0 24 00 3d 00
JIT code: ffffffffa00b5070: 14 a8 c0 74 13 be 26 00 00 00 e8 b5 7a f7 e0 24
JIT code: ffffffffa00b5080: 00 3d 00 14 a8 c0 75 07 b8 ff ff 00 00 eb 02 31
JIT code: ffffffffa00b5090: c0 c9 c3

BPF program is 144 bytes long, so native program is almost same size ;)

(000) ldh      [12]
(001) jeq      #0x800           jt 2    jf 8
(002) ld       [26]
(003) and      #0xffffff00
(004) jeq      #0xc0a81400      jt 16   jf 5
(005) ld       [30]
(006) and      #0xffffff00
(007) jeq      #0xc0a81400      jt 16   jf 17
(008) jeq      #0x806           jt 10   jf 9
(009) jeq      #0x8035          jt 10   jf 17
(010) ld       [28]
(011) and      #0xffffff00
(012) jeq      #0xc0a81400      jt 16   jf 13
(013) ld       [38]
(014) and      #0xffffff00
(015) jeq      #0xc0a81400      jt 16   jf 17
(016) ret      #65535
(017) ret      #0
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a14842f

08 3月, 2011 1 次提交
- H
  af_packet: struct socket declared/assigned but unused · e143038f
  由 Hagen Paul Pfeifer 提交于 3月 04, 2011
```
Signed-off-by: NHagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  e143038f
12 2月, 2011 1 次提交

network: Allow af_packet to transmit +4 bytes for VLAN packets. · 57f89bfa

由 Ben Greear 提交于 2月 11, 2011

This allows user-space to send a '1500' MTU VLAN packet on a
1500 MTU ethernet frame.  The extra 4 bytes of a VLAN header is
not usually charged against the MTU when other parts of the
network stack is transmitting vlans...
Signed-off-by: NBen Greear <greearb@candelatech.com>
Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57f89bfa

20 1月, 2011 1 次提交

net: cleanup unused macros in net directory · 441c793a

由 Shan Wei 提交于 1月 13, 2011

Clean up some unused macros in net/*.
1. be left for code change. e.g. PGV_FROM_VMALLOC, PGV_FROM_VMALLOC, KMEM_SAFETYZONE.
2. never be used since introduced to kernel.
   e.g. P9_RDMA_MAX_SGE, UTIL_CTRL_PKT_SIZE.
Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com>
Acked-by: NSjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

441c793a

19 1月, 2011 1 次提交

net: filter: dont block softirqs in sk_run_filter() · 80f8f102

由 Eric Dumazet 提交于 1月 18, 2011

Packet filter (BPF) doesnt need to disable softirqs, being fully
re-entrant and lock-less.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80f8f102

17 12月, 2010 1 次提交

net: Use skb_checksum_start_offset() · 55508d60

由 Michał Mirosław 提交于 12月 14, 2010

Replace skb->csum_start - skb_headroom(skb) with skb_checksum_start_offset().

Note for usb/smsc95xx: skb->data - skb->head == skb_headroom(skb).
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55508d60

11 12月, 2010 1 次提交
- C
  af_packet: use swap() instead of the open coded macro XC() · c053fd96
  由 Changli Gao 提交于 12月 10, 2010
```
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  c053fd96
09 12月, 2010 3 次提交

af_packet: fix freeing pg_vec twice on error path · 920b8d91

由 Changli Gao 提交于 12月 07, 2010

It is introduced in:
        commit 0e3125c7
        Author: Neil Horman <nhorman@tuxdriver.com>
        Date:   Tue Nov 16 10:26:47 2010 -0800

        packet: Enhance AF_PACKET implementation to not require high order contiguous memory allocation (v4)
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

920b8d91

af_packet: eliminate pgv_to_page on some arches · f6dafa95

由 Changli Gao 提交于 12月 07, 2010

Some arches don't need flush_dcache_page(), and don't implement it, so
we can eliminate pgv_to_page() calls on those arches.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6dafa95

filter: constify sk_run_filter() · 62ab0812

由 Eric Dumazet 提交于 12月 06, 2010

sk_run_filter() doesnt write on skb, change its prototype to reflect
this.

Fix two af_packet comments.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62ab0812

07 12月, 2010 2 次提交

af_packet: remove pgv.flags · c56b4d90

由 Changli Gao 提交于 12月 01, 2010

As we can check if an address is vmalloc address with is_vmalloc_addr(),
we remove pgv.flags. Then we may get more pg_vecs.
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c56b4d90

af_packet: use vmalloc_to_page() instead for the addresss returned by vmalloc() · 0af55bb5

由 Changli Gao 提交于 12月 01, 2010

The following commit causes the pgv->buffer may point to the memory
returned by vmalloc(). And we can't use virt_to_page() for the vmalloc
address.

This patch introduces a new inline function pgv_to_page(), which calls
vmalloc_to_page() for the vmalloc address, and virt_to_page() for the
__get_free_pages address.

We used to increase page pointer to get the next page at the next page
address, after Neil's patch, it is wrong, as the physical address may
be not continuous. This patch also fixes this issue.

    commit 0e3125c7
    Author: Neil Horman <nhorman@tuxdriver.com>
    Date:   Tue Nov 16 10:26:47 2010 -0800

    packet: Enhance AF_PACKET implementation to not require high order contiguous memory allocation (v4)
Signed-off-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0af55bb5

22 11月, 2010 1 次提交

packet: use vzalloc() · bbce5a59

由 Eric Dumazet 提交于 11月 20, 2010

alloc_one_pg_vec_page() is supposed to return zeroed memory, so use
vzalloc() instead of vmalloc()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bbce5a59

20 11月, 2010 1 次提交

filter: optimize sk_run_filter · 93aaae2e

由 Eric Dumazet 提交于 11月 19, 2010

Remove pc variable to avoid arithmetic to compute fentry at each filter
instruction. Jumps directly manipulate fentry pointer.

As the last instruction of filter[] is guaranteed to be a RETURN, and
all jumps are before the last instruction, we dont need to check filter
bounds (number of instructions in filter array) at each iteration, so we
remove it from sk_run_filter() params.

On x86_32 remove f_k var introduced in commit 57fe93b3
(filter: make sure filters dont read uninitialized memory)

Note : We could use a CONFIG_ARCH_HAS_{FEW|MANY}_REGISTERS in order to
avoid too many ifdefs in this code.

This helps compiler to use cpu registers to hold fentry and A
accumulator.

On x86_32, this saves 401 bytes, and more important, sk_run_filter()
runs much faster because less register pressure (One less conditional
branch per BPF instruction)

# size net/core/filter.o net/core/filter_pre.o
   text    data     bss     dec     hex filename
   2948       0       0    2948     b84 net/core/filter.o
   3349       0       0    3349     d15 net/core/filter_pre.o

on x86_64 :
# size net/core/filter.o net/core/filter_pre.o
   text    data     bss     dec     hex filename
   5173       0       0    5173    1435 net/core/filter.o
   5224       0       0    5224    1468 net/core/filter_pre.o
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93aaae2e

17 11月, 2010 1 次提交

packet: Enhance AF_PACKET implementation to not require high order contiguous... · 0e3125c7

由 Neil Horman 提交于 11月 16, 2010

packet: Enhance AF_PACKET implementation to not require high order contiguous memory allocation (v4)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Version 4 of this patch.

Change notes:
1) Removed extra memset. Didn't think kcalloc added a GFP_ZERO the way kzalloc did :)

Summary:
It was shown to me recently that systems under high load were driven very deep
into swap when tcpdump was run. The reason this happened was because the
AF_PACKET protocol has a SET_RINGBUFFER socket option that allows the user space
application to specify how many entries an AF_PACKET socket will have and how
large each entry will be. It seems the default setting for tcpdump is to set
the ring buffer to 32 entries of 64 Kb each, which implies 32 order 5
allocation. Thats difficult under good circumstances, and horrid under memory
pressure.

I thought it would be good to make that a bit more usable. I was going to do a
simple conversion of the ring buffer from contigous pages to iovecs, but
unfortunately, the metadata which AF_PACKET places in these buffers can easily
span a page boundary, and given that these buffers get mapped into user space,
and the data layout doesn't easily allow for a change to padding between frames
to avoid that, a simple iovec change is just going to break user space ABI
consistency.

So I've done this, I've added a three tiered mechanism to the af_packet set_ring
socket option. It attempts to allocate memory in the following order:

1) Using __get_free_pages with GFP_NORETRY set, so as to fail quickly without
digging into swap

2) Using vmalloc

3) Using __get_free_pages with GFP_NORETRY clear, causing us to try as hard as
needed to get the memory

The effect is that we don't disturb the system as much when we're under load,
while still being able to conduct tcpdumps effectively.

Tested successfully by me.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NMaciej Żenczykowski <zenczykowski@gmail.com>
Reported-by: NMaciej Żenczykowski <zenczykowski@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e3125c7

13 11月, 2010 1 次提交

net: Fix header size check for GSO case in recvmsg (af_packet) · 1f18b717

由 Mariusz Kozlowski 提交于 11月 08, 2010

Parameter 'len' is size_t type so it will never get negative.
Signed-off-by: NMariusz Kozlowski <mk@lab.zgora.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f18b717

11 11月, 2010 1 次提交

net: packet: fix information leak to userland · 67286640

由 Vasiliy Kulikov 提交于 11月 10, 2010

packet_getname_spkt() doesn't initialize all members of sa_data field of
sockaddr struct if strlen(dev->name) < 13.  This structure is then copied
to userland.  It leads to leaking of contents of kernel stack memory.
We have to fully fill sa_data with strncpy() instead of strlcpy().

The same with packet_getname(): it doesn't initialize sll_pkttype field of
sockaddr_ll.  Set it to zero.
Signed-off-by: NVasiliy Kulikov <segooon@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

67286640

19 8月, 2010 1 次提交

net: simplify flags for tx timestamping · 2244d07b

由 Oliver Hartkopp 提交于 8月 17, 2010

This patch removes the abstraction introduced by the union skb_shared_tx in
the shared skb data.

The access of the different union elements at several places led to some
confusion about accessing the shared tx_flags e.g. in skb_orphan_try().

http://marc.info/?l=linux-netdev&m=128084897415886&w=2Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2244d07b

02 6月, 2010 1 次提交

packet_mmap: expose hw packet timestamps to network packet capture utilities · 614f60fa

由 Scott McMillan 提交于 6月 02, 2010

This patch adds a setting, PACKET_TIMESTAMP, to specify the packet
timestamp source that is exported to capture utilities like tcpdump by
packet_mmap.

PACKET_TIMESTAMP accepts the same integer bit field as
SO_TIMESTAMPING.  However, only the SOF_TIMESTAMPING_SYS_HARDWARE and
SOF_TIMESTAMPING_RAW_HARDWARE values are currently recognized by
PACKET_TIMESTAMP.  SOF_TIMESTAMPING_SYS_HARDWARE takes precedence over
SOF_TIMESTAMPING_RAW_HARDWARE if both bits are set.

If PACKET_TIMESTAMP is not set, a software timestamp generated inside
the networking stack is used (the behavior before this setting was
added).
Signed-off-by: NScott McMillan <scott.a.mcmillan@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

614f60fa

17 4月, 2010 1 次提交

packet : remove init_net restriction · 1c4f0197

由 Daniel Lezcano 提交于 4月 14, 2010

The af_packet protocol is used by Perl to do ioctls as reported by
Stephane Riviere:

"Net::RawIP relies on SIOCGIFADDR et SIOCGIFHWADDR to get the IP and MAC
addresses of the network interface."

But in a new network namespace these ioctl fail because it is disabled for
a namespace different from the init_net_ns.

These two lines should not be there as af_inet and af_packet are
namespace aware since a long time now. I suppose we forget to remove these
lines because we sent the af_packet first, before af_inet was supported.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@free.fr>
Reported-by: NStephane Riviere <stephane.riviere@regis-dgac.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c4f0197

13 4月, 2010 1 次提交

packet: support for TX time stamps on RAW sockets · ed85b565

由 Richard Cochran 提交于 4月 07, 2010

Enable the SO_TIMESTAMPING socket infrastructure for raw packet sockets.
We introduce PACKET_TX_TIMESTAMP for the control message cmsg_type.

Similar support for UDP and CAN sockets was added in commit
51f31cabSigned-off-by: NRichard Cochran <richard.cochran@omicron.at>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed85b565

04 4月, 2010 2 次提交

net: convert multicast list to list_head · 22bedad3

由 Jiri Pirko 提交于 4月 01, 2010

Converts the list and the core manipulating with it to be the same as uc_list.

+uses two functions for adding/removing mc address (normal and "global"
 variant) instead of a function parameter.
+removes dev_mcast.c completely.
+exposes netdev_hw_addr_list_* macros along with __hw_addr_* functions for
 manipulation with lists on a sandbox (used in bonding and 80211 drivers)
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22bedad3

net: move address list functions to a separate file · a748ee24

由 Jiri Pirko 提交于 4月 01, 2010

+little renaming of unicast functions to be smooth with multicast ones
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a748ee24

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

03 3月, 2010 1 次提交

af_packet: move strict addr_len check right before dev_[mc/unicast]_[add/del] · 1162563f

由 Jiri Pirko 提交于 3月 02, 2010

My previous patch 914c8ad2 incorrectly changed
the length check in packet_mc_add to be more strict. The problem is that
userspace is not filling this field (and it stays zeroed) in case of setting
PACKET_MR_PROMISC or PACKET_MR_ALLMULTI. So move the strict check to the point
in path where the addr_len must be set correctly.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Reported-by: NPavel Roskin <proski@gnu.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1162563f

26 2月, 2010 1 次提交

af_packet: do not accept mc address smaller then dev->addr_len in packet_mc_add() · 914c8ad2

由 Jiri Pirko 提交于 2月 24, 2010

There is no point of accepting an address of smaller length than dev->addr_len
here. Therefore change this for stonger check.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

914c8ad2

25 2月, 2010 1 次提交

net: Add checking to rcu_dereference() primitives · a898def2

由 Paul E. McKenney 提交于 2月 22, 2010

Update rcu_dereference() primitives to use new lockdep-based
checking. The rcu_dereference() in __in6_dev_get() may be
protected either by rcu_read_lock() or RTNL, per Eric Dumazet.
The rcu_dereference() in __sk_free() is protected by the fact
that it is never reached if an update could change it.  Check
for this by using rcu_dereference_check() to verify that the
struct sock's ->sk_wmem_alloc counter is zero.
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <1266887105-1528-5-git-send-email-paulmck@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a898def2

23 2月, 2010 1 次提交

packet: convert socket list to RCU (v3) · 808f5114

由 stephen hemminger 提交于 2月 22, 2010

Convert AF_PACKET to use RCU, eliminating one more reader/writer lock.

There is no need for a real sk_del_node_init_rcu(), because sk_del_node_init
is doing the equivalent thing to hlst_del_init_rcu already; but added
some comments to try and make that obvious.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

808f5114

11 2月, 2010 1 次提交

net: packet: use seq_hlist_foo() helpers · b7ceabd9

由 Li Zefan 提交于 2月 08, 2010

Simplify seq_file code.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b7ceabd9

06 2月, 2010 1 次提交

packet: Kill CONFIG_PACKET_MMAP. · 889b8f96

由 David S. Miller 提交于 2月 05, 2010

Early on this was an experimental facility that few
people other than Alexey Kuznetsov played with.

Now it's a pretty fundamental thing and as people add
more features to AF_PACKET sockets this config options
creates ifdef spaghetti.

So kill it off.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

889b8f96

05 2月, 2010 1 次提交

packet: Add GSO/csum offload support. · bfd5f4a3

由 Sridhar Samudrala 提交于 2月 04, 2010

This patch adds GSO/checksum offload to af_packet sockets using
virtio_net_hdr. Based on Rusty's patch to add this support to tun.
It allows GSO/checksum offload to be enabled when using raw socket
backend with virtio_net.
Adds PACKET_VNET_HDR socket option to prepend virtio_net_hdr in the
receive path and process/skip virtio_net_hdr in the send path. This
option is only allowed with SOCK_RAW sockets attached to ethernet
type devices.

v2 updates
----------
Michael's Comments
- Perform length check in packet_snd() when GSO is off even when
  vnet_hdr is present.
- Check for SKB_GSO_FCOE type and return -EINVAL
- don't allow tx/rx ring when vnet_hdr is enabled.
Herbert's Comments
- Removed ethernet specific code.
- protocol value is assumed to be passed in by the caller.
Signed-off-by: NSridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bfd5f4a3

18 1月, 2010 1 次提交

net: spread __net_init, __net_exit · 2c8c1e72

由 Alexey Dobriyan 提交于 1月 17, 2010

__net_init/__net_exit are apparently not going away, so use them
to full extent.

In some cases __net_init was removed, because it was called from
__net_exit code.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c8c1e72

12 1月, 2010 1 次提交

af_packet: Don't use skb after dev_queue_xmit() · eb70df13

由 Jarek Poplawski 提交于 1月 10, 2010

tpacket_snd() can change and kfree an skb after dev_queue_xmit(),
which is illegal.

With debugging by: Stephen Hemminger <shemminger@vyatta.com>
Reported-by: NMichael Breuer <mbreuer@majjas.com>
With help from: David S. Miller <davem@davemloft.net>
Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
Tested-by: Michael Breuer<mbreuer@majjas.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb70df13

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功