提交 · 0ba40af963f01b557a4d7a0a6c550a51b0fb8d34 · openanolis / cloud-kernel

09 4月, 2016 3 次提交

nfp: move link state interrupt request/free calls · 0ba40af9

由 Jakub Kicinski 提交于 4月 07, 2016

We need to be able to disable the link state interrupt when
the device is brought down.  We used to just free the IRQ
at the beginning of .ndo_stop().  As we now move towards
more ordered .ndo_open()/.ndo_stop() paths LSC allocation
should be placed in the "allocate resource" section.

Since the IRQ can't be freed early in .ndo_stop(), it is
disabled instead.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ba40af9

nfp: correct RX buffer length calculation · ff1b68ab

由 Jakub Kicinski 提交于 4月 07, 2016

When calculating the RX buffer length we need to account
for up to 2 VLAN tags.  Rounding up to 1k is an relic of
a distant past and can be removed.  While at it also remove
trivial print statement.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff1b68ab

Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 70f767d3

由 David S. Miller 提交于 4月 08, 2016

Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2016-04-07

This series contains updates to ixgbe and ixgbevf.

This entire series (except for one patch from Alex) comes from Mark and
is mainly to add support for our new MAC (x550em_a).

So let's get Alex's patch out of the way first before we cover Mark's
many changes.  Alex does his enable bulk free in transmit cleanup for
ixgbe and ixgbevf, like his has done for all of our other drivers.

First Mark cleans up registers that were not being used, so do some
house cleaning.  Then to avoid casting lan_id and func fields, just
make them u8 since they only hold small values anyways.  Found and
fixed an issue where on read operations it could be possible to
modify locations beyond the length passed in, so change the check
to round up in the same way.  Cleaned up the interface for issuing
firmware commands to use a void * instead of a u32 * which eliminates
a number of casts.  Added support for the new MAC and provided method
pointers and use them to access IOSF-attached devices, since the
new MAC will also need a new access method.  Added support for SFPs
with an external retimer and for an SGMII backplane interface.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70f767d3

08 4月, 2016 37 次提交

Merge branch 'bpf-tracepoints' · f8711655

由 David S. Miller 提交于 4月 07, 2016

Alexei Starovoitov says:

====================
allow bpf attach to tracepoints

Hi Steven, Peter,

v1->v2: addressed Peter's comments:
- fixed wording in patch 1, added ack
- refactored 2nd patch into 3:
2/10 remove unused __perf_addr macro which frees up
an argument in perf_trace_buf_submit
3/10 split perf_trace_buf_prepare into alloc and update parts, so that bpf
programs don't have to pay performance penalty for update of struct trace_entry
which is not going to be accessed by bpf
4/10 actual addition of bpf filter to perf tracepoint handler is now trivial
and bpf prog can be used as proper filter of tracepoints

v1 cover:
last time we discussed bpf+tracepoints it was a year ago [1] and the reason
we didn't proceed with that approach was that bpf would make arguments
arg1, arg2 to trace_xx(arg1, arg2) call to be exposed to bpf program
and that was considered unnecessary extension of abi. Back then I wanted
to avoid the cost of buffer alloc and field assign part in all
of the tracepoints, but looks like when optimized the cost is acceptable.
So this new apporach doesn't expose any new abi to bpf program.
The program is looking at tracepoint fields after they were copied
by perf_trace_xx() and described in /sys/kernel/debug/tracing/events/xxx/format
We made a tool [2] that takes arguments from /sys/.../format and works as:
$ tplist.py -v random:urandom_read
    int got_bits;
    int pool_left;
    int input_left;
Then these fields can be copy-pasted into bpf program like:
struct urandom_read {
    __u64 hidden_pad;
    int got_bits;
    int pool_left;
    int input_left;
};
and the program can use it:
SEC("tracepoint/random/urandom_read")
int bpf_prog(struct urandom_read *ctx)
{
    return ctx->pool_left > 0 ? 1 : 0;
}
This way the program can access tracepoint fields faster than
equivalent bpf+kprobe program, which is the main goal of these patches.

Patch 1-4 are simple changes in perf core side, please review.
I'd like to take the whole set via net-next tree, since the rest of
the patches might conflict with other bpf work going on in net-next
and we want to avoid cross-tree merge conflicts.
Alternatively we can put patches 1-4 into both tip and net-next.

Patch 9 is an example of access to tracepoint fields from bpf prog.
Patch 10 is a micro benchmark for bpf+kprobe vs bpf+tracepoint.

Note that for actual tracing tools the user doesn't need to
run tplist.py and copy-paste fields manually. The tools do it
automatically. Like argdist tool [3] can be used as:
$ argdist -H 't:block:block_rq_complete():u32:nr_sector'
where 'nr_sector' is name of tracepoint field taken from
/sys/kernel/debug/tracing/events/block/block_rq_complete/format
and appropriate bpf program is generated on the fly.

[1] http://thread.gmane.org/gmane.linux.kernel.api/8127/focus=8165
[2] https://github.com/iovisor/bcc/blob/master/tools/tplist.py
[3] https://github.com/iovisor/bcc/blob/master/tools/argdist.py
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f8711655

samples/bpf: add tracepoint vs kprobe performance tests · e3edfdec

由 Alexei Starovoitov 提交于 4月 06, 2016

the first microbenchmark does
fd=open("/proc/self/comm");
for() {
  write(fd, "test");
}
and on 4 cpus in parallel:
                                      writes per sec
base (no tracepoints, no kprobes)         930k
with kprobe at __set_task_comm()          420k
with tracepoint at task:task_rename       730k

For kprobe + full bpf program manully fetches oldcomm, newcomm via bpf_probe_read.
For tracepint bpf program does nothing, since arguments are copied by tracepoint.

2nd microbenchmark does:
fd=open("/dev/urandom");
for() {
  read(fd, buf);
}
and on 4 cpus in parallel:
                                       reads per sec
base (no tracepoints, no kprobes)         300k
with kprobe at urandom_read()             279k
with tracepoint at random:urandom_read    290k

bpf progs attached to kprobe and tracepoint are noop.
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3edfdec

samples/bpf: tracepoint example · 3c9b1644

由 Alexei Starovoitov 提交于 4月 06, 2016

modify offwaketime to work with sched/sched_switch tracepoint
instead of kprobe into finish_task_switch
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c9b1644

samples/bpf: add tracepoint support to bpf loader · c0766040

由 Alexei Starovoitov 提交于 4月 06, 2016

Recognize "tracepoint/" section name prefix and attach the program
to that tracepoint.
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c0766040

bpf: sanitize bpf tracepoint access · 32bbe007

由 Alexei Starovoitov 提交于 4月 06, 2016

during bpf program loading remember the last byte of ctx access
and at the time of attaching the program to tracepoint check that
the program doesn't access bytes beyond defined in tracepoint fields

This also disallows access to __dynamic_array fields, but can be
relaxed in the future.
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32bbe007

bpf: support bpf_get_stackid() and bpf_perf_event_output() in tracepoint programs · 9940d67c

由 Alexei Starovoitov 提交于 4月 06, 2016

needs two wrapper functions to fetch 'struct pt_regs *' to convert
tracepoint bpf context into kprobe bpf context to reuse existing
helper functions
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9940d67c

bpf: register BPF_PROG_TYPE_TRACEPOINT program type · 9fd82b61

由 Alexei Starovoitov 提交于 4月 06, 2016

register tracepoint bpf program type and let it call the same set
of helper functions as BPF_PROG_TYPE_KPROBE
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9fd82b61

perf, bpf: allow bpf programs attach to tracepoints · 98b5c2c6

由 Alexei Starovoitov 提交于 4月 06, 2016

introduce BPF_PROG_TYPE_TRACEPOINT program type and allow it to be attached
to the perf tracepoint handler, which will copy the arguments into
the per-cpu buffer and pass it to the bpf program as its first argument.
The layout of the fields can be discovered by doing
'cat /sys/kernel/debug/tracing/events/sched/sched_switch/format'
prior to the compilation of the program with exception that first 8 bytes
are reserved and not accessible to the program. This area is used to store
the pointer to 'struct pt_regs' which some of the bpf helpers will use:
+---------+
| 8 bytes | hidden 'struct pt_regs *' (inaccessible to bpf program)
+---------+
| N bytes | static tracepoint fields defined in tracepoint/format (bpf readonly)
+---------+
| dynamic | __dynamic_array bytes of tracepoint (inaccessible to bpf yet)
+---------+

Not that all of the fields are already dumped to user space via perf ring buffer
and broken application access it directly without consulting tracepoint/format.
Same rule applies here: static tracepoint fields should only be accessed
in a format defined in tracepoint/format. The order of fields and
field sizes are not an ABI.
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98b5c2c6

perf: split perf_trace_buf_prepare into alloc and update parts · 1e1dcd93

由 Alexei Starovoitov 提交于 4月 06, 2016

split allows to move expensive update of 'struct trace_entry' to later phase.
Repurpose unused 1st argument of perf_tp_event() to indicate event type.

While splitting use temp variable 'rctx' instead of '*rctx' to avoid
unnecessary loads done by the compiler due to -fno-strict-aliasing
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e1dcd93

perf: remove unused __addr variable · e93735be

由 Alexei Starovoitov 提交于 4月 06, 2016

now all calls to perf_trace_buf_submit() pass 0 as 4th
argument which will be repurposed in the next patch which will
change the meaning of 1st arg of perf_tp_event() to event_type
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e93735be

perf: optimize perf_fetch_caller_regs · ec5e099d

由 Alexei Starovoitov 提交于 4月 06, 2016

avoid memset in perf_fetch_caller_regs, since it's the critical path of all tracepoints.
It's called from perf_sw_event_sched, perf_event_task_sched_in and all of perf_trace_##call
with this_cpu_ptr(&__perf_regs[..]) which are zero initialized by perpcu init logic and
subsequent call to perf_arch_fetch_caller_regs initializes the same fields on all archs,
so we can safely drop memset from all of the above cases and move it into
perf_ftrace_function_call that calls it with stack allocated pt_regs.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec5e099d

net: Fix build failure due to lockdep_sock_is_held(). · b33b0a1b

由 David S. Miller 提交于 4月 07, 2016

Needs to be protected with CONFIG_LOCKDEP.

Based upon a patch by Hannes Frederic Sowa.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b33b0a1b

ixgbe: Bump version number · 10ef00fe

由 Mark Rustad 提交于 4月 01, 2016

Update ixgbe version number.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

10ef00fe

ixgbe: Add KR backplane support for x550em_a · f572b2c4

由 Mark Rustad 提交于 4月 01, 2016

Add support for x550em_a-based KR backplane devices.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

f572b2c4

ixgbe: Add support for SGMII backplane interface · 200157c2

由 Mark Rustad 提交于 4月 01, 2016

Add support for an SGMII backplane interface.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

200157c2

ixgbe: Add support for SFPs with retimer · 2d40cd17

由 Mark Rustad 提交于 4月 01, 2016

Add support for SFPs with an external retimer.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

2d40cd17

ixgbe: Introduce function to control MDIO speed · e84db727

由 Mark Rustad 提交于 4月 01, 2016

Move code that controls MDIO speed into a new function because
there will be more MACs that need the control.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

e84db727

ixgbe: Read and parse NW_MNG_IF_SEL register · 537cc5df

由 Mark Rustad 提交于 4月 01, 2016

Read the IXGBE_NW_MNG_IF_SEL register and use it to set interface
attributes.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

537cc5df

ixgbe: Read and set instance id · c898fe28

由 Mark Rustad 提交于 4月 01, 2016

Read the instance number from EEPROM and save it for later use.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

c898fe28

ixgbe: Use new methods for PHY access · d31afc8f

由 Mark Rustad 提交于 4月 01, 2016

Now x550em_a devices will use a new method for PHY access that will
get the firmware token for each access.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

d31afc8f

ixgbe: Add support for x550em_a 10G MAC type · 49425dfc

由 Mark Rustad 提交于 4月 01, 2016

Add support for x550em_a 10G MAC type to the ixgbe driver. The new
MAC includes new firmware commands that need to be used to control
PHY and IOSF access, so that support is also added. The interface
supported is a native SFP+ interface.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

49425dfc

ixgbe: Use method pointer to access IOSF devices · 9a5c27e6

由 Mark Rustad 提交于 4月 01, 2016

Provide method pointers and use them to access IOSF-attached
devices. A new MAC will introduce a new access method.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

9a5c27e6

ixgbe: Add definitions for x550em_a 10G MAC · 207969b9

由 Mark Rustad 提交于 4月 01, 2016

Add definitions for a x550em_a 10G MAC device with a native SFP
interface.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

207969b9

ixgbe: Add support for single-port X550 device · a711ad89

由 Mark Rustad 提交于 3月 21, 2016

Add support for a single-port X550 device.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

a711ad89

ixgbe/ixgbevf: Add support for bulk free in Tx cleanup & cleanup boolean logic · 8220bbc1

由 Alexander Duyck 提交于 3月 07, 2016

This patch enables bulk free in Tx cleanup for ixgbevf and cleans up the
boolean logic in the polling routines for ixgbe and ixgbevf in the hopes of
avoiding any mix-ups similar to what occurred with i40e and i40evf.
Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

8220bbc1

ixgbe: Take manageability semaphore for firmware commands · af741901

由 Mark Rustad 提交于 3月 14, 2016

We need to take the manageability semaphore when issuing firmware
commands to avoid problems. With this in place, the semaphore is
no longer taken in the ixgbe_set_fw_drv_ver_generic function, since
it will now always be taken by the ixgbe_host_interface_command
function.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

af741901

ixgbe: Clean up interface for firmware commands · 5cffde30

由 Mark Rustad 提交于 3月 14, 2016

Clean up the interface for issuing firmware commands to use a
void * instead of a u32 *. This eliminates a number of casts.
Also clean up ixgbe_host_interface_command in a few other ways,
eliminating comparisons with 0, redundant parens and minor
formatting issues.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

5cffde30

ixgbe: Correct length check for round up · 73457165

由 Mark Rustad 提交于 3月 14, 2016

The function ixgbe_host_interface_command actually uses a multiple
of word sized buffer to do its business, but only checks against
the actual length passed in. This means that on read operations it
could be possible to modify locations beyond the length passed in.
Change the check to round up in the same way, just to avoid any
possible hazard.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

73457165

ixgbe: Change the lan_id and func fields to a u8 to avoid casts · 3775b814

由 Mark Rustad 提交于 3月 14, 2016

Since the lan_id and func fields only ever hold small values, make
them u8 to avoid casts used to silence warnings.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

3775b814

ixgbe: Delete some unused register definitions · 832ac592

由 Mark Rustad 提交于 3月 14, 2016

I noticed the SRAMREL registers are not referenced for any device,
so delete the definitions.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

832ac592

sock: make lockdep_sock_is_held static inline · 03be9822

由 Hannes Frederic Sowa 提交于 4月 07, 2016

I forgot to add inline to lockdep_sock_is_held, so it generated all
kinds of build warnings if not build with lockdep support.
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

03be9822

Merge branch 'tipc-next' · 889750bd

由 David S. Miller 提交于 4月 07, 2016

Jon Maloy says:

====================
tipc: some small fixes

When fix a minor buffer leak, and ensure that bearers filter packets
correctly while they are being shut down.

v2: Corrected typos in commit #3, as per feedback from S. Shtylyov
v3: Removed commit #3 from the series. Improved version will be
    re-submitted later.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

889750bd

tipc: stricter filtering of packets in bearer layer · 5b7066c3

由 Jon Paul Maloy 提交于 4月 07, 2016

Resetting a bearer/interface, with the consequence of resetting all its
pertaining links, is not an atomic action. This becomes particularly
evident in very large clusters, where a lot of traffic may happen on the
remaining links while we are busy shutting them down. In extreme cases,
we may even see links being re-created and re-established before we are
finished with the job.

To solve this, we now introduce a solution where we temporarily detach
the bearer from the interface when the bearer is reset. This inhibits
all packet reception, while sending still is possible. For the latter,
we use the fact that the device's user pointer now is zero to filter out
which packets can be sent during this situation; i.e., outgoing RESET
messages only. This filtering serves to speed up the neighbors'
detection of the loss event, and saves us from unnecessary probing.
Acked-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b7066c3

tipc: eliminate buffer leak in bearer layer · 4e801fa1

由 Jon Paul Maloy 提交于 4月 07, 2016

When enabling a bearer we create a 'neigbor discoverer' instance by
calling the function tipc_disc_create() before the bearer is actually
registered in the list of enabled bearers. Because of this, the very
first discovery broadcast message, created by the mentioned function,
is lost, since it cannot find any valid bearer to use. Furthermore,
the used send function, tipc_bearer_xmit_skb() does not free the given
buffer when it cannot find a bearer, resulting in the leak of exactly
one send buffer each time a bearer is enabled.

This commit fixes this problem by introducing two changes:

1) Instead of attemting to send the discovery message directly, we let
tipc_disc_create() return the discovery buffer to the calling
function, tipc_enable_bearer(), so that the latter can send it
when the enabling sequence is finished.

2) In tipc_bearer_xmit_skb(), as well as in the two other transmit
functions at the bearer layer, we now free the indicated buffer or
buffer chain when a valid bearer cannot be found.
Acked-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4e801fa1

Merge branch 'gro-in-udp' · ba35855e

由 David S. Miller 提交于 4月 07, 2016

Tom Herbert says:

====================
udp: GRO in UDP sockets

This patch set adds GRO functions (gro_receive and gro_complete) to UDP
sockets and removes udp_offload infrastructure.

Add GRO functions (gro_receive and gro_complete) to UDP sockets. In
udp_gro_receive and udp_gro_complete a socket lookup is done instead of
looking up the port number in udp_offloads.  If a socket is found and
there are GRO functions for it then those are called. This feature
allows binding GRO functions to more than just a port number.
Eventually, we will be able to use this technique to allow application
defined GRO for an application protocol by attaching BPF porgrams to UDP
sockets for doing GRO.

In order to implement these functions, we added exported
udp6_lib_lookup_skb and udp4_lib_lookup_skb functions in ipv4/udp.c and
ipv6/udp.c. Also, inet_iif and references to skb_dst() were changed to
check that dst is set in skbuf before derefencing. In the GRO path there
is now a UDP socket lookup performed before dst is set, to the get the
device in that case we simply use skb->dev.

Tested:

Ran various combinations of VXLAN and GUE TCP_STREAM and TCP_RR tests.
Did not see any material regression.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba35855e

udp: Remove udp_offloads · 46aa2f30

由 Tom Herbert 提交于 4月 05, 2016

Now that the UDP encapsulation GRO functions have been moved to the UDP
socket we not longer need the udp_offload insfrastructure so removing it.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46aa2f30

geneve: change to use UDP socket GRO · 4a0090a9

由 Tom Herbert 提交于 4月 05, 2016

Adapt geneve_gro_receive, geneve_gro_complete to take a socket argument.
Set these functions in tunnel_config. Don't set udp_offloads any more.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a0090a9

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功