提交 · c647cc3fd5ee3c3aba34a00326e684684d491de0 · openeuler / raspberrypi-kernel

13 11月, 2014 1 次提交

sunvnet: fix NULL pointer dereference · c647cc3f

由 David L Stevens 提交于 11月 12, 2014

This patch fixes a NULL pointer dereference when __tx_port_find() doesn't
find a matching port.
Signed-off-by: NDavid L Stevens <david.stevens@oracle.com>
Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c647cc3f

12 11月, 2014 23 次提交

Merge branch 'skb_alloc_pages' · ee47ad42

由 David S. Miller 提交于 11月 12, 2014

Alexander Duyck says:

====================
Replace __skb_alloc_pages with simpler function

This patch series replaces __skb_alloc_pages with a much simpler function,
__dev_alloc_pages.  The main difference between the two is that
__skb_alloc_pages had an sk_buff pointer that was being passed as NULL in
call places where it was called.  In a couple of cases the NULL was passed
by variable and this led to unnecessary code being run.

As such in order to simplify things the __dev_alloc_pages call only takes a
mask and the page order being requested.  In addition it takes advantage of
several behaviors already built into the page allocator so that it can just
set GFP flags unconditionally.

v2: Renamed functions to dev_alloc_page(s) instead of netdev_alloc_page(s)
    Removed __GFP_COLD flag from usb code as it was redundant
v3: Update patch descriptions and subjects to match changes in v2
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ee47ad42

net: Remove __skb_alloc_page and __skb_alloc_pages · 160d2aba

由 Alexander Duyck 提交于 11月 11, 2014

Remove the two functions which are now dead code.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

160d2aba

fm10k/igb/ixgbe: Replace __skb_alloc_page with dev_alloc_page · 42b17f09

由 Alexander Duyck 提交于 11月 11, 2014

The Intel drivers were pretty much just using the plain vanilla GFP flags
in their calls to __skb_alloc_page so this change makes it so that they use
dev_alloc_page which just uses GFP_ATOMIC for the gfp_flags value.

Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Matthew Vick <matthew.vick@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

42b17f09

phonet: Replace calls to __skb_alloc_page with __dev_alloc_page · 5693d284

由 Alexander Duyck 提交于 11月 11, 2014

Replace the calls to __skb_alloc_page that are passed NULL with calls to
__dev_alloc_page.

In addition remove __GFP_COLD flag from allocations as we only want it for
the Rx buffer which is taken care of by __dev_alloc_skb, not for any
secondary allocations such as the queue element transmit descriptors.

Cc: Oliver Neukum <oliver@neukum.org>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5693d284

cxgb4/cxgb4vf: Replace __skb_alloc_page with __dev_alloc_page · aa9cd31c

由 Alexander Duyck 提交于 11月 11, 2014

Drop the bloated use of __skb_alloc_page and replace it with
__dev_alloc_page.  In addition update the one other spot that is
allocating a page so that it allocates with the correct flags.

Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Casey Leedom <leedom@chelsio.com>
Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa9cd31c

net: Add device Rx page allocation function · 71dfda58

由 Alexander Duyck 提交于 11月 11, 2014

This patch implements __dev_alloc_pages and __dev_alloc_page. These are
meant to replace the __skb_alloc_pages and __skb_alloc_page functions. The
reason for doing this is that it occurred to me that __skb_alloc_page is
supposed to be passed an sk_buff pointer, but it is NULL in all cases where
it is used. Worse is that in the case of ixgbe it is passed NULL via the
sk_buff pointer in the rx_buffer info structure which means the compiler is
not correctly stripping it out.

The naming for these functions is based on dev_alloc_skb and __dev_alloc_skb.
There was originally a netdev_alloc_page, however that was passed a
net_device pointer and this function is not so I thought it best to follow
that naming scheme since that is the same difference between dev_alloc_skb
and netdev_alloc_skb.

In the case of anything greater than order 0 it is assumed that we want a
compound page so __GFP_COMP is set for all allocations as we expect a
compound page when assigning a page frag.

The other change in this patch is to exploit the behaviors of the page
allocator in how it handles flags. So for example we can always set
__GFP_COMP and __GFP_MEMALLOC since they are ignored if they are not
applicable or are overridden by another flag.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71dfda58

irda: Remove IRDA_<TYPE> logging macros · 6c91023d

由 Joe Perches 提交于 11月 11, 2014

And use the more common mechanisms directly.

Other miscellanea:

o Coalesce formats
o Add missing newlines
o Realign arguments
o Remove unnecessary OOM message logging as
  there's a generic stack dump already on OOM.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c91023d

net: kill netif_copy_real_num_queues() · 09626e9d

由 WANG Cong 提交于 11月 11, 2014

vlan was the only user of netif_copy_real_num_queues(),
but it no longer calls it after
commit 4af429d2 ("vlan: lockless transmit path").
So we can just remove it.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

09626e9d

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · 2387e3b5

由 David S. Miller 提交于 11月 11, 2014

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2014-11-11

This series contains updates to i40e, i40evf and ixgbe.

Kamil updated the i40e and i40evf driver to poll the firmware slower
since we were polling faster than the firmware could respond.

Shannon updates i40e to add a check to keep the service_task from
running the periodic tasks more than once per second, while still
allowing quick action to service the events.

Jesse cleans up the throttle rate code by fixing the minimum interrupt
throttle rate and removing some unused defines.

Mitch makes the early init admin queue message receive code more robust
by handling messages in a loop and ignoring those that we are not
interested in.  This also gets rid of some scary log messages that
really do not indicate a problem.

Don provides several ixgbe patches, first fixes an issue with x540
completion timeout where on topologies including few levels of PCIe
switching for x540 can run into an unexpected completion error.  Cleans
up the functionality in ixgbe_ndo_set_vf_vlan() in preparation for
future work.  Adds support for x550 MAC's to the driver.

v2:
 - Remove code comment in patch 01 of the series, based on feedback from
   David Liaght
 - Updated the "goto" to "break" statements in patch 06 of the series,
   based on feedback from Sergei Shtylyov
 - Initialized the variable err due to the possibility of use before
   being assigned a value in patch 07 of the series
 - Added patch "ixgbe: add helper function for setting RSS key in
   preparation of X550" since it is needed for the addition of X550 MAC
   support
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2387e3b5

usbnet: smsc95xx: dereferencing NULL pointer · 8bca81d9

由 Sudip Mukherjee 提交于 11月 11, 2014

we were dereferencing dev to initialize pdata. but just after that we
have a BUG_ON(!dev). so we were basically dereferencing the pointer
first and then tesing it for NULL.
Signed-off-by: NSudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8bca81d9

irda: Simplify IRDA logging macros · d65c4e4e

由 Joe Perches 提交于 11月 11, 2014

These are the same as net_<level>_ratelimited, so
use the more common style in the macro definition.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d65c4e4e

neigh: remove dynamic neigh table registration support · d7480fd3

由 WANG Cong 提交于 11月 10, 2014

Currently there are only three neigh tables in the whole kernel:
arp table, ndisc table and decnet neigh table. What's more,
we don't support registering multiple tables per family.
Therefore we can just make these tables statically built-in.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7480fd3

stmmac: split to core library and probe drivers · b2e2f0c7

由 Andy Shevchenko 提交于 11月 10, 2014

Instead of registering the platform and PCI drivers in one module let's move
necessary bits to where it belongs. During this procedure we convert the module
registration part to use module_*_driver() macros which makes code simplier.

>From now on the driver consists three parts: core library, PCI, and platform
drivers.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b2e2f0c7

net: Convert LIMIT_NETDEBUG to net_dbg_ratelimited · ba7a46f1

由 Joe Perches 提交于 11月 11, 2014

Use the more common dynamic_debug capable net_dbg_ratelimited
and remove the LIMIT_NETDEBUG macro.

All messages are still ratelimited.

Some KERN_<LEVEL> uses are changed to KERN_DEBUG.

This may have some negative impact on messages that were
emitted at KERN_INFO that are not not enabled at all unless
DEBUG is defined or dynamic_debug is enabled.  Even so,
these messages are now _not_ emitted by default.

This also eliminates the use of the net_msg_warn sysctl
"/proc/sys/net/core/warnings".  For backward compatibility,
the sysctl is not removed, but it has no function.  The extern
declaration of net_msg_warn is removed from sock.h and made
static in net/core/sysctl_net_core.c

Miscellanea:

o Update the sysctl documentation
o Remove the embedded uses of pr_fmt
o Coalesce format fragments
o Realign arguments
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba7a46f1

PPC: bpf_jit_comp: add SKF_AD_HATYPE instruction · 5b61c4db

由 Denis Kirjanov 提交于 11月 10, 2014

Add BPF extension SKF_AD_HATYPE to ppc JIT to check
the hw type of the interface

Before:
[   57.723666] test_bpf: #20 LD_HATYPE
[   57.723675] BPF filter opcode 0020 (@0) unsupported
[   57.724168] 48 48 PASS

After:
[  103.053184] test_bpf: #20 LD_HATYPE 7 6 PASS

CC: Alexei Starovoitov<alexei.starovoitov@gmail.com>
CC: Daniel Borkmann<dborkman@redhat.com>
CC: Philippe Bergheaud<felix@linux.vnet.ibm.com>
Signed-off-by: NDenis Kirjanov <kda@linux-powerpc.org>

v2: address Alexei's comments
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b61c4db

Merge branch 'net_next_ovs' of git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch · 4083c805

由 David S. Miller 提交于 11月 11, 2014

Pravin B Shelar says:

====================
Open vSwitch

Following batch of patches brings feature parity between upstream
ovs and out of tree ovs module.

Two features are added, first adds support to export egress
tunnel information for a packet. This is used to improve
visibility in network traffic. Second feature allows userspace
vswitchd process to probe ovs module features. Other patches
are optimization and code cleanup.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4083c805

dsa: Use netdev_<level> instead of printk · a2ae6007

由 Joe Perches 提交于 11月 09, 2014

Neaten and standardize the logging output.

Other miscellanea:

o Use pr_notice_once instead of a guard flag.
o Convert existing pr_<level> uses too.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2ae6007

Merge branch 'mlx4-next' · 008e8165

由 David S. Miller 提交于 11月 11, 2014

Or Gerlitz says:

====================
mlx4: Add CHECKSUM_COMPLETE support

These patches from Shani, Matan and myself add support for
CHECKSUM_COMPLETE reporting on non TCP/UDP packets such as
GRE and ICMP. I'd like to deeply thank Jerry Chu for his
innovation and support in that effort.

Based on the feedback from Eric and Ido Shamay, in V2 we dropped
the patch which removed the calls to napi_gro_frags() and added
a patch which makes the RX code to go through that path
regardless of the checksum status.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

008e8165

net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE · f8c6455b

由 Shani Michaeli 提交于 11月 09, 2014

When processing received traffic, pass CHECKSUM_COMPLETE status to the
stack, with calculated checksum for non TCP/UDP packets (such
as GRE or ICMP).

Although the stack expects checksum which doesn't include the pseudo
header, the HW adds it. To address that, we are subtracting the pseudo
header checksum from the checksum value provided by the HW.

In the IPv6 case, we also compute/add the IP header checksum which
is not added by the HW for such packets.

Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: NShani Michaeli <shanim@mellanox.com>
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f8c6455b

net/mlx4_en: Extend usage of napi_gro_frags · dd65beac

由 Shani Michaeli 提交于 11月 09, 2014

We can call napi_gro_frags for all the received traffic regardless
of the checksum status. Specifically, received packets whose status
is CHECKSUM_NONE (and soon to be added CHECKSUM_COMPLETE)
are eligible for napi_gro_frags as well.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NShani Michaeli <shanim@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd65beac

Merge branch 'so_incoming_cpu' · b00394c0

由 David S. Miller 提交于 11月 11, 2014

Eric Dumazet says:

====================
net: SO_INCOMING_CPU support

SO_INCOMING_CPU socket option (read by getsockopt()) provides
an alternative to RPS/RFS for high performance servers using
multi queues NIC.

TCP should use sk_mark_napi_id() for established sockets only.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b00394c0

net: introduce SO_INCOMING_CPU · 2c8c56e1

由 Eric Dumazet 提交于 11月 11, 2014

Alternative to RPS/RFS is to use hardware support for multiple
queues.

Then split a set of million of sockets into worker threads, each
one using epoll() to manage events on its own socket pool.

Ideally, we want one thread per RX/TX queue/cpu, but we have no way to
know after accept() or connect() on which queue/cpu a socket is managed.

We normally use one cpu per RX queue (IRQ smp_affinity being properly
set), so remembering on socket structure which cpu delivered last packet
is enough to solve the problem.

After accept(), connect(), or even file descriptor passing around
processes, applications can use :

 int cpu;
 socklen_t len = sizeof(cpu);

 getsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);

And use this information to put the socket into the right silo
for optimal performance, as all networking stack should run
on the appropriate cpu, without need to send IPI (RPS/RFS).
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c8c56e1

tcp: move sk_mark_napi_id() at the right place · 3d97379a

由 Eric Dumazet 提交于 11月 11, 2014

sk_mark_napi_id() is used to record for a flow napi id of incoming
packets for busypoll sake.
We should do this only on established flows, not on listeners.

This was 'working' by virtue of the socket cloning, but doing
this on SYN packets in unecessary cache line dirtying.

Even if we move sk_napi_id in the same cache line than sk_lock,
we are working to make SYN processing lockless, so it is desirable
to set sk_napi_id only for established flows.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d97379a

11 11月, 2014 16 次提交

ixgbe: add helper function for setting RSS key in preparation of X550 · d1b849b9

由 Don Skidmore 提交于 11月 09, 2014

Split off the setting of the RSS key into its own function.  This
will help when we add support for X550 which can have different
RSS keys per pool.
Signed-off-by: NDon Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

d1b849b9

ixgbe: Add new support for X550 MAC's · 9a75a1ac

由 Don Skidmore 提交于 11月 07, 2014

This patch will add in the new MAC defines and fit it into the switch
cases throughout the driver.  New functionality and enablement support will
be added in following patches.
Signed-off-by: NDon Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

9a75a1ac

ixgbe: cleanup move setting PFQDE.HIDE_VLAN to support function. · 8d697e7e

由 Don Skidmore 提交于 11月 05, 2014

Move setting of drop enable to support function.  This not only makes the
code more readable but is also prep for following patches that add
additional MAC support.
Signed-off-by: NDon Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

8d697e7e

ixgbe: cleanup ixgbe_ndo_set_vf_vlan · 2b509c0c

由 Don Skidmore 提交于 11月 01, 2014

Clean up functionality in ixgbe_ndo_set_vf_vlan that will simplify later
patches.
Signed-off-by: NDon Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

2b509c0c

ixgbe: fix X540 Completion timeout · 71bde601

由 Don Skidmore 提交于 10月 29, 2014

On topologies including few levels of PCIe switching X540 can run into an
unexpected completion error. We get around this by waiting after enabling
loopback a sufficient amount of time until Tx Data Fetch is sent. We then
poll the pending transaction bit to ensure we received the completion. Only
then do we go on to clear the buffers.
Signed-of-by: NDon Skidmore <donald.c.skidmore@intel.com>
Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

71bde601

i40evf: don't use more queues than CPUs · cc052927

由 Mitch Williams 提交于 10月 25, 2014

It's kind of silly to configure and attempt to use a bunch of queue
pairs when you're running on a single (virtual) CPU. Instead of
unconditionally configuring all of the queues that the PF gives us,
clamp the number of queue pairs to the number of CPUs.

Change-ID: I321714c9e15072ee76de8f95ab9a81f86ed347d1
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Signed-off-by: NPatrick Lu <patrick.lu@intel.com>
Tested-by: NJim Young <jamesx.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

cc052927

i40evf: make early init processing more robust · f8d4db35

由 Mitch Williams 提交于 10月 25, 2014

In early init, if we get an unexpected message from the PF (such as link
status), we just kick an error back to the init task, causing it to
restart its state machine and delaying initialization.

Make the early init AQ message receive code more robust by handling
messages in a loop, and ignoring those that we aren't interested in.
This also gets rid of some scary log messages that really didn't
indicate a problem.

Change-ID: I620e8c72e49c49c665ef33eeab2425dd10e721cf
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Signed-off-by: NPatrick Lu <patrick.lu@intel.com>
Tested-by: NJim Young <jamesx.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

f8d4db35

i40e: clean up throttle rate code · 79442d38

由 Jesse Brandeburg 提交于 10月 25, 2014

The interrupt throttle rate minimum is actually 2us, so
fix that define and while we are there, remove some unused defines.

Change some strings in the function to be a bit less wrappy, and
express the correct limits.

Change-ID: I96829bbc77935e0b57c6f0fc1439fb4152b2960a
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NPatrick Lu <patrick.lu@intel.com>
Tested-by: NJim Young <jamesx.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

79442d38

i40e: don't do link_status or stats collection on every ARQ · 21536717

由 Shannon Nelson 提交于 10月 25, 2014

The ARQ events cause a service_task execution, and we do a link_status
check and full stats gathering for each service_task. However, when
there are a lot of ARQ events, such as when doing an NVM update, we end up
doing 10's if not 100's of these per second, thereby heavily abusing the
PCI bus and especially the Firmware. This patch adds a check to keep the
service_task from running these periodic tasks more than once per second,
while still allowing quick action to service the events.

Change-ID: Iec7670c37bfae9791c43fec26df48aea7f70b33e
Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
Signed-off-by: NPatrick Lu <patrick.lu@intel.com>
Tested-by: NJim Young <jamesx.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

21536717

i40e: poll firmware slower · 0db4e162

由 Kamil Krawczyk 提交于 10月 25, 2014

The code was polling the firmware tail register for completion every
10 microseconds, which is way faster than the firmware can respond.
This changes the poll interval to 1ms, which reduces polling CPU
utilization, and the number of times we loop.

The maximum delay is still 100ms.

Change-ID: I4bbfa6b66d802890baf8b4154061e55942b90958
Signed-off-by: NKamil Krawczyk <kamil.krawczyk@intel.com>
Acked-by: NShannon Nelson <shannon.nelson@intel.com>
Tested-by: NJim Young <jamesx.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

0db4e162

mlx4: restore conditional call to napi_complete_done() · 2e1af7d7

由 Eric Dumazet 提交于 11月 10, 2014

After commit 1a288172 ("mlx4: use napi_complete_done()") we ended up
calling napi_complete_done() in the case NAPI poll consumed all its
budget.

This added extra interrupt pressure, this patch restores proper
behavior.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Fixes: 1a288172 ("mlx4: use napi_complete_done()")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e1af7d7

Merge branch 'sunvnet-next' · d21385fa

由 David S. Miller 提交于 11月 10, 2014

Sowmini Varadhan says:

====================
sunvnet: edge-case/race-conditions bug fixes

This patch series contains fixes for race-conditions in sunvnet,
that can encountered when there is a difference in latency between
producer and consumer.

Patch 1 addresses a case when the STOPPED LDC ack from a peer is
processed before vnet_start_xmit can finish updating the dr->prod
state.

Patch 2 fixes the edge-case when outgoing data and incoming
stopped-ack cross each other in flight.

Patch 3 adds a missing rcu_read_unlock(), found by code-inspection.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d21385fa

sunvnet: Add missing rcu_read_unlock() in vnet_start_xmit · df20286a

由 Sowmini Varadhan 提交于 11月 08, 2014

The out_dropped label will only do rcu_read_unlock for non-null port.
So add the missing rcu_read_unlock() when bailing due to non-null port.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df20286a

sunvnet: vnet_ack() should check if !start_cons to send a missed trigger · 777362d7

由 Sowmini Varadhan 提交于 11月 08, 2014

As per comments in vnet_start_xmit, for the edge case
when outgoing vnet_start_xmit() data and an incoming STOPPED
ACK cross each other in flight, we may need to send the missed
START trigger from maybe_tx_wakeup() after checking for a
false value of start_cons
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

777362d7

sunvnet: Fix race between vnet_start_xmit() and vnet_ack() · b0cffed5

由 Sowmini Varadhan 提交于 11月 08, 2014

When vnet_start_xmit() is concurrent with vnet_ack(), we may
have a race that looks like:

    thread 1                              thread 2
    vnet_start_xmit                       vnet_event_napi -> vnet_rx

__vnet_tx_trigger for some desc X
at this point dr->prod == X
                                        peer sends back a stopped ack for X
                                        we process X, but X == dr->prod
                                        so we bail out in vnet_ack with
                                        !idx_is_pending
update dr->prod

As a result of the fact that we never processed the stopped ack for X,
the Tx path is led to incorrectly believe that the peer is still
"started" and reading, but the peer has stopped reading, which will
ultimately end in flow-control assertions.

The fix is to synchronize the above 2 paths  on the netif_tx_lock.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b0cffed5

8139too: Allow using the largest possible MTU · 6f6e741f

由 Alban Bedel 提交于 11月 08, 2014

This driver allows MTU up to 1518 bytes which is not enought to run
batman-adv. Simply raise the maximum packet size up to the maximum
allowed by the transmit descriptor, 1792 bytes, giving a maximum MTU
of 1774 bytes.
Signed-off-by: NAlban Bedel <albeu@free.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f6e741f