提交 · e24f8191cc35ae3780b4656a6befae8b8657edc2 · openanolis / cloud-kernel

26 8月, 2014 1 次提交

mvneta: Fix TSO and checksum for non-acceleration vlan traffic · 817dbfa5

由 Vlad Yasevich 提交于 8月 25, 2014

This driver doesn't appear to support vlan acceleration at
all.  However, it does claim to support TSO and IP checksums
for vlan devices.  Thus any configured vlan device would
end up passing down partial checksums or TSO frames.

The driver also uses the value from skb->protocol to
determine TSO and checksum offload information, but assumes
that skb->protocol holds the l3 protocol information.
As a result, vlan traffic with partial checksums or TSO
will fail those checks and TSO will not happen.

Fix this by using vlan_get_protocol() helper.

CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

817dbfa5

08 8月, 2014 1 次提交

net: mvneta: Fix reference counting for phy_node · c891c24c

由 Uwe Kleine-König 提交于 8月 07, 2014

If there is a "phy" handle the probe function returns with holding a
reference to that node. Make sure that in the fixed phy case there is
also held a reference to yield a consistant state.

Also add the corresponding of_node_put in the error path and the remove
function.

Fixes: 83895bed ("net: mvneta: add support for fixed links")
Fixes: c5aff182 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c891c24c

09 7月, 2014 2 次提交

net: mvneta: Fix big endian issue in mvneta_txq_desc_csum() · 0a198587

由 Thomas Fitzsimmons 提交于 7月 08, 2014

This commit fixes the command value generated for CSUM calculation
when running in big endian mode.  The Ethernet protocol ID for IP was
being unconditionally byte-swapped in the layer 3 protocol check (with
swab16), which caused the mvneta driver to not function correctly in
big endian mode.  This patch byte-swaps the ID conditionally with
htons.

Cc: <stable@vger.kernel.org> # v3.13+
Signed-off-by: NThomas Fitzsimmons <fitzsim@fitzsim.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a198587

net: mvneta: fix operation in 10 Mbit/s mode · 4d12bc63

由 Thomas Petazzoni 提交于 7月 08, 2014

As reported by Maggie Mae Roxas, the mvneta driver doesn't behave
properly in 10 Mbit/s mode. This is due to a misconfiguration of the
MVNETA_GMAC_AUTONEG_CONFIG register: bit MVNETA_GMAC_CONFIG_MII_SPEED
must be set for a 100 Mbit/s speed, but cleared for a 10 Mbit/s speed,
which the driver was not properly doing. This commit adjusts that by
setting the MVNETA_GMAC_CONFIG_MII_SPEED bit only in 100 Mbit/s mode,
and relying on the fact that all the speed related bits of this
register are cleared at the beginning of the mvneta_adjust_link()
function.

This problem exists since c5aff182 ("net: mvneta: driver for
Marvell Armada 370/XP network unit") which is the commit that
introduced the mvneta driver in the kernel.

Cc: <stable@vger.kernel.org> # v3.8+
Fixes: c5aff182 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Reported-by: NMaggie Mae Roxas <maggie.mae.roxas@gmail.com>
Cc: Maggie Mae Roxas <maggie.mae.roxas@gmail.com>
Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d12bc63

03 6月, 2014 4 次提交

net: mvneta: Avoid unmapping the TSO header buffers · 2e3173a3

由 Ezequiel Garcia 提交于 5月 30, 2014

The buffers for the TSO headers belong to a DMA coherent region which is
allocated at ndo_open() time, and released at ndo_stop() time.

Therefore, and contrary to the TSO payload descriptor buffers, the TSO header
buffers don't need to be unmapped. This commit adds a check to detect a
TSO header buffer and explicitly prevent the unmap.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e3173a3

net: mvneta: Fix missing DMA region unmap · ba7e46ef

由 Ezequiel Garcia 提交于 5月 30, 2014

The Tx descriptor release code currently calls dma_unmap_single() and
dev_kfree_skb_any() if the descriptor is associated with a non-NULL skb.
This is true only for the last fragment of the packet.

This is wrong, however, since every descriptor buffer is DMA mapped and needs
to be unmapped.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba7e46ef

net: mvneta: Limit the TSO segments and adjust stop/wake thresholds · 8eef5f97

由 Ezequiel Garcia 提交于 5月 30, 2014

Currently small MSS values may require too many TSO descriptors for
the default queue size. This commit prevents this situation by fixing
the maximum supported TSO number of segments to 100 and by setting a
minimum Tx queue size. The minimum Tx queue size is set so that at
least 2 worst-case skb can be accommodated.

In addition, the queue stop and wake thresholds values are adjusted
accordingly. The queue is stopped when there's room for only 1 worst-case
skb and waked when the number of descriptors is half that value.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8eef5f97

net: mvneta: Use default NAPI weight instead of a custom one · 9fa9379d

由 Ezequiel Garcia 提交于 5月 30, 2014

This driver has no need for a custom NAPI weigth. Use the default
one, which has the same value.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9fa9379d

24 5月, 2014 6 次提交

net: mvneta: Remove unneeded 'weigth' field · dc03e21a

由 Ezequiel Garcia 提交于 5月 22, 2014

The 'weight' field is only used to pass the weigth to napi initialization
function. This commit removes the field, and instead uses a fixed value to
initialize the napi context.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc03e21a

net: mvneta: Change the number of default rx queues to one · edadb7fa

由 Ezequiel Garcia 提交于 5月 22, 2014

The driver does not support multiple rx queues, and so it's a waste
of resources to have a default number larger than one (1).
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

edadb7fa

net: mvneta: Use prepare/commit API to simplify MAC address setting · e68de360

由 Ezequiel Garcia 提交于 5月 22, 2014

Use eth_prepare_mac_addr_change and eth_commit_mac_addr_change, instead
of manually checking and storing the MAC address, which makes the
code slightly more robust. This fixes the lack of valid MAC address check
in the driver's .ndo_set_mac_address hook.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e68de360

net: mvneta: Clean-up mvneta_init() · 9672850b

由 Ezequiel Garcia 提交于 5月 22, 2014

This commit cleans-up mvneta_init(), which initializes the hardware
and allocates the rx/qx queues. The queue allocation is simplified
by using devm_kcalloc instead of kzalloc. The unused phy_addr parameter
is removed. While here, the 'hal' references in the comments are removed.
This commit makes no functionality change.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9672850b

net: mvneta: Check tx queue setup error in mvneta_change_mtu() · a92dbd96

由 Ezequiel Garcia 提交于 5月 22, 2014

This commit checks the return code of mvneta_setup_txq() call
in mvneta_change_mtu(). Also, use the netdevice pointer directly
instead of dereferencing the port structure. While here, let's
fix a tiny comment typo.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a92dbd96

net: mvneta: Clean-up mvneta_tx_frag_process() · 3d4ea02f

由 Ezequiel Garcia 提交于 5月 22, 2014

A tiny clean-up to improve readability. This commit makes no functionality
change.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d4ea02f

23 5月, 2014 3 次提交

net: mvneta: Implement software TSO · 2adb719d

由 Ezequiel Garcia 提交于 5月 19, 2014

Now that the TSO helper API has been introduced, this commit makes use
of it to implement the TSO in this driver.

Using iperf to test and vmstat to check the CPU usage, shows a substantial
CPU usage drop when TSO is on (~15% vs. ~25%). HTTP-based tests performed
by Willy Tarreau have shown performance improvements.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2adb719d

net: mvneta: Clean mvneta_tx() sk_buff handling · e19d2dda

由 Ezequiel Garcia 提交于 5月 19, 2014

Rework mvneta_tx() so that the code that performs the final handling
before a sk_buff is transmitted is done only if the numbers of fragments
processed if positive.

This is preparation work to add the support for software TSO.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e19d2dda

net: mvneta: Factorize feature setting · 01ef26ca

由 Ezequiel Garcia 提交于 5月 19, 2014

In order to ease the addition of new features, let's factorize the
feature list.
Signed-off-by: NEzequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01ef26ca

17 5月, 2014 1 次提交

net: mvneta: add support for fixed links · 83895bed

由 Thomas Petazzoni 提交于 5月 16, 2014

Following the introduction of of_phy_register_fixed_link(), this patch
introduces fixed link support in the mvneta driver, for Marvell Armada
370/XP SOCs.
Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83895bed

14 5月, 2014 1 次提交

net: get rid of SET_ETHTOOL_OPS · 7ad24ea4

由 Wilfried Klaebe 提交于 5月 11, 2014

net: get rid of SET_ETHTOOL_OPS

Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone.
This does that.

Mostly done via coccinelle script:
@@
struct ethtool_ops *ops;
struct net_device *dev;
@@
-       SET_ETHTOOL_OPS(dev, ops);
+       dev->ethtool_ops = ops;

Compile tested only, but I'd seriously wonder if this broke anything.
Suggested-by: NDave Miller <davem@davemloft.net>
Signed-off-by: NWilfried Klaebe <w-lkml@lebenslange-mailadresse.de>
Acked-by: NFelipe Balbi <balbi@ti.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ad24ea4

17 4月, 2014 1 次提交

net: mvneta: properly configure the MAC <-> PHY connection in all situations · 3f1dd4bc

由 Thomas Petazzoni 提交于 4月 15, 2014

Commit 5445eaf3 ('mvneta: Try to fix mvneta when compiled as
module') fixed the mvneta driver to make it work properly when loaded
as a module in SGMII configuration, which was tested successful by the
author on the Armada XP OpenBlocks AX3, which uses SGMII.

However, some other platforms, namely the Armada XP GP don't use
SGMII, but a QSGMII connection between the MAC and the PHY, and this
case was not supported by the mvneta driver, which was relying on
configuration put in place by the bootloader. While this works when
the mvneta driver is built-in (because clocks are not gated), it
breaks when mvneta is built as a module, because the clock is gated
(all configuration is lost) and then re-enabled when the mvneta driver
is loaded.

In order to support all of RGMII, SGMII and QSGMII, this commit
reworks how the PHY interface configuration is done, and simplifies
it: it removes the mvneta_port_sgmii_config() and
mvneta_gmac_rgmii_set() functions, which were strange because
mvneta_gmac_rgmii_set() was called in all cases, even for SGMII
configurations. Also, the mvneta_gmac_rgmii_set() function was taking
a boolean as argument, which was always true.

Instead, all the PHY interface configuration logic is moved into the
mvneta_port_power_up() function, in a much simpler 'switch' construct,
with four cases:

 - QSGMII: the RGMIIEn bit, the PCSEn bit in GMAC_CTRL_2 are set, and
   the SERDES is configured in QSGMII. Technically speaking,
   configuring the SERDES of the first port would be sufficient, but
   it is simpler to do it on all ports.

 - SGMII: the RGMIIEn bit, the PCSEn bit in GMAC_CTRL_2 are set, and
   the SERDES is configured as SGMII.

 - RGMII: the RGMIIEn bit in GMAC_CTRL_2 is set. The PCSEn bit is kept
   cleared, and no SERDES configuration is done, because RGMII is not
   using SERDES lanes.

 - other: an error is returned. For this reason, the
   mvneta_port_power_up() now returns an int instead of nothing, and
   the return value is checked by mvneta_probe().

This has been successfully tested on:

 * Armada XP DB, which has two RGMII and two SGMII connections
 * Armada XP GP, which uses QSGMII for its four interfaces
 * Armada 370 Mirabox, which has two RGMII connections
Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3f1dd4bc

14 4月, 2014 1 次提交

Revert "net: mvneta: fix usage as a module on RGMII configurations" · cc6ca302

由 Thomas Petazzoni 提交于 4月 13, 2014

This reverts commit e3a8786c. While
this commit allows to use the mvneta driver as a module on some
configurations, it breaks other configurations even if mvneta is used
built-in.

This breakage is due to the fact that on some RGMII platforms, the PCS
bit has to be set, and on some other platforms, it has to be
cleared. At the moment, we lack informations to know exactly the
significance of this bit (the datasheet only says "enables PCS"), and
so we can't produce a patch that will work on all platforms at this
point. And since this change is breaking the network completely for
many users, it's much better to revert it for now. We'll come back
later with a proper fix that takes into account all platforms.

Basically:

 * Armada XP GP is configured as RGMII-ID, and needs the PCS bit to be
   set.
 * Armada 370 Mirabox is configured as RGMII-ID, and needs the PCS bit
   to be cleared.

And at the moment, we don't know how to make the distinction between
those two cases. One hint is that the Armada XP GP appears in fact to
be using a QSGMII connection with the PHY (Quad-SGMII), but
configuring it as SGMII doesn't work, while RGMII-ID works. This needs
more investigation, but in the mean time, let's unbreak the network
for all those users.
Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reported-by: NArnaud Ebalard <arno@natisbad.org>
Reported-by: NAlexander Reuter <Alexander.Reuter@gmx.net>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=73401
Cc: stable@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc6ca302

30 3月, 2014 1 次提交

net: mvneta: use devm_ioremap_resource() instead of of_iomap() · c3f0dd38

由 Thomas Petazzoni 提交于 3月 27, 2014

The mvneta driver currently uses of_iomap(), which has two drawbacks:
it doesn't request the resource, and it isn't devm-style so some error
handling is needed.

This commit switches to use devm_ioremap_resource() instead, which
automatically requests the resource (so the I/O registers region shows
up properly in /proc/iomem), and also is devm-style, which allows to
get rid of some error handling to unmap the I/O registers region.
Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3f0dd38

27 3月, 2014 3 次提交

net: mvneta: use devm_ioremap_resource() instead of of_iomap() · b5f3b75d

由 Thomas Petazzoni 提交于 3月 26, 2014

The mvneta driver currently uses of_iomap(), which has two drawbacks:
it doesn't request the resource, and it isn't devm-style so some error
handling is needed.

b5f3b75d

net: mvneta: fix usage as a module on RGMII configurations · e3a8786c

由 Thomas Petazzoni 提交于 3月 26, 2014

Commit 5445eaf3 ('mvneta: Try to fix mvneta when compiled as
module') fixed the mvneta driver to make it work properly when loaded
as a module in SGMII configuration, which was tested successful by the
author on the Armada XP OpenBlocks AX3, which uses SGMII.

However, it turns out that the Armada XP GP, which uses RGMII, is
affected by a similar problem: its SERDES configuration is lost when
mvneta is loaded as a module, because this configuration is set by the
bootloader, and then lost because the clock is gated by the clock
framework until the mvneta driver is loaded again and the clock is
re-enabled.

However, it turns out that for the RGMII case, setting the SERDES
configuration is not sufficient: the PCS enable bit in the
MVNETA_GMAC_CTRL_2 register must also be set, like in the SGMII
configuration.

Therefore, this commit reworks the SGMII/RGMII initialization: the
only difference between the two now is a different SERDES
configuration, all the rest is identical.

In detail, to achieve this, the commit:

 * Renames MVNETA_SGMII_SERDES_CFG to MVNETA_SERDES_CFG because it is
   not specific to SGMII, but also used on RGMII configurations.

 * Adds a MVNETA_RGMII_SERDES_PROTO definition, that must be used as
   the MVNETA_SERDES_CFG value in RGMII configurations.

 * Removes the mvneta_gmac_rgmii_set() and mvneta_port_sgmii_config()
   functions, and instead directly do the SGMII/RGMII configuration in
   mvneta_port_up(), from where those functions where called. It is
   worth mentioning that mvneta_gmac_rgmii_set() had an 'enable'
   parameter that was always passed as '1', so it was pretty useless.

 * Reworks the mvneta_port_up() function to set the MVNETA_SERDES_CFG
   register to the appropriate value depending on the RGMII vs. SGMII
   configuration. It also unconditionally set the PCS_ENABLE bit (was
   already done for SGMII, but is now also needed for RGMII), and sets
   the PORT_RGMII bit (which was already done for both SGMII and
   RGMII).

This commit was successfully tested with mvneta compiled as a module,
on both the OpenBlocks AX3 (SGMII configuration) and the Armada XP GP
(RGMII configuration).
Reported-by: NSteve McIntyre <steve@einval.com>
Cc: stable@vger.kernel.org # 3.11.x: 5445eaf3 mvneta: Try to fix mvneta when compiled as module
Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3a8786c

net: mvneta: rename MVNETA_GMAC2_PSC_ENABLE to MVNETA_GMAC2_PCS_ENABLE · a79121d3

由 Thomas Petazzoni 提交于 3月 26, 2014

Bit 3 of the MVNETA_GMAC_CTRL_2 is actually used to enable the PCS,
not the PSC: there was a typo in the name of the define, which this
commit fixes.

Cc: stable@vger.kernel.org
Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a79121d3

15 3月, 2014 1 次提交

net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq · 57a7744e

由 Eric W. Biederman 提交于 3月 13, 2014

Replace the bh safe variant with the hard irq safe variant.

We need a hard irq safe variant to deal with netpoll transmitting
packets from hard irq context, and we need it in most if not all of
the places using the bh safe variant.

Except on 32bit uni-processor the code is exactly the same so don't
bother with a bh variant, just have a hard irq safe variant that
everyone can use.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57a7744e

15 2月, 2014 1 次提交

net: introduce netdev_alloc_pcpu_stats() for drivers · 1c213bd2

由 WANG Cong 提交于 2月 13, 2014

There are many drivers calling alloc_percpu() to allocate pcpu stats
and then initializing ->syncp. So just introduce a helper function for them.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c213bd2

17 1月, 2014 13 次提交

net: mvneta: make mvneta_txq_done() return void · cd713199

由 Arnaud Ebalard 提交于 1月 16, 2014

The function return parameter is not used in mvneta_tx_done_gbe(),
where the function is called. This patch makes the function return
void.
Reviewed-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cd713199

net: mvneta: mvneta_tx_done_gbe() cleanups · 0713a86a

由 Arnaud Ebalard 提交于 1月 16, 2014

mvneta_tx_done_gbe() return value and third parameter are no more
used. This patch changes the function prototype and removes a useless
variable where the function is called.
Reviewed-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0713a86a

net: mvneta: implement rx_copybreak · f19fadfc

由 willy tarreau 提交于 1月 16, 2014

calling dma_map_single()/dma_unmap_single() is quite expensive compared
to copying a small packet. So let's copy short frames and keep the buffers
mapped. We set the limit to 256 bytes which seems to give good results both
on the XP-GP board and on the AX3/4.

The Rx small packet rate increased by 16.4% doing this, from 486kpps to
573kpps. It is worth noting that even the call to the function
dma_sync_single_range_for_cpu() is expensive (300 ns) although less
than dma_unmap_single(). Without it, the packet rate raises to 711kpps
(+24% more). Thus on systems where coherency from device to CPU is
guaranteed by a snoop control unit, this patch should provide even more
gains, and probably rx_copybreak could be increased.

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f19fadfc

net: mvneta: convert to build_skb() · 8ec2cd48

由 willy tarreau 提交于 1月 16, 2014

Make use of build_skb() to allocate frags on the RX path. When frag size
is lower than a page size, we can use netdev_alloc_frag(), and we fall back
to kmalloc() for larger sizes. The frag size is stored into the mvneta_port
struct. The alloc/free functions check the frag size to decide what alloc/
free method to use. MTU changes are safe because the MTU change function
stops the device and clears the queues before applying the change.

With this patch, I observed a reproducible 2% performance improvement on
HTTP-based benchmarks, and 5% on small packet RX rate.

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ec2cd48

net: mvneta: prefetch next rx descriptor instead of current one · 34e4179d

由 willy tarreau 提交于 1月 16, 2014

Currently, the mvneta driver tries to prefetch the current Rx
descriptor during read. Tests have shown that prefetching the
next one instead increases general performance by about 1% on
HTTP traffic.

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

34e4179d

net: mvneta: simplify access to the rx descriptor status · 5428213c

由 willy tarreau 提交于 1月 16, 2014

At several places, we already know the value of the rx status but
we call functions which dereference the pointer again to get it
and don't need the descriptor for anything else. Simplify this
task by replacing the rx desc pointer by the status word itself.

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5428213c

net: mvneta: factor rx refilling code · a1a65ab1

由 willy tarreau 提交于 1月 16, 2014

Make mvneta_rxq_fill() use mvneta_rx_refill() instead of using
duplicate code.

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a1a65ab1

net: mvneta: remove tests for impossible cases in the tx_done path · 6c498974

由 willy tarreau 提交于 1月 16, 2014

Currently, mvneta_txq_bufs_free() calls mvneta_tx_done_policy() with
a non-null cause to retrieve the pointer to the next queue to process.
There are useless tests on the return queue number and on the pointer,
all of which are well defined within a known limited set. This code
path is fast, although not critical. Removing 3 tests here that the
compiler could not optimize (verified) is always desirable.

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c498974

net: mvneta: replace Tx timer with a real interrupt · 71f6d1b3

由 willy tarreau 提交于 1月 16, 2014

Right now the mvneta driver doesn't handle Tx IRQ, and relies on two
mechanisms to flush Tx descriptors : a flush at the end of mvneta_tx()
and a timer. If a burst of packets is emitted faster than the device
can send them, then the queue is stopped until next wake-up of the
timer 10ms later. This causes jerky output traffic with bursts and
pauses, making it difficult to reach line rate with very few streams.

A test on UDP traffic shows that it's not possible to go beyond 134
Mbps / 12 kpps of outgoing traffic with 1500-bytes IP packets. Routed
traffic tends to observe pauses as well if the traffic is bursty,
making it even burstier after the wake-up.

It seems that this feature was inherited from the original driver but
nothing there mentions any reason for not using the interrupt instead,
which the chip supports.

Thus, this patch enables Tx interrupts and removes the timer. It does
the two at once because it's not really possible to make the two
mechanisms coexist, so a split patch doesn't make sense.

First tests performed on a Mirabox (Armada 370) show that less CPU
seems to be used when sending traffic. One reason might be that we now
call the mvneta_tx_done_gbe() with a mask indicating which queues have
been done instead of looping over all of them.

The same UDP test above now happily reaches 987 Mbps / 87.7 kpps.
Single-stream TCP traffic can now more easily reach line rate. HTTP
transfers of 1 MB objects over a single connection went from 730 to
840 Mbps. It is even possible to go significantly higher (>900 Mbps)
by tweaking tcp_tso_win_divisor.

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: Arnaud Ebalard <arno@natisbad.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71f6d1b3

net: mvneta: add missing bit descriptions for interrupt masks and causes · 40ba35e7

由 willy tarreau 提交于 1月 16, 2014

Marvell has not published the chip's datasheet yet, so it's very hard
to find the relevant bits to manipulate to change the IRQ behaviour.
Fortunately, these bits are described in the proprietary LSP patch set
which is publicly available here :

    http://www.plugcomputer.org/downloads/mirabox/

So let's put them back in the driver in order to reduce the burden of
current and future maintenance.

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40ba35e7

net: mvneta: do not schedule in mvneta_tx_timeout · 29021366

由 willy tarreau 提交于 1月 16, 2014

If a queue timeout is reported, we can oops because of some
schedules while the caller is atomic, as shown below :

  mvneta d0070000.ethernet eth0: tx timeout
  BUG: scheduling while atomic: bash/1528/0x00000100
  Modules linked in: slhttp_ethdiv(C) [last unloaded: slhttp_ethdiv]
  CPU: 2 PID: 1528 Comm: bash Tainted: G        WC   3.13.0-rc4-mvebu-nf #180
  [<c0011bd9>] (unwind_backtrace+0x1/0x98) from [<c000f1ab>] (show_stack+0xb/0xc)
  [<c000f1ab>] (show_stack+0xb/0xc) from [<c02ad323>] (dump_stack+0x4f/0x64)
  [<c02ad323>] (dump_stack+0x4f/0x64) from [<c02abe67>] (__schedule_bug+0x37/0x4c)
  [<c02abe67>] (__schedule_bug+0x37/0x4c) from [<c02ae261>] (__schedule+0x325/0x3ec)
  [<c02ae261>] (__schedule+0x325/0x3ec) from [<c02adb97>] (schedule_timeout+0xb7/0x118)
  [<c02adb97>] (schedule_timeout+0xb7/0x118) from [<c0020a67>] (msleep+0xf/0x14)
  [<c0020a67>] (msleep+0xf/0x14) from [<c01dcbe5>] (mvneta_stop_dev+0x21/0x194)
  [<c01dcbe5>] (mvneta_stop_dev+0x21/0x194) from [<c01dcfe9>] (mvneta_tx_timeout+0x19/0x24)
  [<c01dcfe9>] (mvneta_tx_timeout+0x19/0x24) from [<c024afc7>] (dev_watchdog+0x18b/0x1c4)
  [<c024afc7>] (dev_watchdog+0x18b/0x1c4) from [<c0020b53>] (call_timer_fn.isra.27+0x17/0x5c)
  [<c0020b53>] (call_timer_fn.isra.27+0x17/0x5c) from [<c0020cad>] (run_timer_softirq+0x115/0x170)
  [<c0020cad>] (run_timer_softirq+0x115/0x170) from [<c001ccb9>] (__do_softirq+0xbd/0x1a8)
  [<c001ccb9>] (__do_softirq+0xbd/0x1a8) from [<c001cfad>] (irq_exit+0x61/0x98)
  [<c001cfad>] (irq_exit+0x61/0x98) from [<c000d4bf>] (handle_IRQ+0x27/0x60)
  [<c000d4bf>] (handle_IRQ+0x27/0x60) from [<c000843b>] (armada_370_xp_handle_irq+0x33/0xc8)
  [<c000843b>] (armada_370_xp_handle_irq+0x33/0xc8) from [<c000fba9>] (__irq_usr+0x49/0x60)

Ben Hutchings attempted to propose a better fix consisting in using a
scheduled work for this, but while it fixed this panic, it caused other
random freezes and panics proving that the reset sequence in the driver
is unreliable and that additional fixes should be investigated.

When sending multiple streams over a link limited to 100 Mbps, Tx timeouts
happen from time to time, and the driver correctly recovers only when the
function is disabled.

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Tested-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29021366

net: mvneta: use per_cpu stats to fix an SMP lock up · 74c41b04

由 willy tarreau 提交于 1月 16, 2014

Stats writers are mvneta_rx() and mvneta_tx(). They don't lock anything
when they update the stats, and as a result, it randomly happens that
the stats freeze on SMP if two updates happen during stats retrieval.
This is very easily reproducible by starting two HTTP servers and binding
each of them to a different CPU, then consulting /proc/net/dev in loops
during transfers, the interface should immediately lock up. This issue
also randomly happens upon link state changes during transfers, because
the stats are collected in this situation, but it takes more attempts to
reproduce it.

The comments in netdevice.h suggest using per_cpu stats instead to get
rid of this issue.

This patch implements this. It merges both rx_stats and tx_stats into
a single "stats" member with a single syncp. Both mvneta_rx() and
mvneta_rx() now only update the a single CPU's counters.

In turn, mvneta_get_stats64() does the summing by iterating over all CPUs
to get their respective stats.

With this change, stats are still correct and no more lockup is encountered.

Note that this bug was present since the first import of the mvneta
driver.  It might make sense to backport it to some stable trees. If
so, it depends on "d33dc73 net: mvneta: increase the 64-bit rx/tx stats
out of the hot path".

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Tested-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74c41b04

net: mvneta: increase the 64-bit rx/tx stats out of the hot path · dc4277dd

由 willy tarreau 提交于 1月 16, 2014

Better count packets and bytes in the stack and on 32 bit then
accumulate them at the end for once. This saves two memory writes
and two memory barriers per packet. The incoming packet rate was
increased by 4.7% on the Openblocks AX3 thanks to this.

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Tested-by: NArnaud Ebalard <arno@natisbad.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc4277dd

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功