提交 · 2214c260c72b0bd94e6c1c19bf451686212025d3 · openanolis / cloud-kernel

09 5月, 2017 1 次提交

md: don't return -EAGAIN in md_allow_write for external metadata arrays · 2214c260

由 Artur Paszkiewicz 提交于 5月 08, 2017

This essentially reverts commit b5470dc5 ("md: resolve external
metadata handling deadlock in md_allow_write") with some adjustments.

Since commit 6791875e ("md: make reconfig_mutex optional for writes
to md sysfs files.") changing array_state to 'active' does not use
mddev_lock() and will not cause a deadlock with md_allow_write(). This
revert simplifies userspace tools that write to sysfs attributes like
"stripe_cache_size" or "consistency_policy" because it removes the need
for special handling for external metadata arrays, checking for EAGAIN
and retrying the write.
Signed-off-by: NArtur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NShaohua Li <shli@fb.com>

2214c260

05 5月, 2017 4 次提交

md/raid5: make use of spin_lock_irq over local_irq_disable + spin_lock · 3d05f3ae

由 Julia Cartwright 提交于 4月 28, 2017

On mainline, there is no functional difference, just less code, and
symmetric lock/unlock paths.

On PREEMPT_RT builds, this fixes the following warning, seen by
Alexander GQ Gerasiov, due to the sleeping nature of spinlocks.

   BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:993
   in_atomic(): 0, irqs_disabled(): 1, pid: 58, name: kworker/u12:1
   CPU: 5 PID: 58 Comm: kworker/u12:1 Tainted: G        W       4.9.20-rt16-stand6-686 #1
   Hardware name: Supermicro SYS-5027R-WRF/X9SRW-F, BIOS 3.2a 10/28/2015
   Workqueue: writeback wb_workfn (flush-253:0)
   Call Trace:
    dump_stack+0x47/0x68
    ? migrate_enable+0x4a/0xf0
    ___might_sleep+0x101/0x180
    rt_spin_lock+0x17/0x40
    add_stripe_bio+0x4e3/0x6c0 [raid456]
    ? preempt_count_add+0x42/0xb0
    raid5_make_request+0x737/0xdd0 [raid456]
Reported-by: NAlexander GQ Gerasiov <gq@redlab-i.ru>
Tested-by: NAlexander GQ Gerasiov <gq@redlab-i.ru>
Signed-off-by: NJulia Cartwright <julia@ni.com>
Signed-off-by: NShaohua Li <shli@fb.com>

3d05f3ae

qede: Fix possible misconfiguration of advertised autoneg value. · 161adb04

由 sudarsana.kalluru@cavium.com 提交于 5月 04, 2017

Fail the configuration of advertised speed-autoneg value if the config
update is not supported.
Signed-off-by: NSudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

161adb04

qed: Fix overriding of supported autoneg value. · 34f9199c

由 sudarsana.kalluru@cavium.com 提交于 5月 04, 2017

Driver currently uses advertised-autoneg value to populate the
supported-autoneg field. When advertised field is updated, user gets
the same value for supported field. Supported-autoneg value need to be
populated from the link capabilities value returned by the MFW.
Signed-off-by: NSudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

34f9199c

qed*: Fix possible overflow for status block id field. · f870a3c6

由 sudarsana.kalluru@cavium.com 提交于 5月 04, 2017

Value for status block id could be more than 256 in 100G mode, need to
update its data type from u8 to u16.
Signed-off-by: NSudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f870a3c6

04 5月, 2017 11 次提交

netvsc: make sure napi enabled before vmbus_open · 2be0f264

由 stephen hemminger 提交于 5月 03, 2017

This fixes a race where vmbus callback for new packet arriving
could occur before NAPI is initialized.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2be0f264

aquantia: Fix driver name reported by ethtool · 5900eca1

由 Pavel Belous 提交于 5月 03, 2017

V2: using "aquantia" subsystem tag.

The command "ethtool -i ethX" should display driver name (driver: atlantic)
instead vendor name (driver: aquantia).
Signed-off-by: NPavel Belous <pavel.belous@aquantia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5900eca1

forcedeth: remove unnecessary carrier status check · 5d826b7b

由 Zhu Yanjun 提交于 5月 03, 2017

Since netif_carrier_on() will do nothing if device's
carrier is already on, so it's unnecessary to do
carrier status check.

It's the same for netif_carrier_off().
Signed-off-by: NZhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d826b7b

zram: reduce load operation in page_same_filled · f0fe9984

由 Sangwoo Park 提交于 5月 03, 2017

In page_same_filled function, all elements in the page is compared with
next index value.  The current comparison routine compares the (i)th and
(i+1)th values of the page.

In this case, two load operaions occur for each comparison.  But if we
store first value of the page stores at 'val' variable and using it to
compare with others, the load opearation is reduced.  It reduce load
operation per page by up to 64times.

Link: http://lkml.kernel.org/r/1488428104-7257-1-git-send-email-sangwoo2.park@lge.comSigned-off-by: NSangwoo Park <sangwoo2.park@lge.com>
Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f0fe9984

zram: use zram_free_page instead of open-coded · 302128dc

由 Minchan Kim 提交于 5月 03, 2017

The zram_free_page already handles NULL handle case and same page so use
it to reduce error probability.  (Acutaully, I made a mistake when I
handled same page feature)

Link: http://lkml.kernel.org/r/1492052365-16169-7-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

302128dc

zram: introduce zram data accessor · 643ae61d

由 Minchan Kim 提交于 5月 03, 2017

With element, sometime I got confused handle and element access. It
might be my bad but I think it's time to introduce accessor to prevent
future idiot like me. This patch is just clean-up patch so it shouldn't
change any behavior.

Link: http://lkml.kernel.org/r/1492052365-16169-6-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

643ae61d

zram: remove zram_meta structure · beb6602c

由 Minchan Kim 提交于 5月 03, 2017

It's redundant now. Instead, remove it and use zram structure directly.

Link: http://lkml.kernel.org/r/1492052365-16169-5-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

beb6602c

zram: use zram_slot_lock instead of raw bit_spin_lock op · 86c49814

由 Minchan Kim 提交于 5月 03, 2017

With this clean-up phase, I want to use zram's wrapper function to lock
table access which is more consistent with other zram's functions.

Link: http://lkml.kernel.org/r/1492052365-16169-4-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

86c49814

zram: partial IO refactoring · 1f7319c7

由 Minchan Kim 提交于 5月 03, 2017

For architecture(PAGE_SIZE > 4K), zram have supported partial IO.
However, the mixed code for handling normal/partial IO is too mess,
error-prone to modify IO handler functions with upcoming feature so this
patch aims for cleaning up zram's IO handling functions.

Link: http://lkml.kernel.org/r/1492052365-16169-3-git-send-email-minchan@kernel.orgSigned-off-by: NMinchan Kim <minchan@kernel.org>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1f7319c7

zram: handle multiple pages attached bio's bvec · e86942c7

由 Minchan Kim 提交于 5月 03, 2017

Patch series "zram clean up", v2.

This patchset aims to clean up zram .

[1] clean up multiple pages's bvec handling.
[2] clean up partial IO handling
[3-6] clean up zram via using accessor and removing pointless structure.

With [2-6] applied, we can get a few hundred bytes as well as huge
readibility enhance.

x86: 708 byte save

    add/remove: 1/1 grow/shrink: 0/11 up/down: 478/-1186 (-708)
    function                                     old     new   delta
    zram_special_page_read                         -     478    +478
    zram_reset_device                            317     314      -3
    mem_used_max_store                           131     128      -3
    compact_store                                 96      93      -3
    mm_stat_show                                 203     197      -6
    zram_add                                     719     712      -7
    zram_slot_free_notify                        229     214     -15
    zram_make_request                            819     803     -16
    zram_meta_free                               128     111     -17
    zram_free_page                               180     151     -29
    disksize_store                               432     361     -71
    zram_decompress_page.isra                    504       -    -504
    zram_bvec_rw                                2592    2080    -512
    Total: Before=25350773, After=25350065, chg -0.00%

ppc64: 231 byte save

    add/remove: 2/0 grow/shrink: 1/9 up/down: 681/-912 (-231)
    function                                     old     new   delta
    zram_special_page_read                         -     480    +480
    zram_slot_lock                                 -     200    +200
    vermagic                                      39      40      +1
    mm_stat_show                                 256     248      -8
    zram_meta_free                               200     184     -16
    zram_add                                     944     912     -32
    zram_free_page                               348     308     -40
    disksize_store                               572     492     -80
    zram_decompress_page                         664     564    -100
    zram_slot_free_notify                        292     160    -132
    zram_make_request                           1132    1000    -132
    zram_bvec_rw                                2768    2396    -372
    Total: Before=17565825, After=17565594, chg -0.00%

This patch (of 6):

Johannes Thumshirn reported system goes the panic when using NVMe over
Fabrics loopback target with zram.

The reason is zram expects each bvec in bio contains a single page
but nvme can attach a huge bulk of pages attached to the bio's bvec
so that zram's index arithmetic could be wrong so that out-of-bound
access makes system panic.

[1] in mainline solved solved the problem by limiting max_sectors with
SECTORS_PER_PAGE but it makes zram slow because bio should split with
each pages so this patch makes zram aware of multiple pages in a bvec
so it could solve without any regression(ie, bio split).

[1] 0bc31538, zram: set physical queue limits to avoid array out of
    bounds accesses

Link: http://lkml.kernel.org/r/20170413134057.GA27499@bboxSigned-off-by: NMinchan Kim <minchan@kernel.org>
Reported-by: NJohannes Thumshirn <jthumshirn@suse.de>
Tested-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e86942c7

oom: improve oom disable handling · d75da004

由 Michal Hocko 提交于 5月 03, 2017

Tetsuo has reported that sysrq triggered OOM killer will print a
misleading information when no tasks are selected:

sysrq: SysRq : Manual OOM execution
Out of memory: Kill process 4468 ((agetty)) score 0 or sacrifice child
Killed process 4468 ((agetty)) total-vm:43704kB, anon-rss:1760kB, file-rss:0kB, shmem-rss:0kB
sysrq: SysRq : Manual OOM execution
Out of memory: Kill process 4469 (systemd-cgroups) score 0 or sacrifice child
Killed process 4469 (systemd-cgroups) total-vm:10704kB, anon-rss:120kB, file-rss:0kB, shmem-rss:0kB
sysrq: SysRq : Manual OOM execution
sysrq: OOM request ignored because killer is disabled
sysrq: SysRq : Manual OOM execution
sysrq: OOM request ignored because killer is disabled
sysrq: SysRq : Manual OOM execution
sysrq: OOM request ignored because killer is disabled

The real reason is that there are no eligible tasks for the OOM killer
to select but since commit 7c5f64f8 ("mm: oom: deduplicate victim
selection code for memcg and global oom") the semantic of out_of_memory
has changed without updating moom_callback.

This patch updates moom_callback to tell that no task was eligible which
is the case for both oom killer disabled and no eligible tasks. In
order to help distinguish first case from the second add printk to both
oom_killer_{enable,disable}. This information is useful on its own
because it might help debugging potential memory allocation failures.

Fixes: 7c5f64f8 ("mm: oom: deduplicate victim selection code for memcg and global oom")
Link: http://lkml.kernel.org/r/20170404134705.6361-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d75da004

03 5月, 2017 24 次提交

ibmvnic: Move queue restarting in ibmvnic_tx_complete · 7c3e7de3

由 Nathan Fontenot 提交于 5月 03, 2017

Restart of the subqueue should occur outside of the loop processing
any tx buffers instead of doing this in the middle of the loop.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c3e7de3

ibmvnic: Record SKB RX queue during poll · 94ca305f

由 Thomas Falcon 提交于 5月 03, 2017

Map each RX SKB to the RX queue associated with the driver's RX SCRQ.
This should improve the RX CPU load balancing issues seen by the
performance team.
Signed-off-by: NThomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94ca305f

ibmvnic: Continue skb processing after skb completion error · ca05e316

由 Nathan Fontenot 提交于 5月 03, 2017

There is not a need to stop processing skbs if we encounter a
skb that has a receive completion error.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca05e316

ibmvnic: Check for driver reset first in ibmvnic_xmit · 161b8a81

由 Nathan Fontenot 提交于 5月 03, 2017

Move the check for the driver resetting to the first thing
in ibmvnic_xmit().
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

161b8a81

ibmvnic: Wait for any pending scrqs entries at driver close · 46293b94

由 Nathan Fontenot 提交于 5月 03, 2017

When closing the ibmvnic driver we need to wait for any pending
sub crq entries to ensure they are handled.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46293b94

ibmvnic: Clean up tx pools when closing · b41b83e9

由 Nathan Fontenot 提交于 5月 03, 2017

When closing the ibmvnic driver, most notably during the reset
path, the tx pools need to be cleaned to ensure there are no
hanging skbs that need to be free'ed.

The need for this was found during debugging a loss of network
traffic after handling a driver reset. The underlying cause was
some skbs in the tx pool that were never free'ed. As a
result the upper network layers never tried a re-send since it
believed the driver still had the skb.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b41b83e9

ibmvnic: Whitespace correction in release_rx_pools · e0ebe942

由 Nathan Fontenot 提交于 5月 03, 2017

Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0ebe942

ibmvnic: Delete napi's when releasing driver resources · c7bac00b

由 Nathan Fontenot 提交于 5月 03, 2017

The napi structs allocated at drivier initializatio need to be
free'ed when releasing the drivers resources.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7bac00b

ibmvnic: Updated reset handling · ed651a10

由 Nathan Fontenot 提交于 5月 03, 2017

The ibmvnic driver has multiple handlers for resetting the driver
depending on the reason the reset is needed (failover, lpm,
fatal erors,...). All of the reset handlers do essentially the same
thing, this patch moves this work to a common reset handler.

By doing this we also allow the driver to better handle situations
where we can get a reset while handling a reset.

The updated reset handling works by adding a reset work item to the
list of resets and then scheduling work to perform the reset. This
step is necessary because we can receive a reset in interrupt context
and we want to handle the reset out of interrupt context.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed651a10

ibmvnic: Replace is_closed with state field · 90c8014c

由 Nathan Fontenot 提交于 5月 03, 2017

Replace the is_closed flag in the ibmvnic adapter strcut with a
more comprehensive state field that tracks the current state of
the driver.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

90c8014c

ibmvnic: Move resource initialization to its own routine · bfc32f29

由 Nathan Fontenot 提交于 5月 03, 2017

Move all of the calls to initialize resources for the driver to
a separate routine.
Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bfc32f29

net: usb: qmi_wwan: add Telit ME910 support · 4c54dc02

由 Daniele Palmas 提交于 5月 03, 2017

This patch adds support for Telit ME910 PID 0x1100.
Signed-off-by: NDaniele Palmas <dnlplm@gmail.com>
Acked-by: NBjørn Mork <bjorn@mork.no>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c54dc02

tg3: don't clear stats while tg3_close · 37a7fdf2

由 YueHaibing 提交于 5月 03, 2017

Now tg3 NIC's stats will be cleared after ifdown/ifup. bond_get_stats traverse
its salves to get statistics,cumulative the increment.If a tg3 NIC is added to
bonding as a slave,ifdown/ifup will cause bonding's stats become tremendous value
(ex.1638.3 PiB) because of negative increment.

Fixes: 92feeabf ("tg3: Save stats across chip resets")
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37a7fdf2

xdp: use common helper for netlink extended ack reporting · 4d463c4d

由 Daniel Borkmann 提交于 5月 03, 2017

Small follow-up to d74a32ac ("xdp: use netlink extended ACK reporting")
in order to let drivers all use the same NL_SET_ERR_MSG_MOD() helper macro
for reporting. This also ensures that we consistently add the driver's
prefix for dumping the report in user space to indicate that the error
message is driver specific and not coming from core code. Furthermore,
NL_SET_ERR_MSG_MOD() now reuses NL_SET_ERR_MSG() and thus makes all macros
check the pointer as suggested.

References: https://www.spinics.net/lists/netdev/msg433267.htmlSigned-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d463c4d

smsc911x: Adding support for Micochip LAN9250 Ethernet controller · f6fec61e

由 David Cai 提交于 5月 02, 2017

Adding support for Microchip LAN9250 Ethernet controller.
Signed-off-by: NDavid Cai <david.cai@microchip.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6fec61e

net: thunderx: Optimize page recycling for XDP · 77322538

由 Sunil Goutham 提交于 5月 02, 2017

Driver follows a method of taking one extra reference on the
page for recycling which is fine in usual packet path where
each 64KB page is segmented into multiple receive buffers.

But in XDP mode since there is just one receive buffer per
page taking extra page reference itself becomes big bottleneck
consuming ~50% of CPU cycles due to atomic operations.

This patch adds a internal ref count in pgcache for each
page and additional page references are taken in a batch
instead of just one at a time. Internal i.e 'pgcache->ref_count'
and page's i.e 'page->_refcount' counters are compared to check
page's recyclability.
Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

77322538

net: thunderx: Support for XDP header adjustment · e3d06ff9

由 Sunil Goutham 提交于 5月 02, 2017

When in XDP mode reserve XDP_PACKET_HEADROOM bytes at the start
of receive buffer for XDP program to modify headers and adjust
packet start. Additional code changes done to handle such packets.
Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3d06ff9

net: thunderx: Add support for XDP_TX · 16f2bccd

由 Sunil Goutham 提交于 5月 02, 2017

Adds support for XDP_TX i.e transmits packet out of
the XDP TX queue mapped to the corresponding Rx queue
on which packet is received.

Since SQ for XDP TX will be used only on a single cpu i.e
SQ description creation and freeing, using atomic free count
is not necessary and will become a bottleneck. Hence added
a separate 'xdp_free_cnt' used for SQs designated for XDP
to track descriptor free count.

Changes also include
- A new entry 'xdp_page' is added to save transmitted packet's
  page pointer for later cleanup.
- XDP Tx SQ's doorbell is ringed once per NAPI instance.
- Retrieving designated SQ for packets being sent out by stack
  via 'nicvf_xmit'.
Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16f2bccd

net: thunderx: Add support for XDP_DROP · c56d91ce

由 Sunil Goutham 提交于 5月 02, 2017

Adds support for XDP_DROP.
Also since in XDP mode there is just a single buffer per page,
made changes to recycle DMA mapping info as well along with pages.
Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c56d91ce

net: thunderx: Add basic XDP support · 05c773f5

由 Sunil Goutham 提交于 5月 02, 2017

Adds basic XDP support i.e attaching a BPF program to an
interface. Also takes care of allocating separate Tx queues
for XDP path and for network stack packet transmission.

This patch doesn't support handling of any of the XDP actions,
all are treated as XDP_PASS i.e packets will be handed over to
the network stack.

Changes also involve allocating one receive buffer per page in XDP
mode and multiple in normal mode i.e when no BPF program is attached.
Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05c773f5

net: thunderx: Cleanup receive buffer allocation · 927987f3

由 Sunil Goutham 提交于 5月 02, 2017

Get rid of unnecessary double pointer references and type casting
in receive buffer allocation code.
Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

927987f3

net: thunderx: Optimize CQE_TX handling · 0dada88b

由 Sunil Goutham 提交于 5月 02, 2017

Optimized CQE handling with below changes
- Feeing descriptors back to SQ in bulk i.e once per NAPI
  instance instead for every CQE_TX, this will reduce number
  of atomic updates to 'sq->free_cnt'.
- Checking errors in CQE_TX and CQE_RX before calling appropriate
  fn()s to update error stats i.e reduce branching.

Also removed debug messages in packet handling path which otherwise
causes issues if DEBUG is enabled.
Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0dada88b

net: thunderx: Optimize RBDR descriptor handling · 5e848e4c

由 Sunil Goutham 提交于 5月 02, 2017

Receive buffer's physical address or iova will anyway not
go beyond 49bits, since it is the max supported HW address.
As per perf, updating bitfields i.e buf_addr:42 in RBDR
descriptor entry consumes lots of cpu cycles, hence changed
it to a 64bit field with alignment requirements taken care of.
Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e848e4c

net: thunderx: Support for page recycling · 5836b442

由 Sunil Goutham 提交于 5月 02, 2017

Adds support for page recycling for allocating receive buffers
to reduce cost of refilling RBDR ring. Also got rid of using
compound pages when pagesize is 4K, only order-0 pages now.

Only page is recycled, DMA mappings still needs to be done for
every receive buffer allocated due to following constraints
- Cannot have just one receive buffer per 64KB page.
- There is just one buffer ring shared across 8 Rx queues, so
  buffers of same page can go to any Rx queue.
- HW gives buffer address where packet has been DMA'ed and not
  the index into buffer ring.
This makes it not possible to resue DMA mapping info. So unfortunately
have to go through costly mapping route for every buffer.
Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5836b442

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功