提交 · d2582a03939ed0a80ffcd3ea5345505bc8067c54 · openeuler / Kernel

30 10月, 2016 36 次提交

net/mlx4_en: Fix potential deadlock in port statistics flow · d2582a03

由 Jack Morgenstein 提交于 10月 27, 2016

mlx4_en_DUMP_ETH_STATS took the *counter mutex* and then
called the FW command, with WRAPPED attribute. As a result, the fw command
is wrapped on the Hypervisor when it calls mlx4_en_DUMP_ETH_STATS.
The FW command wrapper flow on the hypervisor takes the *slave_cmd_mutex*
during processing.

At the same time, a VF could be in the process of coming up, and could
call mlx4_QUERY_FUNC_CAP.  On the hypervisor, the command flow takes the
*slave_cmd_mutex*, then executes mlx4_QUERY_FUNC_CAP_wrapper.
mlx4_QUERY_FUNC_CAP wrapper calls mlx4_get_default_counter_index(),
which takes the *counter mutex*. DEADLOCK.

The fix is that the DUMP_ETH_STATS fw command should be called with
the NATIVE attribute, so that on the hypervisor, this command does not
enter the wrapper flow.

Since the Hypervisor no longer goes through the wrapper code, we also
simply return 0 in mlx4_DUMP_ETH_STATS_wrapper (i.e.the function succeeds,
but the returned data will be all zeroes).
No need to test if it is the Hypervisor going through the wrapper.

Fixes: f9baff50 ("mlx4_core: Add "native" argument to mlx4_cmd ...")
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d2582a03

net/mlx4: Fix firmware command timeout during interrupt test · 6f2e0d2c

由 Eugenia Emantayev 提交于 10月 27, 2016

Currently interrupt test that is part of ethtool selftest runs the
check over all interrupt vectors of the device.
In mlx4_en package part of interrupt vectors are uninitialized since
mlx4_ib doesn't exist. This causes NOP FW command to time out.
Change logic to test current port interrupt vectors only.
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f2e0d2c

net/mlx4_core: Do not access comm channel if it has not yet been initialized · 81d18419

由 Jack Morgenstein 提交于 10月 27, 2016

In the Hypervisor, there are several FW commands which are invoked
before the comm channel is initialized (in mlx4_multi_func_init).
These include MOD_STAT_CONFIG, QUERY_DEV_CAP, INIT_HCA, and others.

If any of these commands fails, say with a timeout, the Hypervisor
driver enters the internal error reset flow. In this flow, the driver
attempts to notify all slaves via the comm channel that an internal error
has occurred.

Since the comm channel has not yet been initialized (i.e., mapped via
ioremap), this will cause dereferencing a NULL pointer.

To fix this, do not access the comm channel in the internal error flow
if it has not yet been initialized.

Fixes: 55ad3592 ("net/mlx4_core: Enable device recovery flow with SRIOV")
Fixes: ab9c17a0 ("mlx4_core: Modify driver initialization flow to accommodate SRIOV for Ethernet")
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

81d18419

net/mlx4_en: Fix panic during reboot · 9d2afba0

由 Eugenia Emantayev 提交于 10月 27, 2016

Fix a kernel panic that occurs as a result of an asynchronous event
handled in roce_gid_mgmt:
mlx4_en_get_drvinfo is called and accesses freed resources.

This happens in a shutdown flow only, since pci device is destroyed
while netdevice is still alive.

Fixes: c27a02cd ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC")
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d2afba0

net/mlx4_en: Process all completions in RX rings after port goes up · 8d59de8f

由 Erez Shitrit 提交于 10月 27, 2016

Currently there is a race between incoming traffic and
initialization flow. HW is able to receive the packets
after INIT_PORT is done and unicast steering is configured.
Before we set priv->port_up NAPI is not scheduled and
receive queues become full. Therefore we never get
new interrupts about the completions.
This issue could happen if running heavy traffic during
bringing port up.
The resolution is to schedule NAPI once port_up is set.
If receive queues were full this will process all cqes
and release them.

Fixes: c27a02cd ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC")
Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d59de8f

net/mlx4_en: Resolve dividing by zero in 32-bit system · 4850cf45

由 Eugenia Emantayev 提交于 10月 27, 2016

When doing roundup_pow_of_two for large enough number with
bit 31, an overflow will occur and a value equal to 1 will
be returned. In this case 1 will be subtracted from the return
value and division by zero will be reached.

Fixes: 31c128b6 ("net/mlx4_en: Choose time-stamping shift value according to HW frequency")
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4850cf45

net/mlx4_core: Change the default value of enable_qos · 72da2e91

由 Moshe Lazer 提交于 10月 27, 2016

Change the default status of quality of service back to disabled,
as it hurts performance in some cases.

Fixes: 38438f7c ("net/mlx4: Set enhanced QoS support by default when ...")
Signed-off-by: NMoshe Lazer <moshel@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72da2e91

net/mlx4_core: Avoid setting ports to auto when only one port type is supported · 33a1f8b1

由 Maor Gottlieb 提交于 10月 27, 2016

When only one port type is supported, it should be read only.
We reject changing requests, even to the auto sense mode.

Fixes: 27bf91d6 ("mlx4_core: Add link type autosensing")
Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

33a1f8b1

net/mlx4_core: Fix the resource-type enum in res tracker to conform to FW spec · aa0c08fe

由 Jack Morgenstein 提交于 10月 27, 2016

The resource type enum in the resource tracker was incorrect.
RES_EQ was put in the position of RES_NPORT_ID (a FC resource).

Since the remaining resources maintain their current values,
and RES_EQ is not passed from slaves to the hypervisor in any
FW command, this change affects only the hypervisor.
Therefore, there is no backwards-compatibility issue.

Fixes: 623ed84b ("mlx4_core: initial header-file changes for SRIOV support")
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa0c08fe

rds: debug messages are enabled by default · ff57087f

由 shamir rabinovitch 提交于 10月 27, 2016

rds use Kconfig option called "RDS_DEBUG" to enable rds debug messages.
This option cause the rds Makefile to add -DDEBUG to the rds gcc command
line.

When CONFIG_DYNAMIC_DEBUG is enabled, the "DEBUG" macro is used by
include/linux/dynamic_debug.h to decide if dynamic debug prints should
be sent by default to the kernel log.

rds should not enable this macro for production builds. rds dynamic
debug work as expected follow this fix.
Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Reviewed-by: NWengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff57087f

Merge tag 'mac80211-for-davem-2016-10-27' of... · 880b583c

由 David S. Miller 提交于 10月 29, 2016

Merge tag 'mac80211-for-davem-2016-10-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Just two fixes:
 * a fix to process all events while suspending, so any
   potential calls into the driver are done before it is
   suspended
 * small markup fixes for the sphinx documentation conversion
   that's coming into the tree via the doc tree
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

880b583c

ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context · 8d7533e5

由 Thomas Falcon 提交于 10月 26, 2016

Schedule these XPORT event tasks in the shared workqueue
so that IRQs are not freed in an interrupt context when
sub-CRQs are released.
Signed-off-by: NThomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d7533e5

net: mv643xx_eth: Fetch the phy connection type from DT · fd33b244

由 Jason Gunthorpe 提交于 10月 26, 2016

The MAC is capable of RGMII mode and that is probably a more typical
connection type than GMII today (eg it is used by Marvell Reference
designs for several SOCs). Let DT users specify the standard

   phy-connection-type = "rgmii-id";

On a phy node.
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd33b244

Merge tag 'batadv-net-for-davem-20161026' of git://git.open-mesh.org/linux-merge · ad601339

由 David S. Miller 提交于 10月 29, 2016

Simon Wunderlich says:

====================
Here are three batman-adv bugfix patches:

 - Fix RCU usage for neighbor list, by Sven Eckelmann

 - Fix BATADV_DBG_ALL loglevel to include TP Meter messages, by Sven Eckelmann

 - Fix possible splat when disabling an interface, by Linus Luessing
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad601339

Revert "hv_netvsc: report vmbus name in ethtool" · e934f684

由 Stephen Hemminger 提交于 10月 26, 2016

This reverts commit e3f74b84
("hv_netvsc: report vmbus name in ethtool")'
because of problem introduced by commit f9a56e5d6a0ba
("Drivers: hv: make VMBus bus ids persistent").
This changed the format of the vmbus name and this new format is too
long to fit in the bus_info field of ethtool.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e934f684

packet: on direct_xmit, limit tso and csum to supported devices · 104ba78c

由 Willem de Bruijn 提交于 10月 26, 2016

When transmitting on a packet socket with PACKET_VNET_HDR and
PACKET_QDISC_BYPASS, validate device support for features requested
in vnet_hdr.

Drop TSO packets sent to devices that do not support TSO or have the
feature disabled. Note that the latter currently do process those
packets correctly, regardless of not advertising the feature.

Because of SKB_GSO_DODGY, it is not sufficient to test device features
with netif_needs_gso. Full validate_xmit_skb is needed.

Switch to software checksum for non-TSO packets that request checksum
offload if that device feature is unsupported or disabled. Note that
similar to the TSO case, device drivers may perform checksum offload
correctly even when not advertising it.

When switching to software checksum, packets hit skb_checksum_help,
which has two BUG_ON checksum not in linear segment. Packet sockets
always allocate at least up to csum_start + csum_off + 2 as linear.

Tested by running github.com/wdebruij/kerneltools/psock_txring_vnet.c

  ethtool -K eth0 tso off tx on
  psock_txring_vnet -d $dst -s $src -i eth0 -l 2000 -n 1 -q -v
  psock_txring_vnet -d $dst -s $src -i eth0 -l 2000 -n 1 -q -v -N

  ethtool -K eth0 tx off
  psock_txring_vnet -d $dst -s $src -i eth0 -l 1000 -n 1 -q -v -G
  psock_txring_vnet -d $dst -s $src -i eth0 -l 1000 -n 1 -q -v -G -N

v2:
  - add EXPORT_SYMBOL_GPL(validate_xmit_skb_list)

Fixes: d346a3fa ("packet: introduce PACKET_QDISC_BYPASS socket option")
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

104ba78c

net_sched actions: use nla_parse_nested() · 4700e9ce

由 Johannes Berg 提交于 10月 26, 2016

Use nla_parse_nested instead of open-coding the call to
nla_parse() with the attribute data/len.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4700e9ce

cxgb4: Fix error handling in alloc_uld_rxqs(). · 166e6045

由 Ganesh Goudar 提交于 10月 26, 2016

Fix to release resources properly in error handling path of
alloc_uld_rxqs(), This patch also removes unwanted arguments
and avoids calling the same function twice.

Fixes: 94cdb8bb (cxgb4: Add support for dynamic allocation
       of resources for ULD
Signed-off-by: NGanesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

166e6045

IB/mlx4: avoid a -Wmaybe-uninitialize warning · a4256bc9

由 Arnd Bergmann 提交于 10月 25, 2016

There is an old warning about mlx4_SW2HW_EQ_wrapper on x86:

ethernet/mellanox/mlx4/resource_tracker.c: In function ‘mlx4_SW2HW_EQ_wrapper’:
ethernet/mellanox/mlx4/resource_tracker.c:3071:10: error: ‘eq’ may be used uninitialized in this function [-Werror=maybe-uninitialized]

The problem here is that gcc won't track the state of the variable
across a spin_unlock. Moving the assignment out of the lock is
safe here and avoids the warning.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4256bc9

ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit() · ae148b08

由 Eli Cooper 提交于 10月 26, 2016

This patch updates skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit() when an
IPv6 header is installed to a socket buffer.

This is not a cosmetic change.  Without updating this value, GSO packets
transmitted through an ipip6 tunnel have the protocol of ETH_P_IP and
skb_mac_gso_segment() will attempt to call gso_segment() for IPv4,
which results in the packets being dropped.

Fixes: b8921ca8 ("ip4ip6: Support for GSO/GRO")
Signed-off-by: NEli Cooper <elicooper@gmx.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae148b08

bpf: fix samples to add fake KBUILD_MODNAME · 96a8eb1e

由 Daniel Borkmann 提交于 10月 26, 2016

Some of the sample files are causing issues when they are loaded with tc
and cls_bpf, meaning tc bails out while trying to parse the resulting ELF
file as program/map/etc sections are not present, which can be easily
spotted with readelf(1).

Currently, BPF samples are including some of the kernel headers and mid
term we should change them to refrain from this, really. When dynamic
debugging is enabled, we bail out due to undeclared KBUILD_MODNAME, which
is easily overlooked in the build as clang spills this along with other
noisy warnings from various header includes, and llc still generates an
ELF file with mentioned characteristics. For just playing around with BPF
examples, this can be a bit of a hurdle to take.

Just add a fake KBUILD_MODNAME as a band-aid to fix the issue, same is
done in xdp*_kern samples already.

Fixes: 65d472fb ("samples/bpf: add 'pointer to packet' tests")
Fixes: 6afb1e28 ("samples/bpf: Add tunnel set/get tests.")
Fixes: a3f74617 ("cgroup: bpf: Add an example to do cgroup checking in BPF")
Reported-by: NChandrasekar Kannan <ckannan@console.to>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96a8eb1e

inet: Fix missing return value in inet6_hash · e4cabca5

由 Craig Gallek 提交于 10月 25, 2016

As part of a series to implement faster SO_REUSEPORT lookups,
commit 086c653f ("sock: struct proto hash function may error")
added return values to protocol hash functions and
commit 496611d7 ("inet: create IPv6-equivalent inet_hash function")
implemented a new hash function for IPv6.  However, the latter does
not respect the former's convention.

This properly propagates the hash errors in the IPv6 case.

Fixes: 496611d7 ("inet: create IPv6-equivalent inet_hash function")
Reported-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NCraig Gallek <kraig@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e4cabca5

Merge branch 'mlx5-fixes' · 58a86c45

由 David S. Miller 提交于 10月 29, 2016

Saeed Mahameed says:

====================
Mellanox 100G mlx5 fixes 2016-10-25

This series contains some bug fixes for the mlx5 core and mlx5e driver.

From Daniel:
    - Cache line size determination at runtime, instead of using
      L1_CACHE_BYTES hard coded value, use cache_line_size()
    - Always Query HCA caps after setting them even on reset flow

From Mohamad:
    - Reorder netdev cleanup to uregister netdev before detaching it
      for the kernel to not complain about open resources such as vlans
    - Change the acl enable prototype to return status, for better error
      resiliency
    - Clear health sick bit when starting health poll after reset flow
    - Fix race between PCI error handlers and health work
    - PCI error recovery health care simulation, in case when the kernel
      PCI error handlers are not triggered for some internal firmware errors

From Noa:
    - Avoid passing dma address 0 to firmware when mapping system pages
      to the firmware

From Paul: Some straight forward flow steering fixes
    - Keep autogroups list ordered
    - Fix autogroups groups num not decreasing
    - Correctly initialize last use of flow counters

From Saeed:
    - Choose the nearest LRO timeout to the wanted one
      instead of blindly choosing "dev_cap.lro_timeout[2]"

This series has no conflict with the for-next pull request posted
earlier today ("Mellanox mlx5 core driver updates 2016-10-25").
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58a86c45

net/mlx5: Avoid passing dma address 0 to firmware · 6b276190

由 Noa Osherovich 提交于 10月 25, 2016

Currently the firmware can't work with a page with dma address 0.
Passing such an address to the firmware will cause the give_pages
command to fail.

To avoid this, in case we get a 0 dma address of a page from the
dma engine, we avoid passing it to FW by remapping to get an address
other than 0.

Fixes: bf0bf77f ('mlx5: Support communicating arbitrary host...')
Signed-off-by: NNoa Osherovich <noaos@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b276190

net/mlx5: PCI error recovery health care simulation · 04c0c1ab

由 Mohamad Haj Yahia 提交于 10月 25, 2016

In case that the kernel PCI error handlers are not called, we will
trigger our own recovery flow.

The health work will give priority to the kernel pci error handlers to
recover the PCI by waiting for a small period, if the pci error handlers
are not triggered the manual recovery flow will be executed.

We don't save pci state in case of manual recovery because it will ruin the
pci configuration space and we will lose dma sync.

Fixes: 89d44f0a ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: NMohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04c0c1ab

net/mlx5: Fix race between PCI error handlers and health work · 05ac2c0b

由 Mohamad Haj Yahia 提交于 10月 25, 2016

Currently there is a race between the health care work and the kernel
pci error handlers because both of them detect the error, the first one
to be called will do the error handling.
There is a chance that health care will disable the pci after resuming
pci slot.
Also create a separate WQ because now we will have two types of health
works, one for the error detection and one for the recovery.

Fixes: 89d44f0a ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: NMohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05ac2c0b

net/mlx5: Clear health sick bit when starting health poll · 2241007b

由 Mohamad Haj Yahia 提交于 10月 25, 2016

The health sick status should be cleared when we start the health poll.
This is crucial for driver reload (unload + load) in order to behave
right in case of health issue.

Fixes: fd76ee4d ('net/mlx5_core: Fix internal error detection conditions')
Signed-off-by: NMohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2241007b

net/mlx5: Change the acl enable prototype to return status · 247f139c

由 Mohamad Haj Yahia 提交于 10月 25, 2016

The Ingress/Egress ACL enable function may fail and it should return
status to its caller to avoid NULL pointer dereference.

Fixes: f942380c ('net/mlx5: E-Switch, Vport ingress/egress ACLs rules for spoofchk')
Signed-off-by: NMohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

247f139c

net/mlx5e: Unregister netdev before detaching it · 5e1e93c7

由 Mohamad Haj Yahia 提交于 10月 25, 2016

Detaching the netdev before unregistering it cause some netdev cleanup
ndos to fail because they check presence of the netdev, so we need to
unregister the netdev first.

Fixes: 26e59d80 ('net/mlx5e: Implement mlx5e interface attach/detach callbacks')
Signed-off-by: NMohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e1e93c7

net/mlx5e: Choose best nearest LRO timeout · 2b029556

由 Saeed Mahameed 提交于 10月 25, 2016

Instead of predicting the index of the wanted LRO timeout value from
hardware capabilities, look for the nearest LRO timeout value.

Fixes: 5c50368f ('net/mlx5e: Light-weight netdev open/stop')
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NMohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b029556

net/mlx5: Correctly initialize last use of flow counters · e83d6955

由 Paul Blakey 提交于 10月 25, 2016

Currently, last use timestamp is initialized to zero.
This is not the expected value by higher layers such as
when we do TC action offloading. To fix that, set it to
the current time, e.g when the counter/rule is offloaded.
This is the same behaviour of non-offloaded TC actions.

Fixes: 43a335e0 ('mlx5_core: Flow counters infrastructure')
Signed-off-by: NPaul Blakey <paulb@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e83d6955

net/mlx5: Fix autogroups groups num not decreasing · 32dba76a

由 Paul Blakey 提交于 10月 25, 2016

Autogroups groups num is increased when creating a new flow group,
but is never decreased.

Now decreasing it when deleting a flow group.

Fixes: f0d22d18 ('net/mlx5_core: Introduce flow steering autogrouped flow table')
Signed-off-by: NPaul Blakey <paulb@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32dba76a

net/mlx5: Keep autogroups list ordered · eccec8da

由 Paul Blakey 提交于 10月 25, 2016

Finding a new autogroup range is done by going over a group list
sorted by each group start index. The search is stopped after finding
the first free range. Adding the newly created group to the list is
wrongly added to the end of the list regardless of its start index as
the parameter of where to insert it is ignored.

This commit makes sure to use that unused parameter to insert
it where requested.

Fixes: f0d22d18 ('net/mlx5_core: Introduce flow steering autogrouped flow table')
Signed-off-by: NPaul Blakey <paulb@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eccec8da

net/mlx5: Always Query HCA caps after setting them · bba1574c

由 Daniel Jurgens 提交于 10月 25, 2016

Always query the HCA caps after setting them to update the capablities
data structures. Not doing so results in incorrect capabilities being
reported including max_dc, max_qp and several others.

Fixes: 59211bd3 ("net/mlx5: Split the load/unload flow into hardware
and software flows")
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bba1574c

{net, ib}/mlx5: Make cache line size determination at runtime. · b47bd6ea

由 Daniel Jurgens 提交于 10月 25, 2016

ARM 64B cache line systems have L1_CACHE_BYTES set to 128.
cache_line_size() will return the correct size.

Fixes: cf50b5efa2fe('net/mlx5_core/ib: New device capabilities
handling.')
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b47bd6ea

sctp: validate chunk len before actually using it · bf911e98

由 Marcelo Ricardo Leitner 提交于 10月 25, 2016

Andrey Konovalov reported that KASAN detected that SCTP was using a slab
beyond the boundaries. It was caused because when handling out of the
blue packets in function sctp_sf_ootb() it was checking the chunk len
only after already processing the first chunk, validating only for the
2nd and subsequent ones.

The fix is to just move the check upwards so it's also validated for the
1st chunk.
Reported-by: NAndrey Konovalov <andreyknvl@google.com>
Tested-by: NAndrey Konovalov <andreyknvl@google.com>
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf911e98

29 10月, 2016 3 次提交

Merge branch 'mlxsw-fixes' · c2e169be

由 David S. Miller 提交于 10月 28, 2016

Jiri Pirko says:

====================
mlxsw: Couple of fixes

Couple of LPM tree management fixes.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2e169be

mlxsw: spectrum_router: Compare only trees which are in use during tree get · 8b99becd

由 Jiri Pirko 提交于 10月 25, 2016

Only trees which are in use should be compared to requested prefix usage.

Fixes: 53342023 ("mlxsw: spectrum_router: Implement LPM trees management")
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8b99becd

mlxsw: spectrum_router: Save requested prefix bitlist when creating tree · 2083d367

由 Jiri Pirko 提交于 10月 25, 2016

Currently, the prefix bitlist is not saved for LPM trees, causing the
compare to always fail which causes the tree to be destroyed and created
for every inserted and removed FIB entry. So fix this by saving
the bitlist as it should have been done from the very beginning.

Fixes: 53342023 ("mlxsw: spectrum_router: Implement LPM trees management")
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2083d367

28 10月, 2016 1 次提交

net sched filters: fix notification of filter delete with proper handle · 9ee78374

由 Jamal Hadi Salim 提交于 10月 24, 2016

Daniel says:

While trying out [1][2], I noticed that tc monitor doesn't show the
correct handle on delete:

$ tc monitor
qdisc clsact ffff: dev eno1 parent ffff:fff1
filter dev eno1 ingress protocol all pref 49152 bpf handle 0x2a [...]
deleted filter dev eno1 ingress protocol all pref 49152 bpf handle 0xf3be0c80

some context to explain the above:
The user identity of any tc filter is represented by a 32-bit
identifier encoded in tcm->tcm_handle. Example 0x2a in the bpf filter
above. A user wishing to delete, get or even modify a specific filter
uses this handle to reference it.
Every classifier is free to provide its own semantics for the 32 bit handle.
Example: classifiers like u32 use schemes like 800:1:801 to describe
the semantics of their filters represented as hash table, bucket and
node ids etc.
Classifiers also have internal per-filter representation which is different
from this externally visible identity. Most classifiers set this
internal representation to be a pointer address (which allows fast retrieval
of said filters in their implementations). This internal representation
is referenced with the "fh" variable in the kernel control code.

When a user successfuly deletes a specific filter, by specifying the correct
tcm->tcm_handle, an event is generated to user space which indicates
which specific filter was deleted.

Before this patch, the "fh" value was sent to user space as the identity.
As an example what is shown in the sample bpf filter delete event above
is 0xf3be0c80. This is infact a 32-bit truncation of 0xffff8807f3be0c80
which happens to be a 64-bit memory address of the internal filter
representation (address of the corresponding filter's struct cls_bpf_prog);

After this patch the appropriate user identifiable handle as encoded
in the originating request tcm->tcm_handle is generated in the event.
One of the cardinal rules of netlink rules is to be able to take an
event (such as a delete in this case) and reflect it back to the
kernel and successfully delete the filter. This patch achieves that.

Note, this issue has existed since the original TC action
infrastructure code patch back in 2004 as found in:
https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/

[1] http://patchwork.ozlabs.org/patch/682828/
[2] http://patchwork.ozlabs.org/patch/682829/

Fixes: 4e54c4816bfe ("[NET]: Add tc extensions infrastructure.")
Reported-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ee78374

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功