提交 · 8e8dddba72d81b78742b002bebc3cce1b23d3f84 · openeuler / raspberrypi-kernel

22 6月, 2017 2 次提交

qed: Cleanup qed_roce before duplicating it · 8e8dddba

由 Kalderon, Michal 提交于 6月 21, 2017

The next patch in the series will duplicate qed_roce as part
of code preprations for iWARP support. Do some cleanup before
duplicating
Signed-off-by: NMichal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e8dddba

bpf: expose prog id for cls_bpf and act_bpf · e8628307

由 Daniel Borkmann 提交于 6月 21, 2017

In order to be able to retrieve the attached programs from cls_bpf
and act_bpf, we need to expose the prog ids via netlink so that
an application can later on get an fd based on the id through the
BPF_PROG_GET_FD_BY_ID command, and dump related prog info via
BPF_OBJ_GET_INFO_BY_FD command for bpf(2).
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e8628307

21 6月, 2017 38 次提交

sock: avoid dirtying incoming_cpu if not needed · 34cfb542

由 Paolo Abeni 提交于 6月 21, 2017

for connected socket, the incoming_cpu field in the sock struct
is not going to change frequently, but we are setting it
unconditionally for each packet.

Since sk_incoming_cpu and sk_flags share the same cacheline,
and the latter is access by udp_recvmsg(), this cause a cache
miss for each packet for UDP connected socket.

With this patch, we set the incoming cpu field only when the
ingress cpu really changes.

This gives a small but measurable performance improvement for
connected UDP socket.
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

34cfb542

net: introduce SO_PEERGROUPS getsockopt · 28b5ba2a

由 David Herrmann 提交于 6月 21, 2017

This adds the new getsockopt(2) option SO_PEERGROUPS on SOL_SOCKET to
retrieve the auxiliary groups of the remote peer. It is designed to
naturally extend SO_PEERCRED. That is, the underlying data is from the
same credentials. Regarding its syntax, it is based on SO_PEERSEC. That
is, if the provided buffer is too small, ERANGE is returned and @optlen
is updated. Otherwise, the information is copied, @optlen is set to the
actual size, and 0 is returned.

While SO_PEERCRED (and thus `struct ucred') already returns the primary
group, it lacks the auxiliary group vector. However, nearly all access
controls (including kernel side VFS and SYSVIPC, but also user-space
polkit, DBus, ...) consider the entire set of groups, rather than just
the primary group. But this is currently not possible with pure
SO_PEERCRED. Instead, user-space has to work around this and query the
system database for the auxiliary groups of a UID retrieved via
SO_PEERCRED.

Unfortunately, there is no race-free way to query the auxiliary groups
of the PID/UID retrieved via SO_PEERCRED. Hence, the current user-space
solution is to use getgrouplist(3p), which itself falls back to NSS and
whatever is configured in nsswitch.conf(3). This effectively checks
which groups we *would* assign to the user if it logged in *now*. On
normal systems it is as easy as reading /etc/group, but with NSS it can
resort to quering network databases (eg., LDAP), using IPC or network
communication.

Long story short: Whenever we want to use auxiliary groups for access
checks on IPC, we need further IPC to talk to the user/group databases,
rather than just relying on SO_PEERCRED and the incoming socket. This
is unfortunate, and might even result in dead-locks if the database
query uses the same IPC as the original request.

So far, those recursions / dead-locks have been avoided by using
primitive IPC for all crucial NSS modules. However, we want to avoid
re-inventing the wheel for each NSS module that might be involved in
user/group queries. Hence, we would preferably make DBus (and other IPC
that supports access-management based on groups) work without resorting
to the user/group database. This new SO_PEERGROUPS ioctl would allow us
to make dbus-daemon work without ever calling into NSS.

Cc: Michal Sekletar <msekleta@redhat.com>
Cc: Simon McVittie <simon.mcvittie@collabora.co.uk>
Reviewed-by: NTom Gundersen <teg@jklm.no>
Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

28b5ba2a

udp: prefetch rmem_alloc in udp_queue_rcv_skb() · dd99e425

由 Paolo Abeni 提交于 6月 21, 2017

On UDP packets processing, if the BH is the bottle-neck, it
always sees a cache miss while updating rmem_alloc; try to
avoid it prefetching the value as soon as we have the socket
available.

Performances under flood with multiple NIC rx queues used are
unaffected, but when a single NIC rx queue is in use, this
gives ~10% performance improvement.
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd99e425

qede: Fix compilation without QED_RDMA · da2e9cf0

由 Chad Dupuis 提交于 6月 21, 2017

When CONFIG_QED_RDMA isn't defined, we'd hit the following:

 /include/linux/qed/qede_rdma.h:84:19:
 warning: ‘qede_rdma_dev_add’ used but never defined [enabled by default]
 static inline int qede_rdma_dev_add(struct qede_dev *dev);

Fixes: bbfcd1e8 ("qed*: Set rdma generic functions prefix")
Signed-off-by: NChad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da2e9cf0

r8152: correct the definition · b65c0c9b

由 hayeswang 提交于 6月 21, 2017

Replace VLAN_HLEN and CRC_SIZE with ETH_FCS_LEN.
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b65c0c9b

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · adff41f9

由 David S. Miller 提交于 6月 21, 2017

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2017-06-20

This series contains updates to i40e and i40evf only.

Björn adds additional XDP support for i40e, by adding pass and drop actions
and XDP_TX action support.

Jake fixes a possible NULL pointer dereference in
i40evf_get_ethtool_stats() which could occur if the VF fails to recover
from a reset, and then a user requests statistics.  Changed the use of
dev_info() to dev_dbg() for vf_capability client routine so that the
standard log is not spammed with this information which "might" cause
administrators to worry.  Also added more code comments to help explain
why udp_port has be in host byte order and to avoid future changes which
may cause this to break.  Fixed the holding of the RTNL lock for the
entire reset routine, reduced the scope so that the reset function will
handle its own lock, so that we do not have to wrap every reference
to i40e_do_reset() with RTNL lock/unlock.

Alice updates flags related to firmware interactions for WoL and admin
queue command address with the correct value.

Sudheer makes a fix to ensure that the array is not accessed past the
size of the array.

Greg fixes the parsing of firmware 4.33 admin queue commmand "Get CEE
DCBX PER CFG" because the firmware now creates the oper_prio_tc nibbles
reversed from those in the CDD Priority Group sub-TLV.

Carolyn adds a check and message to let users know that when in MFP mode,
changing RSS hash input set is not supported.

Shannon makes the partition bandwidth control more generic since it is not
in just one form of multi-function partitioning (MFP).  Also fixes a bug
which was causing the firmware confusion in some reset sequences, when
we were disabling interrupts and we were clearing the whole register.
Instead we should only be clearing the CAUSE_ENA bit when disabling
interrupts.

Filip adds support for OEM firmware version, so that if a OEM specific
adapter is detected, ethtool reports the OEM product version in the
firmware version string instead of etrack id.

Alan fixes a bug where the driver was not correctly exiting overflow
promiscuous mode, which can happen if "too many" MAC filters are added,
putting the driver into overflow promiscuous mode, and the filters are
then removed.  The bug occurs because the conditional for toggling
promiscuous mode was only be executed when enabled and not when it was
disabled.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

adff41f9

Merge branch 'ipmr-ip6mr-add-Netlink-notifications-on-cache-reports' · 2f89a795

由 David S. Miller 提交于 6月 21, 2017

Julien Gomes says:

====================
ipmr/ip6mr: add Netlink notifications on cache reports

Currently, all ipmr/ip6mr cache reports are sent through the
mroute/mroute6 socket only.
This forces the use of a single socket for mroute programming, cache
reports and, regarding ipmr, IGMP messages without Router Alert option
reception.

The present patches are aiming to send Netlink notifications in addition
to the existing igmpmsg/mrt6msg to give user programs a way to handle
cache reports in parallel with multiple sockets other than the
mroute/mroute6 socket.

Changes in v2:
- Changed attributes naming from {IPMRA,IP6MRA}_CACHEREPORTA_* to
  {IPMRA,IP6MRA}_CREPORT_*
- Improved packet data copy to handle non-linear packets in
  ipmr/ip6mr cache report Netlink notification creation
- Added two rtnetlink groups with restricted-binding
- Changed cache report notified groups from RTNL_{IPV4,IPV6}_MROUTE to
  the new restricted groups in ipmr/ip6mr

Changes in v3:
- Put message size calculation for {igmp,mrt6}msg_netlink_event in separate
  functions
- Increased vif id attributes size from u8 to u32
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f89a795

ip6mr: add netlink notifications on mrt6msg cache reports · dd12d15c

由 Julien Gomes 提交于 6月 20, 2017

Add Netlink notifications on cache reports in ip6mr, in addition to the
existing mrt6msg sent to mroute6_sk.
Send RTM_NEWCACHEREPORT notifications to RTNLGRP_IPV6_MROUTE_R.

MSGTYPE, MIF_ID, SRC_ADDR and DST_ADDR Netlink attributes contain the
same data as their equivalent fields in the mrt6msg header.
PKT attribute is the packet sent to mroute6_sk, without the added
mrt6msg header.
Suggested-by: NRyan Halbrook <halbrook@arista.com>
Signed-off-by: NJulien Gomes <julien@arista.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd12d15c

ipmr: add netlink notifications on igmpmsg cache reports · 5a645dd8

由 Julien Gomes 提交于 6月 20, 2017

Add Netlink notifications on cache reports in ipmr, in addition to the
existing igmpmsg sent to mroute_sk.
Send RTM_NEWCACHEREPORT notifications to RTNLGRP_IPV4_MROUTE_R.

MSGTYPE, VIF_ID, SRC_ADDR and DST_ADDR Netlink attributes contain the
same data as their equivalent fields in the igmpmsg header.
PKT attribute is the packet sent to mroute_sk, without the added igmpmsg
header.
Suggested-by: NRyan Halbrook <halbrook@arista.com>
Signed-off-by: NJulien Gomes <julien@arista.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a645dd8

rtnetlink: add restricted rtnl groups for ipv4 and ipv6 mroute · 5f729eaa

由 Julien Gomes 提交于 6月 20, 2017

Add RTNLGRP_{IPV4,IPV6}_MROUTE_R as two new restricted groups for the
NETLINK_ROUTE family.
Binding to these groups specifically requires CAP_NET_ADMIN to allow
multicast of sensitive messages (e.g. mroute cache reports).
Suggested-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NJulien Gomes <julien@arista.com>
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f729eaa

rtnetlink: add NEWCACHEREPORT message type · 94df30a6

由 Julien Gomes 提交于 6月 20, 2017

New NEWCACHEREPORT message type to be used for cache reports sent
via Netlink, effectively allowing splitting cache report reception from
mroute programming.
Suggested-by: NRyan Halbrook <halbrook@arista.com>
Signed-off-by: NJulien Gomes <julien@arista.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94df30a6

tcp: md5: hide unused variable · 083a0326

由 Arnd Bergmann 提交于 6月 20, 2017

Changing from a memcpy to per-member comparison left the
size variable unused:

net/ipv4/tcp_ipv4.c: In function 'tcp_md5_do_lookup':
net/ipv4/tcp_ipv4.c:910:15: error: unused variable 'size' [-Werror=unused-variable]

This does not show up when CONFIG_IPV6 is enabled, but the
variable can be removed either way, along with the now unused
assignment.

Fixes: 6797318e ("tcp: md5: add an address prefix for key lookup")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

083a0326

idsn: fix wrong skb_put() used · e1d20e22

由 yuan linyu 提交于 6月 21, 2017

in my commit b952f4df,
-	*(u8 *)skb_put(skb_out, 1) = (u8)(accm >> 24);	\
+	skb_put(skb_out, (u8)(accm >> 24));	\
it should skb_put_u8()

Fixes: b952f4df ("net: manual clean code which call skb_put_[data:zero])")
Signed-off-by: Nyuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1d20e22

i40e: don't hold RTNL lock for the entire reset · dfc4ff64

由 Jacob Keller 提交于 6月 07, 2017

We recently refactored i40e_do_reset() and its friends to be able to
hold the RTNL lock only for the portions that actually need to be
protected. However, a separate refactoring added several new callers of
these functions during the PCIe error recovery and suspend/resume
cycles.

When merging the changes together, it was not noticed that we could
reduce the RTNL scope by letting the reset function handle the lock
itself, as previously it was not possible.

Fix this by replacing these call sites to indicate that the reset
function should handle its own lock. This enables multiple PFs to reset
or resume simultaneously without serializing the resets via the RTNL
lock. The end result is that on systems with lots of PFs and VFs the
resets don't stall waiting for each other to finish.

It is probable that we can also do the same for i40e_do_reset_safe, but
this author did not research that change carefully enough to be
confident.
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

dfc4ff64

i40e: Handle PE_CRITERR properly with IWARP enabled · 7642984b

由 Catherine Sullivan 提交于 6月 07, 2017

When IWARP is enabled, we weren't clearing the PE_CRITERR, just logging
it and removing it from the mask. We need to do a corer to reset the
PE_CRITERR register, so set the bit for that as we handle the
interrupt.

We should also be checking for the error against the PFINT_ICR0 register,
and only need to clear it in the value getting written to
PFINT_ICR0_ENA.
Signed-off-by: NCatherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

7642984b

i40e: clear only cause_ena bit · 2e5c26ea

由 Shannon Nelson 提交于 6月 07, 2017

When disabling interrupts, we should only be clearing the CAUSE_ENA bit,
not clearing the whole register.  Clearing the whole register sets the
NEXTQ_IDX field to 0 instead of 0x7ff which can confuse the Firmware in
some reset sequences.
Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

2e5c26ea

i40e: fix disabling overflow promiscuous mode · e5887239

由 Alan Brady 提交于 6月 07, 2017

There exists a bug in which the driver does not correctly exit overflow
promiscuous mode. This can occur if "too many" mac filters are added,
putting the driver into overflow promiscuous mode, and the filters are
then removed. When the failed filters are removed, the driver reports
exiting overflow promiscuous mode which is correct, however traffic
continues to be received as if in promiscuous mode still.

The bug occurs because the conditional for toggling promiscuous mode was
set to only execute when promiscuous mode was enabled and not when it
was disabled as well. This patch fixes the conditional to correctly
execute when promiscuous mode is toggled and not just enabled. Without
this patch, the driver is unable to correctly exit overflow promiscuous
mode.
Signed-off-by: NAlan Brady <alan.brady@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

e5887239

i40e: Add support for OEM firmware version · 5bbb2e20

由 Filip Sadowski 提交于 6月 07, 2017

This patch adds support for OEM firmware version. If OEM specific
adapter is detected ethtool reports OEM product version in firmware
version string instead of etrack id.
Signed-off-by: NFilip Sadowski <filip.sadowski@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

5bbb2e20

i40e: genericize the partition bandwidth control · 4fc8c676

由 Shannon Nelson 提交于 6月 07, 2017

Partition bandwidth control is not in just one form of MFP (multi-function
partitioning), so make the code more generic and be sure to nudge the Tx
scheduler for all MFP.

Copyright updated to 2017.
Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

4fc8c676

i40e: Add message for unsupported MFP mode · 83d14c59

由 Carolyn Wyborny 提交于 6月 07, 2017

This patch adds a check and message if the device is in
MFP mode as changing RSS input set is not supported in
MFP mode.
Signed-off-by: NCarolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

83d14c59

i40e: Support firmware CEE DCB UP to TC map re-definition · 68fb13a7

由 Greg Bowers 提交于 6月 07, 2017

Changes parsing of FW 4.33 AQ command Get CEE DCBX OPER CFG (0x0A07).
Change is required because FW now creates the oper_prio_tc
nibbles reversed from those in the CEE Priority Group sub-TLV.
This change will only apply to FW 4.33 as future FW versions will use a
different function to parse the CEE data.
Signed-off-by: NGreg Bowers <gregory.j.bowers@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

68fb13a7

i40e: Fix potential out of bound array access · 1e998547

由 Sudheer Mogilappagari 提交于 6月 07, 2017

This is a fix for the static code analysis issue where dcbcfg->numapps
could be greater than size of array (i.e dcbcfg->app[I40E_DCBX_MAX_APPS]).
The fix makes sure that the array is not accessed past the size of
of the array (i.e. I40E_DCBX_MAX_APPS).

Copyright updated to 2017.
Signed-off-by: NSudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

1e998547

i40e: comment that udp_port must be in host byte order · 15d23b4c

由 Jacob Keller 提交于 6月 07, 2017

The firmware expects the port number passed when setting up
the UDP tunnel configuration to be in Little Endian format.
The i40e_aq_add_udp_tunnel command byte swaps the value from
host order to Little Endian.

Since commit fe0b0cd9 ("i40e: send correct port number to
AdminQ when enabling UDP tunnels") we've correctly
sent the value in host order.

Let's also add a comment to the function explaining that it must
be in host order, as the port numbers are commonly stored as Big
Endian values.
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

15d23b4c

i40e: use dev_dbg instead of dev_info when warning about missing routine · 59e331e3

由 Jacob Keller 提交于 6月 07, 2017

When searching for the vf_capability client routine, dev_info() was
used, instead of the normal dev_dbg(). This causes the message to be
displayed at standard log levels which can cause administrators to
worry. Avoid this by using dev_dbg instead.

Copyright updated to 2017.
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

59e331e3

i40e/i40evf: update WOL and I40E_AQC_ADDR_VALID_MASK flags · 7c32b1e6

由 Alice Michael 提交于 6月 07, 2017

Update a few flags related to FW interactions.

Copyright updated to 2017.
Signed-off-by: NAlice Michael <alice.michael@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

7c32b1e6

i40evf: assign num_active_queues inside i40evf_alloc_queues · 65c7006f

由 Jacob Keller 提交于 6月 07, 2017

The variable num_active_queues represents the number of active queues we
have for the device. We assign this pretty early in i40evf_init_subtask.

Several code locations are written with loops over the tx_rings and
rx_rings structures, which don't get allocated until
i40evf_alloc_queues, and which get freed by i40evf_free_queues.

These call sites were written under the assumption that tx_rings and
rx_rings would always be allocated at least when num_active_queues is
non-zero.

Lets fix this by moving the assignment into the function where we
allocate queues. We'll use a temporary variable for storage so that we
don't assign the value in the adapter structure until after the rings
have been set up.

Finally, when we free the queues, we'll clear the value to ensure that
we do not loop over the rings memory that no longer exists.

This resolves a possible NULL pointer dereference in
i40evf_get_ethtool_stats which could occur if the VF fails to recover
from a reset, and then a user requests statistics.
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

65c7006f

i40e: add support for XDP_TX action · 74608d17

由 Björn Töpel 提交于 5月 24, 2017

This patch adds proper XDP_TX action support. For each Tx ring, an
additional XDP Tx ring is allocated and setup. This version does the
DMA mapping in the fast-path, which will penalize performance for
IOMMU enabled systems. Further, debugfs support is not wired up for
the XDP Tx rings.
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

74608d17

i40e: add XDP support for pass and drop actions · 0c8493d9

由 Björn Töpel 提交于 5月 24, 2017

This commit adds basic XDP support for i40e derived NICs. All XDP
actions will end up in XDP_DROP.
Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

0c8493d9

Merge tag 'mlx5-updates-2017-06-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · f5c30647

由 David S. Miller 提交于 6月 20, 2017

Saeed Mahameed says:

====================
mlx5-updates-2017-06-20 (mlx5 IPoIB updates)

This series includes updates to mlx5 IPoIB netdevice driver (mlx5i),

1. We move ipoib files into separate directory, to allow it to grow
   separately in its own space
2. Remove HW update carrier logic from IPoIB and VF representors profiles.
3. Add basic ethtool support. (Rings options/statistics and driver info).
4. Change MTU support.
5. Xmit path statistics reporting.
6. add PTP support.

For the new ethtool ops, PTP (ioctl) and change_mtu ndos in IPoIB, we didn't add new
implementation or new logic, we only reused those callbacks from the already existing
mlx5e (ethernet netdevice profile) and exposed them in IPoIB netdevice/ethtool ops.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5c30647

Merge branch 's390-net-updates-part-2' · 8abd5599

由 David S. Miller 提交于 6月 20, 2017

Julian Wiedmann says:

====================
s390/net updates, part 2 (v2)

thanks for the feedback. Here's an updated patchset that honours
the reverse christmas tree and drops the __packed attribute. Please apply.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8abd5599

s390/qeth: use diag26c to get MAC address on L2 · ec61bd2f

由 Julian Wiedmann 提交于 6月 20, 2017

When a s390 guest runs on a z/VM host that's part of a SSI cluster,
it can be migrated to a different host. In this case, the MAC address
it originally obtained on the old host may be re-assigned to a new
guest. This would result in address conflicts between the two guests.

When running as z/VM guest, use the diag26c MAC Service to obtain
a hypervisor-managed MAC address. The MAC Service is SSI-aware, and
won't re-assign the address after the guest is migrated to a new host.

This patch adds support for the z/VM MAC Service on L2 devices.
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Acked-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec61bd2f

s390/diag: add diag26c support · 1b030478

由 Julian Wiedmann 提交于 6月 20, 2017

Implement support for the hypervisor diagnose 0x26c
('Access Certain System Information').
It passes a request buffer and a subfunction code, and receives
a response buffer and a return code.

Also add the scaffolding for the 'MAC Services' subfunction.
It may be used by network devices to obtain a hypervisor-managed
MAC address.
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b030478

s390/qeth: fix packing buffer statistics · 3cdc8a25

由 Julian Wiedmann 提交于 6月 20, 2017

There's two spots in qeth_send_packet() where we don't accurately
account for transmitted packing buffers in qeth's performance
statistics:

1) when flushing the current buffer due to insufficient size,
   and the next buffer is not EMPTY, we need to account for that
   flushed buffer.
2) when synchronizing with the TX completion code, we reset
   flush_count and thus forget to account for any previously
   flushed buffers.
Reported-by: NNils Hoppmann <niho@de.ibm.com>
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3cdc8a25

s390/qeth: add ipa return codes for bridgeport · 2063a5f5

由 Kittipon Meesompop 提交于 6月 20, 2017

add ipa return codes for Bridgeport (HiperSockets and OSA) according to
system level design.
Signed-off-by: NKittipon Meesompop <kmeesomp@linux.vnet.ibm.com>
Reviewed-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2063a5f5

sctp: handle errors when updating asoc · 5ee8aa68

由 Xin Long 提交于 6月 20, 2017

It's a bad thing not to handle errors when updating asoc. The memory
allocation failure in any of the functions called in sctp_assoc_update()
would cause sctp to work unexpectedly.

This patch is to fix it by aborting the asoc and reporting the error when
any of these functions fails.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ee8aa68

sctp: uncork the old asoc before changing to the new one · 8cd5c25f

由 Xin Long 提交于 6月 20, 2017

local_cork is used to decide if it should uncork asoc outq after processing
some cmds, and it is set when replying or sending msgs. local_cork should
always have the same value with current asoc q->cork in some way.

The thing is when changing to a new asoc by cmd SET_ASOC, local_cork may
not be consistent with the current asoc any more. The cmd seqs can be:

  SCTP_CMD_UPDATE_ASSOC (asoc)
  SCTP_CMD_REPLY (asoc)
  SCTP_CMD_SET_ASOC (new_asoc)
  SCTP_CMD_DELETE_TCB (new_asoc)
  SCTP_CMD_SET_ASOC (asoc)
  SCTP_CMD_REPLY (asoc)

The 1st REPLY makes OLD asoc q->cork and local_cork both are 1, and the cmd
DELETE_TCB clears NEW asoc q->cork and local_cork. After asoc goes back to
OLD asoc, q->cork is still 1 while local_cork is 0. The 2nd REPLY will not
set local_cork because q->cork is already set and it can't be uncorked and
sent out because of this.

To keep local_cork consistent with the current asoc q->cork, this patch is
to uncork the old asoc if local_cork is set before changing to the new one.

Note that the above cmd seqs will be used in the next patch when updating
asoc and handling errors in it.
Suggested-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8cd5c25f

dccp: call inet_add_protocol after register_pernet_subsys in dccp_v6_init · a0f9a4c2

由 Xin Long 提交于 6月 20, 2017

Patch "call inet_add_protocol after register_pernet_subsys in dccp_v4_init"
fixed a null pointer dereference issue for dccp_ipv4 module.

The same fix is needed for dccp_ipv6 module.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0f9a4c2

dccp: call inet_add_protocol after register_pernet_subsys in dccp_v4_init · d5494acb

由 Xin Long 提交于 6月 20, 2017

Now dccp_ipv4 works as a kernel module. During loading this module, if
one dccp packet is being recieved after inet_add_protocol but before
register_pernet_subsys in which v4_ctl_sk is initialized, a null pointer
dereference may be triggered because of init_net.dccp.v4_ctl_sk is 0x0.

Jianlin found this issue when the following call trace occurred:

[  171.950177] BUG: unable to handle kernel NULL pointer dereference at 0000000000000110
[  171.951007] IP: [<ffffffffc0558364>] dccp_v4_ctl_send_reset+0xc4/0x220 [dccp_ipv4]
[...]
[  171.984629] Call Trace:
[  171.984859]  <IRQ>
[  171.985061]
[  171.985213]  [<ffffffffc0559a53>] dccp_v4_rcv+0x383/0x3f9 [dccp_ipv4]
[  171.985711]  [<ffffffff815ca054>] ip_local_deliver_finish+0xb4/0x1f0
[  171.986309]  [<ffffffff815ca339>] ip_local_deliver+0x59/0xd0
[  171.986852]  [<ffffffff810cd7a4>] ? update_curr+0x104/0x190
[  171.986956]  [<ffffffff815c9cda>] ip_rcv_finish+0x8a/0x350
[  171.986956]  [<ffffffff815ca666>] ip_rcv+0x2b6/0x410
[  171.986956]  [<ffffffff810c83b4>] ? task_cputime+0x44/0x80
[  171.986956]  [<ffffffff81586f22>] __netif_receive_skb_core+0x572/0x7c0
[  171.986956]  [<ffffffff810d2c51>] ? trigger_load_balance+0x61/0x1e0
[  171.986956]  [<ffffffff81587188>] __netif_receive_skb+0x18/0x60
[  171.986956]  [<ffffffff8158841e>] process_backlog+0xae/0x180
[  171.986956]  [<ffffffff8158799d>] net_rx_action+0x16d/0x380
[  171.986956]  [<ffffffff81090b7f>] __do_softirq+0xef/0x280
[  171.986956]  [<ffffffff816b6a1c>] call_softirq+0x1c/0x30

This patch is to move inet_add_protocol after register_pernet_subsys in
dccp_v4_init, so that v4_ctl_sk is initialized before any incoming dccp
packets are processed.
Reported-by: NJianlin Shi <jishi@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5494acb