提交 · d692f1138a4bac2efd2c8656ca15556b63479e82 · openanolis / cloud-kernel

31 7月, 2018 2 次提交

bpf: Support bpf_get_socket_cookie in more prog types · d692f113

由 Andrey Ignatov 提交于 7月 30, 2018

bpf_get_socket_cookie() helper can be used to identify skb that
correspond to the same socket.

Though socket cookie can be useful in many other use-cases where socket is
available in program context. Specifically BPF_PROG_TYPE_CGROUP_SOCK_ADDR
and BPF_PROG_TYPE_SOCK_OPS programs can benefit from it so that one of
them can augment a value in a map prepared earlier by other program for
the same socket.

The patch adds support to call bpf_get_socket_cookie() from
BPF_PROG_TYPE_CGROUP_SOCK_ADDR and BPF_PROG_TYPE_SOCK_OPS.

It doesn't introduce new helpers. Instead it reuses same helper name
bpf_get_socket_cookie() but adds support to this helper to accept
`struct bpf_sock_addr` and `struct bpf_sock_ops`.

Documentation in bpf.h is changed in a way that should not break
automatic generation of markdown.
Signed-off-by: NAndrey Ignatov <rdna@fb.com>
Acked-by: NYonghong Song <yhs@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

d692f113

bpf: add End.DT6 action to bpf_lwt_seg6_action helper · 486cdf21

由 Mathieu Xhonneux 提交于 7月 26, 2018

The seg6local LWT provides the End.DT6 action, which allows to
decapsulate an outer IPv6 header containing a Segment Routing Header
(SRH), full specification is available here:

https://tools.ietf.org/html/draft-filsfils-spring-srv6-network-programming-05

This patch adds this action now to the seg6local BPF
interface. Since it is not mandatory that the inner IPv6 header also
contains a SRH, seg6_bpf_srh_state has been extended with a pointer to
a possible SRH of the outermost IPv6 header. This helps assessing if the
validation must be triggered or not, and avoids some calls to
ipv6_find_hdr.

v3: s/1/true, s/0/false for boolean values
v2: - changed true/false -> 1/0
    - preempt_enable no longer called in first conditional block
Signed-off-by: NMathieu Xhonneux <m.xhonneux@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

486cdf21

25 7月, 2018 3 次提交

net/sched: add skbprio scheduler · aea5f654

由 Nishanth Devarajan 提交于 7月 23, 2018

Skbprio (SKB Priority Queue) is a queueing discipline that prioritizes packets
according to their skb->priority field. Under congestion, already-enqueued lower
priority packets will be dropped to make space available for higher priority
packets. Skbprio was conceived as a solution for denial-of-service defenses that
need to route packets with different priorities as a means to overcome DoS
attacks.

v5
*Do not reference qdisc_dev(sch)->tx_queue_len for setting limit. Instead set
default sch->limit to 64.

v4
*Drop Documentation/networking/sch_skbprio.txt doc file to move it to tc man
page for Skbprio, in iproute2.

v3
*Drop max_limit parameter in struct skbprio_sched_data and instead use
sch->limit.

*Reference qdisc_dev(sch)->tx_queue_len only once, during initialisation for
qdisc (previously being referenced every time qdisc changes).

*Move qdisc's detailed description from in-code to Documentation/networking.

*When qdisc is saturated, enqueue incoming packet first before dequeueing
lowest priority packet in queue - improves usage of call stack registers.

*Introduce and use overlimit stat to keep track of number of dropped packets.

v2
*Use skb->priority field rather than DS field. Rename queueing discipline as
SKB Priority Queue (previously Gatekeeper Priority Queue).

*Queueing discipline is made classful to expose Skbprio's internal priority
queues.
Signed-off-by: NNishanth Devarajan <ndev2021@gmail.com>
Reviewed-by: NSachin Paryani <sachin.paryani@gmail.com>
Reviewed-by: NCody Doucette <doucette@bu.edu>
Reviewed-by: NMichel Machado <michel@digirati.com.br>
Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aea5f654

net: phy: add GBit master / slave error detection · b8f8c8eb

由 Heiner Kallweit 提交于 7月 21, 2018

Certain PHY's have issues when operating in GBit slave mode and can
be forced to master mode. Examples are RTL8211C, also the Micrel PHY
driver has a DT setting to force master mode.
If two such chips are link partners the autonegotiation will fail.
Standard defines a self-clearing on read, latched-high bit to
indicate this error. Check this bit to inform the user.
Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8f8c8eb

netlink: do not store start function in netlink_cb · 3730cf4d

由 Florian Westphal 提交于 7月 24, 2018

->start() is called once when dump is being initialized, there is no
need to store it in netlink_cb.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3730cf4d

24 7月, 2018 9 次提交

rds: Extend RDS API for IPv6 support · b7ff8b10

由 Ka-Cheong Poon 提交于 7月 23, 2018

There are many data structures (RDS socket options) used by RDS apps
which use a 32 bit integer to store IP address. To support IPv6,
struct in6_addr needs to be used. To ensure backward compatibility, a
new data structure is introduced for each of those data structures
which use a 32 bit integer to represent an IP address. And new socket
options are introduced to use those new structures. This means that
existing apps should work without a problem with the new RDS module.
For apps which want to use IPv6, those new data structures and socket
options can be used. IPv4 mapped address is used to represent IPv4
address in the new data structures.

v4: Revert changes to SO_RDS_TRANSPORT
Signed-off-by: NKa-Cheong Poon <ka-cheong.poon@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b7ff8b10

net: sched: cls_flower: propagate chain teplate creation and destruction to drivers · 34738452

由 Jiri Pirko 提交于 7月 23, 2018

Introduce a couple of flower offload commands in order to propagate
template creation/destruction events down to device drivers.
Drivers may use this information to prepare HW in an optimal way
for future filter insertions.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

34738452

net: sched: introduce chain templates · 9f407f17

由 Jiri Pirko 提交于 7月 23, 2018

Allow user to set a template for newly created chains. Template lock
down the chain for particular classifier type/options combinations.
The classifier needs to support templates, otherwise kernel would
reply with error.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f407f17

net: sched: introduce chain object to uapi · 32a4f5ec

由 Jiri Pirko 提交于 7月 23, 2018

Allow user to create, destroy, get and dump chain objects. Do that by
extending rtnl commands by the chain-specific ones. User will now be
able to explicitly create or destroy chains (so far this was done only
automatically according the filter/act needs and refcounting). Also, the
user will receive notification about any chain creation or destuction.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32a4f5ec

net: sched: Avoid implicit chain 0 creation · f71e0ca4

由 Jiri Pirko 提交于 7月 23, 2018

Currently, chain 0 is implicitly created during block creation. However
that does not align with chain object exposure, creation and destruction
api introduced later on. So make the chain 0 behave the same way as any
other chain and only create it when it is needed. Since chain 0 is
somehow special as the qdiscs need to hold pointer to the first chain
tp, this requires to move the chain head change callback infra to the
block structure.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f71e0ca4

net/mlx5: FW tracer, events handling · c71ad41c

由 Feras Daoud 提交于 2月 07, 2018

The tracer has one event, event 0x26, with two subtypes:
- Subtype 0: Ownership change
- Subtype 1: Traces available

An ownership change occurs in the following cases:
1- Owner releases his ownership, in this case, an event will be
sent to inform others to reattempt acquire ownership.
2- Ownership was taken by a higher priority tool, in this case
the owner should understand that it lost ownership, and go through
tear down flow.

The second subtype indicates that there are traces in the trace buffer,
in this case, the driver polls the tracer buffer for new traces, parse
them and prepares the messages for printing.

The HW starts tracing from the first address in the tracer buffer.
Driver receives an event notifying that new trace block exists.
HW posts a timestamp event at the last 8B of every 256B block.
Comparing the timestamp to the last handled timestamp would indicate
that this is a new trace block. Once the new timestamp is detected,
the entire block is considered valid.

Block validation and parsing, should be done after copying the current
block to a different location, in order to avoid block overwritten
during processing.
Signed-off-by: NFeras Daoud <ferasda@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

c71ad41c

net/mlx5: FW tracer, implement tracer logic · f53aaa31

由 Feras Daoud 提交于 7月 16, 2018

Implement FW tracer logic and registers access, initialization and
cleanup flows.

Initializing the tracer will be part of load one flow, as multiple
PFs will try to acquire ownership but only one will succeed and will
be the tracer owner.
Signed-off-by: NFeras Daoud <ferasda@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

f53aaa31

net/smc: provide smc mode in smc_diag.c · c601171d

由 Karsten Graul 提交于 7月 23, 2018

Rename field diag_fallback into diag_mode and set the smc mode of a
connection explicitly.
Signed-off-by: NKarsten Graul <kgraul@linux.ibm.com>
Signed-off-by: NUrsula Braun <ubraun@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c601171d

net: bridge: add support for backup port · 2756f68c

由 Nikolay Aleksandrov 提交于 7月 23, 2018

This patch adds a new port attribute - IFLA_BRPORT_BACKUP_PORT, which
allows to set a backup port to be used for known unicast traffic if the
port has gone carrier down. The backup pointer is rcu protected and set
only under RTNL, a counter is maintained so when deleting a port we know
how many other ports reference it as a backup and we remove it from all.
Also the pointer is in the first cache line which is hot at the time of
the check and thus in the common case we only add one more test.
The backup port will be used only for the non-flooding case since
it's a part of the bridge and the flooded packets will be forwarded to it
anyway. To remove the forwarding just send a 0/non-existing backup port.
This is used to avoid numerous scalability problems when using MLAG most
notably if we have thousands of fdbs one would need to change all of them
on port carrier going down which takes too long and causes a storm of fdb
notifications (and again when the port comes back up). In a Multi-chassis
Link Aggregation setup usually hosts are connected to two different
switches which act as a single logical switch. Those switches usually have
a control and backup link between them called peerlink which might be used
for communication in case a host loses connectivity to one of them.
We need a fast way to failover in case a host port goes down and currently
none of the solutions (like bond) cannot fulfill the requirements because
the participating ports are actually the "master" devices and must have the
same peerlink as their backup interface and at the same time all of them
must participate in the bridge device. As Roopa noted it's normal practice
in routing called fast re-route where a precalculated backup path is used
when the main one is down.
Another use case of this is with EVPN, having a single vxlan device which
is backup of every port. Due to the nature of master devices it's not
currently possible to use one device as a backup for many and still have
all of them participate in the bridge (which is master itself).
More detailed information about MLAG is available at the link below.
https://docs.cumulusnetworks.com/display/DOCS/Multi-Chassis+Link+Aggregation+-+MLAG

Further explanation and a diagram by Roopa:
Two switches acting in a MLAG pair are connected by the peerlink
interface which is a bridge port.

the config on one of the switches looks like the below. The other
switch also has a similar config.
eth0 is connected to one port on the server. And the server is
connected to both switches.

br0 -- team0---eth0
|
-- switch-peerlink
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2756f68c

23 7月, 2018 1 次提交

nfp: bring back support for offloading shared blocks · 042f8825

由 Jakub Kicinski 提交于 7月 20, 2018

Now that we have offload replay infrastructure added by
commit 32636742 ("net: sched: call reoffload op on block callback reg")
and flows are guaranteed to be removed correctly, we can revert
commit 951a8ee6 ("nfp: reject binding to shared blocks").
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NJohn Hurley <john.hurley@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

042f8825

21 7月, 2018 4 次提交

net: create reusable function for getting ownership info of sysfs inodes · fbdeaed4

由 Tyler Hicks 提交于 7月 20, 2018

Make net_ns_get_ownership() reusable by networking code outside of core.
This is useful, for example, to allow bridge related sysfs files to be
owned by container root.

Add a function comment since this is a potentially dangerous function to
use given the way that kobject_get_ownership() works by initializing uid
and gid before calling .get_ownership().
Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fbdeaed4

driver core: set up ownership of class devices in sysfs · 9944e894

由 Dmitry Torokhov 提交于 7月 20, 2018

Plumb in get_ownership() callback for devices belonging to a class so that
they can be created with uid/gid different from global root. This will
allow network devices in a container to belong to container's root and not
global root.
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: NTyler Hicks <tyhicks@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9944e894

sysfs, kobject: allow creating kobject belonging to arbitrary users · 5f81880d

由 Dmitry Torokhov 提交于 7月 20, 2018

Normally kobjects and their sysfs representation belong to global root,
however it is not necessarily the case for objects in separate namespaces.
For example, objects in separate network namespace logically belong to the
container's root and not global root.

This change lays groundwork for allowing network namespace objects
ownership to be transferred to container's root user by defining
get_ownership() callback in ktype structure and using it in sysfs code to
retrieve desired uid/gid when creating sysfs objects for given kobject.
Co-Developed-by: NTyler Hicks <tyhicks@canonical.com>
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f81880d

kernfs: allow creating kernfs objects with arbitrary uid/gid · 488dee96

由 Dmitry Torokhov 提交于 7月 20, 2018

This change allows creating kernfs files and directories with arbitrary
uid/gid instead of always using GLOBAL_ROOT_UID/GID by extending
kernfs_create_dir_ns() and kernfs_create_file_ns() with uid/gid arguments.
The "simple" kernfs_create_file() and kernfs_create_dir() are left alone
and always create objects belonging to the global root.

When creating symlinks ownership (uid/gid) is taken from the target kernfs
object.
Co-Developed-by: NTyler Hicks <tyhicks@canonical.com>
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

488dee96

20 7月, 2018 5 次提交

Revert "iommu/vt-d: Clean up pasid quirk for pre-production devices" · 2db1581e

由 Lu Baolu 提交于 7月 08, 2018

This reverts commit ab96746a.

The commit ab96746a ("iommu/vt-d: Clean up pasid quirk for
pre-production devices") triggers ECS mode on some platforms
which have broken ECS support. As the result, graphic device
will be inoperable on boot.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107017

Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

2db1581e

qed: Add qed APIs for PHY module query. · b51dab46

由 Sudarsana Reddy Kalluru 提交于 7月 18, 2018

This patch adds qed APIs for reading the PHY module.
Signed-off-by: NSudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: NAriel Elior <ariel.elior@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b51dab46

net/sched: cls_flower: Support matching on ip tos and ttl for tunnels · 0e2c17b6

由 Or Gerlitz 提交于 7月 17, 2018

Allow users to set rules matching on ipv4 tos and ttl or
ipv6 traffic-class and hoplimit of tunnel headers.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e2c17b6

flow_dissector: Dissect tos and ttl from the tunnel info · 5544adb9

由 Or Gerlitz 提交于 7月 17, 2018

Add dissection of the tos and ttl from the ip tunnel headers
fields in case a match is needed on them.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5544adb9

net/sched: tunnel_key: Allow to set tos and ttl for tc based ip tunnels · 07a557f4

由 Or Gerlitz 提交于 7月 17, 2018

Allow user-space to provide tos and ttl to be set for the tunnel headers.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: NRoi Dayan <roid@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07a557f4

19 7月, 2018 8 次提交

ipv6: fix useless rol32 call on hash · 169dc027

由 Colin Ian King 提交于 7月 17, 2018

The rol32 call is currently rotating hash but the rol'd value is
being discarded. I believe the current code is incorrect and hash
should be assigned the rotated value returned from rol32.

Thanks to David Lebrun for spotting this.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

169dc027

net: dsa: Remove VLA usage · 0015b80a

由 Salvatore Mesoraca 提交于 7月 16, 2018

We avoid 2 VLAs by using a pre-allocated field in dsa_switch. We also
try to avoid dynamic allocation whenever possible (when using fewer than
bits-per-long ports, which is the common case).

Link: http://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Link: http://lkml.kernel.org/r/20180505185145.GB32630@lunn.chSigned-off-by: NSalvatore Mesoraca <s.mesoraca16@gmail.com>
[kees: tweak commit subject and message slightly]
Signed-off-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0015b80a

net/mlx5: Better return types for CQE API · e2abdcf1

由 Tariq Toukan 提交于 7月 16, 2018

Reduce sizes of return types.
Use bool for binary indication.
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

e2abdcf1

net/mlx5: Add core support for double vlan push/pop steering action · 8da6fe2a

由 Jianbo Liu 提交于 7月 16, 2018

As newer firmware supports double push/pop in a single FTE, we add
core bits and extend vlan action logic for it.
Signed-off-by: NJianbo Liu <jianbol@mellanox.com>
Reviewed-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

8da6fe2a

net/mlx5: Expose MPEGC (Management PCIe General Configuration) structures · 5e022dd3

由 Eran Ben Elisha 提交于 7月 16, 2018

This patch exposes PRM layout for handling MPEGC (Management PCIe
General Configuration).

This will be used in the downstream patch for configuring MPEGC via the
driver.
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

5e022dd3

net/mlx5: FW tracer, add hardware structures · eff8ea8f

由 Feras Daoud 提交于 7月 16, 2018

This change adds the infrastructure to mlx5 core fw tracer.
It introduces the following 4 new registers:
MLX5_REG_MTRC_CAP  - Used to read tracer capabilities
MLX5_REG_MTRC_CONF - Used to set tracer configurations
MLX5_REG_MTRC_STDB - Used to query tracer strings database
MLX5_REG_MTRC_CTRL - Used to control the tracer

The capability of the tracing can be checked using mcam access
register, therefore, the mcam access register interface will expose
the tracer register.
Signed-off-by: NFeras Daoud <ferasda@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>

eff8ea8f

net: Move skb decrypted field, avoid explicity copy · a48d189e

由 Stefano Brivio 提交于 7月 17, 2018

Commit 784abe24 ("net: Add decrypted field to skb")
introduced a 'decrypted' field that is explicitly copied on skb
copy and clone.

Move it between headers_start[0] and headers_end[0], so that we
don't need to copy it explicitly as it's copied by the memcpy()
in __copy_skb_header().

While at it, drop the assignment in __skb_clone(), it was
already redundant.

This doesn't change the size of sk_buff or cacheline boundaries.

The 15-bits hole before tc_index becomes a 14-bits hole, and
will be again a 15-bits hole when this change is merged with
commit 8b700862 ("net: Don't copy pfmemalloc flag in
__copy_skb_header()").

v2: as reported by kbuild test robot (oops, I forgot to build
    with CONFIG_TLS_DEVICE it seems), we can't use
    CHECK_SKB_FIELD() on a bit-field member. Just drop the
    check for the moment being, perhaps we could think of some
    magic to also check bit-field members one day.

Fixes: 784abe24 ("net: Add decrypted field to skb")
Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a48d189e

PCI: OF: Fix I/O space page leak · a5fb9fb0

由 Sergei Shtylyov 提交于 7月 18, 2018

When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY
driver was left disabled, the kernel crashed with this BUG:

  kernel BUG at lib/ioremap.c:72!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092
  Hardware name: Renesas Condor board based on r8a77980 (DT)
  Workqueue: events deferred_probe_work_func
  pstate: 80000005 (Nzcv daif -PAN -UAO)
  pc : ioremap_page_range+0x370/0x3c8
  lr : ioremap_page_range+0x40/0x3c8
  sp : ffff000008da39e0
  x29: ffff000008da39e0 x28: 00e8000000000f07
  x27: ffff7dfffee00000 x26: 0140000000000000
  x25: ffff7dfffef00000 x24: 00000000000fe100
  x23: ffff80007b906000 x22: ffff000008ab8000
  x21: ffff000008bb1d58 x20: ffff7dfffef00000
  x19: ffff800009c30fb8 x18: 0000000000000001
  x17: 00000000000152d0 x16: 00000000014012d0
  x15: 0000000000000000 x14: 0720072007200720
  x13: 0720072007200720 x12: 0720072007200720
  x11: 0720072007300730 x10: 00000000000000ae
  x9 : 0000000000000000 x8 : ffff7dffff000000
  x7 : 0000000000000000 x6 : 0000000000000100
  x5 : 0000000000000000 x4 : 000000007b906000
  x3 : ffff80007c61a880 x2 : ffff7dfffeefffff
  x1 : 0000000040000000 x0 : 00e80000fe100f07
  Process kworker/0:1 (pid: 39, stack limit = 0x        (ptrval))
  Call trace:
   ioremap_page_range+0x370/0x3c8
   pci_remap_iospace+0x7c/0xac
   pci_parse_request_of_pci_ranges+0x13c/0x190
   rcar_pcie_probe+0x4c/0xb04
   platform_drv_probe+0x50/0xbc
   driver_probe_device+0x21c/0x308
   __device_attach_driver+0x98/0xc8
   bus_for_each_drv+0x54/0x94
   __device_attach+0xc4/0x12c
   device_initial_probe+0x10/0x18
   bus_probe_device+0x90/0x98
   deferred_probe_work_func+0xb0/0x150
   process_one_work+0x12c/0x29c
   worker_thread+0x200/0x3fc
   kthread+0x108/0x134
   ret_from_fork+0x10/0x18
  Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000)

It turned out that pci_remap_iospace() wasn't undone when the driver's
probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER,
the probe was retried, finally causing the BUG due to trying to remap
already remapped pages.

Introduce the devm_pci_remap_iospace() managed API and replace the
pci_remap_iospace() call with it to fix the bug.

Fixes: dbf9826d ("PCI: generic: Convert to DT resource parsing API")
Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
[lorenzo.pieralisi@arm.com: split commit/updated the commit log]
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NLinus Walleij <linus.walleij@linaro.org>

a5fb9fb0

18 7月, 2018 8 次提交

bpf: offload: allow program and map sharing per-ASIC · fd4f227d

由 Jakub Kicinski 提交于 7月 17, 2018

Allow programs and maps to be re-used across different netdevs,
as long as they belong to the same struct bpf_offload_dev.
Update the bpf_offload_prog_map_match() helper for the verifier
and export a new helper for the drivers to use when checking
programs at attachment time.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

fd4f227d

bpf: offload: keep the offload state per-ASIC · 602144c2

由 Jakub Kicinski 提交于 7月 17, 2018

Create a higher-level entity to represent a device/ASIC to allow
programs and maps to be shared between device ports.  The extra
work is required to make sure we don't destroy BPF objects as
soon as the netdev for which they were loaded gets destroyed,
as other ports may still be using them.  When netdev goes away
all of its BPF objects will be moved to other netdevs of the
device, and only destroyed when last netdev is unregistered.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

602144c2

bpf: offload: aggregate offloads per-device · 9fd7c555

由 Jakub Kicinski 提交于 7月 17, 2018

Currently we have two lists of offloaded objects - programs and maps.
Netdevice unregister notifier scans those lists to orphan objects
associated with device being unregistered.  This puts unnecessary
(even if negligible) burden on all netdev unregister calls in BPF-
-enabled kernel.  The lists of objects may potentially get long
making the linear scan even more problematic.  There haven't been
complaints about this mechanisms so far, but it is suboptimal.

Instead of relying on notifiers, make the few BPF-capable drivers
register explicitly for BPF offloads.  The programs and maps will
now be collected per-device not on a global list, and only scanned
for removal when driver unregisters from BPF offloads.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

9fd7c555

bpf: offload: rename bpf_offload_dev_match() to bpf_offload_prog_map_match() · 09728266

由 Jakub Kicinski 提交于 7月 17, 2018

A set of new API functions exported for the drivers will soon use
'bpf_offload_dev_' as a prefix.  Rename the bpf_offload_dev_match()
which is internal to the core (used by the verifier) to avoid any
confusion.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

09728266

bpf: bpf_prog_array_alloc() should return a generic non-rcu pointer · d29ab6e1

由 Roman Gushchin 提交于 7月 13, 2018

Currently the return type of the bpf_prog_array_alloc() is
struct bpf_prog_array __rcu *, which is not quite correct.
Obviously, the returned pointer is a generic pointer, which
is valid for an indefinite amount of time and it's not shared
with anyone else, so there is no sense in marking it as __rcu.

This change eliminate the following sparse warnings:
kernel/bpf/core.c:1544:31: warning: incorrect type in return expression (different address spaces)
kernel/bpf/core.c:1544:31: expected struct bpf_prog_array [noderef] <asn:4>*
kernel/bpf/core.c:1544:31: got void *
kernel/bpf/core.c:1548:17: warning: incorrect type in return expression (different address spaces)
kernel/bpf/core.c:1548:17: expected struct bpf_prog_array [noderef] <asn:4>*
kernel/bpf/core.c:1548:17: got struct bpf_prog_array *<noident>
kernel/bpf/core.c:1681:15: warning: incorrect type in assignment (different address spaces)
kernel/bpf/core.c:1681:15: expected struct bpf_prog_array *array
kernel/bpf/core.c:1681:15: got struct bpf_prog_array [noderef] <asn:4>*

Fixes: 324bda9e ("bpf: multi program support for cgroup+bpf")
Signed-off-by: NRoman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

d29ab6e1

netfilter: nf_osf: add missing definitions to header file · 24c458c4

由 Fernando Fernandez Mancera 提交于 7月 14, 2018

Add missing definitions from nf_osf.h in order to extract Passive OS
fingerprint infrastructure from xt_osf.
Signed-off-by: NFernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

24c458c4

ipv6: remove dependency of nf_defrag_ipv6 on ipv6 module · 70b095c8

由 Florian Westphal 提交于 7月 14, 2018

IPV6=m
DEFRAG_IPV6=m
CONNTRACK=y yields:

net/netfilter/nf_conntrack_proto.o: In function `nf_ct_netns_do_get':
net/netfilter/nf_conntrack_proto.c:802: undefined reference to `nf_defrag_ipv6_enable'
net/netfilter/nf_conntrack_proto.o:(.rodata+0x640): undefined reference to `nf_conntrack_l4proto_icmpv6'

Setting DEFRAG_IPV6=y causes undefined references to ip6_rhash_params
ip6_frag_init and ip6_expire_frag_queue so it would be needed to force
IPV6=y too.

This patch gets rid of the 'followup linker error' by removing
the dependency of ipv6.ko symbols from netfilter ipv6 defrag.

Shared code is placed into a header, then used from both.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

70b095c8

netfilter: nft_socket: Expose socket mark · 7d25f885

由 Máté Eckl 提交于 7月 12, 2018

Signed-off-by: NMáté Eckl <ecklm94@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

7d25f885

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功