提交 · d8ba8ba4c801b794f47852a6f1821ea48f83b5d1 · openeuler / Kernel

30 11月, 2022 1 次提交

KVM: x86/xen: Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured · d8ba8ba4

由 David Woodhouse 提交于 11月 27, 2022

Closer inspection of the Xen code shows that we aren't supposed to be
using the XEN_RUNSTATE_UPDATE flag unconditionally. It should be
explicitly enabled by guests through the HYPERVISOR_vm_assist hypercall.
If we randomly set the top bit of ->state_entry_time for a guest that
hasn't asked for it and doesn't expect it, that could make the runtimes
fail to add up and confuse the guest. Without the flag it's perfectly
safe for a vCPU to read its own vcpu_runstate_info; just not for one
vCPU to read *another's*.

I briefly pondered adding a word for the whole set of VMASST_TYPE_*
flags but the only one we care about for HVM guests is this, so it
seemed a bit pointless.
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20221127122210.248427-3-dwmw2@infradead.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d8ba8ba4

23 11月, 2022 2 次提交

KVM: s390: pv: add KVM_CAP_S390_PROTECTED_ASYNC_DISABLE · 8c516b25

由 Claudio Imbrenda 提交于 11月 11, 2022

Add KVM_CAP_S390_PROTECTED_ASYNC_DISABLE to signal that the
KVM_PV_ASYNC_DISABLE and KVM_PV_ASYNC_DISABLE_PREPARE commands for the
KVM_S390_PV_COMMAND ioctl are available.
Signed-off-by: NClaudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: NNico Boehr <nrb@linux.ibm.com>
Reviewed-by: NSteffen Eiden <seiden@linux.ibm.com>
Reviewed-by: NJanosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20221111170632.77622-4-imbrenda@linux.ibm.com
Message-Id: <20221111170632.77622-4-imbrenda@linux.ibm.com>
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>

8c516b25

KVM: s390: pv: asynchronous destroy for reboot · fb491d55

由 Claudio Imbrenda 提交于 11月 11, 2022

Until now, destroying a protected guest was an entirely synchronous
operation that could potentially take a very long time, depending on
the size of the guest, due to the time needed to clean up the address
space from protected pages.

This patch implements an asynchronous destroy mechanism, that allows a
protected guest to reboot significantly faster than previously.

This is achieved by clearing the pages of the old guest in background.
In case of reboot, the new guest will be able to run in the same
address space almost immediately.

The old protected guest is then only destroyed when all of its memory
has been destroyed or otherwise made non protected.

Two new PV commands are added for the KVM_S390_PV_COMMAND ioctl:

KVM_PV_ASYNC_CLEANUP_PREPARE: set aside the current protected VM for
later asynchronous teardown. The current KVM VM will then continue
immediately as non-protected. If a protected VM had already been
set aside for asynchronous teardown, but without starting the teardown
process, this call will fail. There can be at most one VM set aside at
any time. Once it is set aside, the protected VM only exists in the
context of the Ultravisor, it is not associated with the KVM VM
anymore. Its protected CPUs have already been destroyed, but not its
memory. This command can be issued again immediately after starting
KVM_PV_ASYNC_CLEANUP_PERFORM, without having to wait for completion.

KVM_PV_ASYNC_CLEANUP_PERFORM: tears down the protected VM previously
set aside using KVM_PV_ASYNC_CLEANUP_PREPARE. Ideally the
KVM_PV_ASYNC_CLEANUP_PERFORM PV command should be issued by userspace
from a separate thread. If a fatal signal is received (or if the
process terminates naturally), the command will terminate immediately
without completing. All protected VMs whose teardown was interrupted
will be put in the need_cleanup list. The rest of the normal KVM
teardown process will take care of properly cleaning up all remaining
protected VMs, including the ones on the need_cleanup list.
Signed-off-by: NClaudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: NNico Boehr <nrb@linux.ibm.com>
Reviewed-by: NJanosch Frank <frankja@linux.ibm.com>
Reviewed-by: NSteffen Eiden <seiden@linux.ibm.com>
Link: https://lore.kernel.org/r/20221111170632.77622-2-imbrenda@linux.ibm.com
Message-Id: <20221111170632.77622-2-imbrenda@linux.ibm.com>
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>

fb491d55

10 11月, 2022 1 次提交

KVM: x86: Add a VALID_MASK for the MSR exit reason flags · db205f7e

由 Aaron Lewis 提交于 9月 21, 2022

Add the mask KVM_MSR_EXIT_REASON_VALID_MASK for the MSR exit reason
flags.  This simplifies checks that validate these flags, and makes it
easier to introduce new flags in the future.

No functional change intended.
Signed-off-by: NAaron Lewis <aaronlewis@google.com>
Message-Id: <20220921151525.904162-3-aaronlewis@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

db205f7e

27 10月, 2022 1 次提交

perf/mem: Rename PERF_MEM_LVLNUM_EXTN_MEM to PERF_MEM_LVLNUM_CXL · cb6c18b5

由 Ravi Bangoria 提交于 10月 01, 2022

PERF_MEM_LVLNUM_EXTN_MEM was introduced to cover CXL devices but it's
bit ambiguous name and also not generic enough to cover cxl.cache and
cxl.io devices. Rename it to PERF_MEM_LVLNUM_CXL to be more specific.
Signed-off-by: NRavi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/f6268268-b4e9-9ed6-0453-65792644d953@amd.com

cb6c18b5

25 10月, 2022 1 次提交

media: videodev2.h: V4L2_DV_BT_BLANKING_HEIGHT should check 'interlaced' · 8da7f097

由 Hans Verkuil 提交于 10月 12, 2022

If it is a progressive (non-interlaced) format, then ignore the
interlaced timing values.
Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: 7f68127f ([media] videodev2.h: defines to calculate blanking and frame sizes)
Signed-off-by: NMauro Carvalho Chehab <mchehab@kernel.org>

8da7f097

08 10月, 2022 1 次提交

vDPA: allow userspace to query features of a vDPA device · 22856510

由 Zhu Lingshan 提交于 9月 29, 2022

This commit adds a new vDPA netlink attribution
VDPA_ATTR_VDPA_DEV_SUPPORTED_FEATURES. Userspace can query
features of vDPA devices through this new attr.

This commit invokes vdpa_config_ops.get_config()
rather than vdpa_get_config_unlocked() to read
the device config spcae, so no races in
vdpa_set_features_unlocked()

Userspace tool iproute2 example:
$ vdpa dev config show vdpa0
vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 4 mtu 1500
  negotiated_features MRG_RXBUF CTRL_VQ MQ VERSION_1 ACCESS_PLATFORM
  dev_features MTU MAC MRG_RXBUF CTRL_VQ MQ ANY_LAYOUT VERSION_1 ACCESS_PLATFORM
Signed-off-by: NZhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20220929014555.112323-2-lingshan.zhu@intel.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

22856510

07 10月, 2022 2 次提交

virtio_blk: add SECURE ERASE command support · e60d6407

由 Alvaro Karsz 提交于 9月 21, 2022

Support for the VIRTIO_BLK_F_SECURE_ERASE VirtIO feature.

A device that offers this feature can receive VIRTIO_BLK_T_SECURE_ERASE
commands.

A device which supports this feature has the following fields in the
virtio config:

- max_secure_erase_sectors
- max_secure_erase_seg
- secure_erase_sector_alignment

max_secure_erase_sectors and secure_erase_sector_alignment are expressed
in 512-byte units.

Every secure erase command has the following fields:

- sectors: The starting offset in 512-byte units.
- num_sectors: The number of sectors.
Signed-off-by: NAlvaro Karsz <alvaro.karsz@solid-run.com>
Message-Id: <20220921082729.2516779-1-alvaro.karsz@solid-run.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>

e60d6407

vdpa: device feature provisioning · 90fea5a8

由 Jason Wang 提交于 9月 27, 2022

This patch allows the device features to be provisioned through
netlink. A new attribute is introduced to allow the userspace to pass
a 64bit device features during device adding.

This provides several advantages:

- Allow to provision a subset of the features to ease the cross vendor
  live migration.
- Better debug-ability for vDPA framework and parent.
Reviewed-by: NEli Cohen <elic@nvidia.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Message-Id: <20220927074810.28627-2-jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

90fea5a8

04 10月, 2022 1 次提交

ethtool: add interface to interact with Ethernet Power Equipment · 18ff0bcd

由 Oleksij Rempel 提交于 10月 03, 2022

Add interface to support Power Sourcing Equipment. At current step it
provides generic way to address all variants of PSE devices as defined
in IEEE 802.3-2018 but support only objects specified for IEEE 802.3-2018 104.4
PoDL Power Sourcing Equipment (PSE).

Currently supported and mandatory objects are:
IEEE 802.3-2018 30.15.1.1.3 aPoDLPSEPowerDetectionStatus
IEEE 802.3-2018 30.15.1.1.2 aPoDLPSEAdminState
IEEE 802.3-2018 30.15.1.2.1 acPoDLPSEAdminControl

This is minimal interface needed to control PSE on each separate
ethernet port but it provides not all mandatory objects specified in
IEEE 802.3-2018.

Since "PoDL PSE" and "PSE" have similar names, but some different values
I decide to not merge them and keep separate naming schema. This should
allow as to be as close to IEEE 802.3 spec as possible and avoid name
conflicts in the future.

This implementation is connected to PHYs instead of MACs because PSE
auto classification can potentially interfere with PHY auto negotiation.
So, may be some extra PHY related initialization will be needed.

With WIP version of ethtools interaction with PSE capable link looks
as following:

$ ip l
...
5: t1l1@eth0: <BROADCAST,MULTICAST> ..
...

$ ethtool --show-pse t1l1
PSE attributs for t1l1:
PoDL PSE Admin State: disabled
PoDL PSE Power Detection Status: disabled

$ ethtool --set-pse t1l1 podl-pse-admin-control enable
$ ethtool --show-pse t1l1
PSE attributs for t1l1:
PoDL PSE Admin State: enabled
PoDL PSE Power Detection Status: delivering power
Signed-off-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: NBagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

18ff0bcd

30 9月, 2022 6 次提交

io_uring: introduce fixed buffer support for io_uring_cmd · 9cda70f6

由 Anuj Gupta 提交于 9月 30, 2022

Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring
command with previously registered buffers. User-space passes the buffer
index in sqe->buf_index, same as done in read/write variants that uses
fixed buffers.
Signed-off-by: NAnuj Gupta <anuj20.g@samsung.com>
Signed-off-by: NKanchan Joshi <joshi.k@samsung.com>
Link: https://lore.kernel.org/r/20220930062749.152261-3-anuj20.g@samsung.com
[axboe: shuffle valid flags check before acting on it]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9cda70f6

counter: Introduce the Count capture component · 45d29185

由 William Breathitt Gray 提交于 9月 27, 2022

Some devices provide a latch function to save historic Count values.
This patch standardizes exposure of such functionality as Count capture
components. A COUNTER_COMP_CAPTURE macro is provided for driver authors
to define a capture component. A new event COUNTER_EVENT_CAPTURE is
introduced to represent Count value capture events.

Cc: Julien Panis <jpanis@baylibre.com>
Link: https://lore.kernel.org/r/c239572ab4208d0d6728136e82a88ad464369a7a.1664204990.git.william.gray@linaro.org/Signed-off-by: NWilliam Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/3cebaa0b807a225eb277d771504fe6dba7269ffd.1664318353.git.william.gray@linaro.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

45d29185

counter: Introduce the Signal polarity component · 650ae67b

由 William Breathitt Gray 提交于 9月 27, 2022

The Signal polarity component represents the active level of a
respective Signal. There are two possible states: positive (rising edge)
and negative (falling edge); enum counter_signal_polarity represents
these states. A convenience macro COUNTER_COMP_POLARITY() is provided
for driver authors to declare a Signal polarity component.

Cc: Julien Panis <jpanis@baylibre.com>
Link: https://lore.kernel.org/r/8f47d6e1db71a11bb1e2666f8e2a6e9d256d4131.1664204990.git.william.gray@linaro.org/Signed-off-by: NWilliam Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/b6e53438badcb6318997d13dd2fc052f97d808ac.1664318353.git.william.gray@linaro.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

650ae67b

net/sched: taprio: allow user input of per-tc max SDU · a54fc09e

由 Vladimir Oltean 提交于 9月 28, 2022

IEEE 802.1Q clause 12.29.1.1 "The queueMaxSDUTable structure and data
types" and 8.6.8.4 "Enhancements for scheduled traffic" talk about the
existence of a per traffic class limitation of maximum frame sizes, with
a fallback on the port-based MTU.

As far as I am able to understand, the 802.1Q Service Data Unit (SDU)
represents the MAC Service Data Unit (MSDU, i.e. L2 payload), excluding
any number of prepended VLAN headers which may be otherwise present in
the MSDU. Therefore, the queueMaxSDU is directly comparable to the
device MTU (1500 means L2 payload sizes are accepted, or frame sizes of
1518 octets, or 1522 plus one VLAN header). Drivers which offload this
are directly responsible of translating into other units of measurement.

To keep the fast path checks optimized, we keep 2 arrays in the qdisc,
one for max_sdu translated into frame length (so that it's comparable to
skb->len), and another for offloading and for dumping back to the user.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

a54fc09e

docs: netlink: clarify the historical baggage of Netlink flags · 5493a2ad

由 Jakub Kicinski 提交于 9月 27, 2022

nlmsg_flags are full of historical baggage, inconsistencies and
strangeness. Try to document it more thoroughly. Explain the meaning
of the ECHO flag (and while at it clarify the comment in the uAPI).
Handwave a little about the NEW request flags and how they make
sense on the surface but cater to really old paradigm before commands
were a thing.

I will add more notes on how to make use of ECHO and discouragement
for reuse of flags to the kernel-side documentation.

Link: https://lore.kernel.org/r/20220927212306.823862-1-kuba@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

5493a2ad

landlock: Fix documentation style · 2fff00c8

由 Mickaël Salaün 提交于 9月 23, 2022

It seems that all code should use double backquotes, which is also used
to convert "%" defines.  Let's use an homogeneous style and remove all
use of simple backquotes (which should only be used for emphasis).

Cc: Günther Noack <gnoack3000@gmail.com>
Cc: Paul Moore <paul@paul-moore.com>
Signed-off-by: NMickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20220923154207.3311629-4-mic@digikod.net

2fff00c8

29 9月, 2022 7 次提交

perf/uapi: Define PERF_MEM_SNOOPX_PEER in kernel header file · cfef80ba

由 Ravi Bangoria 提交于 9月 28, 2022

PERF_MEM_SNOOPX_PEER is defined only in tools uapi header. Although
it's used only by perf tool, not defining it in kernel header can
create problems in future.
Signed-off-by: NRavi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220928095805.596-8-ravi.bangoria@amd.com

cfef80ba

perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM|IO} · ee3e88df

由 Ravi Bangoria 提交于 9月 28, 2022

PERF_MEM_LVLNUM_EXTN_MEM which can be used to indicate accesses to
extension memory like CXL etc. PERF_MEM_LVL_IO can be used for IO
accesses but it can not distinguish between local and remote IO.
Introduce new field PERF_MEM_LVLNUM_IO which can be clubbed with
PERF_MEM_REMOTE_REMOTE to indicate Remote IO accesses.
Signed-off-by: NRavi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220928095805.596-2-ravi.bangoria@amd.com

ee3e88df

KVM: Add KVM_CAP_DIRTY_LOG_RING_ACQ_REL capability and config option · 17601bfe

由 Marc Zyngier 提交于 9月 26, 2022

In order to differenciate between architectures that require no extra
synchronisation when accessing the dirty ring and those who do,
add a new capability (KVM_CAP_DIRTY_LOG_RING_ACQ_REL) that identify
the latter sort. TSO architectures can obviously advertise both, while
relaxed architectures must only advertise the ACQ_REL version.

This requires some configuration symbol rejigging, with HAVE_KVM_DIRTY_RING
being only indirectly selected by two top-level config symbols:
- HAVE_KVM_DIRTY_RING_TSO for strongly ordered architectures (x86)
- HAVE_KVM_DIRTY_RING_ACQ_REL for weakly ordered architectures (arm64)
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Reviewed-by: NGavin Shan <gshan@redhat.com>
Reviewed-by: NPeter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20220926145120.27974-3-maz@kernel.org

17601bfe

Input: add ABS_PROFILE to uapi and documentation · 1260cd04

由 Nate Yocom 提交于 9月 28, 2022

Define new ABS_PROFILE axis for input devices which need it, e.g. X-Box
Adaptive Controller and X-Box Elite 2.
Signed-off-by: NNate Yocom <nate@yocom.org>
Link: https://lore.kernel.org/r/20220908173930.28940-4-nate@yocom.orgSigned-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>

1260cd04

bpf: Handle bpf_link_info for the parameterized task BPF iterators. · 21fb6f2a

由 Kui-Feng Lee 提交于 9月 26, 2022

Add new fields to bpf_link_info that users can query it through
bpf_obj_get_info_by_fd().
Signed-off-by: NKui-Feng Lee <kuifeng@fb.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NYonghong Song <yhs@fb.com>
Acked-by: NMartin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20220926184957.208194-3-kuifeng@fb.com

21fb6f2a

bpf: Parameterize task iterators. · f0d74c4d

由 Kui-Feng Lee 提交于 9月 26, 2022

Allow creating an iterator that loops through resources of one
thread/process.

People could only create iterators to loop through all resources of
files, vma, and tasks in the system, even though they were interested
in only the resources of a specific task or process.  Passing the
additional parameters, people can now create an iterator to go
through all resources or only the resources of a task.
Signed-off-by: NKui-Feng Lee <kuifeng@fb.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NYonghong Song <yhs@fb.com>
Acked-by: NMartin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20220926184957.208194-2-kuifeng@fb.com

f0d74c4d

firmware/psci: Add debugfs support to ease debugging · 3137f2e6

由 Dmitry Baryshkov 提交于 9月 26, 2022

To ease debugging of PSCI supported features, add debugfs file called
'psci' describing PSCI and SMC CC versions, enabled features and
options.
Signed-off-by: NDmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: NMark Brown <broonie@kernel.org>
Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20220926110758.666922-1-dmitry.baryshkov@linaro.org'
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

3137f2e6

28 9月, 2022 1 次提交

net: tls: Add ARIA-GCM algorithm · 62e56ef5

由 Taehee Yoo 提交于 9月 25, 2022

RFC 6209 describes ARIA for TLS 1.2.
ARIA-128-GCM and ARIA-256-GCM are defined in RFC 6209.

This patch would offer performance increment and an opportunity for
hardware offload.

Benchmark results:
iperf-ssl are used.
CPU: intel i3-12100.

  TLS(openssl-3.0-dev)
[  3]  0.0- 1.0 sec   185 MBytes  1.55 Gbits/sec
[  3]  1.0- 2.0 sec   186 MBytes  1.56 Gbits/sec
[  3]  2.0- 3.0 sec   186 MBytes  1.56 Gbits/sec
[  3]  3.0- 4.0 sec   186 MBytes  1.56 Gbits/sec
[  3]  4.0- 5.0 sec   186 MBytes  1.56 Gbits/sec
[  3]  0.0- 5.0 sec   927 MBytes  1.56 Gbits/sec
  kTLS(aria-generic)
[  3]  0.0- 1.0 sec   198 MBytes  1.66 Gbits/sec
[  3]  1.0- 2.0 sec   194 MBytes  1.62 Gbits/sec
[  3]  2.0- 3.0 sec   194 MBytes  1.63 Gbits/sec
[  3]  3.0- 4.0 sec   194 MBytes  1.63 Gbits/sec
[  3]  4.0- 5.0 sec   194 MBytes  1.62 Gbits/sec
[  3]  0.0- 5.0 sec   974 MBytes  1.63 Gbits/sec
  kTLS(aria-avx wirh GFNI)
[  3]  0.0- 1.0 sec   632 MBytes  5.30 Gbits/sec
[  3]  1.0- 2.0 sec   657 MBytes  5.51 Gbits/sec
[  3]  2.0- 3.0 sec   657 MBytes  5.51 Gbits/sec
[  3]  3.0- 4.0 sec   656 MBytes  5.50 Gbits/sec
[  3]  4.0- 5.0 sec   656 MBytes  5.50 Gbits/sec
[  3]  0.0- 5.0 sec  3.18 GBytes  5.47 Gbits/sec
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Reviewed-by: NVadim Fedorenko <vfedorenko@novek.ru>
Link: https://lore.kernel.org/r/20220925150033.24615-1-ap420073@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

62e56ef5

27 9月, 2022 3 次提交

headers: Remove some left-over license text · 73dfe93e

由 Christophe JAILLET 提交于 9月 22, 2022

Remove some left-over from commit e2be04c7 ("License cleanup: add SPDX
license identifier to uapi header files with a license")

When the SPDX-License-Identifier tag has been added, the corresponding
license text has not been removed.
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: NAlexander Duyck <alexanderduyck@fb.com>
Reviewed-by: NJiri Pirko <jiri@nvidia.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/88410cddd31197ea26840d7dd71612bece8c6acf.1663871981.git.christophe.jaillet@wanadoo.frSigned-off-by: NJakub Kicinski <kuba@kernel.org>

73dfe93e

bpf: Return value in kprobe get_func_ip only for entry address · 0e253f7e

由 Jiri Olsa 提交于 9月 26, 2022

Changing return value of kprobe's version of bpf_get_func_ip
to return zero if the attach address is not on the function's
entry point.

For kprobes attached in the middle of the function we can't easily
get to the function address especially now with the CONFIG_X86_KERNEL_IBT
support.

If user cares about current IP for kprobes attached within the
function body, they can get it with PT_REGS_IP(ctx).
Suggested-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NMartynas Pumputis <m@lambda.lt>
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220926153340.1621984-6-jolsa@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>

0e253f7e

a.out: restore CMAGIC · cdc7daa9

由 наб 提交于 9月 26, 2022

Part of UAPI and the on-disk format:
this means that it's not a magic number per magic-number.rst,
and it's best to leave it untouched to avoid breaking userspace
and suffer the same fate as a.out in general

Fixes: 53c2bd67 ("a.out: remove define-only CMAGIC, previously magic number")
Signed-off-by: NAhelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Link: https://lore.kernel.org/r/20220926151554.7gxd6unp5727vw3c@tarta.nabijaczleweli.xyzSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

cdc7daa9

26 9月, 2022 2 次提交

btrfs: introduce BTRFS_QGROUP_STATUS_FLAGS_MASK for later expansion · e71564c0

由 Qu Wenruo 提交于 8月 24, 2022

Currently we only have 3 qgroup flags:

- BTRFS_QGROUP_STATUS_FLAG_ON
- BTRFS_QGROUP_STATUS_FLAG_RESCAN
- BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT

These flags match the on-disk flags used in btrfs_qgroup_status.

But we're going to introduce extra runtime flags which will not reach
disks.

So here we introduce a new mask, BTRFS_QGROUP_STATUS_FLAGS_MASK, to
make sure only those flags can reach disks.
Signed-off-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

e71564c0

btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2 · 1c56ab99

由 Qu Wenruo 提交于 8月 09, 2022

The problem of long mount time caused by block group item search is
already known for some time, and the solution of block group tree has
been proposed.

There is really no need to bound this feature into extent tree v2, just
introduce compat RO flag, BLOCK_GROUP_TREE, to correctly solve the
problem.

All the code handling block group root is already in the upstream
kernel, thus this patch really only needs to introduce the new compat RO
flag.

This patch introduces one extra artificial limitation on block group
tree feature, that free space cache v2 and no-holes feature must be
enabled to use this new compat RO feature.

This artificial requirement is mostly to reduce the test combinations,
and can be a guideline for future features, to mostly rely on the latest
default features.
Signed-off-by: NQu Wenruo <wqu@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

1c56ab99

24 9月, 2022 7 次提交

a.out: remove define-only CMAGIC, previously magic number · 53c2bd67

由 наб 提交于 9月 16, 2022

The last user was removed in 5.1 in
commit 08300f44 ("a.out: remove core dumping support")
but this is part of the UAPI headers, so this may want to either wait
until a.out is removed entirely, or be removed from the magic number doc
and silently remain in the header

A cursory glance on DCS didn't show any user code actually using this
value

Found with
grep MAGIC Documentation/process/magic-number.rst | while read -r mag _;
do git grep -wF "$mag"  | grep -ve '^Documentation.*magic-number.rst:' \
-qe ':#define '"$mag" || git grep -wF "$mag" | while IFS=: read -r f _;
do sed -i '/\b'"$mag"'\b/d' "$f"; done ; done
Signed-off-by: NAhelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Link: https://lore.kernel.org/r/9cbea062df7125ef43e2e0b2a67ede6ad1c5f27e.1663280877.git.nabijaczleweli@nabijaczleweli.xyzSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

53c2bd67

media: cec: add support for Absolute Volume Control · 479747ca

由 Hans Verkuil 提交于 8月 30, 2022

Add support for this new CEC message. This was added in HDMI 2.1a.
Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: NMauro Carvalho Chehab <mchehab@kernel.org>

479747ca

media: rockchip: rkisp1: Define macros for DPCC configurations in UAPI · 8e2b7442

由 Laurent Pinchart 提交于 6月 09, 2022

Extend the UAPI rkisp1-config.h header with macros for all DPCC
configuration fields. While at it, clarify of fix issues in the DPCC
documentation.
Signed-off-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: NPaul Elder <paul.elder@ideasonboard.com>
Reviewed-by: NDafna Hirschfeld <dafna@fastmail.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@kernel.org>

8e2b7442

fuse: implement ->tmpfile() · 7d375390

由 Miklos Szeredi 提交于 9月 24, 2022

This is basically equivalent to the FUSE_CREATE operation which creates and
opens a regular file.

Add a new FUSE_TMPFILE operation, otherwise just reuse the protocol and the
code for FUSE_CREATE.
Acked-by: NChristian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

7d375390

ublk_drv: add START_USER_RECOVERY and END_USER_RECOVERY support · c732a852

由 ZiyangZhang 提交于 9月 23, 2022

START_USER_RECOVERY and END_USER_RECOVERY are two new control commands
to support user recovery feature.

After a crash, user should send START_USER_RECOVERY, it will:
(1) check if (a)current ublk_device is UBLK_S_DEV_QUIESCED which was
    set by quiesce_work and (b)chardev is released
(2) reinit all ubqs, including:
    (a) put the task_struct and reset ->ubq_daemon to NULL.
    (b) reset all ublk_io.
(3) reset ub->mm to NULL.

Then, user should start a new process and send FETCH_REQ on each
ubq_daemon.

Finally, user should send END_USER_RECOVERY, it will:
(1) wait for all new ubq_daemons getting ready.
(2) update ublksrv_pid
(3) unquiesce the request queue and expect incoming ublk_queue_rq()
(4) convert ub's state to UBLK_S_DEV_LIVE

Note: we can handle STOP_DEV between START_USER_RECOVERY and
END_USER_RECOVERY. This is helpful to users who cannot start new process
after sending START_USER_RECOVERY ctrl-cmd.
Signed-off-by: NZiyangZhang <ZiyangZhang@linux.alibaba.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20220923153919.44078-7-ZiyangZhang@linux.alibaba.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

c732a852

ublk_drv: support UBLK_F_USER_RECOVERY_REISSUE · a0d41dc1

由 ZiyangZhang 提交于 9月 23, 2022

UBLK_F_USER_RECOVERY_REISSUE implies that:
With a dying ubq_daemon, ublk_drv let monitor_work requeues rq issued to
userspace(ublksrv) before the ubq_daemon is dying.

UBLK_F_USER_RECOVERY_REISSUE is designed for backends which:
(1) tolerate double-write since ublk_drv may issue the same rq
    twice.
(2) does not let frontend users get I/O error, such as read-only FS
    and VM backend.
Signed-off-by: NZiyangZhang <ZiyangZhang@linux.alibaba.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20220923153919.44078-6-ZiyangZhang@linux.alibaba.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

a0d41dc1

ublk_drv: define macros for recovery feature and check them · 77a440e2

由 ZiyangZhang 提交于 9月 23, 2022

Define some macros for recovery feature.

UBLK_S_DEV_QUIESCED implies that ublk_device is quiesced
and is ready for recovery. This state can be observed by userspace.

UBLK_F_USER_RECOVERY implies that:
(1) ublk_drv enables recovery feature. It won't let monitor_work to
    automatically abort rqs and release the device.
(2) With a dying ubq_daemon, ublk_drv ends(aborts) rqs issued to
    userspace(ublksrv) before crash.
(3) With a dying ubq_daemon, in task work and ublk_queue_rq(),
    ublk_drv requeues rqs.
Signed-off-by: NZiyangZhang <ZiyangZhang@linux.alibaba.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20220923153919.44078-3-ZiyangZhang@linux.alibaba.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

77a440e2

23 9月, 2022 2 次提交

tun: support not enabling carrier in TUNSETIFF · 195624d9

由 Patrick Rohr 提交于 9月 20, 2022

This change adds support for not enabling carrier during TUNSETIFF
interface creation by specifying the IFF_NO_CARRIER flag.

Our tests make heavy use of tun interfaces. In some scenarios, the test
process creates the interface but another process brings it up after the
interface is discovered via netlink notification. In that case, it is
not possible to create a tun/tap interface with carrier off without it
racing against the bring up. Immediately setting carrier off via
TUNSETCARRIER is still too late.
Signed-off-by: NPatrick Rohr <prohr@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: NMaciej Żenczykowski <maze@google.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Reviewed-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

195624d9

net: phy: Add support for rate matching · 0c3e10cb

由 Sean Anderson 提交于 9月 20, 2022

This adds support for rate matching (also known as rate adaptation) to
the phy subsystem. The general idea is that the phy interface runs at
one speed, and the MAC throttles the rate at which it sends packets to
the link speed. There's a good overview of several techniques for
achieving this at [1]. This patch adds support for three: pause-frame
based (such as in Aquantia phys), CRS-based (such as in 10PASS-TS and
2BASE-TL), and open-loop-based (such as in 10GBASE-W).

This patch makes a few assumptions and a few non assumptions about the
types of rate matching available. First, it assumes that different phys
may use different forms of rate matching. Second, it assumes that phys
can use rate matching for any of their supported link speeds (e.g. if a
phy supports 10BASE-T and XGMII, then it can adapt XGMII to 10BASE-T).
Third, it does not assume that all interface modes will use the same
form of rate matching. Fourth, it does not assume that all phy devices
will support rate matching (even if some do). Relaxing or strengthening
these (non-)assumptions could result in a different API. For example, if
all interface modes were assumed to use the same form of rate matching,
then a bitmask of interface modes supportting rate matching would
suffice.

For some better visibility into the process, the current rate matching
mode is exposed as part of the ethtool ksettings. For the moment, only
read access is supported. I'm not sure what userspace might want to
configure yet (disable it altogether, disable just one mode, specify the
mode to use, etc.). For the moment, since only pause-based rate
adaptation support is added in the next few commits, rate matching can
be disabled altogether by adjusting the advertisement.

802.3 calls this feature "rate adaptation" in clause 49 (10GBASE-R) and
"rate matching" in clause 61 (10PASS-TL and 2BASE-TS). Aquantia also calls
this feature "rate adaptation". I chose "rate matching" because it is
shorter, and because Russell doesn't think "adaptation" is correct in this
context.
Signed-off-by: NSean Anderson <sean.anderson@seco.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c3e10cb

22 9月, 2022 2 次提交

bpf: Add bpf_user_ringbuf_drain() helper · 20571567

由 David Vernet 提交于 9月 19, 2022

In a prior change, we added a new BPF_MAP_TYPE_USER_RINGBUF map type which
will allow user-space applications to publish messages to a ring buffer
that is consumed by a BPF program in kernel-space. In order for this
map-type to be useful, it will require a BPF helper function that BPF
programs can invoke to drain samples from the ring buffer, and invoke
callbacks on those samples. This change adds that capability via a new BPF
helper function:

bpf_user_ringbuf_drain(struct bpf_map *map, void *callback_fn, void *ctx,
                       u64 flags)

BPF programs may invoke this function to run callback_fn() on a series of
samples in the ring buffer. callback_fn() has the following signature:

long callback_fn(struct bpf_dynptr *dynptr, void *context);

Samples are provided to the callback in the form of struct bpf_dynptr *'s,
which the program can read using BPF helper functions for querying
struct bpf_dynptr's.

In order to support bpf_ringbuf_drain(), a new PTR_TO_DYNPTR register
type is added to the verifier to reflect a dynptr that was allocated by
a helper function and passed to a BPF program. Unlike PTR_TO_STACK
dynptrs which are allocated on the stack by a BPF program, PTR_TO_DYNPTR
dynptrs need not use reference tracking, as the BPF helper is trusted to
properly free the dynptr before returning. The verifier currently only
supports PTR_TO_DYNPTR registers that are also DYNPTR_TYPE_LOCAL.

Note that while the corresponding user-space libbpf logic will be added
in a subsequent patch, this patch does contain an implementation of the
.map_poll() callback for BPF_MAP_TYPE_USER_RINGBUF maps. This
.map_poll() callback guarantees that an epoll-waiting user-space
producer will receive at least one event notification whenever at least
one sample is drained in an invocation of bpf_user_ringbuf_drain(),
provided that the function is not invoked with the BPF_RB_NO_WAKEUP
flag. If the BPF_RB_FORCE_WAKEUP flag is provided, a wakeup
notification is sent even if no sample was drained.
Signed-off-by: NDavid Vernet <void@manifault.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220920000100.477320-3-void@manifault.com

20571567

bpf: Define new BPF_MAP_TYPE_USER_RINGBUF map type · 583c1f42

由 David Vernet 提交于 9月 19, 2022

We want to support a ringbuf map type where samples are published from
user-space, to be consumed by BPF programs. BPF currently supports a
kernel -> user-space circular ring buffer via the BPF_MAP_TYPE_RINGBUF
map type.  We'll need to define a new map type for user-space -> kernel,
as none of the helpers exported for BPF_MAP_TYPE_RINGBUF will apply
to a user-space producer ring buffer, and we'll want to add one or
more helper functions that would not apply for a kernel-producer
ring buffer.

This patch therefore adds a new BPF_MAP_TYPE_USER_RINGBUF map type
definition. The map type is useless in its current form, as there is no
way to access or use it for anything until we one or more BPF helpers. A
follow-on patch will therefore add a new helper function that allows BPF
programs to run callbacks on samples that are published to the ring
buffer.
Signed-off-by: NDavid Vernet <void@manifault.com>
Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220920000100.477320-2-void@manifault.com

583c1f42

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功