提交 · dc48bae01e5a23ae67758e8fe31cdc439202b190 · openeuler / Kernel

05 12月, 2017 7 次提交

KVM: Define SEV key management command id · dc48bae0

由 Brijesh Singh 提交于 12月 04, 2017

Define Secure Encrypted Virtualization (SEV) key management command id
and structure. The command definition is available in SEV KM spec
0.14 (http://support.amd.com/TechDocs/55766_SEV-KM API_Specification.pdf)
and Documentation/virtual/kvm/amd-memory-encryption.txt.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Improvements-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>

dc48bae0

crypto: ccp: Implement SEV_PEK_CERT_IMPORT ioctl command · 7360e4b1

由 Brijesh Singh 提交于 12月 04, 2017

The SEV_PEK_CERT_IMPORT command can be used to import the signed PEK
certificate. The command is defined in SEV spec section 5.8.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Gary Hook <gary.hook@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-crypto@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Improvements-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Acked-by: NGary R Hook <gary.hook@amd.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>

7360e4b1

crypto: ccp: Add Secure Encrypted Virtualization (SEV) command support · 200664d5

由 Brijesh Singh 提交于 12月 04, 2017

AMD's new Secure Encrypted Virtualization (SEV) feature allows the
memory contents of virtual machines to be transparently encrypted with a
key unique to the VM. The programming and management of the encryption
keys are handled by the AMD Secure Processor (AMD-SP) which exposes the
commands for these tasks. The complete spec is available at:

http://support.amd.com/TechDocs/55766_SEV-KM%20API_Specification.pdf

Extend the AMD-SP driver to provide the following support:

 - an in-kernel API to communicate with the SEV firmware. The API can be
   used by the hypervisor to create encryption context for a SEV guest.

 - a userspace IOCTL to manage the platform certificates.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Gary Hook <gary.hook@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-crypto@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Improvements-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>

200664d5

crypto: ccp: Define SEV key management command id · 592d5e74

由 Brijesh Singh 提交于 12月 04, 2017

Define Secure Encrypted Virtualization (SEV) key management command id
and structure. The command definition is available in SEV KM spec
0.14 (http://support.amd.com/TechDocs/55766_SEV-KM API_Specification.pdf)

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Gary Hook <gary.hook@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-crypto@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Improvements-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Acked-by: NGary R Hook <gary.hook@amd.com>

592d5e74

crypto: ccp: Define SEV userspace ioctl and command id · 1d57b17c

由 Brijesh Singh 提交于 12月 04, 2017

Add a include file which defines the ioctl and command id used for
issuing SEV platform management specific commands.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Gary Hook <gary.hook@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-crypto@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Improvements-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Acked-by: NGary R Hook <gary.hook@amd.com>

1d57b17c

KVM: Introduce KVM_MEMORY_ENCRYPT_{UN,}REG_REGION ioctl · 69eaedee

由 Brijesh Singh 提交于 12月 04, 2017

If hardware supports memory encryption then KVM_MEMORY_ENCRYPT_REG_REGION
and KVM_MEMORY_ENCRYPT_UNREG_REGION ioctl's can be used by userspace to
register/unregister the guest memory regions which may contain the encrypted
data (e.g guest RAM, PCI BAR, SMRAM etc).

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Improvements-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>

69eaedee

KVM: Introduce KVM_MEMORY_ENCRYPT_OP ioctl · 5acc5c06

由 Brijesh Singh 提交于 12月 04, 2017

If the hardware supports memory encryption then the
KVM_MEMORY_ENCRYPT_OP ioctl can be used by qemu to issue a platform
specific memory encryption commands.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>

5acc5c06

24 11月, 2017 3 次提交

sched/debug: Fix task state recording/printout · 3f5fe9fe

由 Thomas Gleixner 提交于 11月 22, 2017

The recent conversion of the task state recording to use task_state_index()
broke the sched_switch tracepoint task state output.

task_state_index() returns surprisingly an index (0-7) which is then
printed with __print_flags() applying bitmasks. Not really working and
resulting in weird states like 'prev_state=t' instead of 'prev_state=I'.

Use TASK_REPORT_MAX instead of TASK_STATE_MAX to report preemption. Build a
bitmask from the return value of task_state_index() and store it in
entry->prev_state, which makes __print_flags() work as expected.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: stable@vger.kernel.org
Fixes: efb40f58 ("sched/tracing: Fix trace_sched_switch task-state printing")
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1711221304180.1751@nanosSigned-off-by: NIngo Molnar <mingo@kernel.org>

3f5fe9fe

x86/PCI: Remove unused HyperTransport interrupt support · fd2fa6c1

由 Bjorn Helgaas 提交于 11月 22, 2017

There are no in-tree callers of ht_create_irq(), the driver interface for
HyperTransport interrupts, left.  Remove the unused entry point and all the
supporting code.

See 8b955b0d ("[PATCH] Initial generic hypertransport interrupt
support").
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-pci@vger.kernel.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: https://lkml.kernel.org/r/20171122221337.3877.23362.stgit@bhelgaas-glaptop.roam.corp.google.com

fd2fa6c1

net: accept UFO datagrams from tuntap and packet · 0c19f846

由 Willem de Bruijn 提交于 11月 21, 2017

Tuntap and similar devices can inject GSO packets. Accept type
VIRTIO_NET_HDR_GSO_UDP, even though not generating UFO natively.

Processes are expected to use feature negotiation such as TUNSETOFFLOAD
to detect supported offload types and refrain from injecting other
packets. This process breaks down with live migration: guest kernels
do not renegotiate flags, so destination hosts need to expose all
features that the source host does.

Partially revert the UFO removal from 182e0b6b~1..d9d30adf.
This patch introduces nearly(*) no new code to simplify verification.
It brings back verbatim tuntap UFO negotiation, VIRTIO_NET_HDR_GSO_UDP
insertion and software UFO segmentation.

It does not reinstate protocol stack support, hardware offload
(NETIF_F_UFO), SKB_GSO_UDP tunneling in SKB_GSO_SOFTWARE or reception
of VIRTIO_NET_HDR_GSO_UDP packets in tuntap.

To support SKB_GSO_UDP reappearing in the stack, also reinstate
logic in act_csum and openvswitch. Achieve equivalence with v4.13 HEAD
by squashing in commit 93991221 ("net: skb_needs_check() removes
CHECKSUM_UNNECESSARY check for tx.") and reverting commit 8d63bee6
("net: avoid skb_warn_bad_offload false positives on UFO").

(*) To avoid having to bring back skb_shinfo(skb)->ip6_frag_id,
ipv6_proxy_select_ident is changed to return a __be32 and this is
assigned directly to the frag_hdr. Also, SKB_GSO_UDP is inserted
at the end of the enum to minimize code churn.

Tested
  Booted a v4.13 guest kernel with QEMU. On a host kernel before this
  patch `ethtool -k eth0` shows UFO disabled. After the patch, it is
  enabled, same as on a v4.13 host kernel.

  A UFO packet sent from the guest appears on the tap device:
    host:
      nc -l -p -u 8000 &
      tcpdump -n -i tap0

    guest:
      dd if=/dev/zero of=payload.txt bs=1 count=2000
      nc -u 192.16.1.1 8000 < payload.txt

  Direct tap to tap transmission of VIRTIO_NET_HDR_GSO_UDP succeeds,
  packets arriving fragmented:

    ./with_tap_pair.sh ./tap_send_ufo tap0 tap1
    (from https://github.com/wdebruij/kerneltools/tree/master/tests)

Changes
  v1 -> v2
    - simplified set_offload change (review comment)
    - documented test procedure

Link: http://lkml.kernel.org/r/<CAF=yD-LuUeDuL9YWPJD9ykOZ0QCjNeznPDr6whqZ9NGMNF12Mw@mail.gmail.com>
Fixes: fb652fdf ("macvlan/macvtap: Remove NETIF_F_UFO advertisement.")
Reported-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c19f846

23 11月, 2017 4 次提交

bpf: fix branch pruning logic · c131187d

由 Alexei Starovoitov 提交于 11月 22, 2017

when the verifier detects that register contains a runtime constant
and it's compared with another constant it will prune exploration
of the branch that is guaranteed not to be taken at runtime.
This is all correct, but malicious program may be constructed
in such a way that it always has a constant comparison and
the other branch is never taken under any conditions.
In this case such path through the program will not be explored
by the verifier. It won't be taken at run-time either, but since
all instructions are JITed the malicious program may cause JITs
to complain about using reserved fields, etc.
To fix the issue we have to track the instructions explored by
the verifier and sanitize instructions that are dead at run time
with NOPs. We cannot reject such dead code, since llvm generates
it for valid C code, since it doesn't do as much data flow
analysis as the verifier does.

Fixes: 17a52670 ("bpf: verifier (add verifier core)")
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

c131187d

dt-bindings: remove file that was added accidentally · 98ecf1a3

由 Rob Clark 提交于 11月 16, 2017

I think this snuck in when I applied the patch for f97decac (didn't
apply cleanly, required some manual applying + git-add).  It is unused
and shouldn't be here.  My bad.

Fixes: f97decac "drm/msm: Support multiple ringbuffers"
Signed-off-by: NRob Clark <robdclark@gmail.com>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NDave Airlie <airlied@redhat.com>

98ecf1a3

drm: add connector info/property for non-desktop displays [v2] · 66660d4c

由 Dave Airlie 提交于 10月 16, 2017

This adds the infrastructure needed to quirk displays
using edid and to mark them a non-desktop.

A non-desktop display is one which shouldn't normally be included
as a part of a desktop environment.

This is meant to cover head mounted devices like HTC Vive.

v2: Change description from non-standard to non-desktop, add docs
Reviewed-by: NKeith Packard <keithp@keithp.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

fixup docs

66660d4c

bpf: introduce ARG_PTR_TO_MEM_OR_NULL · db1ac496

由 Gianluca Borello 提交于 11月 22, 2017

With the current ARG_PTR_TO_MEM/ARG_PTR_TO_UNINIT_MEM semantics, an helper
argument can be NULL when the next argument type is ARG_CONST_SIZE_OR_ZERO
and the verifier can prove the value of this next argument is 0. However,
most helpers are just interested in handling <!NULL, 0>, so forcing them to
deal with <NULL, 0> makes the implementation of those helpers more
complicated for no apparent benefits, requiring them to explicitly handle
those corner cases with checks that bpf programs could start relying upon,
preventing the possibility of removing them later.

Solve this by making ARG_PTR_TO_MEM/ARG_PTR_TO_UNINIT_MEM never accept NULL
even when ARG_CONST_SIZE_OR_ZERO is set, and introduce a new argument type
ARG_PTR_TO_MEM_OR_NULL to explicitly deal with the NULL case.

Currently, the only helper that needs this is bpf_csum_diff_proto(), so
change arg1 and arg3 to this new type as well.

Also add a new battery of tests that explicitly test the
!ARG_PTR_TO_MEM_OR_NULL combination: all the current ones testing the
various <NULL, 0> variations are focused on bpf_csum_diff, so cover also
other helpers.
Signed-off-by: NGianluca Borello <g.borello@gmail.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

db1ac496

22 11月, 2017 10 次提交

ALSA: hda - Fix yet remaining issue with vmaster 0dB initialization · d6c0615f

由 Takashi Iwai 提交于 11月 22, 2017

The previous fix for addressing the breakage in vmaster slave
initialization, commit a91d6612 ("ALSA: hda - Fix incorrect TLV
callback check introduced during set_fs() removal"), introduced a new
helper to process over each slave kctl. However, this helper passes
only the original kctl, not the virtual slave kctl. As a result,
HD-audio driver (which is the only user so far) couldn't initialize
the slave correctly because it's trying to update the value directly
with the original kctl, not with the mapped kctl.

This patch fixes the situation again by passing both the mapped slaved
and original slave kctls to the function. Luckily there is a single
caller as of now, so changing the call signature is no big matter.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=197959
Fixes: a91d6612 ("ALSA: hda - Fix incorrect TLV callback check introduced during set_fs() removal")
Cc: <stable@vger.kernel.org>
Signed-off-by: NTakashi Iwai <tiwai@suse.de>

d6c0615f

treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts · 841b86f3

由 Kees Cook 提交于 10月 23, 2017

With all callbacks converted, and the timer callback prototype
switched over, the TIMER_FUNC_TYPE cast is no longer needed,
so remove it. Conversion was done with the following scripts:

    perl -pi -e 's|\(TIMER_FUNC_TYPE\)||g' \
        $(git grep TIMER_FUNC_TYPE | cut -d: -f1 | sort -u)

    perl -pi -e 's|\(TIMER_DATA_TYPE\)||g' \
        $(git grep TIMER_DATA_TYPE | cut -d: -f1 | sort -u)

The now unused macros are also dropped from include/linux/timer.h.
Signed-off-by: NKees Cook <keescook@chromium.org>

841b86f3

timer: Remove redundant __setup_timer*() macros · 919b250f

由 Kees Cook 提交于 10月 22, 2017

With __init_timer*() now matching __setup_timer*(), remove the redundant
internal interface, clean up the resulting definitions and add more
documentation.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NKees Cook <keescook@chromium.org>

919b250f

timer: Pass function down to initialization routines · 188665b2

由 Kees Cook 提交于 10月 22, 2017

In preparation for removing more macros, pass the function down to the
initialization routines instead of doing it in macros.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NKees Cook <keescook@chromium.org>

188665b2

timer: Remove unused data arguments from macros · 1fe66ba5

由 Kees Cook 提交于 10月 22, 2017

With the .data field removed, the ignored data arguments in timer macros
can be removed.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shaohua Li <shli@fb.com>
Signed-off-by: NKees Cook <keescook@chromium.org>

1fe66ba5

timer: Switch callback prototype to take struct timer_list * argument · 354b46b1

由 Kees Cook 提交于 10月 22, 2017

Since all callbacks have been converted, we can switch the core
prototype to "struct timer_list *" now too.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NKees Cook <keescook@chromium.org>

354b46b1

timer: Pass timer_list pointer to callbacks unconditionally · c1eba5bc

由 Kees Cook 提交于 10月 22, 2017

Now that all timer callbacks are already taking their struct timer_list
pointer as the callback argument, just do this unconditionally and remove
the .data field.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NKees Cook <keescook@chromium.org>

c1eba5bc

timer: Remove setup_*timer() interface · 513ae785

由 Kees Cook 提交于 10月 22, 2017

With all callers converted to timer_setup(), the old setup_*timer()
interface can be removed.

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NKees Cook <keescook@chromium.org>

513ae785

timer: Remove init_timer() interface · 7eeb6b89

由 Kees Cook 提交于 10月 11, 2017

All users of init_timer() have been updated. Remove the ancient interface.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: NKees Cook <keescook@chromium.org>

7eeb6b89

block/laptop_mode: Convert timers to use timer_setup() · bca237a5

由 Kees Cook 提交于 8月 28, 2017

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: linux-block@vger.kernel.org
Cc: linux-mm@kvack.org
Signed-off-by: NKees Cook <keescook@chromium.org>

bca237a5

21 11月, 2017 6 次提交

sched/deadline: Don't use dubious signed bitfields · aa5222e9

由 Dan Carpenter 提交于 10月 13, 2017

It doesn't cause a run-time bug, but these bitfields should be unsigned.
When it's signed ->dl_throttled is set to either 0 or -1, instead of
0 and 1 as expected.

The sched.h file is included into tons of places so Sparse generates
a flood of warnings like this:

  ./include/linux/sched.h:477:54: error: dubious one-bit signed bitfield
Reported-by: NMatthew Wilcox <willy@infradead.org>
Reported-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NLuca Abeni <luca.abeni@santannapisa.it>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Cc: luca abeni <luca.abeni@santannapisa.it>
Link: http://lkml.kernel.org/r/20171013070121.dzcncojuj2f4utij@mwandaSigned-off-by: NIngo Molnar <mingo@kernel.org>

aa5222e9

bpf: make bpf_prog_offload_verifier_prep() static inline · 14380194

由 Jakub Kicinski 提交于 11月 20, 2017

Header implementation of bpf_prog_offload_verifier_prep() which
is used if CONFIG_NET=n should be a static inline.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

14380194

bpf: revert report offload info to user space · 1ee64009

由 Jakub Kicinski 提交于 11月 20, 2017

This reverts commit bd601b6a ("bpf: report offload info to user
space").  The ifindex by itself is not sufficient, we should provide
information on which network namespace this ifindex belongs to.
After considering some options we concluded that it's best to just
remove this API for now, and rework it in -next.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

1ee64009

bpf: turn bpf_prog_get_type() into a wrapper · 479321e9

由 Jakub Kicinski 提交于 11月 20, 2017

bpf_prog_get_type() is identical to bpf_prog_get_type_dev(),
with false passed as attach_drv.  Instead of keeping it as
an exported symbol turn it into static inline wrapper.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

479321e9

bpf: offload: move offload device validation out to the drivers · 288b3de5

由 Jakub Kicinski 提交于 11月 20, 2017

With TC shared block changes we can't depend on correct netdev
pointer being available in cls_bpf.  Move the device validation
to the driver.  Core will only make sure that offloaded programs
are always attached in the driver (or in HW by the driver).  We
trust that drivers which implement offload callbacks will perform
necessary checks.

Moving the checks to the driver is generally a useful thing,
in practice the check should be against a switchdev instance,
not a netdev, given that most ASICs will probably allow using
the same program on many ports.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

288b3de5

bpf: offload: rename the ifindex field · 1f6f4cb7

由 Jakub Kicinski 提交于 11月 20, 2017

bpf_target_prog seems long and clunky, rename it to prog_ifindex.
We don't want to call this field just ifindex, because maps
may need a similar field in the future and bpf_attr members for
programs and maps are unnamed.
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

1f6f4cb7

19 11月, 2017 8 次提交

tcp: when scheduling TLP, time of RTO should account for current ACK · ed66dfaf

由 Neal Cardwell 提交于 11月 17, 2017

Fix the TLP scheduling logic so that when scheduling a TLP probe, we
ensure that the estimated time at which an RTO would fire accounts for
the fact that ACKs indicating forward progress should push back RTO
times.

After the following fix:

df92c839 ("tcp: fix xmit timer to only be reset if data ACKed/SACKed")

we had an unintentional behavior change in the following kind of
scenario: suppose the RTT variance has been very low recently. Then
suppose we send out a flight of N packets and our RTT is 100ms:

t=0: send a flight of N packets
t=100ms: receive an ACK for N-1 packets

The response before df92c839 that was:
  -> schedule a TLP for now + RTO_interval

The response after df92c839 is:
  -> schedule a TLP for t=0 + RTO_interval

Since RTO_interval = srtt + RTT_variance, this means that we have
scheduled a TLP timer at a point in the future that only accounts for
RTT_variance. If the RTT_variance term is small, this means that the
timer fires soon.

Before df92c839 this would not happen, because in that code, when
we receive an ACK for a prefix of flight, we did:

    1) Near the top of tcp_ack(), switch from TLP timer to RTO
       at write_queue_head->paket_tx_time + RTO_interval:
            if (icsk->icsk_pending == ICSK_TIME_LOSS_PROBE)
                   tcp_rearm_rto(sk);

    2) In tcp_clean_rtx_queue(), update the RTO to now + RTO_interval:
            if (flag & FLAG_ACKED) {
                   tcp_rearm_rto(sk);

    3) In tcp_ack() after tcp_fastretrans_alert() switch from RTO
       to TLP at now + RTO_interval:
            if (icsk->icsk_pending == ICSK_TIME_RETRANS)
                   tcp_schedule_loss_probe(sk);

In df92c839 we removed that 3-phase dance, and instead directly
set the TLP timer once: we set the TLP timer in cases like this to
write_queue_head->packet_tx_time + RTO_interval. So if the RTT
variance is small, then this means that this is setting the TLP timer
to fire quite soon. This means if the ACK for the tail of the flight
takes longer than an RTT to arrive (often due to delayed ACKs), then
the TLP timer fires too quickly.

Fixes: df92c839 ("tcp: fix xmit timer to only be reset if data ACKed/SACKed")
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed66dfaf

NTB: switchtec_ntb: Add skeleton NTB driver · e099b45b

由 Logan Gunthorpe 提交于 8月 03, 2017

Add a skeleton NTB driver which will be filled out in subsequent patches.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NStephen Bates <sbates@raithlin.com>
Reviewed-by: NKurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: NAllen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: NJon Mason <jdmason@kudzu.us>

e099b45b

NTB: switchtec_ntb: Introduce initial NTB driver · 33dea5aa

由 Logan Gunthorpe 提交于 8月 03, 2017

Seeing the Switchtec NTB hardware shares the same endpoint as the
management endpoint we utilize the class_interface API to register
an NTB driver for every Switchtec device in the system that has the
NTB class code.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NStephen Bates <sbates@raithlin.com>
Reviewed-by: NKurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: NAllen Hubbe <Allen.Hubbe@dell.com>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJon Mason <jdmason@kudzu.us>

33dea5aa

NTB: Add check and comment for link up to mw_count() and mw_get_align() · fa5ab66e

由 Logan Gunthorpe 提交于 8月 03, 2017

Adds a comment and a check to ntb_mw_get_align() so that it always fails
if the function is called before the link is up.

Also adds a comment to ntb_mw_count() to note that it may return 0 if
it is called before the link is up.

This is to prevent accidental mis-use in clients that are testing
on hardware that this doesn't matter for.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Acked-by: NAllen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: NJon Mason <jdmason@kudzu.us>

fa5ab66e

NTB: switchtec: Add link event notifier callback · 48c302dc

由 Logan Gunthorpe 提交于 8月 03, 2017

In order for the Switchtec NTB code to handle link change events we
create a notifier callback in the switchtec code which gets called
whenever an appropriate event interrupt occurs.

In order to preserve userspace's ability to follow these events,
we compare the event count with a stored copy from last time we
checked.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NStephen Bates <sbates@raithlin.com>
Reviewed-by: NKurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJon Mason <jdmason@kudzu.us>

48c302dc

NTB: switchtec: Add NTB hardware register definitions · c082b04c

由 Logan Gunthorpe 提交于 8月 03, 2017

There are two additional regions: ctrl and dbmsg. The first is
for generic NTB control and memory windows. The second is for doorbells
and message registers. This patch also adds a number of related
constants for using these registers.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NStephen Bates <sbates@raithlin.com>
Reviewed-by: NKurt Schwemmer <kurt.schwemmer@microsemi.com>
Signed-off-by: NJon Mason <jdmason@kudzu.us>

c082b04c

NTB: switchtec: Export class symbol for use in upper layer driver · 302e994d

由 Logan Gunthorpe 提交于 8月 03, 2017

We export the class pointer symbol and add an extern define in the
Switchtec header file.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NStephen Bates <sbates@raithlin.com>
Reviewed-by: NKurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJon Mason <jdmason@kudzu.us>

302e994d

NTB: switchtec: Move structure definitions into a common header · 5a1c269f

由 Logan Gunthorpe 提交于 8月 03, 2017

Create the switchtec.h header in include/linux with hardware defines
and the switchtec_dev structure. Both moved directly from switchtec.c.
This is a prep patch for creating an NTB driver for Switchtec.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Reviewed-by: NStephen Bates <sbates@raithlin.com>
Reviewed-by: NKurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJon Mason <jdmason@kudzu.us>

5a1c269f

18 11月, 2017 2 次提交

sctp: set frag_point in sctp_setsockopt_maxseg correctly · ecca8f88

由 Xin Long 提交于 11月 17, 2017

Now in sctp_setsockopt_maxseg user_frag or frag_point can be set with
val >= 8 and val <= SCTP_MAX_CHUNK_LEN. But both checks are incorrect.

val >= 8 means frag_point can even be less than SCTP_DEFAULT_MINSEGMENT.
Then in sctp_datamsg_from_user(), when it's value is greater than cookie
echo len and trying to bundle with cookie echo chunk, the first_len will
overflow.

The worse case is when it's value is equal as cookie echo len, first_len
becomes 0, it will go into a dead loop for fragment later on. In Hangbin
syzkaller testing env, oom was even triggered due to consecutive memory
allocation in that loop.

Besides, SCTP_MAX_CHUNK_LEN is the max size of the whole chunk, it should
deduct the data header for frag_point or user_frag check.

This patch does a proper check with SCTP_DEFAULT_MINSEGMENT subtracting
the sctphdr and datahdr, SCTP_MAX_CHUNK_LEN subtracting datahdr when
setting frag_point via sockopt. It also improves sctp_setsockopt_maxseg
codes.
Suggested-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reported-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ecca8f88

include/asm-generic/topology.h: remove unused parent_node() macro · 7016383b

由 Dou Liyang 提交于 11月 17, 2017

Commit a7be6e5a ("mm: drop useless local parameters of
__register_one_node()") removed the last user of parent_node().

The parent_node() macro in generic situation is unnecessary.

Remove it for cleanup.

Link: http://lkml.kernel.org/r/1504234599-29533-8-git-send-email-douly.fnst@cn.fujitsu.comSigned-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
Reported-by: NMichael Ellerman <mpe@ellerman.id.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7016383b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功