1. 12 3月, 2021 3 次提交
  2. 11 3月, 2021 3 次提交
  3. 10 3月, 2021 3 次提交
    • B
      net: check if protocol extracted by virtio_net_hdr_set_proto is correct · 924a9bc3
      Balazs Nemeth 提交于
      For gso packets, virtio_net_hdr_set_proto sets the protocol (if it isn't
      set) based on the type in the virtio net hdr, but the skb could contain
      anything since it could come from packet_snd through a raw socket. If
      there is a mismatch between what virtio_net_hdr_set_proto sets and
      the actual protocol, then the skb could be handled incorrectly later
      on.
      
      An example where this poses an issue is with the subsequent call to
      skb_flow_dissect_flow_keys_basic which relies on skb->protocol being set
      correctly. A specially crafted packet could fool
      skb_flow_dissect_flow_keys_basic preventing EINVAL to be returned.
      
      Avoid blindly trusting the information provided by the virtio net header
      by checking that the protocol in the packet actually matches the
      protocol set by virtio_net_hdr_set_proto. Note that since the protocol
      is only checked if skb->dev implements header_ops->parse_protocol,
      packets from devices without the implementation are not checked at this
      stage.
      
      Fixes: 9274124f ("net: stricter validation of untrusted gso packets")
      Signed-off-by: NBalazs Nemeth <bnemeth@redhat.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      924a9bc3
    • B
      bpf, xdp: Restructure redirect actions · ee75aef2
      Björn Töpel 提交于
      The XDP_REDIRECT implementations for maps and non-maps are fairly
      similar, but obviously need to take different code paths depending on
      if the target is using a map or not. Today, the redirect targets for
      XDP either uses a map, or is based on ifindex.
      
      Here, the map type and id are added to bpf_redirect_info, instead of
      the actual map. Map type, map item/ifindex, and the map_id (if any) is
      passed to xdp_do_redirect().
      
      For ifindex-based redirect, used by the bpf_redirect() XDP BFP helper,
      a special map type/id are used. Map type of UNSPEC together with map id
      equal to INT_MAX has the special meaning of an ifindex based
      redirect. Note that valid map ids are 1 inclusive, INT_MAX exclusive
      ([1,INT_MAX[).
      
      In addition to making the code easier to follow, using explicit type
      and id in bpf_redirect_info has a slight positive performance impact
      by avoiding a pointer indirection for the map type lookup, and instead
      use the cacheline for bpf_redirect_info.
      
      Since the actual map is not passed via bpf_redirect_info anymore, the
      map lookup is only done in the BPF helper. This means that the
      bpf_clear_redirect_map() function can be removed. The actual map item
      is RCU protected.
      
      The bpf_redirect_info flags member is not used by XDP, and not
      read/written any more. The map member is only written to when
      required/used, and not unconditionally.
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/bpf/20210308112907.559576-3-bjorn.topel@gmail.com
      ee75aef2
    • B
      bpf, xdp: Make bpf_redirect_map() a map operation · e6a4750f
      Björn Töpel 提交于
      Currently the bpf_redirect_map() implementation dispatches to the
      correct map-lookup function via a switch-statement. To avoid the
      dispatching, this change adds bpf_redirect_map() as a map
      operation. Each map provides its bpf_redirect_map() version, and
      correct function is automatically selected by the BPF verifier.
      
      A nice side-effect of the code movement is that the map lookup
      functions are now local to the map implementation files, which removes
      one additional function call.
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/bpf/20210308112907.559576-2-bjorn.topel@gmail.com
      e6a4750f
  4. 09 3月, 2021 1 次提交
  5. 08 3月, 2021 2 次提交
  6. 05 3月, 2021 13 次提交
  7. 04 3月, 2021 6 次提交
  8. 03 3月, 2021 2 次提交
    • J
      swap: fix swapfile read/write offset · caf6912f
      Jens Axboe 提交于
      We're not factoring in the start of the file for where to write and
      read the swapfile, which leads to very unfortunate side effects of
      writing where we should not be...
      
      Fixes: 48d15436 ("mm: remove get_swap_bio")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      caf6912f
    • D
      KVM: x86/xen: Add support for vCPU runstate information · 30b5c851
      David Woodhouse 提交于
      This is how Xen guests do steal time accounting. The hypervisor records
      the amount of time spent in each of running/runnable/blocked/offline
      states.
      
      In the Xen accounting, a vCPU is still in state RUNSTATE_running while
      in Xen for a hypercall or I/O trap, etc. Only if Xen explicitly schedules
      does the state become RUNSTATE_blocked. In KVM this means that even when
      the vCPU exits the kvm_run loop, the state remains RUNSTATE_running.
      
      The VMM can explicitly set the vCPU to RUNSTATE_blocked by using the
      KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT attribute, and can also use
      KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST to retrospectively add a given
      amount of time to the blocked state and subtract it from the running
      state.
      
      The state_entry_time corresponds to get_kvmclock_ns() at the time the
      vCPU entered the current state, and the total times of all four states
      should always add up to state_entry_time.
      Co-developed-by: NJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20210301125309.874953-2-dwmw2@infradead.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      30b5c851
  9. 02 3月, 2021 5 次提交
  10. 01 3月, 2021 1 次提交
    • O
      can: skb: can_skb_set_owner(): fix ref counting if socket was closed before setting skb ownership · e940e089
      Oleksij Rempel 提交于
      There are two ref count variables controlling the free()ing of a socket:
      - struct sock::sk_refcnt - which is changed by sock_hold()/sock_put()
      - struct sock::sk_wmem_alloc - which accounts the memory allocated by
        the skbs in the send path.
      
      In case there are still TX skbs on the fly and the socket() is closed,
      the struct sock::sk_refcnt reaches 0. In the TX-path the CAN stack
      clones an "echo" skb, calls sock_hold() on the original socket and
      references it. This produces the following back trace:
      
      | WARNING: CPU: 0 PID: 280 at lib/refcount.c:25 refcount_warn_saturate+0x114/0x134
      | refcount_t: addition on 0; use-after-free.
      | Modules linked in: coda_vpu(E) v4l2_jpeg(E) videobuf2_vmalloc(E) imx_vdoa(E)
      | CPU: 0 PID: 280 Comm: test_can.sh Tainted: G            E     5.11.0-04577-gf8ff6603c617 #203
      | Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
      | Backtrace:
      | [<80bafea4>] (dump_backtrace) from [<80bb0280>] (show_stack+0x20/0x24) r7:00000000 r6:600f0113 r5:00000000 r4:81441220
      | [<80bb0260>] (show_stack) from [<80bb593c>] (dump_stack+0xa0/0xc8)
      | [<80bb589c>] (dump_stack) from [<8012b268>] (__warn+0xd4/0x114) r9:00000019 r8:80f4a8c2 r7:83e4150c r6:00000000 r5:00000009 r4:80528f90
      | [<8012b194>] (__warn) from [<80bb09c4>] (warn_slowpath_fmt+0x88/0xc8) r9:83f26400 r8:80f4a8d1 r7:00000009 r6:80528f90 r5:00000019 r4:80f4a8c2
      | [<80bb0940>] (warn_slowpath_fmt) from [<80528f90>] (refcount_warn_saturate+0x114/0x134) r8:00000000 r7:00000000 r6:82b44000 r5:834e5600 r4:83f4d540
      | [<80528e7c>] (refcount_warn_saturate) from [<8079a4c8>] (__refcount_add.constprop.0+0x4c/0x50)
      | [<8079a47c>] (__refcount_add.constprop.0) from [<8079a57c>] (can_put_echo_skb+0xb0/0x13c)
      | [<8079a4cc>] (can_put_echo_skb) from [<8079ba98>] (flexcan_start_xmit+0x1c4/0x230) r9:00000010 r8:83f48610 r7:0fdc0000 r6:0c080000 r5:82b44000 r4:834e5600
      | [<8079b8d4>] (flexcan_start_xmit) from [<80969078>] (netdev_start_xmit+0x44/0x70) r9:814c0ba0 r8:80c8790c r7:00000000 r6:834e5600 r5:82b44000 r4:82ab1f00
      | [<80969034>] (netdev_start_xmit) from [<809725a4>] (dev_hard_start_xmit+0x19c/0x318) r9:814c0ba0 r8:00000000 r7:82ab1f00 r6:82b44000 r5:00000000 r4:834e5600
      | [<80972408>] (dev_hard_start_xmit) from [<809c6584>] (sch_direct_xmit+0xcc/0x264) r10:834e5600 r9:00000000 r8:00000000 r7:82b44000 r6:82ab1f00 r5:834e5600 r4:83f27400
      | [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534)
      
      To fix this problem, only set skb ownership to sockets which have still
      a ref count > 0.
      
      Fixes: 0ae89beb ("can: add destructor for self generated skbs")
      Cc: Oliver Hartkopp <socketcan@hartkopp.net>
      Cc: Andre Naujoks <nautsch2@gmail.com>
      Link: https://lore.kernel.org/r/20210226092456.27126-1-o.rempel@pengutronix.deSuggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      e940e089
  11. 28 2月, 2021 1 次提交