1. 29 1月, 2020 3 次提交
    • J
      io_uring: allow registering credentials · 071698e1
      Jens Axboe 提交于
      If an application wants to use a ring with different kinds of
      credentials, it can register them upfront. We don't lookup credentials,
      the credentials of the task calling IORING_REGISTER_PERSONALITY is used.
      
      An 'id' is returned for the application to use in subsequent personality
      support.
      Reviewed-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      071698e1
    • P
      io_uring: add io-wq workqueue sharing · 24369c2e
      Pavel Begunkov 提交于
      If IORING_SETUP_ATTACH_WQ is set, it expects wq_fd in io_uring_params to
      be a valid io_uring fd io-wq of which will be shared with the newly
      created io_uring instance. If the flag is set but it can't share io-wq,
      it fails.
      
      This allows creation of "sibling" io_urings, where we prefer to keep the
      SQ/CQ private, but want to share the async backend to minimize the amount
      of overhead associated with having multiple rings that belong to the same
      backend.
      Reported-by: NJens Axboe <axboe@kernel.dk>
      Reported-by: NDaurnimator <quae@daurnimator.com>
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      24369c2e
    • J
      io_uring/io-wq: don't use static creds/mm assignments · cccf0ee8
      Jens Axboe 提交于
      We currently setup the io_wq with a static set of mm and creds. Even for
      a single-use io-wq per io_uring, this is suboptimal as we have may have
      multiple enters of the ring. For sharing the io-wq backend, it doesn't
      work at all.
      
      Switch to passing in the creds and mm when the work item is setup. This
      means that async work is no longer deferred to the io_uring mm and creds,
      it is done with the current mm and creds.
      
      Flag this behavior with IORING_FEAT_CUR_PERSONALITY, so applications know
      they can rely on the current personality (mm and creds) being the same
      for direct issue and async issue.
      Reviewed-by: NStefan Metzmacher <metze@samba.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      cccf0ee8
  2. 21 1月, 2020 20 次提交
    • P
      io_uring: optimise sqe-to-req flags translation · 6b47ee6e
      Pavel Begunkov 提交于
      For each IOSQE_* flag there is a corresponding REQ_F_* flag. And there
      is a repetitive pattern of their translation:
      e.g. if (sqe->flags & SQE_FLAG*) req->flags |= REQ_F_FLAG*
      
      Use same numeric values/bits for them and copy instead of manual
      handling.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6b47ee6e
    • J
      io_uring: add support for probing opcodes · 66f4af93
      Jens Axboe 提交于
      The application currently has no way of knowing if a given opcode is
      supported or not without having to try and issue one and see if we get
      -EINVAL or not. And even this approach is fraught with peril, as maybe
      we're getting -EINVAL due to some fields being missing, or maybe it's
      just not that easy to issue that particular command without doing some
      other leg work in terms of setup first.
      
      This adds IORING_REGISTER_PROBE, which fills in a structure with info
      on what it supported or not. This will work even with sparse opcode
      fields, which may happen in the future or even today if someone
      backports specific features to older kernels.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      66f4af93
    • J
      io_uring: add opcode to issue trace event · 354420f7
      Jens Axboe 提交于
      For some test apps at least, user_data is just zeroes. So it's not a
      good way to tell what the command actually is. Add the opcode to the
      issue trace point.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      354420f7
    • J
      io_uring: add support for IORING_OP_OPENAT2 · cebdb986
      Jens Axboe 提交于
      Add support for the new openat2(2) system call. It's trivial to do, as
      we can have openat(2) just be wrapped around it.
      Suggested-by: NStefan Metzmacher <metze@samba.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      cebdb986
    • J
      io_uring: enable option to only trigger eventfd for async completions · f2842ab5
      Jens Axboe 提交于
      If an application is using eventfd notifications with poll to know when
      new SQEs can be issued, it's expecting the following read/writes to
      complete inline. And with that, it knows that there are events available,
      and don't want spurious wakeups on the eventfd for those requests.
      
      This adds IORING_REGISTER_EVENTFD_ASYNC, which works just like
      IORING_REGISTER_EVENTFD, except it only triggers notifications for events
      that happen from async completions (IRQ, or io-wq worker completions).
      Any completions inline from the submission itself will not trigger
      notifications.
      Suggested-by: NMark Papadakis <markuspapadakis@icloud.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f2842ab5
    • J
      io_uring: add support for send(2) and recv(2) · fddaface
      Jens Axboe 提交于
      This adds IORING_OP_SEND for send(2) support, and IORING_OP_RECV for
      recv(2) support.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      fddaface
    • J
      io_uring: add support for IORING_SETUP_CLAMP · 8110c1a6
      Jens Axboe 提交于
      Some applications like to start small in terms of ring size, and then
      ramp up as needed. This is a bit tricky to do currently, since we don't
      advertise the max ring size.
      
      This adds IORING_SETUP_CLAMP. If set, and the values for SQ or CQ ring
      size exceed what we support, then clamp them at the max values instead
      of returning -EINVAL. Since we return the chosen ring sizes after setup,
      no further changes are needed on the application side. io_uring already
      changes the ring sizes if the application doesn't ask for power-of-two
      sizes, for example.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8110c1a6
    • P
      pcpu_ref: add percpu_ref_tryget_many() · 4e5ef023
      Pavel Begunkov 提交于
      Add percpu_ref_tryget_many(), which works the same way as
      percpu_ref_tryget(), but grabs specified number of refs.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDennis Zhou <dennis@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4e5ef023
    • J
      io_uring: add IORING_OP_MADVISE · c1ca757b
      Jens Axboe 提交于
      This adds support for doing madvise(2) through io_uring. We assume that
      any operation can block, and hence punt everything async. This could be
      improved, but hard to make bullet proof. The async punt ensures it's
      safe.
      Reviewed-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c1ca757b
    • J
      mm: make do_madvise() available internally · db08ca25
      Jens Axboe 提交于
      This is in preparation for enabling this functionality through io_uring.
      Add a helper that is just exporting what sys_madvise() does, and have the
      system call use it.
      
      No functional changes in this patch.
      Reviewed-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      db08ca25
    • J
      io_uring: add IORING_OP_FADVISE · 4840e418
      Jens Axboe 提交于
      This adds support for doing fadvise through io_uring. We assume that
      WILLNEED doesn't block, but that DONTNEED may block.
      Reviewed-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4840e418
    • J
      io_uring: allow use of offset == -1 to mean file position · ba04291e
      Jens Axboe 提交于
      This behaves like preadv2/pwritev2 with offset == -1, it'll use (and
      update) the current file position. This obviously comes with the caveat
      that if the application has multiple read/writes in flight, then the
      end result will not be as expected. This is similar to threads sharing
      a file descriptor and doing IO using the current file position.
      
      Since this feature isn't easily detectable by doing a read or write,
      add a feature flags, IORING_FEAT_RW_CUR_POS, to allow applications to
      detect presence of this feature.
      Reported-by: N李通洲 <carter.li@eoitek.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ba04291e
    • J
      io_uring: add non-vectored read/write commands · 3a6820f2
      Jens Axboe 提交于
      For uses cases that don't already naturally have an iovec, it's easier
      (or more convenient) to just use a buffer address + length. This is
      particular true if the use case is from languages that want to create
      a memory safe abstraction on top of io_uring, and where introducing
      the need for the iovec may impose an ownership issue. For those cases,
      they currently need an indirection buffer, which means allocating data
      just for this purpose.
      
      Add basic read/write that don't require the iovec.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      3a6820f2
    • J
      io_uring: add IOSQE_ASYNC · ce35a47a
      Jens Axboe 提交于
      io_uring defaults to always doing inline submissions, if at all
      possible. But for larger copies, even if the data is fully cached, that
      can take a long time. Add an IOSQE_ASYNC flag that the application can
      set on the SQE - if set, it'll ensure that we always go async for those
      kinds of requests. Use the io-wq IO_WQ_WORK_CONCURRENT flag to ensure we
      get the concurrency we desire for this case.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ce35a47a
    • J
      io_uring: add support for IORING_OP_STATX · eddc7ef5
      Jens Axboe 提交于
      This provides support for async statx(2) through io_uring.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      eddc7ef5
    • J
      io_uring: avoid ring quiesce for fixed file set unregister and update · 05f3fb3c
      Jens Axboe 提交于
      We currently fully quiesce the ring before an unregister or update of
      the fixed fileset. This is very expensive, and we can be a bit smarter
      about this.
      
      Add a percpu refcount for the file tables as a whole. Grab a percpu ref
      when we use a registered file, and put it on completion. This is cheap
      to do. Upon removal of a file from a set, switch the ref count to atomic
      mode. When we hit zero ref on the completion side, then we know we can
      drop the previously registered files. When the old files have been
      dropped, switch the ref back to percpu mode for normal operation.
      
      Since there's a period between doing the update and the kernel being
      done with it, add a IORING_OP_FILES_UPDATE opcode that can perform the
      same action. The application knows the update has completed when it gets
      the CQE for it. Between doing the update and receiving this completion,
      the application must continue to use the unregistered fd if submitting
      IO on this particular file.
      
      This takes the runtime of test/file-register from liburing from 14s to
      about 0.7s.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      05f3fb3c
    • J
      io_uring: add support for IORING_OP_CLOSE · b5dba59e
      Jens Axboe 提交于
      This works just like close(2), unsurprisingly. We remove the file
      descriptor and post the completion inline, then offload the actual
      (potential) last file put to async context.
      
      Mark the async part of this work as uncancellable, as we really must
      guarantee that the latter part of the close is run.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b5dba59e
    • J
      io_uring: add support for IORING_OP_OPENAT · 15b71abe
      Jens Axboe 提交于
      This works just like openat(2), except it can be performed async. For
      the normal case of a non-blocking path lookup this will complete
      inline. If we have to do IO to perform the open, it'll be done from
      async context.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      15b71abe
    • J
      io_uring: add support for fallocate() · d63d1b5e
      Jens Axboe 提交于
      This exposes fallocate(2) through io_uring.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d63d1b5e
    • E
      io_uring: fix compat for IORING_REGISTER_FILES_UPDATE · 1292e972
      Eugene Syromiatnikov 提交于
      fds field of struct io_uring_files_update is problematic with regards
      to compat user space, as pointer size is different in 32-bit, 32-on-64-bit,
      and 64-bit user space.  In order to avoid custom handling of compat in
      the syscall implementation, make fds __u64 and use u64_to_user_ptr in
      order to retrieve it.  Also, align the field naturally and check that
      no garbage is passed there.
      
      Fixes: c3a31e60 ("io_uring: add support for IORING_REGISTER_FILES_UPDATE")
      Signed-off-by: NEugene Syromiatnikov <esyr@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      1292e972
  3. 18 1月, 2020 1 次提交
    • A
      open: introduce openat2(2) syscall · fddb5d43
      Aleksa Sarai 提交于
      /* Background. */
      For a very long time, extending openat(2) with new features has been
      incredibly frustrating. This stems from the fact that openat(2) is
      possibly the most famous counter-example to the mantra "don't silently
      accept garbage from userspace" -- it doesn't check whether unknown flags
      are present[1].
      
      This means that (generally) the addition of new flags to openat(2) has
      been fraught with backwards-compatibility issues (O_TMPFILE has to be
      defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old
      kernels gave errors, since it's insecure to silently ignore the
      flag[2]). All new security-related flags therefore have a tough road to
      being added to openat(2).
      
      Userspace also has a hard time figuring out whether a particular flag is
      supported on a particular kernel. While it is now possible with
      contemporary kernels (thanks to [3]), older kernels will expose unknown
      flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during
      openat(2) time matches modern syscall designs and is far more
      fool-proof.
      
      In addition, the newly-added path resolution restriction LOOKUP flags
      (which we would like to expose to user-space) don't feel related to the
      pre-existing O_* flag set -- they affect all components of path lookup.
      We'd therefore like to add a new flag argument.
      
      Adding a new syscall allows us to finally fix the flag-ignoring problem,
      and we can make it extensible enough so that we will hopefully never
      need an openat3(2).
      
      /* Syscall Prototype. */
        /*
         * open_how is an extensible structure (similar in interface to
         * clone3(2) or sched_setattr(2)). The size parameter must be set to
         * sizeof(struct open_how), to allow for future extensions. All future
         * extensions will be appended to open_how, with their zero value
         * acting as a no-op default.
         */
        struct open_how { /* ... */ };
      
        int openat2(int dfd, const char *pathname,
                    struct open_how *how, size_t size);
      
      /* Description. */
      The initial version of 'struct open_how' contains the following fields:
      
        flags
          Used to specify openat(2)-style flags. However, any unknown flag
          bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR)
          will result in -EINVAL. In addition, this field is 64-bits wide to
          allow for more O_ flags than currently permitted with openat(2).
      
        mode
          The file mode for O_CREAT or O_TMPFILE.
      
          Must be set to zero if flags does not contain O_CREAT or O_TMPFILE.
      
        resolve
          Restrict path resolution (in contrast to O_* flags they affect all
          path components). The current set of flags are as follows (at the
          moment, all of the RESOLVE_ flags are implemented as just passing
          the corresponding LOOKUP_ flag).
      
          RESOLVE_NO_XDEV       => LOOKUP_NO_XDEV
          RESOLVE_NO_SYMLINKS   => LOOKUP_NO_SYMLINKS
          RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS
          RESOLVE_BENEATH       => LOOKUP_BENEATH
          RESOLVE_IN_ROOT       => LOOKUP_IN_ROOT
      
      open_how does not contain an embedded size field, because it is of
      little benefit (userspace can figure out the kernel open_how size at
      runtime fairly easily without it). It also only contains u64s (even
      though ->mode arguably should be a u16) to avoid having padding fields
      which are never used in the future.
      
      Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE
      is no longer permitted for openat(2). As far as I can tell, this has
      always been a bug and appears to not be used by userspace (and I've not
      seen any problems on my machines by disallowing it). If it turns out
      this breaks something, we can special-case it and only permit it for
      openat(2) but not openat2(2).
      
      After input from Florian Weimer, the new open_how and flag definitions
      are inside a separate header from uapi/linux/fcntl.h, to avoid problems
      that glibc has with importing that header.
      
      /* Testing. */
      In a follow-up patch there are over 200 selftests which ensure that this
      syscall has the correct semantics and will correctly handle several
      attack scenarios.
      
      In addition, I've written a userspace library[4] which provides
      convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary
      because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care
      must be taken when using RESOLVE_IN_ROOT'd file descriptors with other
      syscalls). During the development of this patch, I've run numerous
      verification tests using libpathrs (showing that the API is reasonably
      usable by userspace).
      
      /* Future Work. */
      Additional RESOLVE_ flags have been suggested during the review period.
      These can be easily implemented separately (such as blocking auto-mount
      during resolution).
      
      Furthermore, there are some other proposed changes to the openat(2)
      interface (the most obvious example is magic-link hardening[5]) which
      would be a good opportunity to add a way for userspace to restrict how
      O_PATH file descriptors can be re-opened.
      
      Another possible avenue of future work would be some kind of
      CHECK_FIELDS[6] flag which causes the kernel to indicate to userspace
      which openat2(2) flags and fields are supported by the current kernel
      (to avoid userspace having to go through several guesses to figure it
      out).
      
      [1]: https://lwn.net/Articles/588444/
      [2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com
      [3]: commit 629e014b ("fs: completely ignore unknown open flags")
      [4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523
      [5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/
      [6]: https://youtu.be/ggD-eb3yPVsSuggested-by: NChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: NAleksa Sarai <cyphar@cyphar.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      fddb5d43
  4. 16 1月, 2020 5 次提交
    • M
      block: fix an integer overflow in logical block size · ad6bf88a
      Mikulas Patocka 提交于
      Logical block size has type unsigned short. That means that it can be at
      most 32768. However, there are architectures that can run with 64k pages
      (for example arm64) and on these architectures, it may be possible to
      create block devices with 64k block size.
      
      For exmaple (run this on an architecture with 64k pages):
      
      Mount will fail with this error because it tries to read the superblock using 2-sector
      access:
        device-mapper: writecache: I/O is not aligned, sector 2, size 1024, block size 65536
        EXT4-fs (dm-0): unable to read superblock
      
      This patch changes the logical block size from unsigned short to unsigned
      int to avoid the overflow.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ad6bf88a
    • J
      bpf: Sockmap/tls, push write_space updates through ulp updates · 33bfe20d
      John Fastabend 提交于
      When sockmap sock with TLS enabled is removed we cleanup bpf/psock state
      and call tcp_update_ulp() to push updates to TLS ULP on top. However, we
      don't push the write_space callback up and instead simply overwrite the
      op with the psock stored previous op. This may or may not be correct so
      to ensure we don't overwrite the TLS write space hook pass this field to
      the ULP and have it fixup the ctx.
      
      This completes a previous fix that pushed the ops through to the ULP
      but at the time missed doing this for write_space, presumably because
      write_space TLS hook was added around the same time.
      
      Fixes: 95fa1454 ("bpf: sockmap/tls, close can race with map free")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: NJakub Sitnicki <jakub@cloudflare.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/bpf/20200111061206.8028-4-john.fastabend@gmail.com
      33bfe20d
    • J
      bpf: Sockmap/tls, during free we may call tcp_bpf_unhash() in loop · 4da6a196
      John Fastabend 提交于
      When a sockmap is free'd and a socket in the map is enabled with tls
      we tear down the bpf context on the socket, the psock struct and state,
      and then call tcp_update_ulp(). The tcp_update_ulp() call is to inform
      the tls stack it needs to update its saved sock ops so that when the tls
      socket is later destroyed it doesn't try to call the now destroyed psock
      hooks.
      
      This is about keeping stacked ULPs in good shape so they always have
      the right set of stacked ops.
      
      However, recently unhash() hook was removed from TLS side. But, the
      sockmap/bpf side is not doing any extra work to update the unhash op
      when is torn down instead expecting TLS side to manage it. So both
      TLS and sockmap believe the other side is managing the op and instead
      no one updates the hook so it continues to point at tcp_bpf_unhash().
      When unhash hook is called we call tcp_bpf_unhash() which detects the
      psock has already been destroyed and calls sk->sk_prot_unhash() which
      calls tcp_bpf_unhash() yet again and so on looping and hanging the core.
      
      To fix have sockmap tear down logic fixup the stale pointer.
      
      Fixes: 5d92e631 ("net/tls: partially revert fix transition through disconnect with close")
      Reported-by: syzbot+83979935eb6304f8cd46@syzkaller.appspotmail.com
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: NJakub Sitnicki <jakub@cloudflare.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/bpf/20200111061206.8028-2-john.fastabend@gmail.com
      4da6a196
    • W
      drm/dp_mst: Have DP_Tx send one msg at a time · 5a64967a
      Wayne Lin 提交于
      [Why]
      Noticed this while testing MST with the 4 ports MST hub from
      StarTech.com. Sometimes can't light up monitors normally and get the
      error message as 'sideband msg build failed'.
      
      Look into aux transactions, found out that source sometimes will send
      out another down request before receiving the down reply of the
      previous down request. On the other hand, in drm_dp_get_one_sb_msg(),
      current code doesn't handle the interleaved replies case. Hence, source
      can't build up message completely and can't light up monitors.
      
      [How]
      For good compatibility, enforce source to send out one down request at a
      time. Add a flag, is_waiting_for_dwn_reply, to determine if the source
      can send out a down request immediately or not.
      
      - Check the flag before calling process_single_down_tx_qlock to send out
      a msg
      - Set the flag when successfully send out a down request
      - Clear the flag when successfully build up a down reply
      - Clear the flag when find erros during sending out a down request
      - Clear the flag when find errors during building up a down reply
      - Clear the flag when timeout occurs during waiting for a down reply
      - Use drm_dp_mst_kick_tx() to try to send another down request in queue
      at the end of drm_dp_mst_wait_tx_reply() (attempt to send out messages
      in queue when errors occur)
      
      Cc: Lyude Paul <lyude@redhat.com>
      Signed-off-by: NWayne Lin <Wayne.Lin@amd.com>
      Signed-off-by: NLyude Paul <lyude@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200113093649.11755-1-Wayne.Lin@amd.com
      5a64967a
    • D
      bpf: Fix incorrect verifier simulation of ARSH under ALU32 · 0af2ffc9
      Daniel Borkmann 提交于
      Anatoly has been fuzzing with kBdysch harness and reported a hang in one
      of the outcomes:
      
        0: R1=ctx(id=0,off=0,imm=0) R10=fp0
        0: (85) call bpf_get_socket_cookie#46
        1: R0_w=invP(id=0) R10=fp0
        1: (57) r0 &= 808464432
        2: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0
        2: (14) w0 -= 810299440
        3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0
        3: (c4) w0 s>>= 1
        4: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0
        4: (76) if w0 s>= 0x30303030 goto pc+216
        221: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0
        221: (95) exit
        processed 6 insns (limit 1000000) [...]
      
      Taking a closer look, the program was xlated as follows:
      
        # ./bpftool p d x i 12
        0: (85) call bpf_get_socket_cookie#7800896
        1: (bf) r6 = r0
        2: (57) r6 &= 808464432
        3: (14) w6 -= 810299440
        4: (c4) w6 s>>= 1
        5: (76) if w6 s>= 0x30303030 goto pc+216
        6: (05) goto pc-1
        7: (05) goto pc-1
        8: (05) goto pc-1
        [...]
        220: (05) goto pc-1
        221: (05) goto pc-1
        222: (95) exit
      
      Meaning, the visible effect is very similar to f54c7898 ("bpf: Fix
      precision tracking for unbounded scalars"), that is, the fall-through
      branch in the instruction 5 is considered to be never taken given the
      conclusion from the min/max bounds tracking in w6, and therefore the
      dead-code sanitation rewrites it as goto pc-1. However, real-life input
      disagrees with verification analysis since a soft-lockup was observed.
      
      The bug sits in the analysis of the ARSH. The definition is that we shift
      the target register value right by K bits through shifting in copies of
      its sign bit. In adjust_scalar_min_max_vals(), we do first coerce the
      register into 32 bit mode, same happens after simulating the operation.
      However, for the case of simulating the actual ARSH, we don't take the
      mode into account and act as if it's always 64 bit, but location of sign
      bit is different:
      
        dst_reg->smin_value >>= umin_val;
        dst_reg->smax_value >>= umin_val;
        dst_reg->var_off = tnum_arshift(dst_reg->var_off, umin_val);
      
      Consider an unknown R0 where bpf_get_socket_cookie() (or others) would
      for example return 0xffff. With the above ARSH simulation, we'd see the
      following results:
      
        [...]
        1: R1=ctx(id=0,off=0,imm=0) R2_w=invP65535 R10=fp0
        1: (85) call bpf_get_socket_cookie#46
        2: R0_w=invP(id=0) R10=fp0
        2: (57) r0 &= 808464432
          -> R0_runtime = 0x3030
        3: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0
        3: (14) w0 -= 810299440
          -> R0_runtime = 0xcfb40000
        4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0
                                    (0xffffffff)
        4: (c4) w0 s>>= 1
          -> R0_runtime = 0xe7da0000
        5: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0
                                    (0x67c00000)           (0x7ffbfff8)
        [...]
      
      In insn 3, we have a runtime value of 0xcfb40000, which is '1100 1111 1011
      0100 0000 0000 0000 0000', the result after the shift has 0xe7da0000 that
      is '1110 0111 1101 1010 0000 0000 0000 0000', where the sign bit is correctly
      retained in 32 bit mode. In insn4, the umax was 0xffffffff, and changed into
      0x7ffbfff8 after the shift, that is, '0111 1111 1111 1011 1111 1111 1111 1000'
      and means here that the simulation didn't retain the sign bit. With above
      logic, the updates happen on the 64 bit min/max bounds and given we coerced
      the register, the sign bits of the bounds are cleared as well, meaning, we
      need to force the simulation into s32 space for 32 bit alu mode.
      
      Verification after the fix below. We're first analyzing the fall-through branch
      on 32 bit signed >= test eventually leading to rejection of the program in this
      specific case:
      
        0: R1=ctx(id=0,off=0,imm=0) R10=fp0
        0: (b7) r2 = 808464432
        1: R1=ctx(id=0,off=0,imm=0) R2_w=invP808464432 R10=fp0
        1: (85) call bpf_get_socket_cookie#46
        2: R0_w=invP(id=0) R10=fp0
        2: (bf) r6 = r0
        3: R0_w=invP(id=0) R6_w=invP(id=0) R10=fp0
        3: (57) r6 &= 808464432
        4: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0
        4: (14) w6 -= 810299440
        5: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0
        5: (c4) w6 s>>= 1
        6: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0
                                                    (0x67c00000)          (0xfffbfff8)
        6: (76) if w6 s>= 0x30303030 goto pc+216
        7: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0
        7: (30) r0 = *(u8 *)skb[808464432]
        BPF_LD_[ABS|IND] uses reserved fields
        processed 8 insns (limit 1000000) [...]
      
      Fixes: 9cbe1f5a ("bpf/verifier: improve register value range tracking with ARSH")
      Reported-by: NAnatoly Trosinenko <anatoly.trosinenko@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200115204733.16648-1-daniel@iogearbox.net
      0af2ffc9
  5. 15 1月, 2020 3 次提交
    • O
      cfg80211: Fix radar event during another phy CAC · 26ec17a1
      Orr Mazor 提交于
      In case a radar event of CAC_FINISHED or RADAR_DETECTED
      happens during another phy is during CAC we might need
      to cancel that CAC.
      
      If we got a radar in a channel that another phy is now
      doing CAC on then the CAC should be canceled there.
      
      If, for example, 2 phys doing CAC on the same channels,
      or on comptable channels, once on of them will finish his
      CAC the other might need to cancel his CAC, since it is no
      longer relevant.
      
      To fix that the commit adds an callback and implement it in
      mac80211 to end CAC.
      This commit also adds a call to said callback if after a radar
      event we see the CAC is no longer relevant
      Signed-off-by: NOrr Mazor <Orr.Mazor@tandemg.com>
      Reviewed-by: NSergey Matyukevich <sergey.matyukevich.os@quantenna.com>
      Link: https://lore.kernel.org/r/20191222145449.15792-1-Orr.Mazor@tandemg.com
      [slightly reformat/reword commit message]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      26ec17a1
    • A
      reimplement path_mountpoint() with less magic · c64cd6e3
      Al Viro 提交于
      ... and get rid of a bunch of bugs in it.  Background:
      the reason for path_mountpoint() is that umount() really doesn't
      want attempts to revalidate the root of what it's trying to umount.
      The thing we want to avoid actually happen from complete_walk();
      solution was to do something parallel to normal path_lookupat()
      and it both went overboard and got the boilerplate subtly
      (and not so subtly) wrong.
      
      A better solution is to do pretty much what the normal path_lookupat()
      does, but instead of complete_walk() do unlazy_walk().  All it takes
      to avoid that ->d_weak_revalidate() call...  mountpoint_last() goes
      away, along with everything it got wrong, and so does the magic around
      LOOKUP_NO_REVAL.
      
      Another source of bugs is that when we traverse mounts at the final
      location (and we need to do that - umount . expects to get whatever's
      overmounting ., if any, out of the lookup) we really ought to take
      care of ->d_manage() - as it is, manual umount of autofs automount
      in progress can lead to unpleasant surprises for the daemon.  Easily
      solved by using handle_lookup_down() instead of follow_mount().
      Tested-by: NIan Kent <raven@themaw.net>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c64cd6e3
    • D
      afs: Fix use-after-loss-of-ref · 40a708bd
      David Howells 提交于
      afs_lookup() has a tracepoint to indicate the outcome of
      d_splice_alias(), passing it the inode to retrieve the fid from.
      However, the function gave up its ref on that inode when it called
      d_splice_alias(), which may have failed and dropped the inode.
      
      Fix this by caching the fid.
      
      Fixes: 80548b03 ("afs: Add more tracepoints")
      Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      40a708bd
  6. 14 1月, 2020 3 次提交
    • Y
      mm: khugepaged: add trace status description for SCAN_PAGE_HAS_PRIVATE · 554913f6
      Yang Shi 提交于
      Commit 99cb0dbd ("mm,thp: add read-only THP support for (non-shmem)
      FS") introduced a new khugepaged scan result: SCAN_PAGE_HAS_PRIVATE, but
      the corresponding description for trace events were not added.
      
      Link: http://lkml.kernel.org/r/1574793844-2914-1-git-send-email-yang.shi@linux.alibaba.com
      Fixes: 99cb0dbd ("mm,thp: add read-only THP support for (non-shmem) FS")
      Signed-off-by: NYang Shi <yang.shi@linux.alibaba.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      554913f6
    • V
      mm, debug_pagealloc: don't rely on static keys too early · 8e57f8ac
      Vlastimil Babka 提交于
      Commit 96a2b03f ("mm, debug_pagelloc: use static keys to enable
      debugging") has introduced a static key to reduce overhead when
      debug_pagealloc is compiled in but not enabled.  It relied on the
      assumption that jump_label_init() is called before parse_early_param()
      as in start_kernel(), so when the "debug_pagealloc=on" option is parsed,
      it is safe to enable the static key.
      
      However, it turns out multiple architectures call parse_early_param()
      earlier from their setup_arch().  x86 also calls jump_label_init() even
      earlier, so no issue was found while testing the commit, but same is not
      true for e.g.  ppc64 and s390 where the kernel would not boot with
      debug_pagealloc=on as found by our QA.
      
      To fix this without tricky changes to init code of multiple
      architectures, this patch partially reverts the static key conversion
      from 96a2b03f.  Init-time and non-fastpath calls (such as in arch
      code) of debug_pagealloc_enabled() will again test a simple bool
      variable.  Fastpath mm code is converted to a new
      debug_pagealloc_enabled_static() variant that relies on the static key,
      which is enabled in a well-defined point in mm_init() where it's
      guaranteed that jump_label_init() has been called, regardless of
      architecture.
      
      [sfr@canb.auug.org.au: export _debug_pagealloc_enabled_early]
        Link: http://lkml.kernel.org/r/20200106164944.063ac07b@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20191219130612.23171-1-vbabka@suse.cz
      Fixes: 96a2b03f ("mm, debug_pagelloc: use static keys to enable debugging")
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Qian Cai <cai@lca.pw>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8e57f8ac
    • R
      mm: memcg/slab: fix percpu slab vmstats flushing · 4a87e2a2
      Roman Gushchin 提交于
      Currently slab percpu vmstats are flushed twice: during the memcg
      offlining and just before freeing the memcg structure.  Each time percpu
      counters are summed, added to the atomic counterparts and propagated up
      by the cgroup tree.
      
      The second flushing is required due to how recursive vmstats are
      implemented: counters are batched in percpu variables on a local level,
      and once a percpu value is crossing some predefined threshold, it spills
      over to atomic values on the local and each ascendant levels.  It means
      that without flushing some numbers cached in percpu variables will be
      dropped on floor each time a cgroup is destroyed.  And with uptime the
      error on upper levels might become noticeable.
      
      The first flushing aims to make counters on ancestor levels more
      precise.  Dying cgroups may resume in the dying state for a long time.
      After kmem_cache reparenting which is performed during the offlining
      slab counters of the dying cgroup don't have any chances to be updated,
      because any slab operations will be performed on the parent level.  It
      means that the inaccuracy caused by percpu batching will not decrease up
      to the final destruction of the cgroup.  By the original idea flushing
      slab counters during the offlining should minimize the visible
      inaccuracy of slab counters on the parent level.
      
      The problem is that percpu counters are not zeroed after the first
      flushing.  So every cached percpu value is summed twice.  It creates a
      small error (up to 32 pages per cpu, but usually less) which accumulates
      on parent cgroup level.  After creating and destroying of thousands of
      child cgroups, slab counter on parent level can be way off the real
      value.
      
      For now, let's just stop flushing slab counters on memcg offlining.  It
      can't be done correctly without scheduling a work on each cpu: reading
      and zeroing it during css offlining can race with an asynchronous
      update, which doesn't expect values to be changed underneath.
      
      With this change, slab counters on parent level will become eventually
      consistent.  Once all dying children are gone, values are correct.  And
      if not, the error is capped by 32 * NR_CPUS pages per dying cgroup.
      
      It's not perfect, as slab are reparented, so any updates after the
      reparenting will happen on the parent level.  It means that if a slab
      page was allocated, a counter on child level was bumped, then the page
      was reparented and freed, the annihilation of positive and negative
      counter values will not happen until the child cgroup is released.  It
      makes slab counters different from others, and it might want us to
      implement flushing in a correct form again.  But it's also a question of
      performance: scheduling a work on each cpu isn't free, and it's an open
      question if the benefit of having more accurate counters is worth it.
      
      We might also consider flushing all counters on offlining, not only slab
      counters.
      
      So let's fix the main problem now: make the slab counters eventually
      consistent, so at least the error won't grow with uptime (or more
      precisely the number of created and destroyed cgroups).  And think about
      the accuracy of counters separately.
      
      Link: http://lkml.kernel.org/r/20191220042728.1045881-1-guro@fb.com
      Fixes: bee07b33 ("mm: memcontrol: flush percpu slab vmstats on kmem offlining")
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4a87e2a2
  7. 13 1月, 2020 1 次提交
  8. 12 1月, 2020 1 次提交
  9. 10 1月, 2020 1 次提交
  10. 09 1月, 2020 1 次提交
    • E
      macvlan: do not assume mac_header is set in macvlan_broadcast() · 96cc4b69
      Eric Dumazet 提交于
      Use of eth_hdr() in tx path is error prone.
      
      Many drivers call skb_reset_mac_header() before using it,
      but others do not.
      
      Commit 6d1ccff6 ("net: reset mac header in dev_start_xmit()")
      attempted to fix this generically, but commit d346a3fa
      ("packet: introduce PACKET_QDISC_BYPASS socket option") brought
      back the macvlan bug.
      
      Lets add a new helper, so that tx paths no longer have
      to call skb_reset_mac_header() only to get a pointer
      to skb->data.
      
      Hopefully we will be able to revert 6d1ccff6
      ("net: reset mac header in dev_start_xmit()") and save few cycles
      in transmit fast path.
      
      BUG: KASAN: use-after-free in __get_unaligned_cpu32 include/linux/unaligned/packed_struct.h:19 [inline]
      BUG: KASAN: use-after-free in mc_hash drivers/net/macvlan.c:251 [inline]
      BUG: KASAN: use-after-free in macvlan_broadcast+0x547/0x620 drivers/net/macvlan.c:277
      Read of size 4 at addr ffff8880a4932401 by task syz-executor947/9579
      
      CPU: 0 PID: 9579 Comm: syz-executor947 Not tainted 5.5.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
       __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506
       kasan_report+0x12/0x20 mm/kasan/common.c:639
       __asan_report_load_n_noabort+0xf/0x20 mm/kasan/generic_report.c:145
       __get_unaligned_cpu32 include/linux/unaligned/packed_struct.h:19 [inline]
       mc_hash drivers/net/macvlan.c:251 [inline]
       macvlan_broadcast+0x547/0x620 drivers/net/macvlan.c:277
       macvlan_queue_xmit drivers/net/macvlan.c:520 [inline]
       macvlan_start_xmit+0x402/0x77f drivers/net/macvlan.c:559
       __netdev_start_xmit include/linux/netdevice.h:4447 [inline]
       netdev_start_xmit include/linux/netdevice.h:4461 [inline]
       dev_direct_xmit+0x419/0x630 net/core/dev.c:4079
       packet_direct_xmit+0x1a9/0x250 net/packet/af_packet.c:240
       packet_snd net/packet/af_packet.c:2966 [inline]
       packet_sendmsg+0x260d/0x6220 net/packet/af_packet.c:2991
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:659
       __sys_sendto+0x262/0x380 net/socket.c:1985
       __do_sys_sendto net/socket.c:1997 [inline]
       __se_sys_sendto net/socket.c:1993 [inline]
       __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1993
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x442639
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 5b 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffc13549e08 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000442639
      RDX: 000000000000000e RSI: 0000000020000080 RDI: 0000000000000003
      RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000403bb0 R14: 0000000000000000 R15: 0000000000000000
      
      Allocated by task 9389:
       save_stack+0x23/0x90 mm/kasan/common.c:72
       set_track mm/kasan/common.c:80 [inline]
       __kasan_kmalloc mm/kasan/common.c:513 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486
       kasan_kmalloc+0x9/0x10 mm/kasan/common.c:527
       __do_kmalloc mm/slab.c:3656 [inline]
       __kmalloc+0x163/0x770 mm/slab.c:3665
       kmalloc include/linux/slab.h:561 [inline]
       tomoyo_realpath_from_path+0xc5/0x660 security/tomoyo/realpath.c:252
       tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
       tomoyo_path_perm+0x230/0x430 security/tomoyo/file.c:822
       tomoyo_inode_getattr+0x1d/0x30 security/tomoyo/tomoyo.c:129
       security_inode_getattr+0xf2/0x150 security/security.c:1222
       vfs_getattr+0x25/0x70 fs/stat.c:115
       vfs_statx_fd+0x71/0xc0 fs/stat.c:145
       vfs_fstat include/linux/fs.h:3265 [inline]
       __do_sys_newfstat+0x9b/0x120 fs/stat.c:378
       __se_sys_newfstat fs/stat.c:375 [inline]
       __x64_sys_newfstat+0x54/0x80 fs/stat.c:375
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 9389:
       save_stack+0x23/0x90 mm/kasan/common.c:72
       set_track mm/kasan/common.c:80 [inline]
       kasan_set_free_info mm/kasan/common.c:335 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:474
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:483
       __cache_free mm/slab.c:3426 [inline]
       kfree+0x10a/0x2c0 mm/slab.c:3757
       tomoyo_realpath_from_path+0x1a7/0x660 security/tomoyo/realpath.c:289
       tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
       tomoyo_path_perm+0x230/0x430 security/tomoyo/file.c:822
       tomoyo_inode_getattr+0x1d/0x30 security/tomoyo/tomoyo.c:129
       security_inode_getattr+0xf2/0x150 security/security.c:1222
       vfs_getattr+0x25/0x70 fs/stat.c:115
       vfs_statx_fd+0x71/0xc0 fs/stat.c:145
       vfs_fstat include/linux/fs.h:3265 [inline]
       __do_sys_newfstat+0x9b/0x120 fs/stat.c:378
       __se_sys_newfstat fs/stat.c:375 [inline]
       __x64_sys_newfstat+0x54/0x80 fs/stat.c:375
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff8880a4932000
       which belongs to the cache kmalloc-4k of size 4096
      The buggy address is located 1025 bytes inside of
       4096-byte region [ffff8880a4932000, ffff8880a4933000)
      The buggy address belongs to the page:
      page:ffffea0002924c80 refcount:1 mapcount:0 mapping:ffff8880aa402000 index:0x0 compound_mapcount: 0
      raw: 00fffe0000010200 ffffea0002846208 ffffea00028f3888 ffff8880aa402000
      raw: 0000000000000000 ffff8880a4932000 0000000100000001 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8880a4932300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880a4932380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff8880a4932400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                         ^
       ffff8880a4932480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880a4932500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: b863ceb7 ("[NET]: Add macvlan driver")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96cc4b69
  11. 07 1月, 2020 1 次提交