1. 26 7月, 2020 20 次提交
    • Y
      bpf: Implement bpf iterator for array maps · d3cc2ab5
      Yonghong Song 提交于
      The bpf iterators for array and percpu array
      are implemented. Similar to hash maps, for percpu
      array map, bpf program will receive values
      from all cpus.
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200723184115.590532-1-yhs@fb.com
      d3cc2ab5
    • Y
      bpf: Implement bpf iterator for hash maps · d6c4503c
      Yonghong Song 提交于
      The bpf iterators for hash, percpu hash, lru hash
      and lru percpu hash are implemented. During link time,
      bpf_iter_reg->check_target() will check map type
      and ensure the program access key/value region is
      within the map defined key/value size limit.
      
      For percpu hash and lru hash maps, the bpf program
      will receive values for all cpus. The map element
      bpf iterator infrastructure will prepare value
      properly before passing the value pointer to the
      bpf program.
      
      This patch set supports readonly map keys and
      read/write map values. It does not support deleting
      map elements, e.g., from hash tables. If there is
      a user case for this, the following mechanism can
      be used to support map deletion for hashtab, etc.
        - permit a new bpf program return value, e.g., 2,
          to let bpf iterator know the map element should
          be removed.
        - since bucket lock is taken, the map element will be
          queued.
        - once bucket lock is released after all elements under
          this bucket are traversed, all to-be-deleted map
          elements can be deleted.
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200723184114.590470-1-yhs@fb.com
      d6c4503c
    • Y
      bpf: Implement bpf iterator for map elements · a5cbe05a
      Yonghong Song 提交于
      The bpf iterator for map elements are implemented.
      The bpf program will receive four parameters:
        bpf_iter_meta *meta: the meta data
        bpf_map *map:        the bpf_map whose elements are traversed
        void *key:           the key of one element
        void *value:         the value of the same element
      
      Here, meta and map pointers are always valid, and
      key has register type PTR_TO_RDONLY_BUF_OR_NULL and
      value has register type PTR_TO_RDWR_BUF_OR_NULL.
      The kernel will track the access range of key and value
      during verification time. Later, these values will be compared
      against the values in the actual map to ensure all accesses
      are within range.
      
      A new field iter_seq_info is added to bpf_map_ops which
      is used to add map type specific information, i.e., seq_ops,
      init/fini seq_file func and seq_file private data size.
      Subsequent patches will have actual implementation
      for bpf_map_ops->iter_seq_info.
      
      In user space, BPF_ITER_LINK_MAP_FD needs to be
      specified in prog attr->link_create.flags, which indicates
      that attr->link_create.target_fd is a map_fd.
      The reason for such an explicit flag is for possible
      future cases where one bpf iterator may allow more than
      one possible customization, e.g., pid and cgroup id for
      task_file.
      
      Current kernel internal implementation only allows
      the target to register at most one required bpf_iter_link_info.
      To support the above case, optional bpf_iter_link_info's
      are needed, the target can be extended to register such link
      infos, and user provided link_info needs to match one of
      target supported ones.
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200723184112.590360-1-yhs@fb.com
      a5cbe05a
    • Y
      bpf: Support readonly/readwrite buffers in verifier · afbf21dc
      Yonghong Song 提交于
      Readonly and readwrite buffer register states
      are introduced. Totally four states,
      PTR_TO_RDONLY_BUF[_OR_NULL] and PTR_TO_RDWR_BUF[_OR_NULL]
      are supported. As suggested by their respective
      names, PTR_TO_RDONLY_BUF[_OR_NULL] are for
      readonly buffers and PTR_TO_RDWR_BUF[_OR_NULL]
      for read/write buffers.
      
      These new register states will be used
      by later bpf map element iterator.
      
      New register states share some similarity to
      PTR_TO_TP_BUFFER as it will calculate accessed buffer
      size during verification time. The accessed buffer
      size will be later compared to other metrics during
      later attach/link_create time.
      
      Similar to reg_state PTR_TO_BTF_ID_OR_NULL in bpf
      iterator programs, PTR_TO_RDONLY_BUF_OR_NULL or
      PTR_TO_RDWR_BUF_OR_NULL reg_types can be set at
      prog->aux->bpf_ctx_arg_aux, and bpf verifier will
      retrieve the values during btf_ctx_access().
      Later bpf map element iterator implementation
      will show how such information will be assigned
      during target registeration time.
      
      The verifier is also enhanced such that PTR_TO_RDONLY_BUF
      can be passed to ARG_PTR_TO_MEM[_OR_NULL] helper argument, and
      PTR_TO_RDWR_BUF can be passed to ARG_PTR_TO_MEM[_OR_NULL] or
      ARG_PTR_TO_UNINIT_MEM.
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200723184111.590274-1-yhs@fb.com
      afbf21dc
    • Y
      bpf: Refactor to provide aux info to bpf_iter_init_seq_priv_t · f9c79272
      Yonghong Song 提交于
      This patch refactored target bpf_iter_init_seq_priv_t callback
      function to accept additional information. This will be needed
      in later patches for map element targets since a particular
      map should be passed to traverse elements for that particular
      map. In the future, other information may be passed to target
      as well, e.g., pid, cgroup id, etc. to customize the iterator.
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200723184110.590156-1-yhs@fb.com
      f9c79272
    • Y
      bpf: Refactor bpf_iter_reg to have separate seq_info member · 14fc6bd6
      Yonghong Song 提交于
      There is no functionality change for this patch.
      Struct bpf_iter_reg is used to register a bpf_iter target,
      which includes information for both prog_load, link_create
      and seq_file creation.
      
      This patch puts fields related seq_file creation into
      a different structure. This will be useful for map
      elements iterator where one iterator covers different
      map types and different map types may have different
      seq_ops, init/fini private_data function and
      private_data size.
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200723184109.590030-1-yhs@fb.com
      14fc6bd6
    • A
      bpf: Add bpf_prog iterator · a228a64f
      Alexei Starovoitov 提交于
      It's mostly a copy paste of commit 6086d29d ("bpf: Add bpf_map iterator")
      that is use to implement bpf_seq_file opreations to traverse all bpf programs.
      
      v1->v2: Tweak to use build time btf_id
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NYonghong Song <yhs@fb.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      a228a64f
    • Y
      bpf: Fix pos computation for bpf_iter seq_ops->start() · 3f9969f2
      Yonghong Song 提交于
      Currently, the pos pointer in bpf iterator map/task/task_file
      seq_ops->start() is always incremented.
      This is incorrect. It should be increased only if
      *pos is 0 (for SEQ_START_TOKEN) since these start()
      function actually returns the first real object.
      If *pos is not 0, it merely found the object
      based on the state in seq->private, and not really
      advancing the *pos. This patch fixed this issue
      by only incrementing *pos if it is 0.
      
      Note that the old *pos calculation, although not
      correct, does not affect correctness of bpf_iter
      as bpf_iter seq_file->read() does not support llseek.
      
      This patch also renamed "mid" in bpf_map iterator
      seq_file private data to "map_id" for better clarity.
      
      Fixes: 6086d29d ("bpf: Add bpf_map iterator")
      Fixes: eaaacd23 ("bpf: Add task and task/file iterator targets")
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200722195156.4029817-1-yhs@fb.com
      3f9969f2
    • J
      selftests/bpf: Test BPF socket lookup and reuseport with connections · 86176a18
      Jakub Sitnicki 提交于
      Cover the case when BPF socket lookup returns a socket that belongs to a
      reuseport group, and the reuseport group contains connected UDP sockets.
      
      Ensure that the presence of connected UDP sockets in reuseport group does
      not affect the socket lookup result. Socket selected by reuseport should
      always be used as result in such case.
      Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Link: https://lore.kernel.org/bpf/20200722161720.940831-3-jakub@cloudflare.com
      86176a18
    • J
      udp: Don't discard reuseport selection when group has connections · c8a2983c
      Jakub Sitnicki 提交于
      When BPF socket lookup prog selects a socket that belongs to a reuseport
      group, and the reuseport group has connected sockets in it, the socket
      selected by reuseport will be discarded, and socket returned by BPF socket
      lookup will be used instead.
      
      Modify this behavior so that the socket selected by reuseport running after
      BPF socket lookup always gets used. Ignore the fact that the reuseport
      group might have connections because it is only relevant when scoring
      sockets during regular hashtable-based lookup.
      
      Fixes: 72f7e944 ("udp: Run SK_LOOKUP BPF program on socket lookup")
      Fixes: 6d4201b1 ("udp6: Run SK_LOOKUP BPF program on socket lookup")
      Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Link: https://lore.kernel.org/bpf/20200722161720.940831-2-jakub@cloudflare.com
      c8a2983c
    • A
      tools/bpftool: Strip BPF .o files before skeleton generation · f3c93a93
      Andrii Nakryiko 提交于
      Strip away DWARF info from .bpf.o files, before generating BPF skeletons.
      This reduces bpftool binary size from 3.43MB to 2.58MB.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NQuentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/bpf/20200722043804.2373298-1-andriin@fb.com
      f3c93a93
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · a57066b1
      David S. Miller 提交于
      The UDP reuseport conflict was a little bit tricky.
      
      The net-next code, via bpf-next, extracted the reuseport handling
      into a helper so that the BPF sk lookup code could invoke it.
      
      At the same time, the logic for reuseport handling of unconnected
      sockets changed via commit efc6b6f6
      which changed the logic to carry on the reuseport result into the
      rest of the lookup loop if we do not return immediately.
      
      This requires moving the reuseport_has_conns() logic into the callers.
      
      While we are here, get rid of inline directives as they do not belong
      in foo.c files.
      
      The other changes were cases of more straightforward overlapping
      modifications.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a57066b1
    • L
      Merge tag 'riscv-for-linus-5.8-rc7' of... · 04300d66
      Linus Torvalds 提交于
      Merge tag 'riscv-for-linus-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux into master
      
      Pull RISC-V fixes from Palmer Dabbelt:
       "A few more fixes this week:
      
         - A fix to avoid using SBI calls during kasan initialization, as the
           SBI calls themselves have not been probed yet.
      
         - Three fixes related to systems with multiple memory regions"
      
      * tag 'riscv-for-linus-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: Parse all memory blocks to remove unusable memory
        RISC-V: Do not rely on initrd_start/end computed during early dt parsing
        RISC-V: Set maximum number of mapped pages correctly
        riscv: kasan: use local_tlb_flush_all() to avoid uninitialized __sbi_rfence
      04300d66
    • L
      Merge tag 'x86-urgent-2020-07-25' of... · fbe0d451
      Linus Torvalds 提交于
      Merge tag 'x86-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
      
      Pull x86 fixes from Ingo Molnar:
       "Misc fixes:
      
         - Fix a section end page alignment assumption that was causing
           crashes
      
         - Fix ORC unwinding on freshly forked tasks which haven't executed
           yet and which have empty user task stacks
      
         - Fix the debug.exception-trace=1 sysctl dumping of user stacks,
           which was broken by recent maccess changes"
      
      * tag 'x86-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/dumpstack: Dump user space code correctly again
        x86/stacktrace: Fix reliable check for empty user task stacks
        x86/unwind/orc: Fix ORC for newly forked tasks
        x86, vmlinux.lds: Page-align end of ..page_aligned sections
      fbe0d451
    • L
      Merge tag 'perf-urgent-2020-07-25' of... · 78b1afe2
      Linus Torvalds 提交于
      Merge tag 'perf-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
      
      Pull uprobe fix from Ingo Molnar:
       "Fix an interaction/regression between uprobes based shared library
        tracing & GDB"
      
      * tag 'perf-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        uprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix GDB regression
      78b1afe2
    • L
      Merge tag 'timers-urgent-2020-07-25' of... · a7b36c2b
      Linus Torvalds 提交于
      Merge tag 'timers-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
      
      Pull timer fix from Ingo Molnar:
       "Fix a suspend/resume regression (crash) on TI AM3/AM4 SoC's"
      
      * tag 'timers-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4
      a7b36c2b
    • L
      Merge tag 'sched-urgent-2020-07-25' of... · 3077805e
      Linus Torvalds 提交于
      Merge tag 'sched-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
      
      Pull scheduler fixes from Ingo Molnar:
       "Fix a race introduced by the recent loadavg race fix, plus add a debug
        check for a hard to debug case of bogus wakeup function flags"
      
      * tag 'sched-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Warn if garbage is passed to default_wake_function()
        sched: Fix race against ptrace_freeze_trace()
      3077805e
    • L
      Merge tag 'efi-urgent-2020-07-25' of... · 17baa442
      Linus Torvalds 提交于
      Merge tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
      
      Pull EFI fixes from Ingo Molnar:
       "Various EFI fixes:
      
         - Fix the layering violation in the use of the EFI runtime services
           availability mask in users of the 'efivars' abstraction
      
         - Revert build fix for GCC v4.8 which is no longer supported
      
         - Clean up some x86 EFI stub details, some of which are borderline
           bugs that copy around garbage into padding fields - let's fix these
           out of caution.
      
         - Fix build issues while working on RISC-V support
      
         - Avoid --whole-archive when linking the stub on arm64"
      
      * tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: Revert "efi/x86: Fix build with gcc 4"
        efi/efivars: Expose RT service availability via efivars abstraction
        efi/libstub: Move the function prototypes to header file
        efi/libstub: Fix gcc error around __umoddi3 for 32 bit builds
        efi/libstub/arm64: link stub lib.a conditionally
        efi/x86: Only copy upto the end of setup_header
        efi/x86: Remove unused variables
      17baa442
    • L
      Merge tag '5.8-rc6-cifs-fix' of git://git.samba.org/sfrench/cifs-2.6 into master · 7cb3a5c5
      Linus Torvalds 提交于
      Pull cifs fix from Steve French:
       "A fix for a recently discovered regression in rename to older servers
        caused by a recent patch"
      
      * tag '5.8-rc6-cifs-fix' of git://git.samba.org/sfrench/cifs-2.6:
        Revert "cifs: Fix the target file was deleted when rename failed."
      7cb3a5c5
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net into master · 1b64b2e2
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix RCU locaking in iwlwifi, from Johannes Berg.
      
       2) mt76 can access uninitialized NAPI struct, from Felix Fietkau.
      
       3) Fix race in updating pause settings in bnxt_en, from Vasundhara
          Volam.
      
       4) Propagate error return properly during unbind failures in ax88172a,
          from George Kennedy.
      
       5) Fix memleak in adf7242_probe, from Liu Jian.
      
       6) smc_drv_probe() can leak, from Wang Hai.
      
       7) Don't muck with the carrier state if register_netdevice() fails in
          the bonding driver, from Taehee Yoo.
      
       8) Fix memleak in dpaa_eth_probe, from Liu Jian.
      
       9) Need to check skb_put_padto() return value in hsr_fill_tag(), from
          Murali Karicheri.
      
      10) Don't lose ionic RSS hash settings across FW update, from Shannon
          Nelson.
      
      11) Fix clobbered SKB control block in act_ct, from Wen Xu.
      
      12) Missing newlink in "tx_timeout" sysfs output, from Xiongfeng Wang.
      
      13) IS_UDPLITE cleanup a long time ago, incorrectly handled
          transformations involving UDPLITE_RECV_CC. From Miaohe Lin.
      
      14) Unbalanced locking in netdevsim, from Taehee Yoo.
      
      15) Suppress false-positive error messages in qed driver, from Alexander
          Lobakin.
      
      16) Out of bounds read in ax25_connect and ax25_sendmsg, from Peilin Ye.
      
      17) Missing SKB release in cxgb4's uld_send(), from Navid Emamdoost.
      
      18) Uninitialized value in geneve_changelink(), from Cong Wang.
      
      19) Fix deadlock in xen-netfront, from Andera Righi.
      
      19) flush_backlog() frees skbs with IRQs disabled, so should use
          dev_kfree_skb_irq() instead of kfree_skb(). From Subash Abhinov
          Kasiviswanathan.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
        drivers/net/wan: lapb: Corrected the usage of skb_cow
        dev: Defer free of skbs in flush_backlog
        qrtr: orphan socket in qrtr_release()
        xen-netfront: fix potential deadlock in xennet_remove()
        flow_offload: Move rhashtable inclusion to the source file
        geneve: fix an uninitialized value in geneve_changelink()
        bonding: check return value of register_netdevice() in bond_newlink()
        tcp: allow at most one TLP probe per flight
        AX.25: Prevent integer overflows in connect and sendmsg
        cxgb4: add missing release on skb in uld_send()
        net: atlantic: fix PTP on AQC10X
        AX.25: Prevent out-of-bounds read in ax25_sendmsg()
        sctp: shrink stream outq when fails to do addstream reconf
        sctp: shrink stream outq only when new outcnt < old outcnt
        AX.25: Fix out-of-bounds read in ax25_connect()
        enetc: Remove the mdio bus on PF probe bailout
        net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload
        net: ethernet: ave: Fix error returns in ave_init
        drivers/net/wan/x25_asy: Fix to make it work
        ipvs: fix the connection sync failed in some cases
        ...
      1b64b2e2
  2. 25 7月, 2020 20 次提交
    • A
      riscv: Parse all memory blocks to remove unusable memory · fa5a1983
      Atish Patra 提交于
      Currently, maximum physical memory allowed is equal to -PAGE_OFFSET.
      That's why we remove any memory blocks spanning beyond that size. However,
      it is done only for memblock containing linux kernel which will not work
      if there are multiple memblocks.
      
      Process all memory blocks to figure out how much memory needs to be removed
      and remove at the end instead of updating the memblock list in place.
      Signed-off-by: NAtish Patra <atish.patra@wdc.com>
      Signed-off-by: NPalmer Dabbelt <palmerdabbelt@google.com>
      fa5a1983
    • A
      RISC-V: Do not rely on initrd_start/end computed during early dt parsing · 4400231c
      Atish Patra 提交于
      Currently, initrd_start/end are computed during early_init_dt_scan
      but used during arch_setup. We will get the following panic if initrd is used
      and CONFIG_DEBUG_VIRTUAL is turned on.
      
      [    0.000000] ------------[ cut here ]------------
      [    0.000000] kernel BUG at arch/riscv/mm/physaddr.c:33!
      [    0.000000] Kernel BUG [#1]
      [    0.000000] Modules linked in:
      [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.8.0-rc4-00015-ged0b226fed02 #886
      [    0.000000] epc: ffffffe0002058d2 ra : ffffffe0000053f0 sp : ffffffe001001f40
      [    0.000000]  gp : ffffffe00106e250 tp : ffffffe001009d40 t0 : ffffffe00107ee28
      [    0.000000]  t1 : 0000000000000000 t2 : ffffffe000a2e880 s0 : ffffffe001001f50
      [    0.000000]  s1 : ffffffe0001383e8 a0 : ffffffe00c087e00 a1 : 0000000080200000
      [    0.000000]  a2 : 00000000010bf000 a3 : ffffffe00106f3c8 a4 : ffffffe0010bf000
      [    0.000000]  a5 : ffffffe000000000 a6 : 0000000000000006 a7 : 0000000000000001
      [    0.000000]  s2 : ffffffe00106f068 s3 : ffffffe00106f070 s4 : 0000000080200000
      [    0.000000]  s5 : 0000000082200000 s6 : 0000000000000000 s7 : 0000000000000000
      [    0.000000]  s8 : 0000000080011010 s9 : 0000000080012700 s10: 0000000000000000
      [    0.000000]  s11: 0000000000000000 t3 : 000000000001fe30 t4 : 000000000001fe30
      [    0.000000]  t5 : 0000000000000000 t6 : ffffffe00107c471
      [    0.000000] status: 0000000000000100 badaddr: 0000000000000000 cause: 0000000000000003
      [    0.000000] random: get_random_bytes called from print_oops_end_marker+0x22/0x46 with crng_init=0
      
      To avoid the error, initrd_start/end can be computed from phys_initrd_start/size
      in setup itself. It also improves the initrd placement by aligning the start
      and size with the page size.
      
      Fixes: 76d2a049 ("RISC-V: Init and Halt Code")
      Signed-off-by: NAtish Patra <atish.patra@wdc.com>
      Signed-off-by: NPalmer Dabbelt <palmerdabbelt@google.com>
      4400231c
    • X
      drivers/net/wan: lapb: Corrected the usage of skb_cow · 8754e137
      Xie He 提交于
      This patch fixed 2 issues with the usage of skb_cow in LAPB drivers
      "lapbether" and "hdlc_x25":
      
      1) After skb_cow fails, kfree_skb should be called to drop a reference
      to the skb. But in both drivers, kfree_skb is not called.
      
      2) skb_cow should be called before skb_push so that is can ensure the
      safety of skb_push. But in "lapbether", it is incorrectly called after
      skb_push.
      
      More details about these 2 issues:
      
      1) The behavior of calling kfree_skb on failure is also the behavior of
      netif_rx, which is called by this function with "return netif_rx(skb);".
      So this function should follow this behavior, too.
      
      2) In "lapbether", skb_cow is called after skb_push. This results in 2
      logical issues:
         a) skb_push is not protected by skb_cow;
         b) An extra headroom of 1 byte is ensured after skb_push. This extra
            headroom has no use in this function. It also has no use in the
            upper-layer function that this function passes the skb to
            (x25_lapb_receive_frame in net/x25/x25_dev.c).
      So logically skb_cow should instead be called before skb_push.
      
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Martin Schiller <ms@dev.tdt.de>
      Signed-off-by: NXie He <xie.he.0141@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8754e137
    • D
      Merge branch 'net-dsa-mv88e6xxx-port-mtu-support' · dfecd3e0
      David S. Miller 提交于
      Chris Packham says:
      
      ====================
      net: dsa: mv88e6xxx: port mtu support
      
      This series connects up the mv88e6xxx switches to the dsa infrastructure for
      configuring the port MTU. The first patch is also a bug fix which might be a
      candiatate for stable.
      
      I've rebased this series on top of net-next/master to pick up Andrew's change
      for the gigabit switches. Patch 1 and 2 are unchanged (aside from adding
      Andrew's Reviewed-by). Patch 3 is reworked to make use of the existing mtu
      support.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfecd3e0
    • C
      net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU · 1baf0fac
      Chris Packham 提交于
      Some of the chips in the mv88e6xxx family don't support jumbo
      configuration per port. But they do have a chip-wide max frame size that
      can be used. Use this to approximate the behaviour of configuring a port
      based MTU.
      Signed-off-by: NChris Packham <chris.packham@alliedtelesis.co.nz>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1baf0fac
    • C
      net: dsa: mv88e6xxx: Support jumbo configuration on 6190/6190X · e8b34c67
      Chris Packham 提交于
      The MV88E6190 and MV88E6190X both support per port jumbo configuration
      just like the other GE switches. Install the appropriate ops.
      Signed-off-by: NChris Packham <chris.packham@alliedtelesis.co.nz>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e8b34c67
    • C
      net: dsa: mv88e6xxx: MV88E6097 does not support jumbo configuration · 0f3c66a3
      Chris Packham 提交于
      The MV88E6097 chip does not support configuring jumbo frames. Prior to
      commit 5f436666 only the 6352, 6351, 6165 and 6320 chips configured
      jumbo mode. The refactor accidentally added the function for the 6097.
      Remove the erroneous function pointer assignment.
      
      Fixes: 5f436666 ("net: dsa: mv88e6xxx: Refactor setting of jumbo frames")
      Signed-off-by: NChris Packham <chris.packham@alliedtelesis.co.nz>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0f3c66a3
    • S
      dev: Defer free of skbs in flush_backlog · 7df5cb75
      Subash Abhinov Kasiviswanathan 提交于
      IRQs are disabled when freeing skbs in input queue.
      Use the IRQ safe variant to free skbs here.
      
      Fixes: 145dd5f9 ("net: flush the softnet backlog in process context")
      Signed-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7df5cb75
    • A
      RISC-V: Set maximum number of mapped pages correctly · d0d8aae6
      Atish Patra 提交于
      Currently, maximum number of mapper pages are set to the pfn calculated
      from the memblock size of the memblock containing kernel. This will work
      until that memblock spans the entire memory. However, it will be set to
      a wrong value if there are multiple memblocks defined in kernel
      (e.g. with efi runtime services).
      
      Set the the maximum value to the pfn calculated from dram size.
      Signed-off-by: NAtish Patra <atish.patra@wdc.com>
      Signed-off-by: NPalmer Dabbelt <palmerdabbelt@google.com>
      d0d8aae6
    • L
      Merge tag 'pci-v5.8-fixes-2' of... · 23ee3e4e
      Linus Torvalds 提交于
      Merge tag 'pci-v5.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into master
      
      Pull PCI fixes from Bjorn Helgaas:
      
       - Reject invalid IRQ 0 command line argument for virtio_mmio because
         IRQ 0 now generates warnings (Bjorn Helgaas)
      
       - Revert "PCI/PM: Assume ports without DLL Link Active train links in
         100 ms", which broke nouveau (Bjorn Helgaas)
      
      * tag 'pci-v5.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        Revert "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"
        virtio-mmio: Reject invalid IRQ 0 command line argument
      23ee3e4e
    • C
      qrtr: orphan socket in qrtr_release() · af9f691f
      Cong Wang 提交于
      We have to detach sock from socket in qrtr_release(),
      otherwise skb->sk may still reference to this socket
      when the skb is released in tun->queue, particularly
      sk->sk_wq still points to &sock->wq, which leads to
      a UAF.
      
      Reported-and-tested-by: syzbot+6720d64f31c081c2f708@syzkaller.appspotmail.com
      Fixes: 28fb4e59 ("net: qrtr: Expose tunneling endpoint to user space")
      Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af9f691f
    • W
      net: hix5hd2_gmac: Remove unneeded cast from memory allocation · 9b964f16
      Wang Hai 提交于
      Remove casting the values returned by memory allocation function.
      
      Coccinelle emits WARNING:
      
      ./drivers/net/ethernet/hisilicon/hix5hd2_gmac.c:1027:9-23: WARNING:
       casting value returned by memory allocation function to (struct sg_desc *) is useless.
      
      This issue was detected by using the Coccinelle software.
      Signed-off-by: NWang Hai <wanghai38@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b964f16
    • D
      Merge tag 'wireless-drivers-2020-07-24' of... · 657237f5
      David S. Miller 提交于
      Merge tag 'wireless-drivers-2020-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      wireless-drivers fixes for v5.8
      
      Second set of fixes for v5.8, and hopefully also the last. Three
      important regressions fixed.
      
      ath9k
      
      * fix a regression which broke support for all ath9k usb devices
      
      ath10k
      
      * fix a regression which broke support for all QCA4019 AHB devices
      
      iwlwifi
      
      * fix a regression which broke support for some Killer Wireless-AC 1550 cards
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      657237f5
    • D
      Merge branch 'l2tp-avoid-multiple-assignment-remove-BUG_ON' · a8cf7d03
      David S. Miller 提交于
      Tom Parkin says:
      
      ====================
      l2tp: avoid multiple assignment, remove BUG_ON
      
      l2tp hasn't been kept up to date with the static analysis checks offered
      by checkpatch.pl.
      
      This patchset builds on the series: "l2tp: cleanup checkpatch.pl
      warnings" and "l2tp: further checkpatch.pl cleanups" to resolve some of
      the remaining checkpatch warnings in l2tp.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8cf7d03
    • T
      l2tp: WARN_ON rather than BUG_ON in l2tp_session_free · ab6934e0
      Tom Parkin 提交于
      l2tp_session_free called BUG_ON if the tunnel magic feather value wasn't
      correct.  The intent of this was to catch lifetime bugs; for example
      early tunnel free due to incorrect use of reference counts.
      
      Since the tunnel magic feather being wrong indicates either early free
      or structure corruption, we can avoid doing more damage by simply
      leaving the tunnel structure alone.  If the tunnel refcount isn't
      dropped when it should be, the tunnel instance will remain in the
      kernel, resulting in the tunnel structure and socket leaking.
      Signed-off-by: NTom Parkin <tparkin@katalix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab6934e0
    • T
      l2tp: remove BUG_ON refcount value in l2tp_session_free · 0dd62f69
      Tom Parkin 提交于
      l2tp_session_free is only called by l2tp_session_dec_refcount when the
      reference count reaches zero, so it's of limited value to validate the
      reference count value in l2tp_session_free itself.
      Signed-off-by: NTom Parkin <tparkin@katalix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0dd62f69
    • T
      l2tp: WARN_ON rather than BUG_ON in l2tp_session_queue_purge · 493048f5
      Tom Parkin 提交于
      l2tp_session_queue_purge is used during session shutdown to drop any
      skbs queued for reordering purposes according to L2TP dataplane rules.
      
      The BUG_ON in this function checks the session magic feather in an
      attempt to catch lifetime bugs.
      
      Rather than crashing the kernel with a BUG_ON, we can simply WARN_ON and
      refuse to do anything more -- in the worst case this could result in a
      leak.  However this is highly unlikely given that the session purge only
      occurs from codepaths which have obtained the session by means of a lookup
      via. the parent tunnel and which check the session "dead" flag to
      protect against shutdown races.
      
      While we're here, have l2tp_session_queue_purge return void rather than
      an integer, since neither of the callsites checked the return value.
      Signed-off-by: NTom Parkin <tparkin@katalix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      493048f5
    • T
      l2tp: don't BUG_ON seqfile checks in l2tp_ppp · ebb4f5e6
      Tom Parkin 提交于
      checkpatch advises that WARN_ON and recovery code are preferred over
      BUG_ON which crashes the kernel.
      
      l2tp_ppp has a BUG_ON check of struct seq_file's private pointer in
      pppol2tp_seq_start prior to accessing data through that pointer.
      
      Rather than crashing, we can simply bail out early and return NULL in
      order to terminate the seq file processing in much the same way as we do
      when reaching the end of tunnel/session instances to render.
      
      Retain a WARN_ON to help trace possible bugs in this area.
      Signed-off-by: NTom Parkin <tparkin@katalix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ebb4f5e6
    • T
      l2tp: don't BUG_ON session magic checks in l2tp_ppp · 1aa646ac
      Tom Parkin 提交于
      checkpatch advises that WARN_ON and recovery code are preferred over
      BUG_ON which crashes the kernel.
      
      l2tp_ppp.c's BUG_ON checks of the l2tp session structure's "magic" field
      occur in code paths where it's reasonably easy to recover:
      
       * In the case of pppol2tp_sock_to_session, we can return NULL and the
         caller will bail out appropriately.  There is no change required to
         any of the callsites of this function since they already handle
         pppol2tp_sock_to_session returning NULL.
      
       * In the case of pppol2tp_session_destruct we can just avoid
         decrementing the reference count on the suspect session structure.
         In the worst case scenario this results in a memory leak, which is
         preferable to a crash.
      
      Convert these uses of BUG_ON to WARN_ON accordingly.
      Signed-off-by: NTom Parkin <tparkin@katalix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1aa646ac
    • T
      l2tp: remove BUG_ON in l2tp_tunnel_closeall · cd3e29b3
      Tom Parkin 提交于
      l2tp_tunnel_closeall is only called from l2tp_core.c, and it's easy
      to statically analyse the code path calling it to validate that it
      should never be passed a NULL tunnel pointer.
      
      Having a BUG_ON checking the tunnel pointer triggers a checkpatch
      warning.  Since the BUG_ON is of no value, remove it to avoid the
      warning.
      Signed-off-by: NTom Parkin <tparkin@katalix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd3e29b3