1. 11 6月, 2022 3 次提交
    • L
      Merge tag 'net-5.19-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 68171bbd
      Linus Torvalds 提交于
      Pull networking fixes from Jakub Kicinski:
       "Quick follow up, to cleanly fast-forward net again.
      
        Current release - new code bugs:
      
         - Revert "net/mlx5e: Allow relaxed ordering over VFs"
      
        Previous releases - regressions:
      
         - seg6: fix seg6_lookup_any_nexthop() to handle VRFs using
           flowi_l3mdev
      
        Misc:
      
         - rename TLS_INFO_ZC_SENDFILE to better express the meaning"
      
      * tag 'net-5.19-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net:
        net: seg6: fix seg6_lookup_any_nexthop() to handle VRFs using flowi_l3mdev
        nfp: flower: restructure flow-key for gre+vlan combination
        nfp: avoid unnecessary check warnings in nfp_app_get_vf_config
        tls: Rename TLS_INFO_ZC_SENDFILE to TLS_INFO_ZC_TX
        net/mlx5: fs, fail conflicting actions
        net/mlx5: Rearm the FW tracer after each tracer event
        net/mlx5: E-Switch, pair only capable devices
        net/mlx5e: CT: Fix cleanup of CT before cleanup of TC ct rules
        Revert "net/mlx5e: Allow relaxed ordering over VFs"
        MAINTAINERS: adjust MELLANOX ETHERNET INNOVA DRIVERS to TLS support removal
      68171bbd
    • L
      Merge tag 'for-linus-5.19a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · f2ecc964
      Linus Torvalds 提交于
      Pull xen updates from Juergen Gross:
      
       - a small cleanup removing "export" of an __init function
      
       - a small series adding a new infrastructure for platform flags
      
       - a series adding generic virtio support for Xen guests (frontend side)
      
      * tag 'for-linus-5.19a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen: unexport __init-annotated xen_xlate_map_ballooned_pages()
        arm/xen: Assign xen-grant DMA ops for xen-grant DMA devices
        xen/grant-dma-ops: Retrieve the ID of backend's domain for DT devices
        xen/grant-dma-iommu: Introduce stub IOMMU driver
        dt-bindings: Add xen,grant-dma IOMMU description for xen-grant DMA ops
        xen/virtio: Enable restricted memory access using Xen grant mappings
        xen/grant-dma-ops: Add option to restrict memory access under Xen
        xen/grants: support allocating consecutive grants
        arm/xen: Introduce xen_setup_dma_ops()
        virtio: replace arch_has_restricted_virtio_memory_access()
        kernel: add platform_has() infrastructure
      f2ecc964
    • L
      Merge tag 'mips-fixes_5.19_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 7d78b7eb
      Linus Torvalds 提交于
      Pull MIPS fix from Thomas Bogendoerfer:
       "Build fix for Loongson-3"
      
      * tag 'mips-fixes_5.19_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: Loongson-3: fix compile mips cpu_hwmon as module build error.
      7d78b7eb
  2. 10 6月, 2022 16 次提交
    • J
      Merge tag 'mlx5-fixes-2022-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · bf56a091
      Jakub Kicinski 提交于
      Saeed Mahameed says:
      
      ====================
      mlx5 fixes 2022-06-08
      
      This series provides bug fixes to mlx5 driver.
      
      * tag 'mlx5-fixes-2022-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
        net/mlx5: fs, fail conflicting actions
        net/mlx5: Rearm the FW tracer after each tracer event
        net/mlx5: E-Switch, pair only capable devices
        net/mlx5e: CT: Fix cleanup of CT before cleanup of TC ct rules
        Revert "net/mlx5e: Allow relaxed ordering over VFs"
        MAINTAINERS: adjust MELLANOX ETHERNET INNOVA DRIVERS to TLS support removal
      ====================
      
      Link: https://lore.kernel.org/r/20220608185855.19818-1-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      bf56a091
    • A
      net: seg6: fix seg6_lookup_any_nexthop() to handle VRFs using flowi_l3mdev · a3bd2102
      Andrea Mayer 提交于
      Commit 40867d74 ("net: Add l3mdev index to flow struct and avoid oif
      reset for port devices") adds a new entry (flowi_l3mdev) in the common
      flow struct used for indicating the l3mdev index for later rule and
      table matching.
      The l3mdev_update_flow() has been adapted to properly set the
      flowi_l3mdev based on the flowi_oif/flowi_iif. In fact, when a valid
      flowi_iif is supplied to the l3mdev_update_flow(), this function can
      update the flowi_l3mdev entry only if it has not yet been set (i.e., the
      flowi_l3mdev entry is equal to 0).
      
      The SRv6 End.DT6 behavior in VRF mode leverages a VRF device in order to
      force the routing lookup into the associated routing table. This routing
      operation is performed by seg6_lookup_any_nextop() preparing a flowi6
      data structure used by ip6_route_input_lookup() which, in turn,
      (indirectly) invokes l3mdev_update_flow().
      
      However, seg6_lookup_any_nexthop() does not initialize the new
      flowi_l3mdev entry which is filled with random garbage data. This
      prevents l3mdev_update_flow() from properly updating the flowi_l3mdev
      with the VRF index, and thus SRv6 End.DT6 (VRF mode)/DT46 behaviors are
      broken.
      
      This patch correctly initializes the flowi6 instance allocated and used
      by seg6_lookup_any_nexhtop(). Specifically, the entire flowi6 instance
      is wiped out: in case new entries are added to flowi/flowi6 (as happened
      with the flowi_l3mdev entry), we should no longer have incorrectly
      initialized values. As a result of this operation, the value of
      flowi_l3mdev is also set to 0.
      
      The proposed fix can be tested easily. Starting from the commit
      referenced in the Fixes, selftests [1],[2] indicate that the SRv6
      End.DT6 (VRF mode)/DT46 behaviors no longer work correctly. By applying
      this patch, those behaviors are back to work properly again.
      
      [1] - tools/testing/selftests/net/srv6_end_dt46_l3vpn_test.sh
      [2] - tools/testing/selftests/net/srv6_end_dt6_l3vpn_test.sh
      
      Fixes: 40867d74 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices")
      Reported-by: NAnton Makarov <am@3a-alliance.com>
      Signed-off-by: NAndrea Mayer <andrea.mayer@uniroma2.it>
      Reviewed-by: NDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20220608091917.20345-1-andrea.mayer@uniroma2.itSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      a3bd2102
    • J
      Merge branch 'nfp-fixes-for-v5-19' · cd3ff99b
      Jakub Kicinski 提交于
      Simon Horman says:
      
      ====================
      nfp: fixes for v5.19
      
      this short series includes two fixes for the NFP driver.
      
      1. Restructure GRE+VLAN flower offload to address a miss match
         between the NIC firmware and driver implementation which
         prevented these features from working in combination.
      
      2. Prevent unnecessary warnings regarding rate limiting support.-
         It is expected that this feature to not _always_ be present
         but this was not taken into account when the code to check
         for this feature was added.
      ====================
      
      Link: https://lore.kernel.org/r/20220608092901.124780-1-simon.horman@corigine.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      cd3ff99b
    • E
      nfp: flower: restructure flow-key for gre+vlan combination · a0b84334
      Etienne van der Linde 提交于
      Swap around the GRE and VLAN parts in the flow-key offloaded by
      the driver to fit in with other tunnel types and the firmware.
      Without this change used cases with GRE+VLAN on the outer header
      does not get offloaded as the flow-key mismatches what the
      firmware expect.
      
      Fixes: 0d630f58 ("nfp: flower: add support to offload QinQ match")
      Fixes: 5a2b9304 ("nfp: flower-ct: compile match sections of flow_payload")
      Signed-off-by: NEtienne van der Linde <etienne.vanderlinde@corigine.com>
      Signed-off-by: NLouis Peens <louis.peens@corigine.com>
      Signed-off-by: NYinjun Zhang <yinjun.zhang@corigine.com>
      Signed-off-by: NSimon Horman <simon.horman@corigine.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      a0b84334
    • F
      nfp: avoid unnecessary check warnings in nfp_app_get_vf_config · 03d5005f
      Fei Qin 提交于
      nfp_net_sriov_check is added in nfp_app_get_vf_config which intends
      to ensure ivi->vlan_proto and ivi->max_tx_rate/min_tx_rate can be
      read from VF config table only when firmware supports corresponding
      capability.
      
      However, "nfp_app_get_vf_config" can be called by commands like
      "ip a", "ip link set $DEV up" and "ip link set $DEV vf $NUM vlan
      $param" (with VF). When using commands above, many warnings
      "ndo_set_vf_<cap_x> not supported" would appear if firmware doesn't
      support VF rate limit and 802.1ad VLAN assingment. If more VFs are
      created, things could get worse.
      
      Thus, this patch add an extra bool parameter for nfp_net_sriov_check
      to enable/disable the cap check warning report. Unnecessary warnings
      in nfp_app_get_vf_config can be avoided. Valid warnings in kinds of
      vf setting function can be reserved.
      
      Fixes: e0d0e1fd ("nfp: VF rate limit support")
      Fixes: 59359597 ("nfp: support 802.1ad VLAN assingment to VF")
      Signed-off-by: NFei Qin <fei.qin@corigine.com>
      Signed-off-by: NYinjun Zhang <yinjun.zhang@corigine.com>
      Signed-off-by: NLouis Peens <louis.peens@corigine.com>
      Signed-off-by: NSimon Horman <simon.horman@corigine.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      03d5005f
    • M
      tls: Rename TLS_INFO_ZC_SENDFILE to TLS_INFO_ZC_TX · b489a6e5
      Maxim Mikityanskiy 提交于
      To embrace possible future optimizations of TLS, rename zerocopy
      sendfile definitions to more generic ones:
      
      * setsockopt: TLS_TX_ZEROCOPY_SENDFILE- > TLS_TX_ZEROCOPY_RO
      * sock_diag: TLS_INFO_ZC_SENDFILE -> TLS_INFO_ZC_RO_TX
      
      RO stands for readonly and emphasizes that the application shouldn't
      modify the data being transmitted with zerocopy to avoid potential
      disconnection.
      
      Fixes: c1318b39 ("tls: Add opt-in zerocopy mode of sendfile()")
      Signed-off-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
      Link: https://lore.kernel.org/r/20220608153425.3151146-1-maximmi@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      b489a6e5
    • D
      netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context · 874c8ca1
      David Howells 提交于
      While randstruct was satisfied with using an open-coded "void *" offset
      cast for the netfs_i_context <-> inode casting, __builtin_object_size() as
      used by FORTIFY_SOURCE was not as easily fooled.  This was causing the
      following complaint[1] from gcc v12:
      
        In file included from include/linux/string.h:253,
                         from include/linux/ceph/ceph_debug.h:7,
                         from fs/ceph/inode.c:2:
        In function 'fortify_memset_chk',
            inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2,
            inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2:
        include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
          242 |                         __write_overflow_field(p_size_field, size);
              |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Fix this by embedding a struct inode into struct netfs_i_context (which
      should perhaps be renamed to struct netfs_inode).  The struct inode
      vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode
      structs and vfs_inode is then simply changed to "netfs.inode" in those
      filesystems.
      
      Further, rename netfs_i_context to netfs_inode, get rid of the
      netfs_inode() function that converted a netfs_i_context pointer to an
      inode pointer (that can now be done with &ctx->inode) and rename the
      netfs_i_context() function to netfs_inode() (which is now a wrapper
      around container_of()).
      
      Most of the changes were done with:
      
        perl -p -i -e 's/vfs_inode/netfs.inode/'g \
              `git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]`
      
      Kees suggested doing it with a pair structure[2] and a special
      declarator to insert that into the network filesystem's inode
      wrapper[3], but I think it's cleaner to embed it - and then it doesn't
      matter if struct randomisation reorders things.
      
      Dave Chinner suggested using a filesystem-specific VFS_I() function in
      each filesystem to convert that filesystem's own inode wrapper struct
      into the VFS inode struct[4].
      
      Version #2:
       - Fix a couple of missed name changes due to a disabled cifs option.
       - Rename nfs_i_context to nfs_inode
       - Use "netfs" instead of "nic" as the member name in per-fs inode wrapper
         structs.
      
      [ This also undoes commit 507160f4 ("netfs: gcc-12: temporarily
        disable '-Wattribute-warning' for now") that is no longer needed ]
      
      Fixes: bc899ee1 ("netfs: Add a netfs inode context")
      Reported-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NXiubo Li <xiubli@redhat.com>
      cc: Jonathan Corbet <corbet@lwn.net>
      cc: Eric Van Hensbergen <ericvh@gmail.com>
      cc: Latchesar Ionkov <lucho@ionkov.net>
      cc: Dominique Martinet <asmadeus@codewreck.org>
      cc: Christian Schoenebeck <linux_oss@crudebyte.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: Ilya Dryomov <idryomov@gmail.com>
      cc: Steve French <smfrench@gmail.com>
      cc: William Kucharski <william.kucharski@oracle.com>
      cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      cc: Dave Chinner <david@fromorbit.com>
      cc: linux-doc@vger.kernel.org
      cc: v9fs-developer@lists.sourceforge.net
      cc: linux-afs@lists.infradead.org
      cc: ceph-devel@vger.kernel.org
      cc: linux-cifs@vger.kernel.org
      cc: samba-technical@lists.samba.org
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-hardening@vger.kernel.org
      Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1]
      Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2]
      Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3]
      Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4]
      Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1
      Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      874c8ca1
    • Y
      MIPS: Loongson-3: fix compile mips cpu_hwmon as module build error. · 41e45640
      Yupeng Li 提交于
        set cpu_hwmon as a module build with loongson_sysconf, loongson_chiptemp
        undefined error,fix cpu_hwmon compile options to be bool.Some kernel
        compilation error information is as follows:
      
        Checking missing-syscalls for N32
        CALL    scripts/checksyscalls.sh
        Checking missing-syscalls for O32
        CALL    scripts/checksyscalls.sh
        CALL    scripts/checksyscalls.sh
        CHK     include/generated/compile.h
        CC [M]  drivers/platform/mips/cpu_hwmon.o
        Building modules, stage 2.
        MODPOST 200 modules
      ERROR: "loongson_sysconf" [drivers/platform/mips/cpu_hwmon.ko] undefined!
      ERROR: "loongson_chiptemp" [drivers/platform/mips/cpu_hwmon.ko] undefined!
      make[1]: *** [scripts/Makefile.modpost:92:__modpost] 错误 1
      make: *** [Makefile:1261:modules] 错误 2
      Signed-off-by: NYupeng Li <liyupeng@zbhlos.com>
      Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
      Reviewed-by: NHuacai Chen <chenhuacai@kernel.org>
      Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>
      41e45640
    • L
      Merge tag 'fs_for_v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 3d9f55c5
      Linus Torvalds 提交于
      Pull ext2, writeback, and quota fixes and cleanups from Jan Kara:
       "A fix for race in writeback code and two cleanups in quota and ext2"
      
      * tag 'fs_for_v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        quota: Prevent memory allocation recursion while holding dq_lock
        writeback: Fix inode->i_io_list not be protected by inode->i_lock error
        fs: Fix syntax errors in comments
      3d9f55c5
    • L
      Merge tag 'powerpc-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 95fc76c8
      Linus Torvalds 提交于
      Pull powerpc fixes from Michael Ellerman:
      
       - On 32-bit fix overread/overwrite of thread_struct via ptrace
         PEEK/POKE.
      
       - Fix softirqs not switching to the softirq stack since we moved
         irq_exit().
      
       - Force thread size increase when KASAN is enabled to avoid stack
         overflows.
      
       - On Book3s 64 mark more code as not to be instrumented by KASAN to
         avoid crashes.
      
       - Exempt __get_wchan() from KASAN checking, as it's inherently racy.
      
       - Fix a recently introduced crash in the papr_scm driver in some
         configurations.
      
       - Remove include of <generated/compile.h> which is forbidden.
      
      Thanks to Ariel Miculas, Chen Jingwen, Christophe Leroy, Erhard Furtner,
      He Ying, Kees Cook, Masahiro Yamada, Nageswara R Sastry, Paul Mackerras,
      Sachin Sant, Vaibhav Jain, and Wanming Hu.
      
      * tag 'powerpc-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/32: Fix overread/overwrite of thread_struct via ptrace
        powerpc/book3e: get rid of #include <generated/compile.h>
        powerpc/kasan: Force thread size increase with KASAN
        powerpc/papr_scm: don't requests stats with '0' sized stats buffer
        powerpc: Don't select HAVE_IRQ_EXIT_ON_IRQ_STACK
        powerpc/kasan: Silence KASAN warnings in __get_wchan()
        powerpc/kasan: Mark more real-mode code as not to be instrumented
      95fc76c8
    • L
      Merge tag 'net-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 825464e7
      Linus Torvalds 提交于
      Pull networking fixes from Paolo Abeni:
       "Including fixes from bpf and netfilter.
      
        Current release - regressions:
      
         - eth: amt: fix possible null-ptr-deref in amt_rcv()
      
        Previous releases - regressions:
      
         - tcp: use alloc_large_system_hash() to allocate table_perturb
      
         - af_unix: fix a data-race in unix_dgram_peer_wake_me()
      
         - nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling
      
         - eth: ixgbe: fix unexpected VLAN rx in promisc mode on VF
      
        Previous releases - always broken:
      
         - ipv6: fix signed integer overflow in __ip6_append_data
      
         - netfilter:
             - nat: really support inet nat without l3 address
             - nf_tables: memleak flow rule from commit path
      
         - bpf: fix calling global functions from BPF_PROG_TYPE_EXT programs
      
         - openvswitch: fix misuse of the cached connection on tuple changes
      
         - nfc: nfcmrvl: fix memory leak in nfcmrvl_play_deferred
      
         - eth: altera: fix refcount leak in altera_tse_mdio_create
      
        Misc:
      
         - add Quentin Monnet to bpftool maintainers"
      
      * tag 'net-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
        net: amd-xgbe: fix clang -Wformat warning
        tcp: use alloc_large_system_hash() to allocate table_perturb
        net: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHY
        net: dsa: mv88e6xxx: correctly report serdes link failure
        net: dsa: mv88e6xxx: fix BMSR error to be consistent with others
        net: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_complete
        net: altera: Fix refcount leak in altera_tse_mdio_create
        net: openvswitch: fix misuse of the cached connection on tuple changes
        net: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag
        ip_gre: test csum_start instead of transport header
        au1000_eth: stop using virt_to_bus()
        ipv6: Fix signed integer overflow in l2tp_ip6_sendmsg
        ipv6: Fix signed integer overflow in __ip6_append_data
        nfc: nfcmrvl: Fix memory leak in nfcmrvl_play_deferred
        nfc: st21nfca: fix incorrect sizing calculations in EVT_TRANSACTION
        nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling
        nfc: st21nfca: fix incorrect validating logic in EVT_TRANSACTION
        net: ipv6: unexport __init-annotated seg6_hmac_init()
        net: xfrm: unexport __init-annotated xfrm4_protocol_init()
        net: mdio: unexport __init-annotated mdio_bus_init()
        ...
      825464e7
    • L
      netfs: gcc-12: temporarily disable '-Wattribute-warning' for now · 507160f4
      Linus Torvalds 提交于
      This is a pure band-aid so that I can continue merging stuff from people
      while some of the gcc-12 fallout gets sorted out.
      
      In particular, gcc-12 is very unhappy about the kinds of pointer
      arithmetic tricks that netfs does, and that makes the fortify checks
      trigger in afs and ceph:
      
        In function ‘fortify_memset_chk’,
            inlined from ‘netfs_i_context_init’ at include/linux/netfs.h:327:2,
            inlined from ‘afs_set_netfs_context’ at fs/afs/inode.c:61:2,
            inlined from ‘afs_root_iget’ at fs/afs/inode.c:543:2:
        include/linux/fortify-string.h:258:25: warning: call to ‘__write_overflow_field’ declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
          258 |                         __write_overflow_field(p_size_field, size);
              |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      and the reason is that netfs_i_context_init() is passed a 'struct inode'
      pointer, and then it does
      
              struct netfs_i_context *ctx = netfs_i_context(inode);
      
              memset(ctx, 0, sizeof(*ctx));
      
      where that netfs_i_context() function just does pointer arithmetic on
      the inode pointer, knowing that the netfs_i_context is laid out
      immediately after it in memory.
      
      This is all truly disgusting, since the whole "netfs_i_context is laid
      out immediately after it in memory" is not actually remotely true in
      general, but is just made to be that way for afs and ceph.
      
      See for example fs/cifs/cifsglob.h:
      
        struct cifsInodeInfo {
              struct {
                      /* These must be contiguous */
                      struct inode    vfs_inode;      /* the VFS's inode record */
                      struct netfs_i_context netfs_ctx; /* Netfslib context */
              };
      	[...]
      
      and realize that this is all entirely wrong, and the pointer arithmetic
      that netfs_i_context() is doing is also very very wrong and wouldn't
      give the right answer if netfs_ctx had different alignment rules from a
      'struct inode', for example).
      
      Anyway, that's just a long-winded way to say "the gcc-12 warning is
      actually quite reasonable, and our code happens to work but is pretty
      disgusting".
      
      This is getting fixed properly, but for now I made the mistake of
      thinking "the week right after the merge window tends to be calm for me
      as people take a breather" and I did a sustem upgrade.  And I got gcc-12
      as a result, so to continue merging fixes from people and not have the
      end result drown in warnings, I am fixing all these gcc-12 issues I hit.
      
      Including with these kinds of temporary fixes.
      
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Howells <dhowells@redhat.com>
      Link: https://lore.kernel.org/all/AEEBCF5D-8402-441D-940B-105AA718C71F@chromium.org/Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      507160f4
    • L
      gcc-12: disable '-Warray-bounds' universally for now · f0be87c4
      Linus Torvalds 提交于
      In commit 8b202ee2 ("s390: disable -Warray-bounds") the s390 people
      disabled the '-Warray-bounds' warning for gcc-12, because the new logic
      in gcc would cause warnings for their use of the S390_lowcore macro,
      which accesses absolute pointers.
      
      It turns out gcc-12 has many other issues in this area, so this takes
      that s390 warning disable logic, and turns it into a kernel build config
      entry instead.
      
      Part of the intent is that we can make this all much more targeted, and
      use this conflig flag to disable it in only particular configurations
      that cause problems, with the s390 case as an example:
      
              select GCC12_NO_ARRAY_BOUNDS
      
      and we could do that for other configuration cases that cause issues.
      
      Or we could possibly use the CONFIG_CC_NO_ARRAY_BOUNDS thing in a more
      targeted way, and disable the warning only for particular uses: again
      the s390 case as an example:
      
        KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_CC_NO_ARRAY_BOUNDS),-Wno-array-bounds)
      
      but this ends up just doing it globally in the top-level Makefile, since
      the current issues are spread fairly widely all over:
      
        KBUILD_CFLAGS-$(CONFIG_CC_NO_ARRAY_BOUNDS) += -Wno-array-bounds
      
      We'll try to limit this later, since the gcc-12 problems are rare enough
      that *much* of the kernel can be built with it without disabling this
      warning.
      
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f0be87c4
    • L
      mellanox: mlx5: avoid uninitialized variable warning with gcc-12 · 842c3b3d
      Linus Torvalds 提交于
      gcc-12 started warning about 'tracker' being used uninitialized:
      
        drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c: In function ‘mlx5_do_bond’:
        drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c:786:28: warning: ‘tracker’ is used uninitialized [-Wuninitialized]
          786 |         struct lag_tracker tracker;
              |                            ^~~~~~~
      
      which seems to be because it doesn't track how the use (and
      initialization) is bound by the 'do_bond' flag.
      
      But admittedly that 'do_bond' usage is fairly complicated, and involves
      passing it around as an argument to helper functions, so it's somewhat
      understandable that gcc doesn't see how that all works.
      
      This function could be rewritten to make the use of that tracker
      variable more obviously safe, but for now I'm just adding the forced
      initialization of it.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      842c3b3d
    • L
      gcc-12: disable '-Wdangling-pointer' warning for now · 49beadbd
      Linus Torvalds 提交于
      While the concept of checking for dangling pointers to local variables
      at function exit is really interesting, the gcc-12 implementation is not
      compatible with reality, and results in false positives.
      
      For example, gcc sees us putting things on a local list head allocated
      on the stack, which involves exactly those kinds of pointers to the
      local stack entry:
      
        In function ‘__list_add’,
            inlined from ‘list_add_tail’ at include/linux/list.h:102:2,
            inlined from ‘rebuild_snap_realms’ at fs/ceph/snap.c:434:2:
        include/linux/list.h:74:19: warning: storing the address of local variable ‘realm_queue’ in ‘*&realm_27(D)->rebuild_item.prev’ [-Wdangling-pointer=]
           74 |         new->prev = prev;
              |         ~~~~~~~~~~^~~~~~
      
      But then gcc - understandably - doesn't really understand the big
      picture how the doubly linked list works, so doesn't see how we then end
      up emptying said list head in a loop and the pointer we added has been
      removed.
      
      Gcc also complains about us (intentionally) using this as a way to store
      a kind of fake stack trace, eg
      
        drivers/acpi/acpica/utdebug.c:40:38: warning: storing the address of local variable ‘current_sp’ in ‘acpi_gbl_entry_stack_pointer’ [-Wdangling-pointer=]
           40 |         acpi_gbl_entry_stack_pointer = &current_sp;
              |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~
      
      which is entirely reasonable from a compiler standpoint, and we may want
      to change those kinds of patterns, but not not.
      
      So this is one of those "it would be lovely if the compiler were to
      complain about us leaving dangling pointers to the stack", but not this
      way.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      49beadbd
    • L
      drm: imx: fix compiler warning with gcc-12 · 7aefd8b5
      Linus Torvalds 提交于
      Gcc-12 correctly warned about this code using a non-NULL pointer as a
      truth value:
      
        drivers/gpu/drm/imx/ipuv3-crtc.c: In function ‘ipu_crtc_disable_planes’:
        drivers/gpu/drm/imx/ipuv3-crtc.c:72:21: error: the comparison will always evaluate as ‘true’ for the address of ‘plane’ will never be NULL [-Werror=address]
           72 |                 if (&ipu_crtc->plane[1] && plane == &ipu_crtc->plane[1]->base)
              |                     ^
      
      due to the extraneous '&' address-of operator.
      
      Philipp Zabel points out that The mistake had no adverse effect since
      the following condition doesn't actually dereference the NULL pointer,
      but the intent of the code was obviously to check for it, not to take
      the address of the member.
      
      Fixes: eb8c8880 ("drm/imx: add deferred plane disabling")
      Acked-by: NPhilipp Zabel <p.zabel@pengutronix.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7aefd8b5
  3. 09 6月, 2022 21 次提交
    • M
      powerpc/32: Fix overread/overwrite of thread_struct via ptrace · 8e127844
      Michael Ellerman 提交于
      The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
      to read/write registers of another process.
      
      To get/set a register, the API takes an index into an imaginary address
      space called the "USER area", where the registers of the process are
      laid out in some fashion.
      
      The kernel then maps that index to a particular register in its own data
      structures and gets/sets the value.
      
      The API only allows a single machine-word to be read/written at a time.
      So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels.
      
      The way floating point registers (FPRs) are addressed is somewhat
      complicated, because double precision float values are 64-bit even on
      32-bit CPUs. That means on 32-bit kernels each FPR occupies two
      word-sized locations in the USER area. On 64-bit kernels each FPR
      occupies one word-sized location in the USER area.
      
      Internally the kernel stores the FPRs in an array of u64s, or if VSX is
      enabled, an array of pairs of u64s where one half of each pair stores
      the FPR. Which half of the pair stores the FPR depends on the kernel's
      endianness.
      
      To handle the different layouts of the FPRs depending on VSX/no-VSX and
      big/little endian, the TS_FPR() macro was introduced.
      
      Unfortunately the TS_FPR() macro does not take into account the fact
      that the addressing of each FPR differs between 32-bit and 64-bit
      kernels. It just takes the index into the "USER area" passed from
      userspace and indexes into the fp_state.fpr array.
      
      On 32-bit there are 64 indexes that address FPRs, but only 32 entries in
      the fp_state.fpr array, meaning the user can read/write 256 bytes past
      the end of the array. Because the fp_state sits in the middle of the
      thread_struct there are various fields than can be overwritten,
      including some pointers. As such it may be exploitable.
      
      It has also been observed to cause systems to hang or otherwise
      misbehave when using gdbserver, and is probably the root cause of this
      report which could not be easily reproduced:
        https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40fc6@keymile.com/
      
      Rather than trying to make the TS_FPR() macro even more complicated to
      fix the bug, or add more macros, instead add a special-case for 32-bit
      kernels. This is more obvious and hopefully avoids a similar bug
      happening again in future.
      
      Note that because 32-bit kernels never have VSX enabled the code doesn't
      need to consider TS_FPRWIDTH/OFFSET at all. Add a BUILD_BUG_ON() to
      ensure that 32-bit && VSX is never enabled.
      
      Fixes: 87fec051 ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers in little endian builds")
      Cc: stable@vger.kernel.org # v3.13+
      Reported-by: NAriel Miculas <ariel.miculas@belden.com>
      Tested-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20220609133245.573565-1-mpe@ellerman.id.au
      8e127844
    • J
      net: amd-xgbe: fix clang -Wformat warning · 647df0d4
      Justin Stitt 提交于
      see warning:
      | drivers/net/ethernet/amd/xgbe/xgbe-drv.c:2787:43: warning: format specifies
      | type 'unsigned short' but the argument has type 'int' [-Wformat]
      |        netdev_dbg(netdev, "Protocol: %#06hx\n", ntohs(eth->h_proto));
      |                                      ~~~~~~     ^~~~~~~~~~~~~~~~~~~
      
      Variadic functions (printf-like) undergo default argument promotion.
      Documentation/core-api/printk-formats.rst specifically recommends
      using the promoted-to-type's format flag.
      
      Also, as per C11 6.3.1.1:
      (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf)
      `If an int can represent all values of the original type ..., the
      value is converted to an int; otherwise, it is converted to an
      unsigned int. These are called the integer promotions.`
      
      Since the argument is a u16 it will get promoted to an int and thus it is
      most accurate to use the %x format specifier here. It should be noted that the
      `#06` formatting sugar does not alter the promotion rules.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/378Signed-off-by: NJustin Stitt <jstitt007@gmail.com>
      Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
      Link: https://lore.kernel.org/r/20220607191119.20686-1-jstitt007@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      647df0d4
    • M
      tcp: use alloc_large_system_hash() to allocate table_perturb · e67b72b9
      Muchun Song 提交于
      In our server, there may be no high order (>= 6) memory since we reserve
      lots of HugeTLB pages when booting.  Then the system panic.  So use
      alloc_large_system_hash() to allocate table_perturb.
      
      Fixes: e9261476 ("tcp: dynamically allocate the perturb table used by source ports")
      Signed-off-by: NMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20220607070214.94443-1-songmuchun@bytedance.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      e67b72b9
    • A
      net: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHY · 487994ff
      Alvin Šipraga 提交于
      Since commit a18e6521 ("net: phylink: handle NA interface mode in
      phylink_fwnode_phy_connect()"), phylib defaults to GMII when no phy-mode
      or phy-connection-type property is specified in a DSA port node of the
      device tree. The same commit caused a regression in rtl8365mb whereby
      phylink would fail to connect, because the driver did not advertise
      support for GMII for ports with internal PHY.
      
      It should be noted that the aforementioned regression is not because the
      blamed commit was incorrect: on the contrary, the blamed commit is
      correcting the previous behaviour whereby unspecified phy-mode would
      cause the internal interface mode to be PHY_INTERFACE_MODE_NA. The
      rtl8365mb driver only worked by accident before because it _did_
      advertise support for PHY_INTERFACE_MODE_NA, despite NA being reserved
      for internal use by phylink. With one mistake fixed, the other was
      exposed.
      
      Commit a5dba0f2 ("net: dsa: rtl8365mb: add GMII as user port mode")
      then introduced implicit support for GMII mode on ports with internal
      PHY to allow a PHY connection for device trees where the phy-mode is not
      explicitly set to "internal". At this point everything was working OK
      again.
      
      Subsequently, commit 6ff60646 ("net: dsa: realtek: convert to
      phylink_generic_validate()") broke this behaviour again by discarding
      the usage of rtl8365mb_phy_mode_supported() - where this GMII support
      was indicated - while switching to the new .phylink_get_caps API.
      
      With the new API, rtl8365mb_phy_mode_supported() is no longer needed.
      Remove it altogether and add back the GMII capability - this time to
      rtl8365mb_phylink_get_caps() - so that the above default behaviour works
      for ports with internal PHY again.
      
      Fixes: 6ff60646 ("net: dsa: realtek: convert to phylink_generic_validate()")
      Signed-off-by: NAlvin Šipraga <alsi@bang-olufsen.dk>
      Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Link: https://lore.kernel.org/r/20220607184624.417641-1-alvin@pqrs.dkSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      487994ff
    • J
      Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 568a32f5
      Jakub Kicinski 提交于
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-06-07
      
      This series contains updates to ixgbe driver only.
      
      Olivier Matz resolves an issue so that broadcast packets can still be
      received when VF removes promiscuous settings and removes setting of
      VLAN promiscuous, in promiscuous mode, to prevent a loop when VFs are
      bridged.
      
      * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        ixgbe: fix unexpected VLAN Rx in promisc mode on VF
        ixgbe: fix bcast packets Rx on VF after promisc removal
      ====================
      
      Link: https://lore.kernel.org/r/20220607181538.748786-1-anthony.l.nguyen@intel.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      568a32f5
    • J
      Merge branch 'mv88e6xxx-fixes-for-reading-serdes-state' · 5d4af9c1
      Jakub Kicinski 提交于
      Russell King says:
      
      ====================
      mv88e6xxx: fixes for reading serdes state
      
      These are some low-priority fixes to the mv88e6xxx serdes code.
      Patch 1 fixes the reporting of an_complete, which is used in the
      emulation of a conventional C22 PHY. Patch from Marek.
      
      Patch 2 makes one of the error messages in patch 2 to be consistent
      with the other error messages in this function.
      
      Patch 3 ensures that we do not miss a link-failure event.
      ====================
      
      Link: https://lore.kernel.org/r/Yp82TyoLon9jz6k3@shell.armlinux.org.ukSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      5d4af9c1
    • R
      net: dsa: mv88e6xxx: correctly report serdes link failure · b4d78731
      Russell King (Oracle) 提交于
      Phylink wants to know if the link has dropped since the last time state
      was retrieved, and the BMSR gives us that. Read the BMSR and use it when
      deciding the link state. Fill in the an_complete member as well for the
      emulated PHY state.
      Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      b4d78731
    • R
      net: dsa: mv88e6xxx: fix BMSR error to be consistent with others · 2b4bb9cd
      Russell King (Oracle) 提交于
      Other errors accessing the registers in mv88e6352_serdes_pcs_get_state()
      print "PHY " before the register name, except for the BMSR. Make this
      consistent with the other error messages.
      Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      2b4bb9cd
    • M
      net: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_complete · 47e96930
      Marek Behún 提交于
      Commit ede359d8 ("net: dsa: mv88e6xxx: Link in pcs_get_state() if AN
      is bypassed") added the ability to link if AN was bypassed, and added
      filling of state->an_complete field, but set it to true if AN was
      enabled in BMCR, not when AN was reported complete in BMSR.
      
      This was done because for some reason, when I wanted to use BMSR value
      to infer an_complete, I was looking at BMSR_ANEGCAPABLE bit (which was
      always 1), instead of BMSR_ANEGCOMPLETE bit.
      
      Use BMSR_ANEGCOMPLETE for filling state->an_complete.
      
      Fixes: ede359d8 ("net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed")
      Signed-off-by: NMarek Behún <kabel@kernel.org>
      Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      47e96930
    • M
      net: altera: Fix refcount leak in altera_tse_mdio_create · 11ec18b1
      Miaoqian Lin 提交于
      Every iteration of for_each_child_of_node() decrements
      the reference count of the previous node.
      When break from a for_each_child_of_node() loop,
      we need to explicitly call of_node_put() on the child node when
      not need anymore.
      Add missing of_node_put() to avoid refcount leak.
      
      Fixes: bbd2190c ("Altera TSE: Add main and header file for Altera Ethernet Driver")
      Signed-off-by: NMiaoqian Lin <linmq006@gmail.com>
      Link: https://lore.kernel.org/r/20220607041144.7553-1-linmq006@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      11ec18b1
    • I
      net: openvswitch: fix misuse of the cached connection on tuple changes · 2061ecfd
      Ilya Maximets 提交于
      If packet headers changed, the cached nfct is no longer relevant
      for the packet and attempt to re-use it leads to the incorrect packet
      classification.
      
      This issue is causing broken connectivity in OpenStack deployments
      with OVS/OVN due to hairpin traffic being unexpectedly dropped.
      
      The setup has datapath flows with several conntrack actions and tuple
      changes between them:
      
        actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
                set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
                set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
                ct(zone=8),recirc(0x4)
      
      After the first ct() action the packet headers are almost fully
      re-written.  The next ct() tries to re-use the existing nfct entry
      and marks the packet as invalid, so it gets dropped later in the
      pipeline.
      
      Clearing the cached conntrack entry whenever packet tuple is changed
      to avoid the issue.
      
      The flow key should not be cleared though, because we should still
      be able to match on the ct_state if the recirculation happens after
      the tuple change but before the next ct() action.
      
      Cc: stable@vger.kernel.org
      Fixes: 7f8a436e ("openvswitch: Add conntrack action")
      Reported-by: NFrode Nordahl <frode.nordahl@canonical.com>
      Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html
      Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856Signed-off-by: NIlya Maximets <i.maximets@ovn.org>
      Link: https://lore.kernel.org/r/20220606221140.488984-1-i.maximets@ovn.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      2061ecfd
    • C
      net: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag · 2f2c0d29
      Chen Lin 提交于
      When rx_flag == MTK_RX_FLAGS_HWLRO,
      rx_data_len = MTK_MAX_LRO_RX_LENGTH(4096 * 3) > PAGE_SIZE.
      netdev_alloc_frag is for alloction of page fragment only.
      Reference to other drivers and Documentation/vm/page_frags.rst
      
      Branch to use __get_free_pages when ring->frag_size > PAGE_SIZE.
      Signed-off-by: NChen Lin <chen45464546@163.com>
      Link: https://lore.kernel.org/r/1654692413-2598-1-git-send-email-chen45464546@163.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      2f2c0d29
    • W
      ip_gre: test csum_start instead of transport header · 8d21e996
      Willem de Bruijn 提交于
      GRE with TUNNEL_CSUM will apply local checksum offload on
      CHECKSUM_PARTIAL packets.
      
      ipgre_xmit must validate csum_start after an optional skb_pull,
      else lco_csum may trigger an overflow. The original check was
      
      	if (csum && skb_checksum_start(skb) < skb->data)
      		return -EINVAL;
      
      This had false positives when skb_checksum_start is undefined:
      when ip_summed is not CHECKSUM_PARTIAL. A discussed refinement
      was straightforward
      
      	if (csum && skb->ip_summed == CHECKSUM_PARTIAL &&
      	    skb_checksum_start(skb) < skb->data)
      		return -EINVAL;
      
      But was eventually revised more thoroughly:
      - restrict the check to the only branch where needed, in an
        uncommon GRE path that uses header_ops and calls skb_pull.
      - test skb_transport_header, which is set along with csum_start
        in skb_partial_csum_set in the normal header_ops datapath.
      
      Turns out skbs can arrive in this branch without the transport
      header set, e.g., through BPF redirection.
      
      Revise the check back to check csum_start directly, and only if
      CHECKSUM_PARTIAL. Do leave the check in the updated location.
      Check field regardless of whether TUNNEL_CSUM is configured.
      
      Link: https://lore.kernel.org/netdev/YS+h%2FtqCJJiQei+W@shredder/
      Link: https://lore.kernel.org/all/20210902193447.94039-2-willemdebruijn.kernel@gmail.com/T/#u
      Fixes: 8a0ed250 ("ip_gre: validate csum_start only on pull")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NAlexander Duyck <alexanderduyck@fb.com>
      Link: https://lore.kernel.org/r/20220606132107.3582565-1-willemdebruijn.kernel@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      8d21e996
    • J
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · d5d4c363
      Jakub Kicinski 提交于
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2022-06-09
      
      We've added 6 non-merge commits during the last 2 day(s) which contain
      a total of 8 files changed, 49 insertions(+), 15 deletions(-).
      
      The main changes are:
      
      1) Fix an illegal copy_to_user() attempt seen by syzkaller through arm64
         BPF JIT compiler, from Eric Dumazet.
      
      2) Fix calling global functions from BPF_PROG_TYPE_EXT programs by using
         the correct program context type, from Toke Høiland-Jørgensen.
      
      3) Fix XSK TX batching invalid descriptor handling, from Maciej Fijalkowski.
      
      4) Fix potential integer overflows in multi-kprobe link code by using safer
         kvmalloc_array() allocation helpers, from Dan Carpenter.
      
      5) Add Quentin as bpftool maintainer, from Quentin Monnet.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        MAINTAINERS: Add a maintainer for bpftool
        xsk: Fix handling of invalid descriptors in XSK TX batching API
        selftests/bpf: Add selftest for calling global functions from freplace
        bpf: Fix calling global functions from BPF_PROG_TYPE_EXT programs
        bpf: Use safer kvmalloc_array() where possible
        bpf, arm64: Clear prog->jited_len along prog->jited
      ====================
      
      Link: https://lore.kernel.org/r/20220608234133.32265-1-daniel@iogearbox.netSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      d5d4c363
    • L
      cert host tools: Stop complaining about deprecated OpenSSL functions · 6bfb56e9
      Linus Torvalds 提交于
      OpenSSL 3.0 deprecated the OpenSSL's ENGINE API.  That is as may be, but
      the kernel build host tools still use it.  Disable the warning about
      deprecated declarations until somebody who cares fixes it.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6bfb56e9
    • M
      net/mlx5: fs, fail conflicting actions · 8fa5e7b2
      Mark Bloch 提交于
      When combining two steering rules into one check
      not only do they share the same actions but those
      actions are also the same. This resolves an issue where
      when creating two different rules with the same match
      the actions are overwritten and one of the rules is deleted
      a FW syndrome can be seen in dmesg.
      
      mlx5_core 0000:03:00.0: mlx5_cmd_check:819:(pid 2105): DEALLOC_MODIFY_HEADER_CONTEXT(0x941) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x1ab444)
      
      Fixes: 0d235c3f ("net/mlx5: Add hash table to search FTEs in a flow-group")
      Signed-off-by: NMark Bloch <mbloch@nvidia.com>
      Reviewed-by: NMaor Gottlieb <maorg@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      8fa5e7b2
    • F
      net/mlx5: Rearm the FW tracer after each tracer event · 8bf94e64
      Feras Daoud 提交于
      The current design does not arm the tracer if traces are available before
      the tracer string database is fully loaded, leading to an unfunctional tracer.
      This fix will rearm the tracer every time the FW triggers tracer event
      regardless of the tracer strings database status.
      
      Fixes: c71ad41c ("net/mlx5: FW tracer, events handling")
      Signed-off-by: NFeras Daoud <ferasda@nvidia.com>
      Signed-off-by: NRoy Novich <royno@nvidia.com>
      Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      8bf94e64
    • M
      net/mlx5: E-Switch, pair only capable devices · 3008e6a0
      Mark Bloch 提交于
      OFFLOADS paring using devcom is possible only on devices
      that support LAG. Filter based on lag capabilities.
      
      This fixes an issue where mlx5_get_next_phys_dev() was
      called without holding the interface lock.
      
      This issue was found when commit
      bc4c2f2e ("net/mlx5: Lag, filter non compatible devices")
      added an assert that verifies the interface lock is held.
      
      WARNING: CPU: 9 PID: 1706 at drivers/net/ethernet/mellanox/mlx5/core/dev.c:642 mlx5_get_next_phys_dev+0xd2/0x100 [mlx5_core]
      Modules linked in: mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm ib_uverbs ib_core overlay fuse [last unloaded: mlx5_core]
      CPU: 9 PID: 1706 Comm: devlink Not tainted 5.18.0-rc7+ #11
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      RIP: 0010:mlx5_get_next_phys_dev+0xd2/0x100 [mlx5_core]
      Code: 02 00 75 48 48 8b 85 80 04 00 00 5d c3 31 c0 5d c3 be ff ff ff ff 48 c7 c7 08 41 5b a0 e8 36 87 28 e3 85 c0 0f 85 6f ff ff ff <0f> 0b e9 68 ff ff ff 48 c7 c7 0c 91 cc 84 e8 cb 36 6f e1 e9 4d ff
      RSP: 0018:ffff88811bf47458 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff88811b398000 RCX: 0000000000000001
      RDX: 0000000080000000 RSI: ffffffffa05b4108 RDI: ffff88812daaaa78
      RBP: ffff88812d050380 R08: 0000000000000001 R09: ffff88811d6b3437
      R10: 0000000000000001 R11: 00000000fddd3581 R12: ffff88815238c000
      R13: ffff88812d050380 R14: ffff8881018aa7e0 R15: ffff88811d6b3428
      FS:  00007fc82e18ae80(0000) GS:ffff88842e080000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f9630d1b421 CR3: 0000000149802004 CR4: 0000000000370ea0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       mlx5_esw_offloads_devcom_event+0x99/0x3b0 [mlx5_core]
       mlx5_devcom_send_event+0x167/0x1d0 [mlx5_core]
       esw_offloads_enable+0x1153/0x1500 [mlx5_core]
       ? mlx5_esw_offloads_controller_valid+0x170/0x170 [mlx5_core]
       ? wait_for_completion_io_timeout+0x20/0x20
       ? mlx5_rescan_drivers_locked+0x318/0x810 [mlx5_core]
       mlx5_eswitch_enable_locked+0x586/0xc50 [mlx5_core]
       ? mlx5_eswitch_disable_pf_vf_vports+0x1d0/0x1d0 [mlx5_core]
       ? mlx5_esw_try_lock+0x1b/0xb0 [mlx5_core]
       ? mlx5_eswitch_enable+0x270/0x270 [mlx5_core]
       ? __debugfs_create_file+0x260/0x3e0
       mlx5_devlink_eswitch_mode_set+0x27e/0x870 [mlx5_core]
       ? mutex_lock_io_nested+0x12c0/0x12c0
       ? esw_offloads_disable+0x250/0x250 [mlx5_core]
       ? devlink_nl_cmd_trap_get_dumpit+0x470/0x470
       ? rcu_read_lock_sched_held+0x3f/0x70
       devlink_nl_cmd_eswitch_set_doit+0x217/0x620
      
      Fixes: dd3fddb8 ("net/mlx5: E-Switch, handle devcom events only for ports on the same device")
      Signed-off-by: NMark Bloch <mbloch@nvidia.com>
      Reviewed-by: NRoi Dayan <roid@nvidia.com>
      Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      3008e6a0
    • P
      net/mlx5e: CT: Fix cleanup of CT before cleanup of TC ct rules · 15ef9efa
      Paul Blakey 提交于
      CT cleanup assumes that all tc rules were deleted first, and so
      is free to delete the CT shared resources (e.g the dr_action
      fwd_action which is shared for all tuples). But currently for
      uplink, this is happens in reverse, causing the below trace.
      
      CT cleanup is called from:
      mlx5e_cleanup_rep_tx()->mlx5e_cleanup_uplink_rep_tx()->
      mlx5e_rep_tc_cleanup()->mlx5e_tc_esw_cleanup()->
      mlx5_tc_ct_clean()
      
      Only afterwards, tc cleanup is called from:
      mlx5e_cleanup_rep_tx()->mlx5e_tc_ht_cleanup()
      which would have deleted all the tc ct rules, and so delete
      all the offloaded tuples.
      
      Fix this reversing the order of init and on cleanup, which
      will result in tc cleanup then ct cleanup.
      
      [ 9443.593347] WARNING: CPU: 2 PID: 206774 at drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c:1882 mlx5dr_action_destroy+0x188/0x1a0 [mlx5_core]
      [ 9443.593349] Modules linked in: act_ct nf_flow_table rdma_ucm(O) rdma_cm(O) iw_cm(O) ib_ipoib(O) ib_cm(O) ib_umad(O) mlx5_core(O-) mlxfw(O) mlxdevm(O) auxiliary(O) ib_uverbs(O) psample ib_core(O) mlx_compat(O) ip_gre gre ip_tunnel act_vlan bonding geneve esp6_offload esp6 esp4_offload esp4 act_tunnel_key vxlan ip6_udp_tunnel udp_tunnel act_mirred act_skbedit act_gact cls_flower sch_ingress nfnetlink_cttimeout nfnetlink xfrm_user xfrm_algo 8021q garp stp ipmi_devintf mrp ipmi_msghandler llc openvswitch nsh nf_conncount nf_nat mst_pciconf(O) dm_multipath sbsa_gwdt uio_pdrv_genirq uio mlxbf_pmc mlxbf_pka mlx_trio mlx_bootctl(O) bluefield_edac sch_fq_codel ip_tables ipv6 crc_ccitt btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq raid1 raid0 crct10dif_ce i2c_mlxbf gpio_mlxbf2 mlxbf_gige aes_neon_bs aes_neon_blk [last unloaded: mlx5_ib]
      [ 9443.593419] CPU: 2 PID: 206774 Comm: modprobe Tainted: G           O      5.4.0-1023.24.gc14613d-bluefield #1
      [ 9443.593422] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:143ebaf Jan 11 2022
      [ 9443.593424] pstate: 20000005 (nzCv daif -PAN -UAO)
      [ 9443.593489] pc : mlx5dr_action_destroy+0x188/0x1a0 [mlx5_core]
      [ 9443.593545] lr : mlx5_ct_fs_smfs_destroy+0x24/0x30 [mlx5_core]
      [ 9443.593546] sp : ffff8000135dbab0
      [ 9443.593548] x29: ffff8000135dbab0 x28: ffff0003a6ab8e80
      [ 9443.593550] x27: 0000000000000000 x26: ffff0003e07d7000
      [ 9443.593552] x25: ffff800009609de0 x24: ffff000397fb2120
      [ 9443.593554] x23: ffff0003975c0000 x22: 0000000000000000
      [ 9443.593556] x21: ffff0003975f08c0 x20: ffff800009609de0
      [ 9443.593558] x19: ffff0003c8a13380 x18: 0000000000000014
      [ 9443.593560] x17: 0000000067f5f125 x16: 000000006529c620
      [ 9443.593561] x15: 000000000000000b x14: 0000000000000000
      [ 9443.593563] x13: 0000000000000002 x12: 0000000000000001
      [ 9443.593565] x11: ffff800011108868 x10: 0000000000000000
      [ 9443.593567] x9 : 0000000000000000 x8 : ffff8000117fb270
      [ 9443.593569] x7 : ffff0003ebc01288 x6 : 0000000000000000
      [ 9443.593571] x5 : ffff800009591ab8 x4 : fffffe000f6d9a20
      [ 9443.593572] x3 : 0000000080040001 x2 : fffffe000f6d9a20
      [ 9443.593574] x1 : ffff8000095901d8 x0 : 0000000000000025
      [ 9443.593577] Call trace:
      [ 9443.593634]  mlx5dr_action_destroy+0x188/0x1a0 [mlx5_core]
      [ 9443.593688]  mlx5_ct_fs_smfs_destroy+0x24/0x30 [mlx5_core]
      [ 9443.593743]  mlx5_tc_ct_clean+0x34/0xa8 [mlx5_core]
      [ 9443.593797]  mlx5e_tc_esw_cleanup+0x58/0x88 [mlx5_core]
      [ 9443.593851]  mlx5e_rep_tc_cleanup+0x24/0x30 [mlx5_core]
      [ 9443.593905]  mlx5e_cleanup_rep_tx+0x6c/0x78 [mlx5_core]
      [ 9443.593959]  mlx5e_detach_netdev+0x74/0x98 [mlx5_core]
      [ 9443.594013]  mlx5e_netdev_change_profile+0x70/0x180 [mlx5_core]
      [ 9443.594067]  mlx5e_netdev_attach_nic_profile+0x34/0x40 [mlx5_core]
      [ 9443.594122]  mlx5e_vport_rep_unload+0x15c/0x1a8 [mlx5_core]
      [ 9443.594177]  mlx5_eswitch_unregister_vport_reps+0x228/0x298 [mlx5_core]
      [ 9443.594231]  mlx5e_rep_remove+0x2c/0x38 [mlx5_core]
      [ 9443.594236]  auxiliary_bus_remove+0x30/0x50 [auxiliary]
      [ 9443.594246]  device_release_driver_internal+0x108/0x1d0
      [ 9443.594248]  driver_detach+0x5c/0xe8
      [ 9443.594250]  bus_remove_driver+0x64/0xd8
      [ 9443.594253]  driver_unregister+0x38/0x60
      [ 9443.594255]  auxiliary_driver_unregister+0x24/0x38 [auxiliary]
      [ 9443.594311]  mlx5e_rep_cleanup+0x20/0x38 [mlx5_core]
      [ 9443.594365]  mlx5e_cleanup+0x18/0x30 [mlx5_core]
      [ 9443.594419]  cleanup+0xc/0x20cc [mlx5_core]
      [ 9443.594424]  __arm64_sys_delete_module+0x154/0x2b0
      [ 9443.594429]  el0_svc_common.constprop.0+0xf4/0x200
      [ 9443.594432]  el0_svc_handler+0x38/0xa8
      [ 9443.594435]  el0_svc+0x10/0x26c
      
      Fixes: d1a3138f ("net/mlx5e: TC, Move flow hashtable to be per rep")
      Signed-off-by: NPaul Blakey <paulb@nvidia.com>
      Reviewed-by: NOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      15ef9efa
    • S
      Revert "net/mlx5e: Allow relaxed ordering over VFs" · 4d995c1b
      Saeed Mahameed 提交于
      FW is not ready, fix was sent too soon.
      This reverts commit f05ec8d9.
      
      Fixes: f05ec8d9 ("net/mlx5e: Allow relaxed ordering over VFs")
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      4d995c1b
    • L
      MAINTAINERS: adjust MELLANOX ETHERNET INNOVA DRIVERS to TLS support removal · ed872f92
      Lukas Bulwahn 提交于
      Commit 40379a00 ("net/mlx5_fpga: Drop INNOVA TLS support") removes all
      files in the directory drivers/net/ethernet/mellanox/mlx5/core/accel/, but
      misses to adjust its reference in MAINTAINERS.
      
      Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a
      broken reference.
      
      Remove the file entry to the removed directory in MELLANOX ETHERNET INNOVA
      DRIVERS.
      Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      ed872f92