1. 24 8月, 2019 5 次提交
    • E
      net/mlx5: Fix return code in case of hyperv wrong size read · 87cade29
      Eran Ben Elisha 提交于
      Return code value could be non deterministic in case of wrong size read.
      With this patch, if such error occurs, set rc to be -EIO.
      
      In addition, mlx5_hv_config_common() supports reading of
      HV_CONFIG_BLOCK_SIZE_MAX bytes only, fix to early return error with
      bad input.
      
      Fixes: 913d14e8 ("net/mlx5: Add wrappers for HyperV PCIe operations")
      Reported-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87cade29
    • X
      net: ipv6: fix listify ip6_rcv_finish in case of forwarding · c7a42eb4
      Xin Long 提交于
      We need a similar fix for ipv6 as Commit 0761680d ("net: ipv4: fix
      listify ip_rcv_finish in case of forwarding") does for ipv4.
      
      This issue can be reprocuded by syzbot since Commit 323ebb61 ("net:
      use listified RX for handling GRO_NORMAL skbs") on net-next. The call
      trace was:
      
        kernel BUG at include/linux/skbuff.h:2225!
        RIP: 0010:__skb_pull include/linux/skbuff.h:2225 [inline]
        RIP: 0010:skb_pull+0xea/0x110 net/core/skbuff.c:1902
        Call Trace:
          sctp_inq_pop+0x2f1/0xd80 net/sctp/inqueue.c:202
          sctp_endpoint_bh_rcv+0x184/0x8d0 net/sctp/endpointola.c:385
          sctp_inq_push+0x1e4/0x280 net/sctp/inqueue.c:80
          sctp_rcv+0x2807/0x3590 net/sctp/input.c:256
          sctp6_rcv+0x17/0x30 net/sctp/ipv6.c:1049
          ip6_protocol_deliver_rcu+0x2fe/0x1660 net/ipv6/ip6_input.c:397
          ip6_input_finish+0x84/0x170 net/ipv6/ip6_input.c:438
          NF_HOOK include/linux/netfilter.h:305 [inline]
          NF_HOOK include/linux/netfilter.h:299 [inline]
          ip6_input+0xe4/0x3f0 net/ipv6/ip6_input.c:447
          dst_input include/net/dst.h:442 [inline]
          ip6_sublist_rcv_finish+0x98/0x1e0 net/ipv6/ip6_input.c:84
          ip6_list_rcv_finish net/ipv6/ip6_input.c:118 [inline]
          ip6_sublist_rcv+0x80c/0xcf0 net/ipv6/ip6_input.c:282
          ipv6_list_rcv+0x373/0x4b0 net/ipv6/ip6_input.c:316
          __netif_receive_skb_list_ptype net/core/dev.c:5049 [inline]
          __netif_receive_skb_list_core+0x5fc/0x9d0 net/core/dev.c:5097
          __netif_receive_skb_list net/core/dev.c:5149 [inline]
          netif_receive_skb_list_internal+0x7eb/0xe60 net/core/dev.c:5244
          gro_normal_list.part.0+0x1e/0xb0 net/core/dev.c:5757
          gro_normal_list net/core/dev.c:5755 [inline]
          gro_normal_one net/core/dev.c:5769 [inline]
          napi_frags_finish net/core/dev.c:5782 [inline]
          napi_gro_frags+0xa6a/0xea0 net/core/dev.c:5855
          tun_get_user+0x2e98/0x3fa0 drivers/net/tun.c:1974
          tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2020
      
      Fixes: d8269e2c ("net: ipv6: listify ipv6_rcv() and ip6_rcv_finish()")
      Fixes: 323ebb61 ("net: use listified RX for handling GRO_NORMAL skbs")
      Reported-by: syzbot+eb349eeee854e389c36d@syzkaller.appspotmail.com
      Reported-by: syzbot+4a0643a653ac375612d1@syzkaller.appspotmail.com
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7a42eb4
    • D
      Merge branch 'r8152-save-EEE' · aa19d1f1
      David S. Miller 提交于
      Hayes Wang says:
      
      ====================
      r8152: save EEE
      
      v4:
      For patch #2, remove redundant calling of "ocp_reg_write(tp, OCP_EEE_ADV, 0)".
      
      v3:
      For patch #2, fix the mistake caused by copying and pasting.
      
      v2:
      Adjust patch #1. The EEE has been disabled in the beginning of
      r8153_hw_phy_cfg() and r8153b_hw_phy_cfg(), so only check if
      it is necessary to enable EEE.
      
      Add the patch #2 for the helper function.
      
      v1:
      Saving the settings of EEE to avoid they become the default settings
      after reset_resume().
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa19d1f1
    • H
      r8152: add a helper function about setting EEE · e7bde56b
      Hayes Wang 提交于
      Add a helper function "rtl_eee_enable" for setting EEE. Besides, I
      move r8153_eee_en() and r8153b_eee_en(). And, I remove r8152b_enable_eee(),
      r8153_set_eee(), and r8153b_set_eee().
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7bde56b
    • H
      r8152: saving the settings of EEE · f4a93be6
      Hayes Wang 提交于
      Saving the settings of EEE to avoid they become the default settings
      after reset_resume().
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4a93be6
  2. 23 8月, 2019 25 次提交
  3. 22 8月, 2019 10 次提交
    • D
      Merge branch 'mlx5-hyperv' · 8da3803d
      David S. Miller 提交于
      Haiyang Zhang says:
      
      ====================
      Add software backchannel and mlx5e HV VHCA stats
      
      This patch set adds paravirtual backchannel in software in pci_hyperv,
      which is required by the mlx5e driver HV VHCA stats agent.
      
      The stats agent is responsible on running a periodic rx/tx packets/bytes
      stats update.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8da3803d
    • E
      net/mlx5e: Add mlx5e HV VHCA stats agent · cef35af3
      Eran Ben Elisha 提交于
      HV VHCA stats agent is responsible on running a preiodic rx/tx
      packets/bytes stats update. Currently the supported format is version
      MLX5_HV_VHCA_STATS_VERSION. Block ID 1 is dedicated for statistics data
      transfer from the VF to the PF.
      
      The reporter fetch the statistics data from all opened channels, fill it
      in a buffer and send it to mlx5_hv_vhca_write_agent.
      
      As the stats layer should include some metadata per block (sequence and
      offset), the HV VHCA layer shall modify the buffer before actually send it
      over block 1.
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cef35af3
    • E
      net/mlx5: Add HV VHCA control agent · 29ddad43
      Eran Ben Elisha 提交于
      Control agent is responsible over of the control block (ID 0). It should
      update the PF via this block about every capability change. In addition,
      upon block 0 invalidate, it should activate all other supported agents
      with data requests from the PF.
      
      Upon agent create/destroy, the invalidate callback of the control agent
      is being called in order to update the PF driver about this change.
      
      The control agent is an integral part of HV VHCA and will be created
      and destroy as part of the HV VHCA init/cleanup flow.
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29ddad43
    • E
      net/mlx5: Add HV VHCA infrastructure · 87175120
      Eran Ben Elisha 提交于
      HV VHCA is a layer which provides PF to VF communication channel based on
      HyperV PCI config channel. It implements Mellanox's Inter VHCA control
      communication protocol. The protocol contains control block in order to
      pass messages between the PF and VF drivers, and data blocks in order to
      pass actual data.
      
      The infrastructure is agent based. Each agent will be responsible of
      contiguous buffer blocks in the VHCA config space. This infrastructure will
      bind agents to their blocks, and those agents can only access read/write
      the buffer blocks assigned to them. Each agent will provide three
      callbacks (control, invalidate, cleanup). Control will be invoked when
      block-0 is invalidated with a command that concerns this agent. Invalidate
      callback will be invoked if one of the blocks assigned to this agent was
      invalidated. Cleanup will be invoked before the agent is being freed in
      order to clean all of its open resources or deferred works.
      
      Block-0 serves as the control block. All execution commands from the PF
      will be written by the PF over this block. VF will ack on those by
      writing on block-0 as well. Its format is described by struct
      mlx5_hv_vhca_control_block layout.
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      87175120
    • E
      net/mlx5: Add wrappers for HyperV PCIe operations · 913d14e8
      Eran Ben Elisha 提交于
      Add wrapper functions for HyperV PCIe read / write /
      block_invalidate_register operations.  This will be used as an
      infrastructure in the downstream patch for software communication.
      
      This will be enabled by default if CONFIG_PCI_HYPERV_INTERFACE is set.
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      913d14e8
    • H
      PCI: hv: Add a Hyper-V PCI interface driver for software backchannel interface · 348dd93e
      Haiyang Zhang 提交于
      This interface driver is a helper driver allows other drivers to
      have a common interface with the Hyper-V PCI frontend driver.
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      348dd93e
    • D
      PCI: hv: Add a paravirtual backchannel in software · e5d2f910
      Dexuan Cui 提交于
      Windows SR-IOV provides a backchannel mechanism in software for communication
      between a VF driver and a PF driver.  These "configuration blocks" are
      similar in concept to PCI configuration space, but instead of doing reads and
      writes in 32-bit chunks through a very slow path, packets of up to 128 bytes
      can be sent or received asynchronously.
      
      Nearly every SR-IOV device contains just such a communications channel in
      hardware, so using this one in software is usually optional.  Using the
      software channel, however, allows driver implementers to leverage software
      tools that fuzz the communications channel looking for vulnerabilities.
      
      The usage model for these packets puts the responsibility for reading or
      writing on the VF driver.  The VF driver sends a read or a write packet,
      indicating which "block" is being referred to by number.
      
      If the PF driver wishes to initiate communication, it can "invalidate" one or
      more of the first 64 blocks.  This invalidation is delivered via a callback
      supplied by the VF driver by this driver.
      
      No protocol is implied, except that supplied by the PF and VF drivers.
      Signed-off-by: NJake Oshins <jakeo@microsoft.com>
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5d2f910
    • D
      Merge tag 'mlx5-updates-2019-08-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · fed07ef3
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      mlx5 tc flow handling for concurrent execution (Part 3)
      
      This series includes updates to mlx5 ethernet and core driver:
      
      Vlad submits part 3 of 3 part series to allow TC flow handling
      for concurrent execution.
      
      Vlad says:
      ==========
      
      Structure mlx5e_neigh_hash_entry code that uses it are refactored in
      following ways:
      
      - Extend neigh_hash_entry with rcu and modify its users to always take
        reference to the structure when using it (neigh_hash_entry has already
        had atomic reference counter which was only used when scheduling neigh
        update on workqueue from atomic context of neigh update netevent).
      
      - Always use mlx5e_neigh_update_table->encap_lock when modifying neigh
        update hash table and list. Originally, this lock was only used to
        synchronize with netevent handler function, which is called from bh
        context and cannot use rtnl lock for synchronization. Use rcu read lock
        instead of encap_lock to lookup nhe in atomic context of netevent even
        handler function. Convert encap_lock to mutex to allow creating new
        neigh hash entries while holding it, which is safe to do because the
        lock is no longer used in atomic context.
      
      - Rcu-ify mlx5e_neigh_hash_entry->encap_list by changing operations on
        encap list to their rcu counterparts and extending encap structure
        with rcu_head to free the encap instances after rcu grace period. This
        allows fast traversal of list of encaps attached to nhe under rcu read
        lock protection.
      
      - Take encap_table_lock when accessing encap entries in neigh update and
        neigh stats update code to protect from concurrent encap entry
        insertion or removal.
      
      This approach leads to potential race condition when neigh update and
      neigh stats update code can access encap and flow entries that are not
      fully initialized or are being destroyed, or neigh can change state
      without updating encaps that are created concurrently. Prevent these
      issues by following changes in flow and encap initialization:
      
      - Extend mlx5e_tc_flow with 'init_done' completion. Modify neigh update
        to wait for both encap and flow completions to prevent concurrent
        access to a structure that is being initialized by tc.
      
      - Skip structures that failed during initialization: encaps with
        encap_id<0 and flows that don't have OFFLOADED flag set.
      
      - To ensure that no new flows are added to encap when it is being
        accessed by neigh update or neigh stats update, take encap_table_lock
        mutex.
      
      - To prevent concurrent deletion by tc, ensure that neigh update and
        neigh stats update hold references to encap and flow instances while
        using them.
      
      With changes presented in this patch set it is now safe to execute tc
      concurrently with neigh update and neigh stats update. However, these
      two workqueue tasks modify same flow "tmp_list" field to store flows
      with reference taken in temporary list to release the references after
      update operation finishes and should not be executed concurrently with
      each other.
      
      Last 3 patches of this series provide 3 new mlx5 trace points to track
      mlx5 tc requests and mlx5 neigh updates.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fed07ef3
    • V
      net/mlx5e: Add trace point for neigh update · 5970882a
      Vlad Buslov 提交于
      Allow tracing neigh state during neigh update task that is executed on
      workqueue and is scheduled by neigh state change event.
      
      Usage example:
       ># cd /sys/kernel/debug/tracing
       ># echo mlx5:mlx5e_rep_neigh_update >> set_event
       ># cat trace
          ...
          kworker/u48:7-2221  [009] ...1  1475.387435: mlx5e_rep_neigh_update:
      netdev: ens1f0 MAC: 24:8a:07:9a:17:9a IPv4: 1.1.1.10 IPv6: ::ffff:1.1.1.10 neigh_connected=1
      
      Added corresponding documentation in
          Documentation/networking/device-driver/mellanox/mlx5.rst
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NDmytro Linkin <dmitrolin@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      5970882a
    • V
      net/mlx5e: Add trace point for neigh used value update · c786fe59
      Vlad Buslov 提交于
      Allow tracing result of neigh used value update task that is executed
      periodically on workqueue.
      
      Usage example:
       ># cd /sys/kernel/debug/tracing
       ># echo mlx5:mlx5e_tc_update_neigh_used_value >> set_event
       ># cat trace
          ...
          kworker/u48:4-8806  [009] ...1 55117.882428: mlx5e_tc_update_neigh_used_value:
      netdev: ens1f0 IPv4: 1.1.1.10 IPv6: ::ffff:1.1.1.10 neigh_used=1
      
      Added corresponding documentation in
          Documentation/networking/device-driver/mellanox/mlx5.rst
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NDmytro Linkin <dmitrolin@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      c786fe59