1. 27 7月, 2016 1 次提交
    • H
      net: neigh: disallow transition to NUD_STALE if lladdr is unchanged in neigh_update() · d1c2b501
      He Chunhui 提交于
      NUD_STALE is used when the caller(e.g. arp_process()) can't guarantee
      neighbour reachability. If the entry was NUD_VALID and lladdr is unchanged,
      the entry state should not be changed.
      
      Currently the code puts an extra "NUD_CONNECTED" condition. So if old state
      was NUD_DELAY or NUD_PROBE (they are NUD_VALID but not NUD_CONNECTED), the
      state can be changed to NUD_STALE.
      
      This may cause problem. Because NUD_STALE lladdr doesn't guarantee
      reachability, when we send traffic, the state will be changed to
      NUD_DELAY. In normal case, if we get no confirmation (by dst_confirm()),
      we will change the state to NUD_PROBE and send probe traffic. But now the
      state may be reset to NUD_STALE again(e.g. by broadcast ARP packets),
      so the probe traffic will not be sent. This situation may happen again and
      again, and packets will be sent to an non-reachable lladdr forever.
      
      The fix is to remove the "NUD_CONNECTED" condition. After that the
      "NEIGH_UPDATE_F_WEAK_OVERRIDE" condition (used by IPv6) in that branch will
      be redundant, so remove it.
      
      This change may increase probe traffic, but it's essential since NUD_STALE
      lladdr is unreliable. To ensure correctness, we prefer to resolve lladdr,
      when we can't get confirmation, even while remote packets try to set
      NUD_STALE state.
      Signed-off-by: NChunhui He <hchunhui@mail.ustc.edu.cn>
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d1c2b501
  2. 26 7月, 2016 39 次提交
    • D
      Merge branch 'xgene-fix-mod-crash-and-1g-hotplug' · ee591f46
      David S. Miller 提交于
      Iyappan Subramanian says:
      
      ====================
      drivers: net: xgene: Fix module crash and 1G hot-plug
      
      This patchset addresses the following issues,
      
      1. Fixes the kernel crash when the driver loaded as an kernel module
      	- by fixing hardware cleanups and rearrange kernel API calls
      
      2. Hot-plug issue on the SGMII 1G interface
      	- by adding a driver for MDIO management
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Tested-by: NFushen Chen <fchen@apm.com>
      Tested-by: NToan Le <toanle@apm.com>
      ---
      v7: Address review comments from v6
      	- fixed kbuild warnings
      	- unmapped DMA memory on xgene_enet_delete_bufpool()
      	- delete descriptor rings and buffer pools on cle_init() failure
      	- fixed error deconstruction path on probe
      
      v6: Address review comments from v5
      	- changed to use devm_ioremap_resource
      	- changed to return PTR_ERR(clk) on failure
      	- cleaned up and removed indirections
      	- exported mdio read/write and phy_register functions
      	- changed mii_bus is to indicate interface instance
      	- changed to call the exported mdio read/write and phy_register functions
      
      v5: Address review comments from v4
      	- Fixed clock reset sequence by adding delay
      	- Fixed clock count by adding clk_unprepare_disable() in port shutdown
      
      v4: Address review comments from v3
      	- Reorganized into smaller patches
      	- Added wrapper functions for sgmii_control_reset and sgmii_tbi_control_reset
      	- Removed clk_get warning info
      	- mdio: Changed the order of 'if' statements and removed the 'else' statement
      	- mdio: Removed the mdio_read(write) indirection wrapper functions
      	- ethtool: Fixed SGMII 1G get_settings and set_settings
      	- Documentation: dtb: Added MDIO node information
      	- MAINTAINERS: Added MDIO driver and documentation path
      
      v3: Address review comments from v2
      	- Add comment about hardware clock reset sequence on xgene_mdio_reset
      
      v2: Address review comments from v1
      	- Fixed patch 1 compilation error
      	- Fixed mdio@1f610000 xge0clk reference
      	- Squashed dtb patches
      	- Added PORT_OFFSET macro
      
      v1:
      	- Initial version
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee591f46
    • I
      MAINTAINERS: xgene: Add driver and documentation path · 2efccc60
      Iyappan Subramanian 提交于
      Added path to the MDIO driver and Documentation file.
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2efccc60
    • I
    • I
      dtb: xgene: Add MDIO node · 8e694cd2
      Iyappan Subramanian 提交于
      Added mdio node for mdio driver.  Also added phy-handle
      reference to the ethernet nodes.
      
      Removed unused clock node from storm sgenet1.
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Tested-by: NFushen Chen <fchen@apm.com>
      Tested-by: NToan Le <toanle@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8e694cd2
    • I
      drivers: net: xgene: ethtool: Use phy_ethtool_gset and sset · 52d1fd99
      Iyappan Subramanian 提交于
      Changed SGMII 1G get_settings to use phy_ethtool_gset.
      Changed SGMII 1G set_settings to use phy_ethtool_sset.
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Tested-by: NFushen Chen <fchen@apm.com>
      Tested-by: NToan Le <toanle@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52d1fd99
    • I
      drivers: net: xgene: Use exported functions · 8c151963
      Iyappan Subramanian 提交于
      This patch reuses the mdio read/write and phy_register functions
      and removed the local definitions.
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Tested-by: NFushen Chen <fchen@apm.com>
      Tested-by: NToan Le <toanle@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c151963
    • I
      drivers: net: xgene: Enable MDIO driver · 47c62b6d
      Iyappan Subramanian 提交于
      This patch enables MDIO driver by,
      
      - Selecting MDIO_XGENE
      - Changed open and close to use phy_start and phy_stop
      - Changed to use mac_ops->tx(rx)_enable and tx(rx)_disable
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Tested-by: NFushen Chen <fchen@apm.com>
      Tested-by: NToan Le <toanle@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47c62b6d
    • I
      drivers: net: xgene: Add backward compatibility · 8089a96f
      Iyappan Subramanian 提交于
      This patch adds xgene_enet_check_phy_hanlde() function that checks whether
      MDIO driver is probed successfully and sets pdata->mdio_driver to true.
      If MDIO driver is not probed, ethernet driver falls back to backward
      compatibility mode.
      
      Since enum xgene_enet_cmd is used by MDIO driver, removing this from
      ethernet driver.
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Tested-by: NFushen Chen <fchen@apm.com>
      Tested-by: NToan Le <toanle@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8089a96f
    • I
      drivers: net: phy: xgene: Add MDIO driver · 43b3cf66
      Iyappan Subramanian 提交于
      Currently, SGMII based 1G rely on the hardware registers for link state
      and sometimes it's not reliable.  To get most accurate link state, this
      interface has to use the MDIO bus to poll the PHY.
      
      In X-Gene SoC, MDIO bus is shared across RGMII and SGMII based 1G
      interfaces, so adding this driver to manage MDIO bus.  This driver
      registers the mdio bus and registers the PHYs connected to it.
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Tested-by: NFushen Chen <fchen@apm.com>
      Tested-by: NToan Le <toanle@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43b3cf66
    • I
      drivers: net: xgene: Fix module unload crash - clkrst sequence · bc61167a
      Iyappan Subramanian 提交于
      This patch fixes clock reset sequence.
      
      - Added clock reset sequence for ACPI
      - Added delay in clock reset sequence to make sure pulse is generated
      - Added clk_unprepare_disable() in port shutdown to make sure
        clock increment/decrement counts are matching
      - Removed MII_MGMT_CONFIG programming, since it is not required
      - Fixed programming XGENET_CONFIG_REG to enable SGMII mode
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Tested-by: NFushen Chen <fchen@apm.com>
      Tested-by: NToan Le <toanle@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc61167a
    • I
      drivers: net: xgene: Fix module unload crash - change sw sequence · cb0366b7
      Iyappan Subramanian 提交于
      When the driver is configured as kernel module and when it gets
      unloaded and reloaded, kernel crash was observed.  This patch
      addresses the software cleanup by doing the following,
      
      - Moved register_netdev call after hardware is ready
      - Since ndev is not ready, added set_irq_name to set irq name
      - Since ndev is not ready, changed mdio_bus->parent to pdev->dev
      - Replaced netif_start(stop)_queue by netif_tx_start(stop)_queues
      - Removed napi_del call since it's called by free_netdev
      - Added dev_close call, within remove
      - Added shutdown callback
      - Changed to use dmam_ APIs
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Tested-by: NFushen Chen <fchen@apm.com>
      Tested-by: NToan Le <toanle@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb0366b7
    • I
      drivers: net: xgene: Fix module unload crash - hw resource cleanup · cb11c062
      Iyappan Subramanian 提交于
      When the driver is configured as kernel module and when it gets
      unloaded and reloaded, kernel crash was observed.  This patch
      address the hardware resource cleanups by doing the following,
      
      - Added mac_ops->clear() to do prefetch buffer clean up
      - Fixed delete freepool buffers logic
      - Reordered mac_enable and mac_disable
      - Added Tx completion ring free
      - Moved down delete_desc_rings after ring cleanup
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Tested-by: NFushen Chen <fchen@apm.com>
      Tested-by: NToan Le <toanle@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb11c062
    • I
      drivers: net: xgene: Separate set_speed from mac_init · 9a8c5dde
      Iyappan Subramanian 提交于
      Since mac_init is too heavy to be called when the link changes,
      moved the speed_set configuration to a new function and added
      mac_ops->set_speed function pointer.  This function will be
      called from adjust_link callback.
      
      Added cases for 10/100 support for SGMII based 1G interface.
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Tested-by: NFushen Chen <fchen@apm.com>
      Tested-by: NToan Le <toanle@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a8c5dde
    • D
      Merge branch 'refactor-tc_action-structs' · c43212bb
      David S. Miller 提交于
      Cong Wang says:
      
      ====================
      net_sched: refactor tc action structures
      
      These two patches factor out the struct tcf_common.
      
      v2: fix a compile warning
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c43212bb
    • W
      net_sched: get rid of struct tcf_common · ec0595cc
      WANG Cong 提交于
      After the previous patch, struct tc_action should be enough
      to represent the generic tc action, tcf_common is not necessary
      any more. This patch gets rid of it to make tc action code
      more readable.
      
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec0595cc
    • W
      net_sched: move tc_action into tcf_common · a85a970a
      WANG Cong 提交于
      struct tc_action is confusing, currently we use it for two purposes:
      1) Pass in arguments and carry out results from helper functions
      2) A generic representation for tc actions
      
      The first one is error-prone, since we need to make sure we don't
      miss anything. This patch aims to get rid of this use, by moving
      tc_action into tcf_common, so that they are allocated together
      in hashtable and can be cast'ed easily.
      
      And together with the following patch, we could really make
      tc_action a generic representation for all tc actions and each
      type of action can inherit from it.
      
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a85a970a
    • M
      ipvlan: Scrub skb before crossing the namespace boundry · b93dd49c
      Mahesh Bandewar 提交于
      The earlier patch c3aaa06d (ipvlan: scrub skb before routing
      in L3 mode.) did this but only for TX path in L3 mode. This
      patch extends it for both the modes for TX/RX path.
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b93dd49c
    • D
      Merge branch 'bnxt_en-improve-ntuple-and-new-IDs' · 16eab559
      David S. Miller 提交于
      Michael Chan says:
      
      ====================
      bnxt_en: Improve ntuple filters and add new IDs.
      
      Improve ntuple filters and add some new PCI device IDs.  Please review
      for net-next.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16eab559
    • M
      bnxt_en: Add new NPAR and dual media device IDs. · 1f681688
      Michael Chan 提交于
      Add 5741X/5731X NPAR device IDs and dual media SFP/10GBase-T device IDs.
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f681688
    • V
      bnxt_en: Log a message, if enabling NTUPLE filtering fails. · a2304909
      Vasundhara Volam 提交于
      If there are not enough resources to enable ntuple filtering,
      log a warning message.
      
      v2: Use single message and add missing newline.
      Signed-off-by: NVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2304909
    • M
      bnxt_en: Improve ntuple filters by checking destination MAC address. · a54c4d74
      Michael Chan 提交于
      Include the destination MAC address in the ntuple filter structure.  The
      current code assumes that the destination MAC address is always the MAC
      address of the NIC.  This may not be true if there are macvlans, for
      example.  Add destination MAC address checking and configure the filter
      correctly using the correct index for the destination MAC address.
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a54c4d74
    • M
      qed: Fix setting/clearing bit in completion bitmap · 59d3f1ce
      Manish Chopra 提交于
      Slowpath completion handling is incorrectly changing
      SPQ_RING_SIZE bits instead of a single one.
      
      Fixes: 76a9a364 ("qed: fix handling of concurrent ramrods")
      Signed-off-by: NManish Chopra <manish.chopra@qlogic.com>
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59d3f1ce
    • D
      udp: use sk_filter_trim_cap for udp{,6}_queue_rcv_skb · ba66bbe5
      Daniel Borkmann 提交于
      After a6127697 ("udp: prevent bugcheck if filter truncates packet
      too much"), there followed various other fixes for similar cases such
      as f4979fce ("rose: limit sk_filter trim to payload").
      
      Latter introduced a new helper sk_filter_trim_cap(), where we can pass
      the trim limit directly to the socket filter handling. Make use of it
      here as well with sizeof(struct udphdr) as lower cap limit and drop the
      extra skb->len test in UDP's input path.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Willem de Bruijn <willemb@google.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ba66bbe5
    • B
      caif-hsi: Remove deprecated create_singlethread_workqueue · deb1f45a
      Bhaktipriya Shridhar 提交于
      alloc_workqueue replaces deprecated create_singlethread_workqueue().
      
      A dedicated workqueue has been used since the workitems are being used
      on a packet tx/rx path. Hence, WQ_MEM_RECLAIM has been set to guarantee
      forward progress under memory pressure.
      
      An ordered workqueue has been used since workitems &cfhsi->wake_up_work
      and &cfhsi->wake_down_work cannot be run concurrently.
      
      Calls to flush_workqueue() before destroy_workqueue() have been dropped
      since destroy_workqueue() itself calls drain_workqueue() which flushes
      repeatedly till the workqueue becomes empty.
      Signed-off-by: NBhaktipriya Shridhar <bhaktipriya96@gmail.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      deb1f45a
    • D
      Merge branch 'bpf-probe-write-user' · eefc06bd
      David S. Miller 提交于
      Sargun Dhillon says:
      
      ====================
      bpf: add bpf_probe_write_user helper & example
      
      This patch series contains two patches that add support for a probe_write
      helper to BPF programs. This allows them to manipulate user memory during
      the course of tracing. The second patch in the series has an example that
      uses it, in one the intended ways to divert execution.
      
      Thanks to Alexei Starovoitov, and Daniel Borkmann for being patient, review, and
      helping me get familiar with the code base. I've made changes based on their
      recommendations.
      
      This helper should be considered for experimental usage and debugging, so we
      print a warning to dmesg when it is along with the command and pid when someone
      tries to install a proglet that uses it. A follow-up patchset will contain a
      mechanism to verify the safety of the probe beyond what was done by hand.
      ----
      v1->v2: restrict writing to user space, as opposed to globally v2->v3: Fixed
              formatting issues v3->v4: Rename copy_to_user -> bpf_probe_write
              Simplify checking of whether or not it's safe to write
              Add warnings to dmesg
      v4->v5: Raise warning level
              Cleanup location of warning code
              Make test fail when helper is broken
      v5->v6: General formatting cleanup
              Rename bpf_probe_write -> bpf_probe_write_user
      v6->v7: More formatting cleanup.
              Clarifying a few comments
      	Clarified log message
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eefc06bd
    • S
      samples/bpf: Add test/example of using bpf_probe_write_user bpf helper · cf9b1199
      Sargun Dhillon 提交于
      This example shows using a kprobe to act as a dnat mechanism to divert
      traffic for arbitrary endpoints. It rewrite the arguments to a syscall
      while they're still in userspace, and before the syscall has a chance
      to copy the argument into kernel space.
      
      Although this is an example, it also acts as a test because the mapped
      address is 255.255.255.255:555 -> real address, and that's not a legal
      address to connect to. If the helper is broken, the example will fail
      on the intermediate steps, as well as the final step to verify the
      rewrite of userspace memory succeeded.
      Signed-off-by: NSargun Dhillon <sargun@sargun.me>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf9b1199
    • S
      bpf: Add bpf_probe_write_user BPF helper to be called in tracers · 96ae5227
      Sargun Dhillon 提交于
      This allows user memory to be written to during the course of a kprobe.
      It shouldn't be used to implement any kind of security mechanism
      because of TOC-TOU attacks, but rather to debug, divert, and
      manipulate execution of semi-cooperative processes.
      
      Although it uses probe_kernel_write, we limit the address space
      the probe can write into by checking the space with access_ok.
      We do this as opposed to calling copy_to_user directly, in order
      to avoid sleeping. In addition we ensure the threads's current fs
      / segment is USER_DS and the thread isn't exiting nor a kernel thread.
      
      Given this feature is meant for experiments, and it has a risk of
      crashing the system, and running programs, we print a warning on
      when a proglet that attempts to use this helper is installed,
      along with the pid and process name.
      Signed-off-by: NSargun Dhillon <sargun@sargun.me>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96ae5227
    • A
      net/mlx4_core: Check device state before unregistering it · 9b022a6e
      Alex Vesker 提交于
      Verify that the device state is registered before un-registering it.
      This check is required to prevent an OOPS on flows that do
      re-registration of the device and its previous state was
      unregistered.
      
      Fixes: 225c7b1f ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
      Signed-off-by: NAlex Vesker <valex@mellanox.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b022a6e
    • I
      mlxsw: spectrum: Fix compilation error when CLS_ACT isn't set · 86cb13e4
      Ido Schimmel 提交于
      When CONFIG_NET_CLS_ACT isn't set 'struct tcf_exts' has no member named
      'actions' and we therefore must not access it. Otherwise compilation
      fails.
      
      Fix this by introducing a new macro similar to tc_no_actions(), which
      always returns 'false' if CONFIG_NET_CLS_ACT isn't set.
      
      Fixes: 763b4b70 ("mlxsw: spectrum: Add support in matchall mirror TC offloading")
      Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86cb13e4
    • U
      net: davinci_cpdma: remove excessive dump of register values to kernel log · 3568bdf0
      Uwe Kleine-König 提交于
      Such a big dump of register values is hardly useful on a production
      system.
      
      Another downside of the now removed functions is that calling
      emac_dump_regs resulted in at least 87 calls to dev_info while holding a
      spinlock and having irqs off which is a big source of latency.
      Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3568bdf0
    • C
      gtp: #define #define _GTP_H_ and not #define _GTP_H · 9b8ac4f9
      Colin Ian King 提交于
      Fix clang build warning:
      
      ./include/net/gtp.h:1:9: warning: '_GTP_H_' is used as a header
      guard here, followed by #define of a different macro [-Wheader-guard]
      
      fix by defining _GTP_H_ and not _GTP_H
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b8ac4f9
    • D
      Merge branch 'mlx5-minimum-inline-header-mode' · 779d1436
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      Mellanox 100G mlx5 minimum inline header mode
      
      This small series from Hadar adds the support for minimum inline header mode query
      in mlx5e NIC driver.
      
      Today on TX the driver copies to the HW descriptor only up to L2 header which is the default
      required mode and sufficient for today's needs.
      
      The header in the HW descriptor is used for HW loopback steering decision, without it packets
      will go directly to the wire with no questions asked.
      
      For TX loopback steering according to L2/L3/L4 headers, ConnectX-4 requires to copy the
      corresponding headers into the send queue(SQ) WQE HW descriptor so it can decide whether to loop it back
      or to forward to wire.
      
      For legacy E-Switch mode only L2 headers copy is required.
      For advanced steering (E-Switch offloads) more header layers may be required to be copied,
      the required mode will be advertised by FW to each VF and PF according to the corresponding
      E-Switch configuration.
      
      Changes V2:
       - Allocate query_nic_vport_context_out on the stack
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      779d1436
    • H
      net/mlx5e: Query minimum required header copy during xmit · cff92d7c
      Hadar Hen Zion 提交于
      Add support for query the minimum inline mode from the Firmware.
      It is required for correct TX steering according to L3/L4 packet
      headers.
      
      Each send queue (SQ) has inline mode that defines the minimal required
      headers that needs to be copied into the SQ WQE.
      The driver asks the Firmware for the wqe_inline_mode device capability
      value.  In case the device capability defined as "vport context" the
      driver must check the reported min inline mode from the vport context
      before creating its SQs.
      Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cff92d7c
    • H
      net/mlx5e: Check the minimum inline header mode before xmit · ae76715d
      Hadar Hen Zion 提交于
      Each send queue (SQ) has inline mode that defines the minimal required
      inline headers in the SQ WQE.
      Before sending each packet check that the minimum required headers
      on the WQE are copied.
      Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae76715d
    • V
      net/sctp: terminate rhashtable walk correctly · 5fc382d8
      Vegard Nossum 提交于
      I was seeing a lot of these:
      
          BUG: sleeping function called from invalid context at mm/slab.h:388
          in_atomic(): 0, irqs_disabled(): 0, pid: 14971, name: trinity-c2
          Preemption disabled at:[<ffffffff819bcd46>] rhashtable_walk_start+0x46/0x150
      
           [<ffffffff81149abb>] preempt_count_add+0x1fb/0x280
           [<ffffffff83295722>] _raw_spin_lock+0x12/0x40
           [<ffffffff811aac87>] console_unlock+0x2f7/0x930
           [<ffffffff811ab5bb>] vprintk_emit+0x2fb/0x520
           [<ffffffff811aba6a>] vprintk_default+0x1a/0x20
           [<ffffffff812c171a>] printk+0x94/0xb0
           [<ffffffff811d6ed0>] print_stack_trace+0xe0/0x170
           [<ffffffff8115835e>] ___might_sleep+0x3be/0x460
           [<ffffffff81158490>] __might_sleep+0x90/0x1a0
           [<ffffffff8139b823>] kmem_cache_alloc+0x153/0x1e0
           [<ffffffff819bca1e>] rhashtable_walk_init+0xfe/0x2d0
           [<ffffffff82ec64de>] sctp_transport_walk_start+0x1e/0x60
           [<ffffffff82edd8ad>] sctp_transport_seq_start+0x4d/0x150
           [<ffffffff8143a82b>] seq_read+0x27b/0x1180
           [<ffffffff814f97fc>] proc_reg_read+0xbc/0x180
           [<ffffffff813d471b>] __vfs_read+0xdb/0x610
           [<ffffffff813d4d3a>] vfs_read+0xea/0x2d0
           [<ffffffff813d615b>] SyS_pread64+0x11b/0x150
           [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
           [<ffffffff832960a5>] return_from_SYSCALL_64+0x0/0x6a
           [<ffffffffffffffff>] 0xffffffffffffffff
      
      Apparently we always need to call rhashtable_walk_stop(), even when
      rhashtable_walk_start() fails:
      
       * rhashtable_walk_start - Start a hash table walk
       * @iter:       Hash table iterator
       *
       * Start a hash table walk.  Note that we take the RCU lock in all
       * cases including when we return an error.  So you must always call
       * rhashtable_walk_stop to clean up.
      
      otherwise we never call rcu_read_unlock() and we get the splat above.
      
      Fixes: 53fa1036 ("sctp: fix some rhashtable functions using in sctp proc/diag")
      See-also: 53fa1036 ("sctp: fix some rhashtable functions using in sctp proc/diag")
      See-also: f2dba9c6 ("rhashtable: Introduce rhashtable_walk_*")
      Cc: Xin Long <lucien.xin@gmail.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: stable@vger.kernel.org
      Signed-off-by: NVegard Nossum <vegard.nossum@oracle.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5fc382d8
    • D
      Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 9bc4a1cc
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      10GbE Intel Wired LAN Driver Updates 2016-07-22
      
      This series contains updates to ixgbe and ixgbevf only.
      
      Emil fixes the NACK check in ixgbevf_set_uc_addr_vf() for instances where
      the index is not equal to zero.  Fixes an issue where mac->ops.setup_fc
      can be NULL for backplanes which can cause the driver to crash on load.
      
      Don fixes the second parameter of the LED functions, which is the index to
      the LED we are interested in affecting.  Fixed variable to store register
      reads to unsigned integer.  Adds support for the new x553 hardware into
      ixgbevf.  Fixed a missing rtnl lock around ixgbevf_reinit_locked().
      Fixed an issue where in ixgbevf_reset_subtask() was not verifying that
      the port has been removed.  Cleans up the initial crosstalk fix, since
      the SFP that indicates the presence of a SFP+ module changes between
      hardware types.
      
      Babu Moger fixes typo in freeing IRQ, since the array subscript increments
      after the execution of the statement.
      
      Wei Yongjun adds the missing destroy_workqueue() before returning from
      ixgbe_init_module() in the error handling case.
      
      Tony adds range checking for setting the MTU from the VF, where the PF can
      return a NACK but this was not passed on to the VF, so propagate the
      results from the PF to the VF so errors can be reported.  Consolidates
      mailbox read and write functions, since the recent changes to
      ixgbevf_write_msg_read_ack(), other functions are performing the same
      operations done here.
      
      Colin Ian King removes a redundant check on ret_val, since ret_val has
      not changed since the previous check.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9bc4a1cc
    • V
      net/irda: fix NULL pointer dereference on memory allocation failure · d3e6952c
      Vegard Nossum 提交于
      I ran into this:
      
          kasan: CONFIG_KASAN_INLINE enabled
          kasan: GPF could be caused by NULL-ptr deref or user memory access
          general protection fault: 0000 [#1] PREEMPT SMP KASAN
          CPU: 2 PID: 2012 Comm: trinity-c3 Not tainted 4.7.0-rc7+ #19
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
          task: ffff8800b745f2c0 ti: ffff880111740000 task.ti: ffff880111740000
          RIP: 0010:[<ffffffff82bbf066>]  [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710
          RSP: 0018:ffff880111747bb8  EFLAGS: 00010286
          RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000069dd8358
          RDX: 0000000000000009 RSI: 0000000000000027 RDI: 0000000000000048
          RBP: ffff880111747c00 R08: 0000000000000000 R09: 0000000000000000
          R10: 0000000069dd8358 R11: 1ffffffff0759723 R12: 0000000000000000
          R13: ffff88011a7e4780 R14: 0000000000000027 R15: 0000000000000000
          FS:  00007fc738404700(0000) GS:ffff88011af00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 00007fc737fdfb10 CR3: 0000000118087000 CR4: 00000000000006e0
          Stack:
           0000000000000200 ffff880111747bd8 ffffffff810ee611 ffff880119f1f220
           ffff880119f1f4f8 ffff880119f1f4f0 ffff88011a7e4780 ffff880119f1f232
           ffff880119f1f220 ffff880111747d58 ffffffff82bca542 0000000000000000
          Call Trace:
           [<ffffffff82bca542>] irda_connect+0x562/0x1190
           [<ffffffff825ae582>] SYSC_connect+0x202/0x2a0
           [<ffffffff825b4489>] SyS_connect+0x9/0x10
           [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
           [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25
          Code: 41 89 ca 48 89 e5 41 57 41 56 41 55 41 54 41 89 d7 53 48 89 fb 48 83 c7 48 48 89 fa 41 89 f6 48 c1 ea 03 48 83 ec 20 4c 8b 65 10 <0f> b6 04 02 84 c0 74 08 84 c0 0f 8e 4c 04 00 00 80 7b 48 00 74
          RIP  [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710
           RSP <ffff880111747bb8>
          ---[ end trace 4cda2588bc055b30 ]---
      
      The problem is that irda_open_tsap() can fail and leave self->tsap = NULL,
      and then irttp_connect_request() almost immediately dereferences it.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3e6952c
    • M
      sctp: also point GSO head_skb to the sk when it's available · 52253db9
      Marcelo Ricardo Leitner 提交于
      The head skb for GSO packets won't travel through the inner depths of
      SCTP stack as it doesn't contain any chunks on it. That means skb->sk
      doesn't get set and then when sctp_recvmsg() calls
      sctp_inet6_skb_msgname() on the head_skb it panics, as this last needs
      to check flags at the socket (sp->v4mapped).
      
      The fix is to initialize skb->sk for th head skb once we are able to do
      it. That is, when the first chunk is processed.
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52253db9
    • M
      sctp: fix BH handling on socket backlog · eefc1b1d
      Marcelo Ricardo Leitner 提交于
      Now that the backlog processing is called with BH enabled, we have to
      disable BH before taking the socket lock via bh_lock_sock() otherwise
      it may dead lock:
      
      sctp_backlog_rcv()
                      bh_lock_sock(sk);
      
                      if (sock_owned_by_user(sk)) {
                              if (sk_add_backlog(sk, skb, sk->sk_rcvbuf))
                                      sctp_chunk_free(chunk);
                              else
                                      backloged = 1;
                      } else
                              sctp_inq_push(inqueue, chunk);
      
                      bh_unlock_sock(sk);
      
      while sctp_inq_push() was disabling/enabling BH, but enabling BH
      triggers pending softirq, which then may try to re-lock the socket in
      sctp_rcv().
      
      [  219.187215]  <IRQ>
      [  219.187217]  [<ffffffff817ca3e0>] _raw_spin_lock+0x20/0x30
      [  219.187223]  [<ffffffffa041888c>] sctp_rcv+0x48c/0xba0 [sctp]
      [  219.187225]  [<ffffffff816e7db2>] ? nf_iterate+0x62/0x80
      [  219.187226]  [<ffffffff816f1b14>] ip_local_deliver_finish+0x94/0x1e0
      [  219.187228]  [<ffffffff816f1e1f>] ip_local_deliver+0x6f/0xf0
      [  219.187229]  [<ffffffff816f1a80>] ? ip_rcv_finish+0x3b0/0x3b0
      [  219.187230]  [<ffffffff816f17a8>] ip_rcv_finish+0xd8/0x3b0
      [  219.187232]  [<ffffffff816f2122>] ip_rcv+0x282/0x3a0
      [  219.187233]  [<ffffffff810d8bb6>] ? update_curr+0x66/0x180
      [  219.187235]  [<ffffffff816abac4>] __netif_receive_skb_core+0x524/0xa90
      [  219.187236]  [<ffffffff810d8e00>] ? update_cfs_shares+0x30/0xf0
      [  219.187237]  [<ffffffff810d557c>] ? __enqueue_entity+0x6c/0x70
      [  219.187239]  [<ffffffff810dc454>] ? enqueue_entity+0x204/0xdf0
      [  219.187240]  [<ffffffff816ac048>] __netif_receive_skb+0x18/0x60
      [  219.187242]  [<ffffffff816ad1ce>] process_backlog+0x9e/0x140
      [  219.187243]  [<ffffffff816ac8ec>] net_rx_action+0x22c/0x370
      [  219.187245]  [<ffffffff817cd352>] __do_softirq+0x112/0x2e7
      [  219.187247]  [<ffffffff817cc3bc>] do_softirq_own_stack+0x1c/0x30
      [  219.187247]  <EOI>
      [  219.187248]  [<ffffffff810aa1c8>] do_softirq.part.14+0x38/0x40
      [  219.187249]  [<ffffffff810aa24d>] __local_bh_enable_ip+0x7d/0x80
      [  219.187254]  [<ffffffffa0408428>] sctp_inq_push+0x68/0x80 [sctp]
      [  219.187258]  [<ffffffffa04190f1>] sctp_backlog_rcv+0x151/0x1c0 [sctp]
      [  219.187260]  [<ffffffff81692b07>] __release_sock+0x87/0xf0
      [  219.187261]  [<ffffffff81692ba0>] release_sock+0x30/0xa0
      [  219.187265]  [<ffffffffa040e46d>] sctp_accept+0x17d/0x210 [sctp]
      [  219.187266]  [<ffffffff810e7510>] ? prepare_to_wait_event+0xf0/0xf0
      [  219.187268]  [<ffffffff8172d52c>] inet_accept+0x3c/0x130
      [  219.187269]  [<ffffffff8168d7a3>] SYSC_accept4+0x103/0x210
      [  219.187271]  [<ffffffff817ca2ba>] ? _raw_spin_unlock_bh+0x1a/0x20
      [  219.187272]  [<ffffffff81692bfc>] ? release_sock+0x8c/0xa0
      [  219.187276]  [<ffffffffa0413e22>] ? sctp_inet_listen+0x62/0x1b0 [sctp]
      [  219.187277]  [<ffffffff8168f2d0>] SyS_accept+0x10/0x20
      
      Fixes: 860fbbc3 ("sctp: prepare for socket backlog behavior change")
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eefc1b1d