1. 12 8月, 2019 26 次提交
    • D
      Merge branch 'net-dsa-mv88e6xxx-prepare-Wait-Bit-operation' · a8583901
      David S. Miller 提交于
      Vivien Didelot says:
      
      ====================
      net: dsa: mv88e6xxx: prepare Wait Bit operation
      
      The Remote Management Interface has its own implementation of a Wait
      Bit operation, which requires a bit number and a value to wait for.
      
      In order to prepare the introduction of this implementation, rework the
      code waiting for bits and masks in mv88e6xxx to match this signature.
      
      This has the benefit to unify the implementation of wait routines while
      removing obsolete wait and update functions and also reducing the code.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8583901
    • V
      net: dsa: mv88e6xxx: add delay in direct SMI wait · eede2361
      Vivien Didelot 提交于
      The mv88e6xxx_smi_direct_wait routine is used to wait on indirect
      registers access. It is of no exception and must delay between read
      attempts, like other wait routines.
      Signed-off-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eede2361
    • V
      net: dsa: mv88e6xxx: fix SMI bit checking · 1c6463b6
      Vivien Didelot 提交于
      The current mv88e6xxx_smi_direct_wait function is only used to check
      the 16th bit of the (16-bit) SMI Command register. But the bit shift
      operation is not enough if we eventually use this function to check
      other bits, thus replace it with a mask.
      
      Fixes: e7ba0fad ("net: dsa: mv88e6xxx: refine SMI support")
      Signed-off-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c6463b6
    • V
      net: dsa: mv88e6xxx: remove wait and update routines · 2ad4da77
      Vivien Didelot 提交于
      Now that we have proper Wait Bit and Wait Mask routines, remove the
      unused mv88e6xxx_wait routine and its Global 1 and Global 2 variants.
      
      The indirect tables such as the Device Mapping Table or Priority
      Override Table make use of an Update bit to distinguish reading (0)
      from writing (1) operations. After a write operation occurs, the bit
      self clears right away so there's no need to wait on it. Thus keep
      things simple and remove the mv88e6xxx_update helper as well.
      Signed-off-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ad4da77
    • V
      net: dsa: mv88e6xxx: wait for AVB Busy bit · 28ae1e96
      Vivien Didelot 提交于
      The AVB is not an indirect table using an Update bit, but a unit using
      a Busy bit. This means that we must ensure that this bit is cleared
      before setting it and wait until it gets cleared again after writing
      an operation. Reflect that.
      Signed-off-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28ae1e96
    • V
      net: dsa: mv88e6xxx: introduce wait bit routine · 19fb7f69
      Vivien Didelot 提交于
      Many portions of the driver need to wait until a given bit is set
      or cleared. Some busses even have a specific implementation for this
      operation. In preparation for such variant, implement a generic Wait
      Bit routine that can be used by the driver core functions.
      
      This allows us to get rid of the custom implementations we may find
      in the driver. Note that for the EEPROM bits, BUSY and RUNNING bits
      are independent, thus it is more efficient to wait independently for
      each bit instead of waiting for their mask.
      Signed-off-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      19fb7f69
    • V
      net: dsa: mv88e6xxx: introduce wait mask routine · 683f2244
      Vivien Didelot 提交于
      The current mv88e6xxx_wait routine is used to wait for a given mask
      to be cleared to zero. However in some cases, the driver may have
      to wait for a given mask to be of a certain non-zero value.
      
      Thus provide a generic wait mask routine that will be used to implement
      the current mv88e6xxx_wait function, and use it to wait for 88E6185
      PPU states.
      Signed-off-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      683f2244
    • V
      net: dsa: mv88e6xxx: wait for 88E6185 PPU disabled · 92993853
      Vivien Didelot 提交于
      The PPU state of 88E6185 can be either "Disabled at Reset" or
      "Disabled after Initialization". Because we intentionally clear the
      PPU Enabled bit before checking its state, it is safe to wait for the
      MV88E6185_G1_STS_PPU_STATE_DISABLED state explicitly instead of waiting
      for any state different than MV88E6185_G1_STS_PPU_STATE_POLLING.
      Signed-off-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92993853
    • H
      r8169: inline rtl8169_free_rx_databuff · eb2e7f09
      Heiner Kallweit 提交于
      rtl8169_free_rx_databuff is used in only one place, so let's inline it.
      We can improve the loop because rtl8169_init_ring zero's RX_databuff
      before calling rtl8169_rx_fill, and rtl8169_rx_fill fills
      Rx_databuff starting from index 0.
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb2e7f09
    • D
      Merge branch 'realtek-phy-next' · d35bbe84
      David S. Miller 提交于
      Heiner Kallweit says:
      
      ====================
      net: phy: realtek: add support for integrated 2.5Gbps PHY in RTL8125
      
      This series adds support for the integrated 2.5Gbps PHY in RTL8125.
      First three patches add necessary functionality to phylib.
      
      Changes in v2:
      - added patch 1
      - changed patch 4 to use a fake PHY ID that is injected by the
        network driver. This allows to use a dedicated PHY driver.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d35bbe84
    • H
      net: phy: realtek: add support for the 2.5Gbps PHY in RTL8125 · 087f5b87
      Heiner Kallweit 提交于
      This adds support for the integrated 2.5Gbps PHY in Realtek RTL8125.
      Advertisement of 2.5Gbps mode is done via a vendor-specific register.
      Same applies to reading NBase-T link partner advertisement.
      Unfortunately this 2.5Gbps PHY shares the PHY ID with the integrated
      1Gbps PHY's in other Realtek network chips and so far no method is
      known to differentiate them. As a workaround use a dedicated fake PHY ID
      that is set by the network driver by intercepting the MDIO PHY ID read.
      
      v2:
      - Create dedicated PHY driver and use a fake PHY ID that is injected by
        the network driver. Suggested by Andrew Lunn.
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      087f5b87
    • H
      net: phy: add phy_modify_paged_changed · bf22b343
      Heiner Kallweit 提交于
      Add helper function phy_modify_paged_changed, behavios is the same
      as for phy_modify_changed.
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf22b343
    • H
      net: phy: prepare phylib to deal with PHY's extending Clause 22 · f4069cd7
      Heiner Kallweit 提交于
      The integrated PHY in 2.5Gbps chip RTL8125 is the first (known to me)
      PHY that uses standard Clause 22 for all modes up to 1Gbps and adds
      2.5Gbps control using vendor-specific registers. To use phylib for
      the standard part little extensions are needed:
      - Move most of genphy_config_aneg to a new function
        __genphy_config_aneg that takes a parameter whether restarting
        auto-negotiation is needed (depending on whether content of
        vendor-specific advertisement register changed).
      - Don't clear phydev->lp_advertising in genphy_read_status so that
        we can set non-C22 mode flags before.
      
      Basically both changes mimic the behavior of the equivalent Clause 45
      functions.
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4069cd7
    • H
      net: phy: simplify genphy_config_advert by using the linkmode_adv_to_xxx_t functions · 3eef8689
      Heiner Kallweit 提交于
      Using linkmode_adv_to_mii_adv_t and linkmode_adv_to_mii_ctrl1000_t
      allows to simplify the code. In addition avoiding the conversion to
      the legacy u32 advertisement format allows to remove the warning.
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Suggested-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3eef8689
    • J
      netdevsim: register couple of devlink params · 150e8f8a
      Jiri Pirko 提交于
      Register couple of devlink params, one generic, one driver-specific.
      Make the values available over debugfs.
      
      Example:
      $ echo "111" > /sys/bus/netdevsim/new_device
      $ devlink dev param
      netdevsim/netdevsim111:
        name max_macs type generic
          values:
            cmode driverinit value 32
        name test1 type driver-specific
          values:
            cmode driverinit value true
      $ cat /sys/kernel/debug/netdevsim/netdevsim111/max_macs
      32
      $ cat /sys/kernel/debug/netdevsim/netdevsim111/test1
      Y
      $ devlink dev param set netdevsim/netdevsim111 name max_macs cmode driverinit value 16
      $ devlink dev param set netdevsim/netdevsim111 name test1 cmode driverinit value false
      $ devlink dev reload netdevsim/netdevsim111
      $ cat /sys/kernel/debug/netdevsim/netdevsim111/max_macs
      16
      $ cat /sys/kernel/debug/netdevsim/netdevsim111/test1
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      150e8f8a
    • D
      Merge branch 'drop_monitor-Capture-dropped-packets-and-metadata' · 6e5ee483
      David S. Miller 提交于
      Ido Schimmel says:
      
      ====================
      drop_monitor: Capture dropped packets and metadata
      
      So far drop monitor supported only one mode of operation in which a
      summary of recent packet drops is periodically sent to user space as a
      netlink event. The event only includes the drop location (program
      counter) and number of drops in the last interval.
      
      While this mode of operation allows one to understand if the system is
      dropping packets, it is not sufficient if a more detailed analysis is
      required. Both the packet itself and related metadata are missing.
      
      This patchset extends drop monitor with another mode of operation where
      the packet - potentially truncated - and metadata (e.g., drop location,
      timestamp, netdev) are sent to user space as a netlink event. Thanks to
      the extensible nature of netlink, more metadata can be added in the
      future.
      
      To avoid performing expensive operations in the context in which
      kfree_skb() is called, the dropped skbs are cloned and queued on per-CPU
      skb drop list. The list is then processed in process context (using a
      workqueue), where the netlink messages are allocated, prepared and
      finally sent to user space.
      
      A follow-up patchset will integrate drop monitor with devlink and allow
      the latter to call into drop monitor to report hardware drops. In the
      future, XDP drops can be added as well, thereby making drop monitor the
      go-to netlink channel for diagnosing all packet drops.
      
      Example usage with patched dropwatch [1] can be found here [2]. Example
      dissection of drop monitor netlink events with patched wireshark [3] can
      be found here [4]. I will submit both changes upstream after the kernel
      changes are accepted. Another change worth making is adding a dropmon
      pseudo interface to libpcap, similar to the nflog interface [5]. This
      will allow users to specifically listen on dropmon traffic instead of
      capturing all netlink packets via the nlmon netdev.
      
      Patches #1-#5 prepare the code towards the actual changes in later
      patches.
      
      Patch #6 adds another mode of operation to drop monitor in which the
      dropped packet itself is notified to user space along with metadata.
      
      Patch #7 allows users to truncate reported packets to a specific length,
      in case only the headers are of interest. The original length of the
      packet is added as metadata to the netlink notification.
      
      Patch #8 allows user to query the current configuration of drop monitor
      (e.g., alert mode, truncation length).
      
      Patches #9-#10 allow users to tune the length of the per-CPU skb drop
      list according to their needs.
      
      Changes since v1 [6]:
      * Add skb protocol as metadata. This allows user space to correctly
        dissect the packet instead of blindly assuming it is an Ethernet
        packet
      
      Changes since RFC [7]:
      * Limit the length of the per-CPU skb drop list and make it configurable
      * Do not use the hysteresis timer in packet alert mode
      * Introduce alert mode operations in a separate patch and only then
        introduce the new alert mode
      * Use 'skb->skb_iif' instead of 'skb->dev' because the latter is inside
        a union with 'dev_scratch' and therefore not guaranteed to point to a
        valid netdev
      * Return '-EBUSY' instead of '-EOPNOTSUPP' when trying to configure drop
        monitor while it is monitoring
      * Did not change schedule_work() in favor of schedule_work_on() as I did
        not observe a change in number of tail drops
      
      [1] https://github.com/idosch/dropwatch/tree/packet-mode
      [2] https://gist.github.com/idosch/3d524b887e16bc11b4b19e25c23dcc23#file-gistfile1-txt
      [3] https://github.com/idosch/wireshark/tree/drop-monitor-v2
      [4] https://gist.github.com/idosch/3d524b887e16bc11b4b19e25c23dcc23#file-gistfile2-txt
      [5] https://github.com/the-tcpdump-group/libpcap/blob/master/pcap-netfilter-linux.c
      [6] https://patchwork.ozlabs.org/cover/1143443/
      [7] https://patchwork.ozlabs.org/cover/1135226/
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e5ee483
    • I
      drop_monitor: Expose tail drop counter · e9feb580
      Ido Schimmel 提交于
      Previous patch made the length of the per-CPU skb drop list
      configurable. Expose a counter that shows how many packets could not be
      enqueued to this list.
      
      This allows users determine the desired queue length.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9feb580
    • I
      drop_monitor: Make drop queue length configurable · 30328d46
      Ido Schimmel 提交于
      In packet alert mode, each CPU holds a list of dropped skbs that need to
      be processed in process context and sent to user space. To avoid
      exhausting the system's memory the maximum length of this queue is
      currently set to 1000.
      
      Allow users to tune the length of this queue according to their needs.
      The configured length is reported to user space when drop monitor
      configuration is queried.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30328d46
    • I
      drop_monitor: Add a command to query current configuration · 444be061
      Ido Schimmel 提交于
      Users should be able to query the current configuration of drop monitor
      before they start using it. Add a command to query the existing
      configuration which currently consists of alert mode and packet
      truncation length.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      444be061
    • I
      drop_monitor: Allow truncation of dropped packets · 57986617
      Ido Schimmel 提交于
      When sending dropped packets to user space it is not always necessary to
      copy the entire packet as usually only the headers are of interest.
      
      Allow user to specify the truncation length and add the original length
      of the packet as additional metadata to the netlink message.
      
      By default no truncation is performed.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      57986617
    • I
      drop_monitor: Add packet alert mode · ca30707d
      Ido Schimmel 提交于
      So far drop monitor supported only one alert mode in which a summary of
      locations in which packets were recently dropped was sent to user space.
      
      This alert mode is sufficient in order to understand that packets were
      dropped, but lacks information to perform a more detailed analysis.
      
      Add a new alert mode in which the dropped packet itself is passed to
      user space along with metadata: The drop location (as program counter
      and resolved symbol), ingress netdevice and drop timestamp. More
      metadata can be added in the future.
      
      To avoid performing expensive operations in the context in which
      kfree_skb() is invoked (can be hard IRQ), the dropped skb is cloned and
      queued on per-CPU skb drop list. Then, in process context the netlink
      message is allocated, prepared and finally sent to user space.
      
      The per-CPU skb drop list is limited to 1000 skbs to prevent exhausting
      the system's memory. Subsequent patches will make this limit
      configurable and also add a counter that indicates how many skbs were
      tail dropped.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ca30707d
    • I
      drop_monitor: Add alert mode operations · 28315f79
      Ido Schimmel 提交于
      The next patch is going to add another alert mode in which the dropped
      packet is notified to user space, instead of only a summary of recent
      drops.
      
      Abstract the differences between the modes by adding alert mode
      operations. The operations are selected based on the currently
      configured mode and associated with the probes and the work item just
      before tracing starts.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28315f79
    • I
      drop_monitor: Require CAP_NET_ADMIN for drop monitor configuration · c5ab9b1c
      Ido Schimmel 提交于
      Currently, the configure command does not do anything but return an
      error. Subsequent patches will enable the command to change various
      configuration options such as alert mode and packet truncation.
      
      Similar to other netlink-based configuration channels, make sure only
      users with the CAP_NET_ADMIN capability set can execute this command.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5ab9b1c
    • I
      drop_monitor: Reset per-CPU data before starting to trace · 44075f56
      Ido Schimmel 提交于
      The function reset_per_cpu_data() allocates and prepares a new skb for
      the summary netlink alert message ('NET_DM_CMD_ALERT'). The new skb is
      stored in the per-CPU 'data' variable and the old is returned.
      
      The function is invoked during module initialization and from the
      workqueue, before an alert is sent. This means that it is possible to
      receive an alert with stale data, if we stopped tracing when the
      hysteresis timer ('data->send_timer') was pending.
      
      Instead of invoking the function during module initialization, invoke it
      just before we start tracing and ensure we get a fresh skb.
      
      This also allows us to remove the calls to initialize the timer and the
      work item from the module initialization path, since both could have
      been triggered by the error paths of reset_per_cpu_data().
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44075f56
    • I
      drop_monitor: Initialize timer and work item upon tracing enable · 70c69274
      Ido Schimmel 提交于
      The timer and work item are currently initialized once during module
      init, but subsequent patches will need to associate different functions
      with the work item, based on the configured alert mode.
      
      Allow subsequent patches to make that change by initializing and
      de-initializing these objects during tracing enable and disable.
      
      This also guarantees that once the request to disable tracing returns,
      no more netlink notifications will be generated.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      70c69274
    • I
      drop_monitor: Split tracing enable / disable to different functions · 7c747838
      Ido Schimmel 提交于
      Subsequent patches will need to enable / disable tracing based on the
      configured alerting mode.
      
      Reduce the nesting level and prepare for the introduction of this
      functionality by splitting the tracing enable / disable operations into
      two different functions.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c747838
  2. 11 8月, 2019 14 次提交