1. 07 11月, 2020 1 次提交
    • I
      rtnetlink: Add RTNH_F_TRAP flag · 968a83f8
      Ido Schimmel 提交于
      The flag indicates to user space that the nexthop is not programmed to
      forward packets in hardware, but rather to trap them to the CPU. This is
      needed, for example, when the MAC of the nexthop neighbour is not
      resolved and packets should reach the CPU to trigger neighbour
      resolution.
      
      The flag will be used in subsequent patches by netdevsim to test nexthop
      objects programming to device drivers and in the future by mlxsw as
      well.
      
      Changes since RFC:
      * Reword commit message
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      968a83f8
  2. 06 11月, 2020 1 次提交
  3. 30 10月, 2020 2 次提交
    • H
      bridge: cfm: Netlink GET status Interface. · e77824d8
      Henrik Bjoernlund 提交于
      This is the implementation of CFM netlink status
      get information interface.
      
      Add new nested netlink attributes. These attributes are used by the
      user space to get status information.
      
      GETLINK:
          Request filter RTEXT_FILTER_CFM_STATUS:
          Indicating that CFM status information must be delivered.
      
          IFLA_BRIDGE_CFM:
              Points to the CFM information.
      
          IFLA_BRIDGE_CFM_MEP_STATUS_INFO:
              This indicate that the MEP instance status are following.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_INFO:
              This indicate that the peer MEP status are following.
      
      CFM nested attribute has the following attributes in next level.
      
      GETLINK RTEXT_FILTER_CFM_STATUS:
          IFLA_BRIDGE_CFM_MEP_STATUS_INSTANCE:
              The MEP instance number of the delivered status.
              The type is u32.
          IFLA_BRIDGE_CFM_MEP_STATUS_OPCODE_UNEXP_SEEN:
              The MEP instance received CFM PDU with unexpected Opcode.
              The type is u32 (bool).
          IFLA_BRIDGE_CFM_MEP_STATUS_VERSION_UNEXP_SEEN:
              The MEP instance received CFM PDU with unexpected version.
              The type is u32 (bool).
          IFLA_BRIDGE_CFM_MEP_STATUS_RX_LEVEL_LOW_SEEN:
              The MEP instance received CCM PDU with MD level lower than
              configured level. This frame is discarded.
              The type is u32 (bool).
      
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_INSTANCE:
              The MEP instance number of the delivered status.
              The type is u32.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_PEER_MEPID:
              The added Peer MEP ID of the delivered status.
              The type is u32.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_CCM_DEFECT:
              The CCM defect status.
              The type is u32 (bool).
              True means no CCM frame is received for 3.25 intervals.
              IFLA_BRIDGE_CFM_CC_CONFIG_EXP_INTERVAL.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_RDI:
              The last received CCM PDU RDI.
              The type is u32 (bool).
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_PORT_TLV_VALUE:
              The last received CCM PDU Port Status TLV value field.
              The type is u8.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_IF_TLV_VALUE:
              The last received CCM PDU Interface Status TLV value field.
              The type is u8.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_SEEN:
              A CCM frame has been received from Peer MEP.
              The type is u32 (bool).
              This is cleared after GETLINK IFLA_BRIDGE_CFM_CC_PEER_STATUS_INFO.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_TLV_SEEN:
              A CCM frame with TLV has been received from Peer MEP.
              The type is u32 (bool).
              This is cleared after GETLINK IFLA_BRIDGE_CFM_CC_PEER_STATUS_INFO.
          IFLA_BRIDGE_CFM_CC_PEER_STATUS_SEQ_UNEXP_SEEN:
              A CCM frame with unexpected sequence number has been received
              from Peer MEP.
              The type is u32 (bool).
              When a sequence number is not one higher than previously received
              then it is unexpected.
              This is cleared after GETLINK IFLA_BRIDGE_CFM_CC_PEER_STATUS_INFO.
      Signed-off-by: NHenrik Bjoernlund  <henrik.bjoernlund@microchip.com>
      Reviewed-by: NHoratiu Vultur  <horatiu.vultur@microchip.com>
      Acked-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      e77824d8
    • H
      bridge: cfm: Netlink SET configuration Interface. · 2be665c3
      Henrik Bjoernlund 提交于
      This is the implementation of CFM netlink configuration
      set information interface.
      
      Add new nested netlink attributes. These attributes are used by the
      user space to create/delete/configure CFM instances.
      
      SETLINK:
          IFLA_BRIDGE_CFM:
              Indicate that the following attributes are CFM.
      
          IFLA_BRIDGE_CFM_MEP_CREATE:
              This indicate that a MEP instance must be created.
          IFLA_BRIDGE_CFM_MEP_DELETE:
              This indicate that a MEP instance must be deleted.
          IFLA_BRIDGE_CFM_MEP_CONFIG:
              This indicate that a MEP instance must be configured.
          IFLA_BRIDGE_CFM_CC_CONFIG:
              This indicate that a MEP instance Continuity Check (CC)
              functionality must be configured.
          IFLA_BRIDGE_CFM_CC_PEER_MEP_ADD:
              This indicate that a CC Peer MEP must be added.
          IFLA_BRIDGE_CFM_CC_PEER_MEP_REMOVE:
              This indicate that a CC Peer MEP must be removed.
          IFLA_BRIDGE_CFM_CC_CCM_TX:
              This indicate that the CC transmitted CCM PDU must be configured.
          IFLA_BRIDGE_CFM_CC_RDI:
              This indicate that the CC transmitted CCM PDU RDI must be
              configured.
      
      CFM nested attribute has the following attributes in next level.
      
      SETLINK RTEXT_FILTER_CFM_CONFIG:
          IFLA_BRIDGE_CFM_MEP_CREATE_INSTANCE:
              The created MEP instance number.
              The type is u32.
          IFLA_BRIDGE_CFM_MEP_CREATE_DOMAIN:
              The created MEP domain.
              The type is u32 (br_cfm_domain).
              It must be BR_CFM_PORT.
              This means that CFM frames are transmitted and received
              directly on the port - untagged. Not in a VLAN.
          IFLA_BRIDGE_CFM_MEP_CREATE_DIRECTION:
              The created MEP direction.
              The type is u32 (br_cfm_mep_direction).
              It must be BR_CFM_MEP_DIRECTION_DOWN.
              This means that CFM frames are transmitted and received on
              the port. Not in the bridge.
          IFLA_BRIDGE_CFM_MEP_CREATE_IFINDEX:
              The created MEP residence port ifindex.
              The type is u32 (ifindex).
      
          IFLA_BRIDGE_CFM_MEP_DELETE_INSTANCE:
              The deleted MEP instance number.
              The type is u32.
      
          IFLA_BRIDGE_CFM_MEP_CONFIG_INSTANCE:
              The configured MEP instance number.
              The type is u32.
          IFLA_BRIDGE_CFM_MEP_CONFIG_UNICAST_MAC:
              The configured MEP unicast MAC address.
              The type is 6*u8 (array).
              This is used as SMAC in all transmitted CFM frames.
          IFLA_BRIDGE_CFM_MEP_CONFIG_MDLEVEL:
              The configured MEP unicast MD level.
              The type is u32.
              It must be in the range 1-7.
              No CFM frames are passing through this MEP on lower levels.
          IFLA_BRIDGE_CFM_MEP_CONFIG_MEPID:
              The configured MEP ID.
              The type is u32.
              It must be in the range 0-0x1FFF.
              This MEP ID is inserted in any transmitted CCM frame.
      
          IFLA_BRIDGE_CFM_CC_CONFIG_INSTANCE:
              The configured MEP instance number.
              The type is u32.
          IFLA_BRIDGE_CFM_CC_CONFIG_ENABLE:
              The Continuity Check (CC) functionality is enabled or disabled.
              The type is u32 (bool).
          IFLA_BRIDGE_CFM_CC_CONFIG_EXP_INTERVAL:
              The CC expected receive interval of CCM frames.
              The type is u32 (br_cfm_ccm_interval).
              This is also the transmission interval of CCM frames when enabled.
          IFLA_BRIDGE_CFM_CC_CONFIG_EXP_MAID:
              The CC expected receive MAID in CCM frames.
              The type is CFM_MAID_LENGTH*u8.
              This is MAID is also inserted in transmitted CCM frames.
      
          IFLA_BRIDGE_CFM_CC_PEER_MEP_INSTANCE:
              The configured MEP instance number.
              The type is u32.
          IFLA_BRIDGE_CFM_CC_PEER_MEPID:
              The CC Peer MEP ID added.
              The type is u32.
              When a Peer MEP ID is added and CC is enabled it is expected to
              receive CCM frames from that Peer MEP.
      
          IFLA_BRIDGE_CFM_CC_RDI_INSTANCE:
              The configured MEP instance number.
              The type is u32.
          IFLA_BRIDGE_CFM_CC_RDI_RDI:
              The RDI that is inserted in transmitted CCM PDU.
              The type is u32 (bool).
      
          IFLA_BRIDGE_CFM_CC_CCM_TX_INSTANCE:
              The configured MEP instance number.
              The type is u32.
          IFLA_BRIDGE_CFM_CC_CCM_TX_DMAC:
              The transmitted CCM frame destination MAC address.
              The type is 6*u8 (array).
              This is used as DMAC in all transmitted CFM frames.
          IFLA_BRIDGE_CFM_CC_CCM_TX_SEQ_NO_UPDATE:
              The transmitted CCM frame update (increment) of sequence
              number is enabled or disabled.
              The type is u32 (bool).
          IFLA_BRIDGE_CFM_CC_CCM_TX_PERIOD:
              The period of time where CCM frame are transmitted.
              The type is u32.
              The time is given in seconds. SETLINK IFLA_BRIDGE_CFM_CC_CCM_TX
              must be done before timeout to keep transmission alive.
              When period is zero any ongoing CCM frame transmission
              will be stopped.
          IFLA_BRIDGE_CFM_CC_CCM_TX_IF_TLV:
              The transmitted CCM frame update with Interface Status TLV
              is enabled or disabled.
              The type is u32 (bool).
          IFLA_BRIDGE_CFM_CC_CCM_TX_IF_TLV_VALUE:
              The transmitted Interface Status TLV value field.
              The type is u8.
          IFLA_BRIDGE_CFM_CC_CCM_TX_PORT_TLV:
              The transmitted CCM frame update with Port Status TLV is enabled
              or disabled.
              The type is u32 (bool).
          IFLA_BRIDGE_CFM_CC_CCM_TX_PORT_TLV_VALUE:
              The transmitted Port Status TLV value field.
              The type is u8.
      Signed-off-by: NHenrik Bjoernlund  <henrik.bjoernlund@microchip.com>
      Reviewed-by: NHoratiu Vultur  <horatiu.vultur@microchip.com>
      Acked-by: NNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      2be665c3
  4. 03 7月, 2020 1 次提交
  5. 24 6月, 2020 1 次提交
  6. 16 5月, 2020 1 次提交
    • V
      net: sched: introduce terse dump flag · f8ab1807
      Vlad Buslov 提交于
      Add new TCA_DUMP_FLAGS attribute and use it in cls API to request terse
      filter output from classifiers with TCA_DUMP_FLAGS_TERSE flag. This option
      is intended to be used to improve performance of TC filter dump when
      userland only needs to obtain stats and not the whole classifier/action
      data. Extend struct tcf_proto_ops with new terse_dump() callback that must
      be defined by supporting classifier implementations.
      
      Support of the options in specific classifiers and actions is
      implemented in following patches in the series.
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8ab1807
  7. 15 1月, 2020 3 次提交
    • N
      net: bridge: vlan: add rtnetlink group and notify support · cf5bddb9
      Nikolay Aleksandrov 提交于
      Add a new rtnetlink group for bridge vlan notifications - RTNLGRP_BRVLAN
      and add support for sending vlan notifications (both single and ranges).
      No functional changes intended, the notification support will be used by
      later patches.
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf5bddb9
    • N
      net: bridge: vlan: add rtm definitions and dump support · 8dcea187
      Nikolay Aleksandrov 提交于
      This patch adds vlan rtm definitions:
       - NEWVLAN: to be used for creating vlans, setting options and
         notifications
       - DELVLAN: to be used for deleting vlans
       - GETVLAN: used for dumping vlan information
      
      Dumping vlans which can span multiple messages is added now with basic
      information (vid and flags). We use nlmsg_parse() to validate the header
      length in order to be able to extend the message with filtering
      attributes later.
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8dcea187
    • I
      ipv4: Add "offload" and "trap" indications to routes · 90b93f1b
      Ido Schimmel 提交于
      When performing L3 offload, routes and nexthops are usually programmed
      into two different tables in the underlying device. Therefore, the fact
      that a nexthop resides in hardware does not necessarily mean that all
      the associated routes also reside in hardware and vice-versa.
      
      While the kernel can signal to user space the presence of a nexthop in
      hardware (via 'RTNH_F_OFFLOAD'), it does not have a corresponding flag
      for routes. In addition, the fact that a route resides in hardware does
      not necessarily mean that the traffic is offloaded. For example,
      unreachable routes (i.e., 'RTN_UNREACHABLE') are programmed to trap
      packets to the CPU so that the kernel will be able to generate the
      appropriate ICMP error packet.
      
      This patch adds an "offload" and "trap" indications to IPv4 routes, so
      that users will have better visibility into the offload process.
      
      'struct fib_alias' is extended with two new fields that indicate if the
      route resides in hardware or not and if it is offloading traffic from
      the kernel or trapping packets to it. Note that the new fields are added
      in the 6 bytes hole and therefore the struct still fits in a single
      cache line [1].
      
      Capable drivers are expected to invoke fib_alias_hw_flags_set() with the
      route's key in order to set the flags.
      
      The indications are dumped to user space via a new flags (i.e.,
      'RTM_F_OFFLOAD' and 'RTM_F_TRAP') in the 'rtm_flags' field in the
      ancillary header.
      
      v2:
      * Make use of 'struct fib_rt_info' in fib_alias_hw_flags_set()
      
      [1]
      struct fib_alias {
              struct hlist_node  fa_list;                      /*     0    16 */
              struct fib_info *          fa_info;              /*    16     8 */
              u8                         fa_tos;               /*    24     1 */
              u8                         fa_type;              /*    25     1 */
              u8                         fa_state;             /*    26     1 */
              u8                         fa_slen;              /*    27     1 */
              u32                        tb_id;                /*    28     4 */
              s16                        fa_default;           /*    32     2 */
              u8                         offload:1;            /*    34: 0  1 */
              u8                         trap:1;               /*    34: 1  1 */
              u8                         unused:6;             /*    34: 2  1 */
      
              /* XXX 5 bytes hole, try to pack */
      
              struct callback_head rcu __attribute__((__aligned__(8))); /*    40    16 */
      
              /* size: 56, cachelines: 1, members: 12 */
              /* sum members: 50, holes: 1, sum holes: 5 */
              /* sum bitfield members: 8 bits (1 bytes) */
              /* forced alignments: 1, forced holes: 1, sum forced holes: 5 */
              /* last cacheline: 56 bytes */
      } __attribute__((__aligned__(8)));
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90b93f1b
  8. 02 10月, 2019 1 次提交
  9. 29 5月, 2019 1 次提交
    • D
      net: nexthop uapi · 65ee00a9
      David Ahern 提交于
      New UAPI for nexthops as standalone objects:
      - defines netlink ancillary header, struct nhmsg
      - RTM commands for nexthop objects, RTM_*NEXTHOP,
      - RTNLGRP for nexthop notifications, RTNLGRP_NEXTHOP,
      - Attributes for creating nexthops, NHA_*
      - Attribute for route specs to specify a nexthop by id, RTA_NH_ID.
      
      The nexthop attributes and semantics follow the route and RTA ones for
      device, gateway and lwt encap. Unique to nexthop objects are a blackhole
      and a group which contains references to other nexthop objects. With the
      exception of blackhole and group, nexthop objects MUST contain a device.
      Gateway and encap are optional. Nexthop groups can only reference other
      pre-existing nexthops by id. If the NHA_ID attribute is present that id
      is used for the nexthop. If not specified, one is auto assigned.
      
      Dump requests can include attributes:
      - NHA_GROUPS to return only nexthop groups,
      - NHA_MASTER to limit dumps to nexthops with devices enslaved to the
        given master (e.g., VRF)
      - NHA_OIF to limit dumps to nexthops using given device
      
      nlmsg_route_perms in selinux code is updated for the new RTM comands.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65ee00a9
  10. 24 7月, 2018 1 次提交
  11. 01 6月, 2018 1 次提交
  12. 24 5月, 2018 1 次提交
  13. 18 1月, 2018 2 次提交
  14. 16 12月, 2017 1 次提交
  15. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX license identifier to uapi header files with no license · 6f52b16c
      Greg Kroah-Hartman 提交于
      Many user space API headers are missing licensing information, which
      makes it hard for compliance tools to determine the correct license.
      
      By default are files without license information under the default
      license of the kernel, which is GPLV2.  Marking them GPLV2 would exclude
      them from being included in non GPLV2 code, which is obviously not
      intended. The user space API headers fall under the syscall exception
      which is in the kernels COPYING file:
      
         NOTE! This copyright does *not* cover user programs that use kernel
         services by normal system calls - this is merely considered normal use
         of the kernel, and does *not* fall under the heading of "derived work".
      
      otherwise syscall usage would not be possible.
      
      Update the files which contain no license information with an SPDX
      license identifier.  The chosen identifier is 'GPL-2.0 WITH
      Linux-syscall-note' which is the officially assigned identifier for the
      Linux syscall exception.  SPDX license identifiers are a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.  See the previous patch in this series for the
      methodology of how this patch was researched.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f52b16c
  16. 24 10月, 2017 1 次提交
    • C
      tcp: Configure TFO without cookie per socket and/or per route · 71c02379
      Christoph Paasch 提交于
      We already allow to enable TFO without a cookie by using the
      fastopen-sysctl and setting it to TFO_SERVER_COOKIE_NOT_REQD (or
      TFO_CLIENT_NO_COOKIE).
      This is safe to do in certain environments where we know that there
      isn't a malicous host (aka., data-centers) or when the
      application-protocol already provides an authentication mechanism in the
      first flight of data.
      
      A server however might be providing multiple services or talking to both
      sides (public Internet and data-center). So, this server would want to
      enable cookie-less TFO for certain services and/or for connections that
      go to the data-center.
      
      This patch exposes a socket-option and a per-route attribute to enable such
      fine-grained configurations.
      Signed-off-by: NChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      71c02379
  17. 31 7月, 2017 2 次提交
    • J
      net sched actions: add time filter for action dumping · e62e484d
      Jamal Hadi Salim 提交于
      This patch adds support for filtering based on time since last used.
      When we are dumping a large number of actions it is useful to
      have the option of filtering based on when the action was last
      used to reduce the amount of data crossing to user space.
      
      With this patch the user space app sets the TCA_ROOT_TIME_DELTA
      attribute with the value in milliseconds with "time of interest
      since now".  The kernel converts this to jiffies and does the
      filtering comparison matching entries that have seen activity
      since then and returns them to user space.
      Old kernels and old tc continue to work in legacy mode since
      they dont specify this attribute.
      
      Some example (we have 400 actions bound to 400 filters); at
      installation time. Using updated when tc setting the time of
      interest to 120 seconds earlier (we see 400 actions):
      prompt$ hackedtc actions ls action gact since 120000| grep index | wc -l
      400
      
      go get some coffee and wait for > 120 seconds and try again:
      
      prompt$ hackedtc actions ls action gact since 120000 | grep index | wc -l
      0
      
      Lets see a filter bound to one of these actions:
      ....
      filter pref 10 u32
      filter pref 10 u32 fh 800: ht divisor 1
      filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10  (rule hit 2 success 1)
        match 7f000002/ffffffff at 12 (success 1 )
          action order 1: gact action pass
           random type none pass val 0
           index 23 ref 2 bind 1 installed 1145 sec used 802 sec
          Action statistics:
          Sent 84 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
          backlog 0b 0p requeues 0
      ....
      
      that coffee took long, no? It was good.
      
      Now lets ping -c 1 127.0.0.2, then run the actions again:
      prompt$ hackedtc actions ls action gact since 120 | grep index | wc -l
      1
      
      More details please:
      prompt$ hackedtc -s actions ls action gact since 120000
      
          action order 0: gact action pass
           random type none pass val 0
           index 23 ref 2 bind 1 installed 1270 sec used 30 sec
          Action statistics:
          Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
          backlog 0b 0p requeues 0
      
      And the filter?
      
      filter pref 10 u32
      filter pref 10 u32 fh 800: ht divisor 1
      filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10  (rule hit 4 success 2)
        match 7f000002/ffffffff at 12 (success 2 )
          action order 1: gact action pass
           random type none pass val 0
           index 23 ref 2 bind 1 installed 1324 sec used 84 sec
          Action statistics:
          Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
          backlog 0b 0p requeues 0
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e62e484d
    • J
      net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch · 90825b23
      Jamal Hadi Salim 提交于
      When you dump hundreds of thousands of actions, getting only 32 per
      dump batch even when the socket buffer and memory allocations allow
      is inefficient.
      
      With this change, the user will get as many as possibly fitting
      within the given constraints available to the kernel.
      
      The top level action TLV space is extended. An attribute
      TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON
      is set by the user indicating the user is capable of processing
      these large dumps. Older user space which doesnt set this flag
      doesnt get the large (than 32) batches.
      The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many
      actions are put in a single batch. As such user space app knows how long
      to iterate (independent of the type of action being dumped)
      instead of hardcoded maximum of 32 thus maintaining backward compat.
      
      Some results dumping 1.5M actions below:
      first an unpatched tc which doesnt understand these features...
      
      prompt$ time -p tc actions ls action gact | grep index | wc -l
      1500000
      real 1388.43
      user 2.07
      sys 1386.79
      
      Now lets see a patched tc which sets the correct flags when requesting
      a dump:
      
      prompt$ time -p updatedtc actions ls action gact | grep index | wc -l
      1500000
      real 178.13
      user 2.02
      sys 176.96
      
      That is about 8x performance improvement for tc app which sets its
      receive buffer to about 32K.
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90825b23
  18. 21 6月, 2017 2 次提交
  19. 27 5月, 2017 1 次提交
  20. 18 5月, 2017 1 次提交
  21. 29 3月, 2017 1 次提交
  22. 14 3月, 2017 1 次提交
    • R
      mpls: allow TTL propagation to IP packets to be configured · 5b441ac8
      Robert Shearman 提交于
      Provide the ability to control on a per-route basis whether the TTL
      value from an MPLS packet is propagated to an IPv4/IPv6 packet when
      the last label is popped as per the theoretical model in RFC 3443
      through a new route attribute, RTA_TTL_PROPAGATE which can be 0 to
      mean disable propagation and 1 to mean enable propagation.
      
      In order to provide the ability to change the behaviour for packets
      arriving with IPv4/IPv6 Explicit Null labels and to provide an easy
      way for a user to change the behaviour for all existing routes without
      having to reprogram them, a global knob is provided. This is done
      through the addition of a new per-namespace sysctl,
      "net.mpls.ip_ttl_propagate", which defaults to enabled. If the
      per-route attribute is set (either enabled or disabled) then it
      overrides the global configuration.
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Tested-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b441ac8
  23. 13 3月, 2017 1 次提交
  24. 21 2月, 2017 1 次提交
  25. 03 1月, 2017 1 次提交
  26. 05 11月, 2016 1 次提交
  27. 19 10月, 2016 1 次提交
  28. 27 4月, 2016 1 次提交
  29. 22 4月, 2016 1 次提交
  30. 21 4月, 2016 1 次提交
    • R
      rtnetlink: add new RTM_GETSTATS message to dump link stats · 10c9ead9
      Roopa Prabhu 提交于
      This patch adds a new RTM_GETSTATS message to query link stats via netlink
      from the kernel. RTM_NEWLINK also dumps stats today, but RTM_NEWLINK
      returns a lot more than just stats and is expensive in some cases when
      frequent polling for stats from userspace is a common operation.
      
      RTM_GETSTATS is an attempt to provide a light weight netlink message
      to explicity query only link stats from the kernel on an interface.
      The idea is to also keep it extensible so that new kinds of stats can be
      added to it in the future.
      
      This patch adds the following attribute for NETDEV stats:
      struct nla_policy ifla_stats_policy[IFLA_STATS_MAX + 1] = {
              [IFLA_STATS_LINK_64]  = { .len = sizeof(struct rtnl_link_stats64) },
      };
      
      Like any other rtnetlink message, RTM_GETSTATS can be used to get stats of
      a single interface or all interfaces with NLM_F_DUMP.
      
      Future possible new types of stat attributes:
      link af stats:
          - IFLA_STATS_LINK_IPV6  (nested. for ipv6 stats)
          - IFLA_STATS_LINK_MPLS  (nested. for mpls/mdev stats)
      extended stats:
          - IFLA_STATS_LINK_EXTENDED (nested. extended software netdev stats like bridge,
            vlan, vxlan etc)
          - IFLA_STATS_LINK_HW_EXTENDED (nested. extended hardware stats which are
            available via ethtool today)
      
      This patch also declares a filter mask for all stat attributes.
      User has to provide a mask of stats attributes to query. filter mask
      can be specified in the new hdr 'struct if_stats_msg' for stats messages.
      Other important field in the header is the ifindex.
      
      This api can also include attributes for global stats (eg tcp) in the future.
      When global stats are included in a stats msg, the ifindex in the header
      must be zero. A single stats message cannot contain both global and
      netdev specific stats. To easily distinguish them, netdev specific stat
      attributes name are prefixed with IFLA_STATS_LINK_
      
      Without any attributes in the filter_mask, no stats will be returned.
      
      This patch has been tested with mofified iproute2 ifstat.
      Suggested-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10c9ead9
  31. 18 12月, 2015 1 次提交
  32. 13 10月, 2015 1 次提交
  33. 16 9月, 2015 2 次提交