1. 05 10月, 2018 11 次提交
    • C
      net_sched: convert idrinfo->lock from spinlock to a mutex · 95278dda
      Cong Wang 提交于
      In commit ec3ed293 ("net_sched: change tcf_del_walker() to take idrinfo->lock")
      we move fl_hw_destroy_tmplt() to a workqueue to avoid blocking
      with the spinlock held. Unfortunately, this causes a lot of
      troubles here:
      
      1. tcf_chain_destroy() could be called right after we queue the work
         but before the work runs. This is a use-after-free.
      
      2. The chain refcnt is already 0, we can't even just hold it again.
         We can check refcnt==1 but it is ugly.
      
      3. The chain with refcnt 0 is still visible in its block, which means
         it could be still found and used!
      
      4. The block has a refcnt too, we can't hold it without introducing a
         proper API either.
      
      We can make it working but the end result is ugly. Instead of wasting
      time on reviewing it, let's just convert the troubling spinlock to
      a mutex, which allows us to use non-atomic allocations too.
      
      Fixes: ec3ed293 ("net_sched: change tcf_del_walker() to take idrinfo->lock")
      Reported-by: NIdo Schimmel <idosch@idosch.org>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Vlad Buslov <vladbu@mellanox.com>
      Cc: Jiri Pirko <jiri@mellanox.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Tested-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95278dda
    • D
      net: Move free of dst_metrics to helper · 1620a336
      David Ahern 提交于
      Move the refcounting and potential free of dst metrics associated
      for ipv4 and ipv6 to a common helper.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1620a336
    • D
      net: common metrics init helper for dst_entry · e1255ed4
      David Ahern 提交于
      ipv4 and ipv6 both use refcounted metrics if FIB entries have metrics set.
      Move the common initialization code to a helper and use for both protocols.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1255ed4
    • D
      net: Move free of fib_metrics to helper · cc5f0eb2
      David Ahern 提交于
      Move the refcounting and potential free of dst metrics associated
      with a fib entry to a helper and use it in both ipv4 and ipv6.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc5f0eb2
    • D
      net: common metrics init helper for FIB entries · 767a2217
      David Ahern 提交于
      Consolidate initialization of ipv4 and ipv6 metrics when fib entries
      are created into a single helper, ip_fib_metrics_init, that handles
      the call to ip_metrics_convert.
      
      If no metrics are defined for the fib entry, then the metrics is set
      to dst_default_metrics.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      767a2217
    • J
      net: sched: remove unused helpers · d26d4b19
      Jakub Kicinski 提交于
      tcf_block_dev() doesn't seem to be used anywhere in the tree.
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d26d4b19
    • V
      tc: Add support for configuring the taprio scheduler · 5a781ccb
      Vinicius Costa Gomes 提交于
      This traffic scheduler allows traffic classes states (transmission
      allowed/not allowed, in the simplest case) to be scheduled, according
      to a pre-generated time sequence. This is the basis of the IEEE
      802.1Qbv specification.
      
      Example configuration:
      
      tc qdisc replace dev enp3s0 parent root handle 100 taprio \
                num_tc 3 \
      	  map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
      	  queues 1@0 1@1 2@2 \
      	  base-time 1528743495910289987 \
      	  sched-entry S 01 300000 \
      	  sched-entry S 02 300000 \
      	  sched-entry S 04 300000 \
      	  clockid CLOCK_TAI
      
      The configuration format is similar to mqprio. The main difference is
      the presence of a schedule, built by multiple "sched-entry"
      definitions, each entry has the following format:
      
           sched-entry <CMD> <GATE MASK> <INTERVAL>
      
      The only supported <CMD> is "S", which means "SetGateStates",
      following the IEEE 802.1Qbv-2015 definition (Table 8-6). <GATE MASK>
      is a bitmask where each bit is a associated with a traffic class, so
      bit 0 (the least significant bit) being "on" means that traffic class
      0 is "active" for that schedule entry. <INTERVAL> is a time duration
      in nanoseconds that specifies for how long that state defined by <CMD>
      and <GATE MASK> should be held before moving to the next entry.
      
      This schedule is circular, that is, after the last entry is executed
      it starts from the first one, indefinitely.
      
      The other parameters can be defined as follows:
      
       - base-time: specifies the instant when the schedule starts, if
        'base-time' is a time in the past, the schedule will start at
      
       	      base-time + (N * cycle-time)
      
         where N is the smallest integer so the resulting time is greater
         than "now", and "cycle-time" is the sum of all the intervals of the
         entries in the schedule;
      
       - clockid: specifies the reference clock to be used;
      
      The parameters should be similar to what the IEEE 802.1Q family of
      specification defines.
      Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a781ccb
    • V
      devlink: Add generic parameter msix_vec_per_pf_min · 16511789
      Vasundhara Volam 提交于
      msix_vec_per_pf_min - This param sets the number of minimal MSIX
      vectors required for the device initialization. This value is set
      in the device which limits MSIX vectors per PF.
      
      Cc: Jiri Pirko <jiri@mellanox.com>
      Cc: Michael Chan <michael.chan@broadcom.com>
      Signed-off-by: NVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16511789
    • V
      devlink: Add generic parameter msix_vec_per_pf_max · f61cba42
      Vasundhara Volam 提交于
      msix_vec_per_pf_max - This param sets the number of MSIX vectors
      that the device requests from the host on driver initialization.
      This value is set in the device which is applicable per PF.
      
      Cc: Jiri Pirko <jiri@mellanox.com>
      Cc: Michael Chan <michael.chan@broadcom.com>
      Signed-off-by: NVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f61cba42
    • V
      devlink: Add generic parameter ignore_ari · e3b51061
      Vasundhara Volam 提交于
      ignore_ari - Device ignores ARI(Alternate Routing ID) capability,
      even when platforms has the support and creates same number of
      partitions when platform does not support ARI capability.
      
      Cc: Jiri Pirko <jiri@mellanox.com>
      Cc: Michael Chan <michael.chan@broadcom.com>
      Signed-off-by: NVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e3b51061
    • D
      dns: Allow the dns resolver to retrieve a server set · bbb4c432
      David Howells 提交于
      Allow the DNS resolver to retrieve a set of servers and their associated
      addresses, ports, preference and weight ratings.
      
      In terms of communication with userspace, "srv=1" is added to the callout
      string (the '1' indicating the maximum data version supported by the
      kernel) to ask the userspace side for this.
      
      If the userspace side doesn't recognise it, it will ignore the option and
      return the usual text address list.
      
      If the userspace side does recognise it, it will return some binary data
      that begins with a zero byte that would cause the string parsers to give an
      error.  The second byte contains the version of the data in the blob (this
      may be between 1 and the version specified in the callout data).  The
      remainder of the payload is version-specific.
      
      In version 1, the payload looks like (note that this is packed):
      
      	u8	Non-string marker (ie. 0)
      	u8	Content (0 => Server list)
      	u8	Version (ie. 1)
      	u8	Source (eg. DNS_RECORD_FROM_DNS_SRV)
      	u8	Status (eg. DNS_LOOKUP_GOOD)
      	u8	Number of servers
      	foreach-server {
      		u16	Name length (LE)
      		u16	Priority (as per SRV record) (LE)
      		u16	Weight (as per SRV record) (LE)
      		u16	Port (LE)
      		u8	Source (eg. DNS_RECORD_FROM_NSS)
      		u8	Status (eg. DNS_LOOKUP_GOT_NOT_FOUND)
      		u8	Protocol (eg. DNS_SERVER_PROTOCOL_UDP)
      		u8	Number of addresses
      		char[]	Name (not NUL-terminated)
      		foreach-address {
      			u8		Family (AF_INET{,6})
      			union {
      				u8[4]	ipv4_addr
      				u8[16]	ipv6_addr
      			}
      		}
      	}
      
      This can then be used to fetch a whole cell's VL-server configuration for
      AFS, for example.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bbb4c432
  2. 04 10月, 2018 4 次提交
  3. 03 10月, 2018 9 次提交
  4. 02 10月, 2018 14 次提交
  5. 30 9月, 2018 2 次提交