1. 05 3月, 2020 6 次提交
    • H
      PCI: Add constant PCI_STATUS_ERROR_BITS · d6e055e8
      Heiner Kallweit 提交于
      This collection of PCI error bits is used in more than one driver,
      so move it to the PCI core.
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6e055e8
    • V
      net: dsa: felix: Allow unknown unicast traffic towards the CPU port module · 1cf3299b
      Vladimir Oltean 提交于
      Compared to other DSA switches, in the Ocelot cores, the RX filtering is
      a much more important concern.
      
      Firstly, the primary use case for Ocelot is non-DSA, so there isn't any
      secondary Ethernet MAC [the DSA master's one] to implicitly drop frames
      having a DMAC we are not interested in.  So the switch driver itself
      needs to install FDB entries towards the CPU port module (PGID_CPU) for
      the MAC address of each switch port, in each VLAN installed on the port.
      Every address that is not whitelisted is implicitly dropped. This is in
      order to achieve a behavior similar to N standalone net devices.
      
      Secondly, even in the secondary use case of DSA, such as illustrated by
      Felix with the NPI port mode, that secondary Ethernet MAC is present,
      but its RX filter is bypassed. This is because the DSA tags themselves
      are placed before Ethernet, so the DMAC that the switch ports see is
      not seen by the DSA master too (since it's shifter to the right).
      
      So RX filtering is pretty important. A good RX filter won't bother the
      CPU in case the switch port receives a frame that it's not interested
      in, and there exists no other line of defense.
      
      Ocelot is pretty strict when it comes to RX filtering: non-IP multicast
      and broadcast traffic is allowed to go to the CPU port module, but
      unknown unicast isn't. This means that traffic reception for any other
      MAC addresses than the ones configured on each switch port net device
      won't work. This includes use cases such as macvlan or bridging with a
      non-Ocelot (so-called "foreign") interface. But this seems to be fine
      for the scenarios that the Linux system embedded inside an Ocelot switch
      is intended for - it is simply not interested in unknown unicast
      traffic, as explained in Allan Nielsen's presentation [0].
      
      On the other hand, the Felix DSA switch is integrated in more
      general-purpose Linux systems, so it can't afford to drop that sort of
      traffic in hardware, even if it will end up doing so later, in software.
      
      Actually, unknown unicast means more for Felix than it does for Ocelot.
      Felix doesn't attempt to perform the whitelisting of switch port MAC
      addresses towards PGID_CPU at all, mainly because it is too complicated
      to be feasible: while the MAC addresses are unique in Ocelot, by default
      in DSA all ports are equal and inherited from the DSA master. This adds
      into account the question of reference counting MAC addresses (delayed
      ocelot_mact_forget), not to mention reference counting for the VLAN IDs
      that those MAC addresses are installed in. This reference counting
      should be done in the DSA core, and the fact that it wasn't needed so
      far is due to the fact that the other DSA switches don't have the DSA
      tag placed before Ethernet, so the DSA master is able to whitelist the
      MAC addresses in hardware.
      
      So this means that even regular traffic termination on a Felix switch
      port happens through flooding (because neither Felix nor Ocelot learn
      source MAC addresses from CPU-injected frames).
      
      So far we've explained that whitelisting towards PGID_CPU:
      - helps to reduce the likelihood of spamming the CPU with frames it
        won't process very far anyway
      - is implemented in the ocelot driver
      - is sufficient for the ocelot use cases
      - is not feasible in DSA
      - breaks use cases in DSA, in the current status (whitelisting enabled
        but no MAC address whitelisted)
      
      So the proposed patch allows unknown unicast frames to be sent to the
      CPU port module. This is done for the Felix DSA driver only, as Ocelot
      seems to be happy without it.
      
      [0]: https://www.youtube.com/watch?v=B1HhxEcU7JgSuggested-by: NAllan W. Nielsen <allan.nielsen@microchip.com>
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NAllan W. Nielsen <allan.nielsen@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1cf3299b
    • V
      net: mscc: ocelot: eliminate confusion between CPU and NPI port · 69df578c
      Vladimir Oltean 提交于
      Ocelot has the concept of a CPU port. The CPU port is represented in the
      forwarding and the queueing system, but it is not a physical device. The
      CPU port can either be accessed via register-based injection/extraction
      (which is the case of Ocelot), via Frame-DMA (similar to the first one),
      or "connected" to a physical Ethernet port (called NPI in the datasheet)
      which is the case of the Felix DSA switch.
      
      In Ocelot the CPU port is at index 11.
      In Felix the CPU port is at index 6.
      
      The CPU bit is treated special in the forwarding, as it is never cleared
      from the forwarding port mask (once added to it). Other than that, it is
      treated the same as a normal front port.
      
      Both Felix and Ocelot should use the CPU port in the same way. This
      means that Felix should not use the NPI port directly when forwarding to
      the CPU, but instead use the CPU port.
      
      This patch is fixing this such that Felix will use port 6 as its CPU
      port, and just use the NPI port to carry the traffic.
      
      Therefore, eliminate the "ocelot->cpu" variable which was holding the
      index of the NPI port for Felix, and the index of the CPU port module
      for Ocelot, so the variable was actually configuring different things
      for different drivers and causing at least part of the confusion.
      
      Also remove the "ocelot->num_cpu_ports" variable, which is the result of
      another confusion. The 2 CPU ports mentioned in the datasheet are
      because there are two frame extraction channels (register based or DMA
      based). This is of no relevance to the driver at the moment, and
      invisible to the analyzer module.
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Suggested-by: NAllan W. Nielsen <allan.nielsen@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      69df578c
    • L
      pie: realign comment · 5c5840e4
      Leslie Monis 提交于
      Realign a comment after the change introduced by the
      previous patch.
      Signed-off-by: NLeslie Monis <lesliemonis@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5c5840e4
    • L
      pie: remove pie_vars->accu_prob_overflows · 105e808c
      Leslie Monis 提交于
      The variable pie_vars->accu_prob is used as an accumulator for
      probability values. Since probabilty values are scaled using the
      MAX_PROB macro denoting (2^64 - 1), pie_vars->accu_prob is
      likely to overflow as it is of type u64.
      
      The variable pie_vars->accu_prob_overflows counts the number of
      times the variable pie_vars->accu_prob overflows.
      
      The MAX_PROB macro needs to be equal to at least (2^39 - 1) in
      order to do precise calculations without any underflow. Thus
      MAX_PROB can be reduced to (2^56 - 1) without affecting the
      precision in calculations drastically. Doing so will eliminate
      the need for the variable pie_vars->accu_prob_overflows as the
      variable pie_vars->accu_prob will never overflow.
      
      Removing the variable pie_vars->accu_prob_overflows also reduces
      the size of the structure pie_vars to exactly 64 bytes.
      Signed-off-by: NMohit P. Tahiliani <tahiliani@nitk.edu.in>
      Signed-off-by: NGautam Ramakrishnan <gautamramk@gmail.com>
      Signed-off-by: NLeslie Monis <lesliemonis@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      105e808c
    • L
      pie: use term backlog instead of qlen · 90baeb9d
      Leslie Monis 提交于
      Remove ambiguity by using the term backlog instead of qlen when
      representing the queue length in bytes.
      Signed-off-by: NMohit P. Tahiliani <tahiliani@nitk.edu.in>
      Signed-off-by: NGautam Ramakrishnan <gautamramk@gmail.com>
      Signed-off-by: NLeslie Monis <lesliemonis@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90baeb9d
  2. 04 3月, 2020 8 次提交
  3. 03 3月, 2020 6 次提交
  4. 01 3月, 2020 13 次提交
  5. 29 2月, 2020 6 次提交
  6. 28 2月, 2020 1 次提交
    • M
      bpf: inet_diag: Dump bpf_sk_storages in inet_diag_dump() · 085c20ca
      Martin KaFai Lau 提交于
      This patch will dump out the bpf_sk_storages of a sk
      if the request has the INET_DIAG_REQ_SK_BPF_STORAGES nlattr.
      
      An array of SK_DIAG_BPF_STORAGE_REQ_MAP_FD can be specified in
      INET_DIAG_REQ_SK_BPF_STORAGES to select which bpf_sk_storage to dump.
      If no map_fd is specified, all bpf_sk_storages of a sk will be dumped.
      
      bpf_sk_storages can be added to the system at runtime.  It is difficult
      to find a proper static value for cb->min_dump_alloc.
      
      This patch learns the nlattr size required to dump the bpf_sk_storages
      of a sk.  If it happens to be the very first nlmsg of a dump and it
      cannot fit the needed bpf_sk_storages,  it will try to expand the
      skb by "pskb_expand_head()".
      
      Instead of expanding it in inet_sk_diag_fill(), it is expanded at a
      sleepable context in __inet_diag_dump() so __GFP_DIRECT_RECLAIM can
      be used.  In __inet_diag_dump(), it will retry as long as the
      skb is empty and the cb->min_dump_alloc becomes larger than before.
      cb->min_dump_alloc is bounded by KMALLOC_MAX_SIZE.  The min_dump_alloc
      is also changed from 'u16' to 'u32' to accommodate a sk that may have
      a few large bpf_sk_storages.
      
      The updated cb->min_dump_alloc will also be used to allocate the skb in
      the next dump.  This logic already exists in netlink_dump().
      
      Here is the sample output of a locally modified 'ss' and it could be made
      more readable by using BTF later:
      [root@arch-fb-vm1 ~]# ss --bpf-map-id 14 --bpf-map-id 13 -t6an 'dst [::1]:8989'
      State Recv-Q Send-Q Local Address:Port  Peer Address:PortProcess
      ESTAB 0      0              [::1]:51072        [::1]:8989
      	 bpf_map_id:14 value:[ 3feb ]
      	 bpf_map_id:13 value:[ 3f ]
      ESTAB 0      0              [::1]:51070        [::1]:8989
      	 bpf_map_id:14 value:[ 3feb ]
      	 bpf_map_id:13 value:[ 3f ]
      
      [root@arch-fb-vm1 ~]# ~/devshare/github/iproute2/misc/ss --bpf-maps -t6an 'dst [::1]:8989'
      State         Recv-Q         Send-Q                   Local Address:Port                    Peer Address:Port         Process
      ESTAB         0              0                                [::1]:51072                          [::1]:8989
      	 bpf_map_id:14 value:[ 3feb ]
      	 bpf_map_id:13 value:[ 3f ]
      	 bpf_map_id:12 value:[ 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000... total:65407 ]
      ESTAB         0              0                                [::1]:51070                          [::1]:8989
      	 bpf_map_id:14 value:[ 3feb ]
      	 bpf_map_id:13 value:[ 3f ]
      	 bpf_map_id:12 value:[ 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000... total:65407 ]
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20200225230427.1976129-1-kafai@fb.com
      085c20ca