1. 11 11月, 2017 11 次提交
    • D
      net: netlink: Update attr validation to require exact length for some types · 28033ae4
      David Ahern 提交于
      Attributes using NLA_U* and NLA_S* (where * is 8, 16,32 and 64) are
      expected to be an exact length. Split these data types from
      nla_attr_minlen into nla_attr_len and update validate_nla to require
      the attribute to have exact length for them.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28033ae4
    • M
      net: ipv6: sysctl to specify IPv6 ND traffic class · 2210d6b2
      Maciej Żenczykowski 提交于
      Add a per-device sysctl to specify the default traffic class to use for
      kernel originated IPv6 Neighbour Discovery packets.
      
      Currently this includes:
      
        - Router Solicitation (ICMPv6 type 133)
          ndisc_send_rs() -> ndisc_send_skb() -> ip6_nd_hdr()
      
        - Neighbour Solicitation (ICMPv6 type 135)
          ndisc_send_ns() -> ndisc_send_skb() -> ip6_nd_hdr()
      
        - Neighbour Advertisement (ICMPv6 type 136)
          ndisc_send_na() -> ndisc_send_skb() -> ip6_nd_hdr()
      
        - Redirect (ICMPv6 type 137)
          ndisc_send_redirect() -> ndisc_send_skb() -> ip6_nd_hdr()
      
      and if the kernel ever gets around to generating RA's,
      it would presumably also include:
      
        - Router Advertisement (ICMPv6 type 134)
          (radvd daemon could pick up on the kernel setting and use it)
      
      Interface drivers may examine the Traffic Class value and translate
      the DiffServ Code Point into a link-layer appropriate traffic
      prioritization scheme.  An example of mapping IETF DSCP values to
      IEEE 802.11 User Priority values can be found here:
      
          https://tools.ietf.org/html/draft-ietf-tsvwg-ieee-802-11
      
      The expected primary use case is to properly prioritize ND over wifi.
      
      Testing:
        jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
        0
        jzem22:~# echo -1 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
        -bash: echo: write error: Invalid argument
        jzem22:~# echo 256 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
        -bash: echo: write error: Invalid argument
        jzem22:~# echo 0 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
        jzem22:~# echo 255 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
        jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
        255
        jzem22:~# echo 34 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
        jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
        34
      
        jzem22:~# echo $[0xDC] > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
        jzem22:~# tcpdump -v -i eth0 icmp6 and src host jzem22.pgc and dst host fe80::1
        tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
        IP6 (class 0xdc, hlim 255, next-header ICMPv6 (58) payload length: 24)
        jzem22.pgc > fe80::1: [icmp6 sum ok] ICMP6, neighbor advertisement,
        length 24, tgt is jzem22.pgc, Flags [solicited]
      
      (based on original change written by Erik Kline, with minor changes)
      
      v2: fix 'suspicious rcu_dereference_check() usage'
          by explicitly grabbing the rcu_read_lock.
      
      Cc: Lorenzo Colitti <lorenzo@google.com>
      Signed-off-by: NErik Kline <ek@google.com>
      Signed-off-by: NMaciej Żenczykowski <maze@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2210d6b2
    • S
      net/ncsi: Don't return error on normal response · 04bad8bd
      Samuel Mendoza-Jonas 提交于
      Several response handlers return EBUSY if the data corresponding to the
      command/response pair is already set. There is no reason to return an
      error here; the channel is advertising something as enabled because we
      told it to enable it, and it's possible that the feature has been
      enabled previously.
      Signed-off-by: NSamuel Mendoza-Jonas <sam@mendozajonas.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04bad8bd
    • S
      net/ncsi: Improve general state logging · 9ef8690b
      Samuel Mendoza-Jonas 提交于
      The NCSI driver is mostly silent which becomes a headache when trying to
      determine what has occurred on the NCSI connection. This adds additional
      logging in a few key areas such as state transitions and calling out
      certain errors more visibly.
      Signed-off-by: NSamuel Mendoza-Jonas <sam@mendozajonas.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ef8690b
    • D
      Merge branch 'bpftool-show-filenames-of-pinned-objects' · a8a6f1e4
      David S. Miller 提交于
      Prashant Bhole says:
      
      ====================
      tools: bpftool: show filenames of pinned objects
      
      This patchset adds support to show pinned objects in object details.
      
      Patch1 adds a funtionality to open a path in bpf-fs regardless of its object
      type.
      
      Patch2 adds actual functionality by scanning the bpf-fs once and adding
      object information in hash table, with object id as a key. One object may be
      associated with multiple paths because an object can be pinned multiple times
      
      Patch3 adds command line option to enable this functionality. Making it optional
      because scanning bpf-fs can be costly.
      ====================
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      a8a6f1e4
    • P
      tools: bpftool: optionally show filenames of pinned objects · c541b734
      Prashant Bhole 提交于
      Making it optional to show file names of pinned objects because
      it scans complete bpf-fs filesystem which is costly.
      Added option -f|--bpffs. Documentation updated.
      Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c541b734
    • P
      tools: bpftool: show filenames of pinned objects · 4990f1f4
      Prashant Bhole 提交于
      Added support to show filenames of pinned objects.
      
      For example:
      
      root@test# ./bpftool prog
      3: tracepoint  name tracepoint__irq  tag f677a7dd722299a3
          loaded_at Oct 26/11:39  uid 0
          xlated 160B  not jited  memlock 4096B  map_ids 4
          pinned /sys/fs/bpf/softirq_prog
      
      4: tracepoint  name tracepoint__irq  tag ea5dc530d00b92b6
          loaded_at Oct 26/11:39  uid 0
          xlated 392B  not jited  memlock 4096B  map_ids 4,6
      
      root@test# ./bpftool --json --pretty prog
      [{
              "id": 3,
              "type": "tracepoint",
              "name": "tracepoint__irq",
              "tag": "f677a7dd722299a3",
              "loaded_at": "Oct 26/11:39",
              "uid": 0,
              "bytes_xlated": 160,
              "jited": false,
              "bytes_memlock": 4096,
              "map_ids": [4
              ],
              "pinned": ["/sys/fs/bpf/softirq_prog"
              ]
          },{
              "id": 4,
              "type": "tracepoint",
              "name": "tracepoint__irq",
              "tag": "ea5dc530d00b92b6",
              "loaded_at": "Oct 26/11:39",
              "uid": 0,
              "bytes_xlated": 392,
              "jited": false,
              "bytes_memlock": 4096,
              "map_ids": [4,6
              ],
              "pinned": []
          }
      ]
      
      root@test# ./bpftool map
      4: hash  name start  flags 0x0
          key 4B  value 16B  max_entries 10240  memlock 1003520B
          pinned /sys/fs/bpf/softirq_map1
      5: hash  name iptr  flags 0x0
          key 4B  value 8B  max_entries 10240  memlock 921600B
      
      root@test# ./bpftool --json --pretty map
      [{
              "id": 4,
              "type": "hash",
              "name": "start",
              "flags": 0,
              "bytes_key": 4,
              "bytes_value": 16,
              "max_entries": 10240,
              "bytes_memlock": 1003520,
              "pinned": ["/sys/fs/bpf/softirq_map1"
              ]
          },{
              "id": 5,
              "type": "hash",
              "name": "iptr",
              "flags": 0,
              "bytes_key": 4,
              "bytes_value": 8,
              "max_entries": 10240,
              "bytes_memlock": 921600,
              "pinned": []
          }
      ]
      Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4990f1f4
    • P
      tools: bpftool: open pinned object without type check · 18527196
      Prashant Bhole 提交于
      This was needed for opening any file in bpf-fs without knowing
      its object type
      Signed-off-by: NPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18527196
    • D
      Merge branch 'BPF-directed-error-injection' · 329fca60
      David S. Miller 提交于
      Josef Bacik says:
      
      ====================
      Add the ability to do BPF directed error injection
      
      I'm sending this through Dave since it'll conflict with other BPF changes in his
      tree, but since it touches tracing as well Dave would like a review from
      somebody on the tracing side.
      
      v4->v5:
      - disallow kprobe_override programs from being put in the prog map array so we
        don't tail call into something we didn't check.  This allows us to make the
        normal path still fast without a bunch of percpu operations.
      
      v3->v4:
      - fix a build error found by kbuild test bot (I didn't wait long enough
        apparently.)
      - Added a warning message as per Daniels suggestion.
      
      v2->v3:
      - added a ->kprobe_override flag to bpf_prog.
      - added some sanity checks to disallow attaching bpf progs that have
        ->kprobe_override set that aren't for ftrace kprobes.
      - added the trace_kprobe_ftrace helper to check if the trace_event_call is a
        ftrace kprobe.
      - renamed bpf_kprobe_state to bpf_kprobe_override, fixed it so we only read this
        value in the kprobe path, and thus only write to it if we're overriding or
        clearing the override.
      
      v1->v2:
      - moved things around to make sure that bpf_override_return could really only be
        used for an ftrace kprobe.
      - killed the special return values from trace_call_bpf.
      - renamed pc_modified to bpf_kprobe_state so bpf_override_return could tell if
        it was being called from an ftrace kprobe context.
      - reworked the logic in kprobe_perf_func to take advantage of bpf_kprobe_state.
      - updated the test as per Alexei's review.
      
      - Original message -
      
      A lot of our error paths are not well tested because we have no good way of
      injecting errors generically.  Some subystems (block, memory) have ways to
      inject errors, but they are random so it's hard to get reproduceable results.
      
      With BPF we can add determinism to our error injection.  We can use kprobes and
      other things to verify we are injecting errors at the exact case we are trying
      to test.  This patch gives us the tool to actual do the error injection part.
      It is very simple, we just set the return value of the pt_regs we're given to
      whatever we provide, and then override the PC with a dummy function that simply
      returns.
      
      Right now this only works on x86, but it would be simple enough to expand to
      other architectures.  Thanks,
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      329fca60
    • J
      samples/bpf: add a test for bpf_override_return · eafb3401
      Josef Bacik 提交于
      This adds a basic test for bpf_override_return to verify it works.  We
      override the main function for mounting a btrfs fs so it'll return
      -ENOMEM and then make sure that trying to mount a btrfs fs will fail.
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eafb3401
    • J
      bpf: add a bpf_override_function helper · dd0bb688
      Josef Bacik 提交于
      Error injection is sloppy and very ad-hoc.  BPF could fill this niche
      perfectly with it's kprobe functionality.  We could make sure errors are
      only triggered in specific call chains that we care about with very
      specific situations.  Accomplish this with the bpf_override_funciton
      helper.  This will modify the probe'd callers return value to the
      specified value and set the PC to an override function that simply
      returns, bypassing the originally probed function.  This gives us a nice
      clean way to implement systematic error injection for all of our code
      paths.
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd0bb688
  2. 10 11月, 2017 26 次提交
    • G
      net: fix incorrect comment with regard to VLAN packet handling · 54985120
      Girish Moodalbail 提交于
      The commit bcc6d479 ("net: vlan: make non-hw-accel rx path similar
      to hw-accel") unified accel and non-accel path for VLAN RX. With that
      fix we do not register any packet_type handler for VLANs anymore, so fix
      the incorrect comment.
      Signed-off-by: NGirish Moodalbail <girish.moodalbail@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      54985120
    • D
      Merge branch 'act_vlan-rcu' · b79c069a
      David S. Miller 提交于
      Manish Kurup says:
      
      ====================
      net_sched actions: act_vlan now uses RCU
      
      This commit consists of 3 patches:
      
      patch1 (1/3):
      The VLAN action maintains one set of stats across all cores, and uses a
      spinlock to synchronize updates to it from the same. Changed this to use a
      per-CPU stats context instead.
      This change will result in better performance.
      
      patch2 (2/3):
      Modified netronome nfp flower action to use VLAN helper functions instead
      of accessing/referencing TC act_vlan private structures directly.
      
      patch3 (3/3):
      Using a spinlock in the VLAN action causes performance issues when the VLAN
      action is used on multiple cores. Rewrote the VLAN action to use RCU read
      locking for reads and updates instead.
      All functions now use an RCU dereferenced pointer to access the VLAN action
      context. Modified helper functions used by other modules, to use the RCU as
      opposed to directly accessing the structure.
      
      As part of this review, there were some changes suggested by reviewers.
      I have incorporated all the changes that were requested.
      
      Here're the changes:
      v2: Fixed all helper functions to use RCU (rtnl_dereference) - Eric, Jamal
      v2: Fixed indentation, extra line nits - Jamal, Jiri
      v2: Moved rcu_head to the end of the struct - Jiri
      v2: Re-formatted locals to reverse-christmas-tree - Jiri
      v2: Removed mismatched spin_lock() - Cong
      v2: Removed spin_lock_bh() in tcf_vlan_init, rtnl_dereference() should
          suffice - Cong, Jiri
      v4: Modified the nfp flower action code to use the VLAN helper functions
          instead of referencing the structure directly. Isolated this into a
          separate patch - Pieter Jansen
      v5: Got rid of the unlikely() for the allocation case - Simon Horman
      v6: Had forgotten cleanup functions for RCU alloc, added them - Dave Miller
      v7: Re-formatted more locals to reverse-christmas-tree - Pieter V
      v8: Reverted reverse-christmas-tree(v7), not required when dependencies
          make it difficult to implement - Alexander D
      v9: Cover letter subject change - Jamal
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b79c069a
    • M
      act_vlan: VLAN action rewrite to use RCU lock/unlock and update · 4c5b9d96
      Manish Kurup 提交于
      Using a spinlock in the VLAN action causes performance issues when the VLAN
      action is used on multiple cores. Rewrote the VLAN action to use RCU read
      locking for reads and updates instead.
      All functions now use an RCU dereferenced pointer to access the VLAN action
      context. Modified helper functions used by other modules, to use the RCU as
      opposed to directly accessing the structure.
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NManish Kurup <manish.kurup@verizon.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c5b9d96
    • M
      nfp flower action: Modified to use VLAN helper functions · bf068bdd
      Manish Kurup 提交于
      Modified netronome nfp flower action to use VLAN helper functions instead
      of accessing/referencing TC act_vlan private structures directly.
      Reviewed-by: NPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Signed-off-by: NManish Kurup <manish.kurup@verizon.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf068bdd
    • M
      act_vlan: Change stats update to use per-core stats · e0496cbb
      Manish Kurup 提交于
      The VLAN action maintains one set of stats across all cores, and uses a
      spinlock to synchronize updates to it from the same. Changed this to use a
      per-CPU stats context instead.
      This change will result in better performance.
      Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NManish Kurup <manish.kurup@verizon.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0496cbb
    • R
      sfc: don't warn on successful change of MAC · cbad52e9
      Robert Stonehouse 提交于
      Fixes: 535a6177 ("sfc: suppress handled MCDI failures when changing the MAC address")
      Signed-off-by: NBert Kenward <bkenward@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbad52e9
    • C
      net: vxge: remove redundant assignments and pointers · e4effc09
      Colin Ian King 提交于
      There are several pointers that are being assigned but never read
      so remove these as they are redundant.  Also remove an assignment
      to function_mode that is never read. Cleans up several clang
      warnings:
      
      vxge-main.c:1139:2: warning: Value stored to 'hldev' is never read
      vxge-main.c:1294:2: warning: Value stored to 'hldev' is never read
      vxge-main.c:2188:2: warning: Value stored to 'dev' is never read
      vxge-main.c:2188:2: warning: Value stored to 'dev' is never read
      vxge-main.c:2723:2: warning: Value stored to 'function_mode' is
      never read
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4effc09
    • D
      Merge branch 'ip_gre-flags-update' · be61a484
      David S. Miller 提交于
      Xin Long says:
      
      ====================
      ip_gre: add support for i/o_flags update
      
      ip_gre is using as many ip_tunnel apis as possible, newlink works
      fine as gre would do it's own part in .ndo_init. But when changing
      link, ip_tunnel_changelink doesn't even update i/o_flags, and also
      the update of these flags would cause some other gre's properties
      need to be updated or recalculated.
      
      These two patch are to add i/o_flags update and then do adjustment
      on some gre's properties according to the new i/o_flags.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be61a484
    • X
      ip_gre: add the support for i/o_flags update via ioctl · a0efab67
      Xin Long 提交于
      As patch 'ip_gre: add the support for i/o_flags update via netlink'
      did for netlink, we also need to do the same job for these update
      via ioctl.
      
      This patch is to update i/o_flags and call ipgre_link_update to
      recalculate these gre properties after ip_tunnel_ioctl does the
      common update.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0efab67
    • X
      ip_gre: add the support for i/o_flags update via netlink · dd9d598c
      Xin Long 提交于
      Now ip_gre is using ip_tunnel_changelink to update it's properties, but
      ip_tunnel_changelink in ip_tunnel doesn't update i/o_flags as a common
      function.
      
      o_flags updates would cause that tunnel->tun_hlen / hlen and dev->mtu /
      needed_headroom need to be recalculated, and dev->(hw_)features need to
      be updated as well.
      
      Therefore, we can't just add the update into ip_tunnel_update called
      in ip_tunnel_changelink, and it's also better not to touch ip_tunnel
      codes.
      
      This patch updates i/o_flags and calls ipgre_link_update to recalculate
      these gre properties after ip_tunnel_changelink does the common update.
      
      Note that since ipgre_link_update doesn't know the lower dev, it will
      update gre->hlen, dev->mtu and dev->needed_headroom with the value of
      'new tun_hlen - old tun_hlen'. In this way, we can avoid many redundant
      codes, unlike ip6_gre.
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd9d598c
    • D
      Merge branch 'tcp-ns-rmem-wmem' · c7947e43
      David S. Miller 提交于
      Eric Dumazet says:
      
      ====================
      net: Namespace-ify sysctl_tcp_rmem and sysctl_tcp_wmem
      
      We need to get per netns sysctl for sysctl_[proto]_rmem and sysctl_[proto]_wmem
      
      This patch series adds the basic infrastructure allowing per proto
      conversion, and takes care of TCP.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7947e43
    • E
      tcp: Namespace-ify sysctl_tcp_rmem and sysctl_tcp_wmem · 356d1833
      Eric Dumazet 提交于
      Note that when a new netns is created, it inherits its
      sysctl_tcp_rmem and sysctl_tcp_wmem from initial netns.
      
      This change is needed so that we can refine TCP rcvbuf autotuning,
      to take RTT into consideration.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      356d1833
    • E
      net: allow per netns sysctl_rmem and sysctl_wmem for protos · a3dcaf17
      Eric Dumazet 提交于
      As we want to gradually implement per netns sysctl_rmem and sysctl_wmem
      on per protocol basis, add two new fields in struct proto,
      and two new helpers : sk_get_wmem0() and sk_get_rmem0()
      
      First user will be TCP. Then UDP and SCTP can be easily converted,
      while DECNET probably wont get this support.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a3dcaf17
    • A
      net: dsa: Don't add vlans when vlan filtering is disabled · 2ea7a679
      Andrew Lunn 提交于
      The software bridge can be build with vlan filtering support
      included. However, by default it is turned off. In its turned off
      state, it still passes VLANs via switchev, even though they are not to
      be used. Don't pass these VLANs to the hardware. Only do so when vlan
      filtering is enabled.
      
      This fixes at least one corner case. There are still issues in other
      corners, such as when vlan_filtering is later enabled.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ea7a679
    • D
      Merge tag 'mlx5-updates-2017-11-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 4fdc3023
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      mlx5-updates-2017-11-09
      
      This series introduces vlan offloads related improvements for mlx5
      ethernet netdev driver, from Gal Pressman.
      
       - Add support for 802.1ad vlan filter
       - Add support for 802.1ad vlan insertion
       - Add vlan offloads statistics to ethtool (inserted/stripped vlans)
       - CHECKSUM_COMPLETE support for vlan traffic when vlan stripping is off! (Finally)
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4fdc3023
    • D
      Merge branch 'IGMP-snooping-for-local-traffic' · 5d37636a
      David S. Miller 提交于
      Andrew Lunn says:
      
      ====================
      IGMP snooping for local traffic
      
      The linux bridge supports IGMP snooping. It will listen to IGMP
      reports on bridge ports and keep track of which groups have been
      joined on an interface. It will then forward multicast based on this
      group membership.
      
      When the bridge adds or removed groups from an interface, it uses
      switchdev to request the hardware add an mdb to a port, so the
      hardware can perform the selective forwarding between ports.
      
      What is not covered by the current bridge code, is IGMP joins/leaves
      from the host on the brX interface. These are not reported via
      switchdev so that hardware knows the local host is interested in the
      multicast frames.
      
      Luckily, the bridge does track joins/leaves on the brX interface. The
      code is obfusticated, which is why i missed it with my first attempt.
      So the first patch tries to remove this obfustication. Currently,
      there is no notifications sent when the bridge interface joins a
      group. The second patch adds them. bridge monitor then shows
      joins/leaves in the same way as for other ports of the bridge.
      
      Then starts the work passing down to the hardware that the host has
      joined/left a group. The existing switchdev mdb object cannot be used,
      since the semantics are different. The existing
      SWITCHDEV_OBJ_ID_PORT_MDB is used to indicate a specific multicast
      group should be forwarded out that port of the switch. However here we
      require the exact opposite. We want multicast frames for the group
      received on the port to the forwarded to the host. Hence add a new
      object SWITCHDEV_OBJ_ID_HOST_MDB, a multicast database entry to
      forward to the host. This new object is then propagated through the
      DSA layers. No DSA driver changes should be needed, this should just
      work...
      
      This version fixes up the nitpick from Nikolay, removes an unrelated
      white space change, and adds in a patch adding a few const attributes
      to a couple of functions taking a port parameter, in order to stop the
      following patch produces warnings.
      ====================
      Acked-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d37636a
    • A
      net: dsa: switch: Don't add CPU port to an mdb by default · ae45102c
      Andrew Lunn 提交于
      Now that the host indicates when a multicast group should be forwarded
      from the switch to the host, don't do it by default.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae45102c
    • A
      net: dsa: add more const attributes · bb9f6031
      Andrew Lunn 提交于
      The notify mechanism does not need to modify the port it is notifying.
      So make the parameter const.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb9f6031
    • A
      net: dsa: slave: Handle switchdev host mdb add/del · 5f4dbc50
      Andrew Lunn 提交于
      Add code to handle switchdev host mdb add/del. Since DSA uses one of
      the switch ports as a transport to the host, we just need to add an
      MDB on this port.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f4dbc50
    • A
      net: bridge: Add/del switchdev object on host join/leave · 47d5b6db
      Andrew Lunn 提交于
      When the host joins or leaves a multicast group, use switchdev to add
      an object to the hardware to forward traffic for the group to the
      host.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47d5b6db
    • A
      net: bridge: Send notification when host join/leaves a group · 2a26028d
      Andrew Lunn 提交于
      The host can join or leave a multicast group on the brX interface, as
      indicated by IGMP snooping.  This is tracked within the bridge
      multicast code. Send a notification when this happens, in the same way
      a notification is sent when a port of the bridge joins/leaves a group
      because of IGMP snooping.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2a26028d
    • A
      net: bridge: Rename mglist to host_joined · ff0fd34e
      Andrew Lunn 提交于
      The boolean mglist indicates the host has joined a particular
      multicast group on the bridge interface. It is badly named, obscuring
      what is means. Rename it.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff0fd34e
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 4dc6758d
      David S. Miller 提交于
      Simple cases of overlapping changes in the packet scheduler.
      
      Must easier to resolve this time.
      
      Which probably means that I screwed it up somehow.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4dc6758d
    • L
      Merge tag 'pm-final-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 3fefc318
      Linus Torvalds 提交于
      Pull final power management fixes from Rafael Wysocki:
       "These fix a regression in the schedutil cpufreq governor introduced by
        a recent change and blacklist Dell XPS13 9360 from using the Low Power
        S0 Idle _DSM interface which triggers serious problems on one of these
        machines.
      
        Specifics:
      
         - Prevent the schedutil cpufreq governor from using the utilization
           of a wrong CPU in some cases which started to happen after one of
           the recent changes in it (Chris Redpath).
      
         - Blacklist Dell XPS13 9360 from using the Low Power S0 Idle _DSM
           interface as that causes serious issue (related to NVMe) to appear
           on one of these machines, even though the other Dells XPS13 9360 in
           somewhat different HW configurations behave correctly (Rafael
           Wysocki)"
      
      * tag 'pm-final-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI / PM: Blacklist Low Power S0 Idle _DSM for Dell XPS13 9360
        cpufreq: schedutil: Examine the correct CPU when we update util
      3fefc318
    • L
      Merge tag 'sound-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · d93d4ce1
      Linus Torvalds 提交于
      Pull sound fixes from Takashi Iwai:
       "The amount of the changes isn't as quite small as wished, nevertheless
        they are straight fixes that deserve merging to 4.14 final.
      
        Most of fixes are about ALSA core bugs spotted by fuzzer: a follow-up
        fix for the previous nested rwsem patch, a fix to avoid the resource
        hogs due to too many concurrent ALSA timer invocations, and a fix for
        a crash with SYSEX MIDI transfer over OSS sequencer emulation that is
        used by none but fuzzer.
      
        The rest are usual HD-audio and USB-audio device-specific quirks,
        which are safe to apply"
      
      * tag 'sound-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - fix headset mic problem for Dell machines with alc274
        ALSA: seq: Fix OSS sysex delivery in OSS emulation
        ALSA: seq: Avoid invalid lockdep class warning
        ALSA: timer: Limit max instances per timer
        ALSA: usb-audio: support new Amanero Combo384 firmware version
      d93d4ce1
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · d1041cdc
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix use-after-free in IPSEC input parsing, desintation address
          pointer was loaded before pskb_may_pull() which can change the SKB
          data pointers. From Florian Westphal.
      
       2) Stack out-of-bounds read in xfrm_state_find(), from Steffen
          Klassert.
      
       3) IPVS state of SKB is not properly reset when moving between
          namespaces, from Ye Yin.
      
       4) Fix crash in asix driver suspend and resume, from Andrey Konovalov.
      
       5) Don't deliver ipv6 l2tp tunnel packets to ipv4 l2tp tunnels, and
          vice versa, from Guillaume Nault.
      
       6) Fix DSACK undo on non-dup ACKs, from Priyaranjan Jha.
      
       7) Fix regression in bond_xmit_hash()'s behavior after the TCP port
          selection changes back in 4.2, from Hangbin Liu.
      
       8) Two divide by zero bugs in USB networking drivers when parsing
          descriptors, from Bjorn Mork.
      
       9) Fix bonding slaves being stuck in BOND_LINK_FAIL state, from Jay
          Vosburgh.
      
      10) Missing skb_reset_mac_header() in qmi_wwan, from Kristian Evensen.
      
      11) Fix the destruction of tc action object races properly, from Cong
          Wang.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (31 commits)
        cls_u32: use tcf_exts_get_net() before call_rcu()
        cls_tcindex: use tcf_exts_get_net() before call_rcu()
        cls_rsvp: use tcf_exts_get_net() before call_rcu()
        cls_route: use tcf_exts_get_net() before call_rcu()
        cls_matchall: use tcf_exts_get_net() before call_rcu()
        cls_fw: use tcf_exts_get_net() before call_rcu()
        cls_flower: use tcf_exts_get_net() before call_rcu()
        cls_flow: use tcf_exts_get_net() before call_rcu()
        cls_cgroup: use tcf_exts_get_net() before call_rcu()
        cls_bpf: use tcf_exts_get_net() before call_rcu()
        cls_basic: use tcf_exts_get_net() before call_rcu()
        net_sched: introduce tcf_exts_get_net() and tcf_exts_put_net()
        Revert "net_sched: hold netns refcnt for each action"
        net: usb: asix: fill null-ptr-deref in asix_suspend
        Revert "net: usb: asix: fill null-ptr-deref in asix_suspend"
        qmi_wwan: Add missing skb_reset_mac_header-call
        bonding: fix slave stuck in BOND_LINK_FAIL state
        qrtr: Move to postcore_initcall
        net: qmi_wwan: fix divide by 0 on bad descriptors
        net: cdc_ether: fix divide by 0 on bad descriptors
        ...
      d1041cdc
  3. 09 11月, 2017 3 次提交
    • H
      ALSA: hda - fix headset mic problem for Dell machines with alc274 · 75ee94b2
      Hui Wang 提交于
      Confirmed with Kailang of Realtek, the pin 0x19 is for Headset Mic, and
      the pin 0x1a is for Headphone Mic, he suggested to apply
      ALC269_FIXUP_DELL1_MIC_NO_PRESENCE to fix this problem. And we
      verified applying this FIXUP can fix this problem.
      
      Cc: <stable@vger.kernel.org>
      Cc: Kailang Yang <kailang@realtek.com>
      Signed-off-by: NHui Wang <hui.wang@canonical.com>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      75ee94b2
    • G
      net/mlx5e: CHECKSUM_COMPLETE offload for VLAN/QinQ packets · f938daee
      Gal Pressman 提交于
      When the VLAN tag is present in the packet buffer (i.e VLAN stripping disabled, QinQ)
      the driver will currently report CHECKSUM_UNNECESSARY.
      Instead of using CHECKSUM_COMPLETE offload for packets with first
      ethertype of IPv4/6, use it for packets with last ethertype of IPv4/6 to
      cover the former cases as well.
      
      The checksum field present in the CQE is calculated from the IP header
      until the end of the packet. When the first ethertype is different than
      IPv4/6 (for ex. 802.1Q VLAN) a checksum of the VLAN header/s should be
      added. The small header/s checksum calculation will allow us to use
      CHECKSUM_COMPLETE instead of CHECKSUM_UNNECESSARY.
      
      Testing bandwidth of one and 8 TCP streams to a single RQ,
      LRO and VLAN stripping offloads disabled:
      CPU: Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
      NIC: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
      
      Before:
      +--------------+--------------------+---------------------+----------------------+
      | Traffic type | 1 Stream BW [Mbps] | 8 Streams BW [Mbps] |   Checksum offload   |
      +--------------+--------------------+---------------------+----------------------+
      | Untagged     |          28,247.35 |           24,716.88 | CHECKSUM_COMPLETE    |
      | VLAN         |          27,516.69 |           23,752.26 | CHECKSUM_UNNECESSARY |
      | QinQ         |           6,961.30 |           20,667.04 | CHECKSUM_UNNECESSARY |
      +--------------+--------------------+---------------------+----------------------+
      
      Now:
      +--------------+--------------------+---------------------+-------------------+
      | Traffic type | 1 Stream BW [Mbps] | 8 Streams BW [Mbps] | Checksum offload  |
      +--------------+--------------------+---------------------+-------------------+
      | Untagged     |          28,521.28 |           24,926.32 | CHECKSUM_COMPLETE |
      | VLAN         |          27,389.37 |           23,715.34 | CHECKSUM_COMPLETE |
      | QinQ         |           6,901.77 |           20,845.73 | CHECKSUM_COMPLETE |
      +--------------+--------------------+---------------------+-------------------+
      
      No performance degradation observed.
      Signed-off-by: NGal Pressman <galp@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      f938daee
    • G
      net/mlx5e: Add VLAN offloads statistics · f24686e8
      Gal Pressman 提交于
      The following counters are now exposed through ethtool -S:
      rx[i]_removed_vlan_packets (per channel)
      rx_removed_vlan_packets
      tx[i]_added_vlan_packets (per channel)
      tx_added_vlan_packets
      
      rx_removed_vlan_packets: The number of packets that had their
      outer VLAN header stripped to the CQE by the hardware.
      tx_added_vlan_packets: The number of packets that had their
      outer VLAN header inserted by the hardware.
      Signed-off-by: NGal Pressman <galp@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      f24686e8