1. 06 3月, 2018 1 次提交
    • B
      e1000e: Remove Other from EIAC · 745d0bd3
      Benjamin Poirier 提交于
      It was reported that emulated e1000e devices in vmware esxi 6.5 Build
      7526125 do not link up after commit 4aea7a5c ("e1000e: Avoid receiver
      overrun interrupt bursts", v4.15-rc1). Some tracing shows that after
      e1000e_trigger_lsc() is called, ICR reads out as 0x0 in e1000_msix_other()
      on emulated e1000e devices. In comparison, on real e1000e 82574 hardware,
      icr=0x80000004 (_INT_ASSERTED | _LSC) in the same situation.
      
      Some experimentation showed that this flaw in vmware e1000e emulation can
      be worked around by not setting Other in EIAC. This is how it was before
      16ecba59 ("e1000e: Do not read ICR in Other interrupt", v4.5-rc1).
      
      Fixes: 4aea7a5c ("e1000e: Avoid receiver overrun interrupt bursts")
      Signed-off-by: NBenjamin Poirier <bpoirier@suse.com>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      745d0bd3
  2. 05 3月, 2018 19 次提交
  3. 04 3月, 2018 1 次提交
  4. 03 3月, 2018 1 次提交
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 4a0c7191
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS fixes for net
      
      The following patchset contains Netfilter fixes for your net tree,
      they are:
      
      1) Put back reference on CLUSTERIP configuration structure from the
         error path, patch from Florian Westphal.
      
      2) Put reference on CLUSTERIP configuration instead of freeing it,
         another cpu may still be walking over it, also from Florian.
      
      3) Refetch pointer to IPv6 header from nf_nat_ipv6_manip_pkt() given
         packet manipulation may reallocation the skbuff header, from Florian.
      
      4) Missing match size sanity checks in ebt_among, from Florian.
      
      5) Convert BUG_ON to WARN_ON in ebtables, from Florian.
      
      6) Sanity check userspace offsets from ebtables kernel, from Florian.
      
      7) Missing checksum replace call in flowtable IPv4 DNAT, from Felix
         Fietkau.
      
      8) Bump the right stats on checksum error from bridge netfilter,
         from Taehee Yoo.
      
      9) Unset interface flag in IPv6 fib lookups otherwise we get
         misleading routing lookup results, from Florian.
      
      10) Missing sk_to_full_sk() in ip6_route_me_harder() from Eric Dumazet.
      
      11) Don't allow devices to be part of multiple flowtables at the same
          time, this may break setups.
      
      12) Missing netlink attribute validation in flowtable deletion.
      
      13) Wrong array index in nf_unregister_net_hook() call from error path
          in flowtable addition path.
      
      14) Fix FTP IPVS helper when NAT mangling is in place, patch from
          Julian Anastasov.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a0c7191
  5. 02 3月, 2018 7 次提交
    • D
      Merge tag 'mac80211-for-davem-2018-03-02' of... · d69242bf
      David S. Miller 提交于
      Merge tag 'mac80211-for-davem-2018-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      Johannes Berg says:
      
      ====================
      Three more patches:
       * fix for a regression in 4-addr mode with fast-RX
       * fix for a Kconfig problem with the new regdb
       * fix for the long-standing TCP performance issue in
         wifi using the new sk_pacing_shift_update()
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d69242bf
    • K
      rds: Incorrect reference counting in TCP socket creation · 84eef2b2
      Ka-Cheong Poon 提交于
      Commit 0933a578 ("rds: tcp: use sock_create_lite() to create the
      accept socket") has a reference counting issue in TCP socket creation
      when accepting a new connection.  The code uses sock_create_lite() to
      create a kernel socket.  But it does not do __module_get() on the
      socket owner.  When the connection is shutdown and sock_release() is
      called to free the socket, the owner's reference count is decremented
      and becomes incorrect.  Note that this bug only shows up when the socket
      owner is configured as a kernel module.
      
      v2: Update comments
      
      Fixes: 0933a578 ("rds: tcp: use sock_create_lite() to create the accept socket")
      Signed-off-by: NKa-Cheong Poon <ka-cheong.poon@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84eef2b2
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · a5f7b0ee
      David S. Miller 提交于
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2018-02-28
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Add schedule points and reduce the number of loop iterations
         the test_bpf kernel module is performing in order to not hog
         the CPU for too long, from Eric.
      
      2) Fix an out of bounds access in tail calls in the ppc64 BPF
         JIT compiler, from Daniel.
      
      3) Fix a crash on arm64 on unaligned BPF xadd operations that
         could be triggered via interpreter and JIT, from Daniel.
      
      Please not that once you merge net into net-next at some point, there
      is a minor merge conflict in test_verifier.c since test cases had
      been added at the end in both trees. Resolution is trivial: keep all
      the test cases from both trees.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5f7b0ee
    • E
      net: ethtool: don't ignore return from driver get_fecparam method · a6d50512
      Edward Cree 提交于
      If ethtool_ops->get_fecparam returns an error, pass that error on to the
       user, rather than ignoring it.
      
      Fixes: 1a5f3da2 ("net: ethtool: add support for forward error correction modes")
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6d50512
    • S
      vrf: check forwarding on the original netdevice when generating ICMP dest unreachable · e2c0dc1f
      Stephen Suryaputra 提交于
      When ip_error() is called the device is the l3mdev master instead of the
      original device. So the forwarding check should be on the original one.
      
      Changes from v2:
      - Handle the original device disappearing (per David Ahern)
      - Minimize the change in code order
      
      Changes from v1:
      - Only need to reset the device on which __in_dev_get_rcu() is done (per
        David Ahern).
      Signed-off-by: NStephen Suryaputra <ssuryaextr@gmail.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2c0dc1f
    • M
      net: allow interface to be set into VRF if VLAN interface in same VRF · 50d629e7
      Mike Manning 提交于
      Setting an interface into a VRF fails with 'RTNETLINK answers: File
      exists' if one of its VLAN interfaces is already in the same VRF.
      As the VRF is an upper device of the VLAN interface, it is also showing
      up as an upper device of the interface itself. The solution is to
      restrict this check to devices other than master. As only one master
      device can be linked to a device, the check in this case is that the
      upper device (VRF) being linked to is not the same as the master device
      instead of it not being any one of the upper devices.
      
      The following example shows an interface ens12 (with a VLAN interface
      ens12.10) being set into VRF green, which behaves as expected:
      
        # ip link add link ens12 ens12.10 type vlan id 10
        # ip link set dev ens12 master vrfgreen
        # ip link show dev ens12
          3: ens12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel
             master vrfgreen state UP mode DEFAULT group default qlen 1000
             link/ether 52:54:00:4c:a0:45 brd ff:ff:ff:ff:ff:ff
      
      But if the VLAN interface has previously been set into the same VRF,
      then setting the interface into the VRF fails:
      
        # ip link set dev ens12 nomaster
        # ip link set dev ens12.10 master vrfgreen
        # ip link show dev ens12.10
          39: ens12.10@ens12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
          qdisc noqueue master vrfgreen state UP mode DEFAULT group default
          qlen 1000 link/ether 52:54:00:4c:a0:45 brd ff:ff:ff:ff:ff:ff
        # ip link set dev ens12 master vrfgreen
          RTNETLINK answers: File exists
      
      The workaround is to move the VLAN interface back into the default VRF
      beforehand, but it has to be shut first so as to avoid the risk of
      traffic leaking from the VRF. This fix avoids needing this workaround.
      Signed-off-by: NMike Manning <mmanning@att.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      50d629e7
    • A
      net: ipv4: avoid unused variable warning for sysctl · 773daa3c
      Arnd Bergmann 提交于
      The newly introudced ip_min_valid_pmtu variable is only used when
      CONFIG_SYSCTL is set:
      
      net/ipv4/route.c:135:12: error: 'ip_min_valid_pmtu' defined but not used [-Werror=unused-variable]
      
      This moves it to the other variables like it, to avoid the harmless
      warning.
      
      Fixes: c7272c2f ("net: ipv4: don't allow setting net.ipv4.route.min_pmtu below 68")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      773daa3c
  6. 01 3月, 2018 11 次提交
    • J
      ipvs: remove IPS_NAT_MASK check to fix passive FTP · 8a949fff
      Julian Anastasov 提交于
      The IPS_NAT_MASK check in 4.12 replaced previous check for nfct_nat()
      which was needed to fix a crash in 2.6.36-rc, see
      commit 7bcbf81a ("ipvs: avoid oops for passive FTP").
      But as IPVS does not set the IPS_SRC_NAT and IPS_DST_NAT bits,
      checking for IPS_NAT_MASK prevents PASV response to be properly
      mangled and blocks the transfer. Remove the check as it is not
      needed after 3.12 commit 41d73ec0 ("netfilter: nf_conntrack:
      make sequence number adjustments usuable without NAT") which
      changes nfct_nat() with nfct_seqadj() and especially after 3.13
      commit b25adce1 ("ipvs: correct usage/allocation of seqadj
      ext in ipvs").
      
      Thanks to Li Shuang and Florian Westphal for reporting the problem!
      Reported-by: NLi Shuang <shuali@redhat.com>
      Fixes: be7be6e1 ("netfilter: ipvs: fix incorrect conflict resolution")
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Acked-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      8a949fff
    • D
      Merge branch 'mlxsw-fixes' · b739012b
      David S. Miller 提交于
      Jiri Pirko says:
      
      ====================
      mlxsw: couple of fixes
      
      Couple of unrelated fixes for mlxsw.
      
      ---
      v1->v2:
      -patch 2:
       - rebase on top of current -net tree
       - removed forgotten empty line
      -patch 3:
       - new patch
      -patch 4:
       - new patch
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b739012b
    • I
      spectrum: Reference count VLAN entries · b3529af6
      Ido Schimmel 提交于
      One of the basic construct in the device is a port-VLAN pair, which can
      be bound to a FID or a RIF in order to direct packets to the bridge or
      the router, respectively.
      
      Since not all the netdevs are configured with a VLAN (e.g., sw1p1 vs.
      sw1p1.10), VID 1 is used to represent these and thus this VID can be
      used by both upper devices of mlxsw ports and by the driver itself.
      
      However, this VID is not reference counted and therefore might be freed
      prematurely, which can result in various WARNINGs. For example:
      
      $ ip link add name br0 type bridge vlan_filtering 1
      $ teamd -t team0 -d -c '{"runner": {"name": "lacp"}}'
      $ ip link set dev team0 master br0
      $ ip link set dev enp1s0np1 master team0
      $ ip address add 192.0.2.1/24 dev enp1s0np1
      
      The enslavement to team0 will fail because team0 already has an upper
      and thus vlan_vids_del_by_dev() will be executed as part of team's error
      path which will delete VID 1 from enp1s0np1 (added by br0 as PVID). The
      WARNING will be generated when the driver will realize it can't find VID
      1 on the port and bind it to a RIF.
      
      Fix this by adding a reference count to the VLAN entries on the port, in
      a similar fashion to the reference counting used by the corresponding
      'vlan_vid_info' structure in the 8021q driver.
      
      Fixes: c57529e1 ("mlxsw: spectrum: Replace vPorts with Port-VLAN")
      Reported-by: NTal Bar <talb@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Tested-by: NTal Bar <talb@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3529af6
    • I
      mlxsw: spectrum: Treat IPv6 unregistered multicast as broadcast · 9d45deb0
      Ido Schimmel 提交于
      When multicast snooping is enabled, the Linux bridge resorts to flooding
      unregistered multicast packets to all ports only in case it did not
      detect a querier in the network.
      
      The above condition is not reflected to underlying drivers, which is
      especially problematic in IPv6 environments, as multicast snooping is
      enabled by default and since neighbour solicitation packets might be
      treated as unregistered multicast packets in case there is no
      corresponding MDB entry.
      
      Until the Linux bridge reflects its querier state to underlying drivers,
      simply treat unregistered multicast packets as broadcast and allow them
      to reach their destination.
      
      Fixes: 9df552ef ("mlxsw: spectrum: Improve IPv6 unregistered multicast flooding")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reported-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9d45deb0
    • J
      mlxsw: spectrum: Fix handling of resource_size_param · 77d27096
      Jiri Pirko 提交于
      Current code uses global variables, adjusts them and passes pointer down
      to devlink. With every other mlxsw_core instance, the previously passed
      pointer values are rewritten. Fix this by de-globalize the variables and
      also memcpy size_params during devlink resource registration.
      Also, introduce a convenient size_param_init helper.
      
      Fixes: ef3116e5 ("mlxsw: spectrum: Register KVD resources with devlink")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      77d27096
    • J
      mlxsw: core: Fix flex keys scratchpad offset conflict · 2ddc94c7
      Jiri Pirko 提交于
      IP_TTL, IP_ECN and IP_DSCP are using the same offset within the
      scratchpad as L4 ports. Fix this by shifting all up.
      
      Fixes: 5f57e090 ("mlxsw: acl: Add ip ttl acl element")
      Fixes: i80d0fe47 ("mlxsw: acl: Add ip tos acl element")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ddc94c7
    • D
      Merge branch 'net-smc-fixes' · 7358799c
      David S. Miller 提交于
      Ursula Braun says:
      
      ====================
      net/smc: fixes 2018-02-28
      
      here are 3 smc bug fixes for the net-tree. Karsten's first patch is
      the reworked version of last week's
         "[PATCH net-next 2/5] net/smc: fix structure size"
      patch, now solved without using __packed, and now targetted for net
      instead of net-next.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7358799c
    • D
      net/smc: fix NULL pointer dereference on sock_create_kern() error path · a5dcb73b
      Davide Caratti 提交于
      when sock_create_kern(..., a) returns an error, 'a' might not be a valid
      pointer, so it shouldn't be dereferenced to read a->sk->sk_sndbuf and
      and a->sk->sk_rcvbuf; not doing that caused the following crash:
      
      general protection fault: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
          (ftrace buffer empty)
      Modules linked in:
      CPU: 0 PID: 4254 Comm: syzkaller919713 Not tainted 4.16.0-rc1+ #18
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      RIP: 0010:smc_create+0x14e/0x300 net/smc/af_smc.c:1410
      RSP: 0018:ffff8801b06afbc8 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: ffff8801b63457c0 RCX: ffffffff85a3e746
      RDX: 0000000000000004 RSI: 00000000ffffffff RDI: 0000000000000020
      RBP: ffff8801b06afbf0 R08: 00000000000007c0 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      R13: ffff8801b6345c08 R14: 00000000ffffffe9 R15: ffffffff8695ced0
      FS:  0000000001afb880(0000) GS:ffff8801db200000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000040 CR3: 00000001b0721004 CR4: 00000000001606f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
        __sock_create+0x4d4/0x850 net/socket.c:1285
        sock_create net/socket.c:1325 [inline]
        SYSC_socketpair net/socket.c:1409 [inline]
        SyS_socketpair+0x1c0/0x6f0 net/socket.c:1366
        do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x26/0x9b
      RIP: 0033:0x4404b9
      RSP: 002b:00007fff44ab6908 EFLAGS: 00000246 ORIG_RAX: 0000000000000035
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004404b9
      RDX: 0000000000000000 RSI: 0000000000000001 RDI: 000000000000002b
      RBP: 00007fff44ab6910 R08: 0000000000000002 R09: 00007fff44003031
      R10: 0000000020000040 R11: 0000000000000246 R12: ffffffffffffffff
      R13: 0000000000000006 R14: 0000000000000000 R15: 0000000000000000
      Code: 48 c1 ea 03 80 3c 02 00 0f 85 b3 01 00 00 4c 8b a3 48 04 00 00 48
      b8
      00 00 00 00 00 fc ff df 49 8d 7c 24 20 48 89 fa 48 c1 ea 03 <80> 3c 02
      00
      0f 85 82 01 00 00 4d 8b 7c 24 20 48 b8 00 00 00 00
      RIP: smc_create+0x14e/0x300 net/smc/af_smc.c:1410 RSP: ffff8801b06afbc8
      
      Fixes: cd6851f3 smc: remote memory buffers (RMBs)
      Reported-and-tested-by: syzbot+aa0227369be2dcc26ebe@syzkaller.appspotmail.com
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5dcb73b
    • K
      net/smc: use link_id of server in confirm link reply · 2be922f3
      Karsten Graul 提交于
      The CONFIRM LINK reply message must contain the link_id sent
      by the server. And set the link_id explicitly when
      initializing the link.
      Signed-off-by: NKarsten Graul <kgraul@linux.vnet.ibm.com>
      Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2be922f3
    • K
      net/smc: use a constant for control message length · cbba07a7
      Karsten Graul 提交于
      The sizeof(struct smc_cdc_msg) evaluates to 48 bytes instead of the
      required 44 bytes. We need to use the constant value of
      SMC_WR_TX_SIZE to set and check the control message length.
      Signed-off-by: NKarsten Graul <kgraul@linux.vnet.ibm.com>
      Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbba07a7
    • J
      virtio-net: disable NAPI only when enabled during XDP set · 4e09ff53
      Jason Wang 提交于
      We try to disable NAPI to prevent a single XDP TX queue being used by
      multiple cpus. But we don't check if device is up (NAPI is enabled),
      this could result stall because of infinite wait in
      napi_disable(). Fixing this by checking device state through
      netif_running() before.
      
      Fixes: 4941d472 ("virtio-net: do not reset during XDP set")
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e09ff53