1. 26 4月, 2018 4 次提交
    • M
      sctp: fix const parameter violation in sctp_make_sack · 47b3ba51
      Marcelo Ricardo Leitner 提交于
      sctp_make_sack() make changes to the asoc and this cast is just
      bypassing the const attribute. As there is no need to have the const
      there, just remove it and fix the violation.
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Reviewed-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: Neil Horman <nhorman@tuxdriver.com
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47b3ba51
    • R
      neighbour: support for NTF_EXT_LEARNED flag · 9ce33e46
      Roopa Prabhu 提交于
      This patch extends NTF_EXT_LEARNED support to the neighbour system.
      Example use-case: An Ethernet VPN implementation (eg in FRR routing suite)
      can use this flag to add dynamic reachable external neigh entires
      learned via control plane. The use of neigh NTF_EXT_LEARNED in this
      patch is consistent with its use with bridge and vxlan fdb entries.
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ce33e46
    • I
      ipv6: addrconf: don't evaluate keep_addr_on_down twice · 0aef78aa
      Ivan Vecera 提交于
      The addrconf_ifdown() evaluates keep_addr_on_down state twice. There
      is no need to do it.
      
      Cc: David Ahern <dsahern@gmail.com>
      Signed-off-by: NIvan Vecera <cera@cera.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0aef78aa
    • A
      ipv6: sr: Compute flowlabel for outer IPv6 header of seg6 encap mode · b5facfdb
      Ahmed Abdelsalam 提交于
      ECMP (equal-cost multipath) hashes are typically computed on the packets'
      5-tuple(src IP, dst IP, src port, dst port, L4 proto).
      
      For encapsulated packets, the L4 data is not readily available and ECMP
      hashing will often revert to (src IP, dst IP). This will lead to traffic
      polarization on a single ECMP path, causing congestion and waste of network
      capacity.
      
      In IPv6, the 20-bit flow label field is also used as part of the ECMP hash.
      In the lack of L4 data, the hashing will be on (src IP, dst IP, flow
      label). Having a non-zero flow label is thus important for proper traffic
      load balancing when L4 data is unavailable (i.e., when packets are
      encapsulated).
      
      Currently, the seg6_do_srh_encap() function extracts the original packet's
      flow label and set it as the outer IPv6 flow label. There are two issues
      with this behaviour:
      
      a) There is no guarantee that the inner flow label is set by the source.
      b) If the original packet is not IPv6, the flow label will be set to
      zero (e.g., IPv4 or L2 encap).
      
      This patch adds a function, named seg6_make_flowlabel(), that computes a
      flow label from a given skb. It supports IPv6, IPv4 and L2 payloads, and
      leverages the per namespace 'seg6_flowlabel" sysctl value.
      
      The currently support behaviours are as follows:
      -1 set flowlabel to zero.
      0 copy flowlabel from Inner paceket in case of Inner IPv6
      (Set flowlabel to 0 in case IPv4/L2)
      1 Compute the flowlabel using seg6_make_flowlabel()
      
      This patch has been tested for IPv6, IPv4, and L2 traffic.
      Signed-off-by: NAhmed Abdelsalam <amsalam20@gmail.com>
      Acked-by: NDavid Lebrun <dlebrun@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5facfdb
  2. 25 4月, 2018 29 次提交
    • D
      c749fa18
    • L
      Merge branch 'userns-linus' of... · 3be4aaf4
      Linus Torvalds 提交于
      Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
      
      Pull userns bug fix from Eric Biederman:
       "Just a small fix to properly set the return code on error"
      
      * 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        commoncap: Handle memory allocation failure.
      3be4aaf4
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 24cac700
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix rtnl deadlock in ipvs, from Julian Anastasov.
      
       2) s390 qeth fixes from Julian Wiedmann (control IO completion stalls,
          bad MAC address update sequence, request side races on command IO
          timeouts).
      
       3) Handle seq_file overflow properly in l2tp, from Guillaume Nault.
      
       4) Fix VLAN priority mappings in cpsw driver, from Ivan Khoronzhuk.
      
       5) Packet scheduler ife action fixes (malformed TLV lengths, etc.) from
          Alexander Aring.
      
       6) Fix out of bounds access in tcp md5 option parser, from Jann Horn.
      
       7) Missing netlink attribute policies in rtm_ipv6_policy table, from
          Eric Dumazet.
      
       8) Missing socket address length checks in l2tp and pppoe connect, from
          Guillaume Nault.
      
       9) Fix netconsole over team and bonding, from Xin Long.
      
      10) Fix race with AF_PACKET socket state bitfields, from Willem de
          Bruijn.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (51 commits)
        ice: Fix insufficient memory issue in ice_aq_manage_mac_read
        sfc: ARFS filter IDs
        net: ethtool: Add missing kernel doc for FEC parameters
        packet: fix bitfield update race
        ice: Do not check INTEVENT bit for OICR interrupts
        ice: Fix incorrect comment for action type
        ice: Fix initialization for num_nodes_added
        igb: Fix the transmission mode of queue 0 for Qav mode
        ixgbevf: ensure xdp_ring resources are free'd on error exit
        team: fix netconsole setup over team
        amd-xgbe: Only use the SFP supported transceiver signals
        amd-xgbe: Improve KR auto-negotiation and training
        amd-xgbe: Add pre/post auto-negotiation phy hooks
        pppoe: check sockaddr length in pppoe_connect()
        l2tp: check sockaddr length in pppol2tp_connect()
        net: phy: marvell: clear wol event before setting it
        ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy
        bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave
        tcp: don't read out-of-bounds opsize
        ibmvnic: Clean actual number of RX or TX pools
        ...
      24cac700
    • S
      liquidio: Swap VF representor Tx and Rx statistics · 16f4faa4
      Srinivas Jampala 提交于
      Swap VF representor tx and rx interface statistics since it is a
      virtual switchdev port and tx for VM should be rx for VF representor
      and vice-versa.
      Signed-off-by: NSrinivas Jampala <srinivasa.jampala@cavium.com>
      Acked-by: NDerek Chickles <derek.chickles@cavium.com>
      Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16f4faa4
    • E
      net/ipv6: fix LOCKDEP issue in rt6_remove_exception_rt() · 091311de
      Eric Dumazet 提交于
      rt6_remove_exception_rt() is called under rcu_read_lock() only.
      
      We lock rt6_exception_lock a bit later, so we do not hold
      rt6_exception_lock yet.
      
      Fixes: 8a14e46f ("net/ipv6: Fix missing rcu dereferences on from")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: David Ahern <dsahern@gmail.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      091311de
    • D
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · d19efb72
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2018-04-24
      
      This series contains fixes to ixgbevf, igb and ice drivers.
      
      Colin Ian King fixes the return value on error for the new XDP support
      that went into ixgbevf for 4.17.
      
      Vinicius provides a fix for queue 0 for igb, which was not receiving all
      the credits it needed when QAV mode was enabled.
      
      Anirudh provides several fixes for the new ice driver, starting with
      properly initializing num_nodes_added to zero.  Fixed up a code comment
      to better reflect what is really going on in the code.  Fixed how to
      detect if an OICR interrupt has occurred to a more reliable method.
      
      Md Fahad fixes the ice driver to allocate the right amount of memory
      when reading and storing the devices MAC addresses.  The device can have
      up to 2 MAC addresses (LAN and WoL), while WoL is currently not
      supported, we need to ensure it can be properly handled when support is
      added.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d19efb72
    • C
      net/tls: remove redundant second null check on sgout · 95ad7544
      Colin Ian King 提交于
      A duplicated null check on sgout is redundant as it is known to be
      already true because of the identical earlier check. Remove it.
      Detected by cppcheck:
      
      net/tls/tls_sw.c:696: (warning) Identical inner 'if' condition is always
      true.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95ad7544
    • C
      fsl/fman_port: remove redundant check on port->rev_info.major · 080aadda
      Colin Ian King 提交于
      The check port->rev_info.major >= 6 is being performed twice, thus
      the inner second check is always true and is redundant, hence it
      can be removed. Detected by cppcheck.
      
      drivers/net/ethernet/freescale/fman/fman_port.c:1394]: (warning)
      Identical inner 'if' condition is always true.
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      080aadda
    • M
      ice: Fix insufficient memory issue in ice_aq_manage_mac_read · d6fef10c
      Md Fahad Iqbal Polash 提交于
      For the MAC read operation, the device can return up to two (LAN and WoL)
      MAC addresses. Without access to adequate memory, the device will return
      an error. Fixed this by allocating the right amount of memory. Also, logic
      to detect and copy the LAN MAC address into the port_info structure has
      been added. Note that the WoL MAC address is ignored currently as the WoL
      feature isn't supported yet.
      
      Fixes: dc49c772 ("ice: Get MAC/PHY/link info and scheduler topology")
      Signed-off-by: NMd Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
      Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      d6fef10c
    • D
      qed: Fix copying 2 strings · c7d852e3
      Denis Bolotin 提交于
      The strscpy() was a recent fix (net: qed: use correct strncpy() size) to
      prevent passing the length of the source buffer to strncpy() and guarantee
      null termination.
      It misses the goal of overwriting only the first 3 characters in
      "???_BIG_RAM" and "???_RAM" while keeping the rest of the string.
      Use strncpy() with the length of 3, without null termination.
      Signed-off-by: NDenis Bolotin <denis.bolotin@cavium.com>
      Signed-off-by: NAriel Elior <ariel.elior@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7d852e3
    • E
      sfc: ARFS filter IDs · f8d62037
      Edward Cree 提交于
      Associate an arbitrary ID with each ARFS filter, allowing to properly query
       for expiry.  The association is maintained in a hash table, which is
       protected by a spinlock.
      
      v3: fix build warnings when CONFIG_RFS_ACCEL is disabled (thanks lkp-robot).
      v2: fixed uninitialised variable (thanks davem and lkp-robot).
      
      Fixes: 3af0f342 ("sfc: replace asynchronous filter operations")
      Signed-off-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8d62037
    • D
      Merge branch 'ipconfig-NTP-server-support-bug-fixes-documentation-improvements' · bc0fbc66
      David S. Miller 提交于
      Chris Novakovic says:
      
      ====================
      ipconfig: NTP server support, bug fixes, documentation improvements
      
      This series (against net-next) makes various improvements to ipconfig:
      
       - Patch #1 correctly documents the behaviour of parameter 4 in the
         "ip=" and "nfsaddrs=" command line parameter.
       - Patch #2 tidies up the printk()s for reporting configured name
         servers.
       - Patch #3 fixes a bug in autoconfiguration via BOOTP whereby the IP
         addresses of IEN-116 name servers are requested from the BOOTP
         server, rather than those of DNS name servers.
       - Patch #4 requests the number of DNS servers specified by
         CONF_NAMESERVERS_MAX when autoconfiguring via BOOTP, rather than
         hardcoding it to 2.
       - Patch #5 fully documents the contents and format of /proc/net/pnp in
         Documentation/filesystems/nfs/nfsroot.txt.
       - Patch #6 fixes a bug whereby bogus information is written to
         /proc/net/pnp when ipconfig is not used.
       - Patch #7 creates a new procfs directory for ipconfig-related
         configuration reports at /proc/net/ipconfig.
       - Patch #8 allows for NTP servers to be configured (manually on the
         kernel command line or automatically via DHCP), enabling systems with
         an NFS root filesystem to synchronise their clock before mounting
         their root filesystem. NTP server IP addresses are written to
         /proc/net/ipconfig/ntp_servers.
      
      Changes from v1:
      
       - David requested that a new directory /proc/net/ipconfig be created to
         contain ipconfig-related configuration reports, which is implemented
         in the new patch #7. NTP server IPs are now written to this directory
         instead of /proc/net/ntp in the new patch #8.
       - Cong and David both requested that the modification to CREDITS be
         dropped. This patch has been removed from the series.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc0fbc66
    • C
      ipconfig: Write NTP server IPs to /proc/net/ipconfig/ntp_servers · c04d2cb2
      Chris Novakovic 提交于
      Distributed filesystems are most effective when the server and client
      clocks are synchronised. Embedded devices often use NFS for their
      root filesystem but typically do not contain an RTC, so the clocks of
      the NFS server and the embedded device will be out-of-sync when the root
      filesystem is mounted (and may not be synchronised until late in the
      boot process).
      
      Extend ipconfig with the ability to export IP addresses of NTP servers
      it discovers to /proc/net/ipconfig/ntp_servers. They can be supplied as
      follows:
      
       - If ipconfig is configured manually via the "ip=" or "nfsaddrs="
         kernel command line parameters, one NTP server can be specified in
         the new "<ntp0-ip>" parameter.
       - If ipconfig is autoconfigured via DHCP, request DHCP option 42 in
         the DHCPDISCOVER message, and record the IP addresses of up to three
         NTP servers sent by the responding DHCP server in the subsequent
         DHCPOFFER message.
      
      ipconfig will only write the NTP server IP addresses it discovers to
      /proc/net/ipconfig/ntp_servers, one per line (in the order received from
      the DHCP server, if DHCP autoconfiguration is used); making use of these
      NTP servers is the responsibility of a user space process (e.g. an
      initrd/initram script that invokes an NTP client before mounting an NFS
      root filesystem).
      Signed-off-by: NChris Novakovic <chris@chrisn.me.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c04d2cb2
    • C
      ipconfig: Create /proc/net/ipconfig directory · 4d019b3f
      Chris Novakovic 提交于
      To allow ipconfig to report IP configuration details to user space
      processes without cluttering /proc/net, create a new subdirectory
      /proc/net/ipconfig. All files containing IP configuration details should
      be written to this directory.
      Signed-off-by: NChris Novakovic <chris@chrisn.me.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d019b3f
    • C
      ipconfig: Correctly initialise ic_nameservers · 300eec7c
      Chris Novakovic 提交于
      ic_nameservers, which stores the list of name servers discovered by
      ipconfig, is initialised (i.e. has all of its elements set to NONE, or
      0xffffffff) by ic_nameservers_predef() in the following scenarios:
      
       - before the "ip=" and "nfsaddrs=" kernel command line parameters are
         parsed (in ip_auto_config_setup());
       - before autoconfiguring via DHCP or BOOTP (in ic_bootp_init()), in
         order to clear any values that may have been set after parsing "ip="
         or "nfsaddrs=" and are no longer needed.
      
      This means that ic_nameservers_predef() is not called when neither "ip="
      nor "nfsaddrs=" is specified on the kernel command line. In this
      scenario, every element in ic_nameservers remains set to 0x00000000,
      which is indistinguishable from ANY and causes pnp_seq_show() to write
      the following (bogus) information to /proc/net/pnp:
      
        #MANUAL
        nameserver 0.0.0.0
        nameserver 0.0.0.0
        nameserver 0.0.0.0
      
      This is potentially problematic for systems that blindly link
      /etc/resolv.conf to /proc/net/pnp.
      
      Ensure that ic_nameservers is also initialised when neither "ip=" nor
      "nfsaddrs=" are specified by calling ic_nameservers_predef() in
      ip_auto_config(), but only when ip_auto_config_setup() was not called
      earlier. This causes the following to be written to /proc/net/pnp, and
      is consistent with what gets written when ipconfig is configured
      manually but no name servers are specified on the kernel command line:
      
        #MANUAL
      Signed-off-by: NChris Novakovic <chris@chrisn.me.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      300eec7c
    • C
      ipconfig: Document /proc/net/pnp · 8b0b37c5
      Chris Novakovic 提交于
      Fully document the format used by the /proc/net/pnp file written by
      ipconfig, explain where its values originate from, and clarify that the
      tertiary name server IP and DNS domain name are only written to the file
      when autoconfiguration is used.
      Signed-off-by: NChris Novakovic <chris@chrisn.me.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b0b37c5
    • C
      ipconfig: BOOTP: Request CONF_NAMESERVERS_MAX name servers · de1fa15b
      Chris Novakovic 提交于
      When ipconfig is autoconfigured via BOOTP, the request packet
      initialised by ic_bootp_init_ext() always allocates 8 bytes for the name
      server option, limiting the BOOTP server to responding with at most 2
      name servers even though ipconfig in fact supports an arbitrary number
      of name servers (as defined by CONF_NAMESERVERS_MAX, which is currently
      3).
      
      Only request name servers in the request packet if CONF_NAMESERVERS_MAX
      is positive (to comply with [1, §3.8]), and allocate enough space in the
      packet for CONF_NAMESERVERS_MAX name servers to indicate the maximum
      number we can accept in response.
      
      [1] RFC 2132, "DHCP Options and BOOTP Vendor Extensions":
          https://tools.ietf.org/rfc/rfc2132.txtSigned-off-by: NChris Novakovic <chris@chrisn.me.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de1fa15b
    • C
      ipconfig: BOOTP: Don't request IEN-116 name servers · 4e1a8af2
      Chris Novakovic 提交于
      When ipconfig is autoconfigured via BOOTP, the request packet
      initialised by ic_bootp_init_ext() allocates 8 bytes for tag 5 ("Name
      Server" [1, §3.7]), but tag 5 in the response isn't processed by
      ic_do_bootp_ext(). Instead, allocate the 8 bytes to tag 6 ("Domain Name
      Server" [1, §3.8]), which is processed by ic_do_bootp_ext(), and appears
      to have been the intended tag to request.
      
      This won't cause any breakage for existing users, as tag 5 responses
      provided by BOOTP servers weren't being processed anyway.
      
      [1] RFC 2132, "DHCP Options and BOOTP Vendor Extensions":
          https://tools.ietf.org/rfc/rfc2132.txtSigned-off-by: NChris Novakovic <chris@chrisn.me.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e1a8af2
    • C
      ipconfig: Tidy up reporting of name servers · e18bdc83
      Chris Novakovic 提交于
      Commit 5e953778 ("ipconfig: add
      nameserver IPs to kernel-parameter ip=") adds the IP addresses of
      discovered name servers to the summary printed by ipconfig when
      configuration is complete. It appears the intention in ip_auto_config()
      was to print the name servers on a new line (especially given the
      spacing and lack of comma before "nameserver0="), but they're actually
      printed on the same line as the NFS root filesystem configuration
      summary:
      
        [    0.686186] IP-Config: Complete:
        [    0.686226]      device=eth0, hwaddr=xx:xx:xx:xx:xx:xx, ipaddr=10.0.0.2, mask=255.255.255.0, gw=10.0.0.1
        [    0.686328]      host=test, domain=example.com, nis-domain=(none)
        [    0.686386]      bootserver=10.0.0.1, rootserver=10.0.0.1, rootpath=     nameserver0=10.0.0.1
      
      This makes it harder to read and parse ipconfig's output. Instead, print
      the name servers on a separate line:
      
        [    0.791250] IP-Config: Complete:
        [    0.791289]      device=eth0, hwaddr=xx:xx:xx:xx:xx:xx, ipaddr=10.0.0.2, mask=255.255.255.0, gw=10.0.0.1
        [    0.791407]      host=test, domain=example.com, nis-domain=(none)
        [    0.791475]      bootserver=10.0.0.1, rootserver=10.0.0.1, rootpath=
        [    0.791476]      nameserver0=10.0.0.1
      Signed-off-by: NChris Novakovic <chris@chrisn.me.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e18bdc83
    • C
      ipconfig: Document setting of NIS domain name · 660de409
      Chris Novakovic 提交于
      ic_do_bootp_ext() is responsible for parsing the "ip=" and "nfsaddrs="
      kernel parameters. If a "." character is found in parameter 4 (the
      client's hostname), everything before the first "." is used as the
      hostname, and everything after it is used as the NIS domain name (but
      not necessarily the DNS domain name).
      
      Document this behaviour in Documentation/filesystems/nfs/nfsroot.txt,
      as it is not made explicit.
      Signed-off-by: NChris Novakovic <chris@chrisn.me.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      660de409
    • F
      net: ethtool: Add missing kernel doc for FEC parameters · d805c520
      Florian Fainelli 提交于
      While adding support for ethtool::get_fecparam and set_fecparam, kernel
      doc for these functions was missed, add those.
      
      Fixes: 1a5f3da2 ("net: ethtool: add support for forward error correction modes")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d805c520
    • D
      Merge branch 'rhash-cleanups' · 5cb5ce33
      David S. Miller 提交于
      NeilBrown says:
      
      ====================
      A few rhashtables cleanups
      
      2 patches fixes documentation
      1 fixes a bit in rhashtable_walk_start()
      1 improves rhashtable_walk stability.
      
      All reviewed and Acked.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5cb5ce33
    • N
      rhashtable: improve rhashtable_walk stability when stop/start used. · 5d240a89
      NeilBrown 提交于
      When a walk of an rhashtable is interrupted with rhastable_walk_stop()
      and then rhashtable_walk_start(), the location to restart from is based
      on a 'skip' count in the current hash chain, and this can be incorrect
      if insertions or deletions have happened.  This does not happen when
      the walk is not stopped and started as iter->p is a placeholder which
      is safe to use while holding the RCU read lock.
      
      In rhashtable_walk_start() we can revalidate that 'p' is still in the
      same hash chain.  If it isn't then the current method is still used.
      
      With this patch, if a rhashtable walker ensures that the current
      object remains in the table over a stop/start period (possibly by
      elevating the reference count if that is sufficient), it can be sure
      that a walk will not miss objects that were in the hashtable for the
      whole time of the walk.
      
      rhashtable_walk_start() may not find the object even though it is
      still in the hashtable if a rehash has moved it to a new table.  In
      this case it will (eventually) get -EAGAIN and will need to proceed
      through the whole table again to be sure to see everything at least
      once.
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d240a89
    • N
      rhashtable: reset iter when rhashtable_walk_start sees new table · b41cc04b
      NeilBrown 提交于
      The documentation claims that when rhashtable_walk_start_check()
      detects a resize event, it will rewind back to the beginning
      of the table.  This is not true.  We need to set ->slot and
      ->skip to be zero for it to be true.
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b41cc04b
    • N
      rhashtable: Revise incorrect comment on r{hl, hash}table_walk_enter() · 82266e98
      NeilBrown 提交于
      Neither rhashtable_walk_enter() or rhltable_walk_enter() sleep, though
      they do take a spinlock without irq protection.
      So revise the comments to accurately state the contexts in which
      these functions can be called.
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82266e98
    • N
      rhashtable: remove outdated comments about grow_decision etc · 0c6f69a5
      NeilBrown 提交于
      grow_decision and shink_decision no longer exist, so remove
      the remaining references to them.
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c6f69a5
    • E
      tcp: md5: only call tp->af_specific->md5_lookup() for md5 sockets · 8c2320e8
      Eric Dumazet 提交于
      RETPOLINE made calls to tp->af_specific->md5_lookup() quite expensive,
      given they have no result.
      We can omit the calls for sockets that have no md5 keys.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c2320e8
    • W
      packet: fix bitfield update race · a6361f0c
      Willem de Bruijn 提交于
      Updates to the bitfields in struct packet_sock are not atomic.
      Serialize these read-modify-write cycles.
      
      Move po->running into a separate variable. Its writes are protected by
      po->bind_lock (except for one startup case at packet_create). Also
      replace a textual precondition warning with lockdep annotation.
      
      All others are set only in packet_setsockopt. Serialize these
      updates by holding the socket lock. Analogous to other field updates,
      also hold the lock when testing whether a ring is active (pg_vec).
      
      Fixes: 8dc41944 ("[PACKET]: Add optional checksum computation for recvmsg")
      Reported-by: NDaeRyong Jeong <threeearcat@gmail.com>
      Reported-by: NByoungyoung Lee <byoungyoung@purdue.edu>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6361f0c
    • B
      ice: Do not check INTEVENT bit for OICR interrupts · 30d84397
      Ben Shelton 提交于
      According to the hardware spec, checking the INTEVENT bit isn't a
      reliable way to detect if an OICR interrupt has occurred. This is
      because this bit can be cleared by the hardware/firmware before the
      interrupt service routine has run. So instead, just check for OICR
      events every time.
      
      Fixes: 940b61af ("ice: Initialize PF and setup miscellaneous interrupt")
      Signed-off-by: NBen Shelton <benjamin.h.shelton@intel.com>
      Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      30d84397
  3. 24 4月, 2018 7 次提交