1. 22 8月, 2016 1 次提交
  2. 11 8月, 2016 1 次提交
    • A
      net/xfrm_input: fix possible NULL deref of tunnel.ip6->parms.i_key · 1625f452
      Alexey Kodanev 提交于
      Running LTP 'icmp-uni-basic.sh -6 -p ipcomp -m tunnel' test over
      openvswitch + veth can trigger kernel panic:
      
        BUG: unable to handle kernel NULL pointer dereference
        at 00000000000000e0 IP: [<ffffffff8169d1d2>] xfrm_input+0x82/0x750
        ...
        [<ffffffff816d472e>] xfrm6_rcv_spi+0x1e/0x20
        [<ffffffffa082c3c2>] xfrm6_tunnel_rcv+0x42/0x50 [xfrm6_tunnel]
        [<ffffffffa082727e>] tunnel6_rcv+0x3e/0x8c [tunnel6]
        [<ffffffff8169f365>] ip6_input_finish+0xd5/0x430
        [<ffffffff8169fc53>] ip6_input+0x33/0x90
        [<ffffffff8169f1d5>] ip6_rcv_finish+0xa5/0xb0
        ...
      
      It seems that tunnel.ip6 can have garbage values and also dereferenced
      without a proper check, only tunnel.ip4 is being verified. Fix it by
      adding one more if block for AF_INET6 and initialize tunnel.ip6 with NULL
      inside xfrm6_rcv_spi() (which is similar to xfrm4_rcv_spi()).
      
      Fixes: 049f8e2e ("xfrm: Override skb->mark with tunnel->parm.i_key in xfrm_input")
      Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      1625f452
  3. 29 7月, 2016 1 次提交
    • T
      xfrm: Ignore socket policies when rebuilding hash tables · 6916fb3b
      Tobias Brunner 提交于
      Whenever thresholds are changed the hash tables are rebuilt.  This is
      done by enumerating all policies and hashing and inserting them into
      the right table according to the thresholds and direction.
      
      Because socket policies are also contained in net->xfrm.policy_all but
      no hash tables are defined for their direction (dir + XFRM_POLICY_MAX)
      this causes a NULL or invalid pointer dereference after returning from
      policy_hash_bysel() if the rebuild is done while any socket policies
      are installed.
      
      Since the rebuild after changing thresholds is scheduled this crash
      could even occur if the userland sets thresholds seemingly before
      installing any socket policies.
      
      Fixes: 53c2e285 ("xfrm: Do not hash socket policies")
      Signed-off-by: NTobias Brunner <tobias@strongswan.org>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      6916fb3b
  4. 27 7月, 2016 2 次提交
  5. 18 7月, 2016 1 次提交
  6. 17 7月, 2016 4 次提交
  7. 16 7月, 2016 10 次提交
  8. 15 7月, 2016 5 次提交
    • G
      i40e: use valid online CPU on q_vector initialization · 7f6c5539
      Guilherme G. Piccoli 提交于
      Currently, the q_vector initialization routine sets the affinity_mask
      of a q_vector based on v_idx value. Meaning a loop iterates on v_idx,
      which is an incremental value, and the cpumask is created based on
      this value.
      
      This is a problem in systems with multiple logical CPUs per core (like in
      SMT scenarios). If we disable some logical CPUs, by turning SMT off for
      example, we will end up with a sparse cpu_online_mask, i.e., only the first
      CPU in a core is online, and incremental filling in q_vector cpumask might
      lead to multiple offline CPUs being assigned to q_vectors.
      
      Example: if we have a system with 8 cores each one containing 8 logical
      CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is
      disabled, only the 1st CPU in each core remains online, so the
      cpu_online_mask in this case would have only 8 bits set, in a sparse way.
      
      In general case, when SMT is off the cpu_online_mask has only C bits set:
      0, 1*N, 2*N, ..., C*(N-1)  where
      C == # of cores;
      N == # of logical CPUs per core.
      In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set.
      
      This patch changes the way q_vector's affinity_mask is created: it iterates
      on v_idx, but consumes the CPU index from the cpu_online_mask instead of
      just using the v_idx incremental value.
      
      No functional changes were introduced.
      Signed-off-by: NGuilherme G Piccoli <gpiccoli@linux.vnet.ibm.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      7f6c5539
    • P
      ixgbe: napi_poll must return the work done · 4b732cd4
      Paolo Abeni 提交于
      Currently the function ixgbe_poll() returns 0 when it clean completely
      the rx rings, but this foul budget accounting in core code.
      Fix this returning the actual work done, capped to weight - 1, since
      the core doesn't allow to return the full budget when the driver modifies
      the napi status
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NVenkatesh Srinivas <venkateshs@google.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4b732cd4
    • K
      i40e: enable VSI broadcast promiscuous mode instead of adding broadcast filter · f6bd0962
      Kiran Patil 提交于
      This patch sets VSI broadcast promiscuous mode during VSI add sequence
      and prevents adding MAC filter if specified MAC address is broadcast.
      
      Change-ID: Ia62251fca095bc449d0497fc44bec3a5a0136773
      Signed-off-by: NKiran Patil <kiran.patil@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      f6bd0962
    • A
      i40e/i40evf: Fix i40e_rx_checksum · 858296c8
      Alexander Duyck 提交于
      There are a couple of issues I found in i40e_rx_checksum while doing some
      recent testing.  As a result I have found the Rx checksum logic is pretty
      much broken and returning that the checksum is valid for tunnels in cases
      where it is not.
      
      First the inner types are not the correct values to use to test for if a
      tunnel is present or not.  In addition the inner protocol types are not a
      bitmask as such performing an OR of the values doesn't make sense.  I have
      instead changed the code so that the inner protocol types are used to
      determine if we report CHECKSUM_UNNECESSARY or not.  For anything that does
      not end in UDP, TCP, or SCTP it doesn't make much sense to report a
      checksum offload since it won't contain a checksum anyway.
      
      This leaves us with the need to set the csum_level based on some value.
      For that purpose I am using the tunnel_type field.  If the tunnel type is
      GRENAT or greater then this means we have a GRE or UDP tunnel with an inner
      header.  In the case of GRE or UDP we will have a possible checksum present
      so for this reason it should be safe to set the csum_level to 1 to indicate
      that we are reporting the state of the inner header.
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      858296c8
    • B
      bonding: set carrier off for devices created through netlink · 005db31d
      Beniamino Galvani 提交于
      Commit e826eafa ("bonding: Call netif_carrier_off after
      register_netdevice") moved netif_carrier_off() from bond_init() to
      bond_create(), but the latter is called only for initial default
      devices and ones created through sysfs:
      
       $ modprobe bonding
       $ echo +bond1 > /sys/class/net/bonding_masters
       $ ip link add bond2 type bond
       $ grep "MII Status" /proc/net/bonding/*
       /proc/net/bonding/bond0:MII Status: down
       /proc/net/bonding/bond1:MII Status: down
       /proc/net/bonding/bond2:MII Status: up
      
      Ensure that carrier is initially off also for devices created through
      netlink.
      Signed-off-by: NBeniamino Galvani <bgalvani@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      005db31d
  9. 14 7月, 2016 6 次提交
  10. 13 7月, 2016 5 次提交
    • D
      Merge branch 'ethoc-fixes' · ea43f860
      David S. Miller 提交于
      Florian Fainelli says:
      
      ====================
      net: ethoc: Error path and transmit fixes
      
      This patch series contains two patches for the ethoc driver while testing on a
      TS-7300 board where ethoc is provided by an on-board FPGA.
      
      First patch was cooked after chasing crashes with invalid resources passed to
      the driver.
      
      Second patch was cooked after seeing that an interface configured with IP
      192.168.2.2 was sending ARP packets for 192.168.0.0, no wonder why it could not
      work.
      
      I don't have access to any other platform using an ethoc interface so
      it could be good to some testing on Xtensa for instance.
      
      Changes in v3:
      
      - corrected the error path if skb_put_padto() fails, thanks to Max
        for spotting this!
      
      Changes in v2:
      
      - fixed the first commit message
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea43f860
    • F
      net: ethoc: Correctly pad short packets · ee6c21b9
      Florian Fainelli 提交于
      Even though the hardware can be doing zero padding, we want the SKB to
      be going out on the wire with the appropriate size. This fixes packet
      truncations observed with e.g: ARP packets.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee6c21b9
    • F
      net: ethoc: Fix early error paths · 386512d1
      Florian Fainelli 提交于
      In case any operation fails before we can successfully go the point
      where we would register a MDIO bus, we would be going to an error label
      which involves unregistering then freeing this yet to be created MDIO
      bus. Update all error paths to go to label free which is the only one
      valid until either the clock is enabled, or the MDIO bus is allocated
      and registered. This fixes kernel oops observed while trying to
      dereference the MDIO bus structure which is not yet allocated.
      
      Fixes: a1702857 ("net: Add support for the OpenCores 10/100 Mbps Ethernet MAC.")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      386512d1
    • N
      net: nps_enet: Fix PCS reset · 136ab0d0
      Noam Camus 提交于
      During commit b54b8c2d
       ("net: ezchip: adapt driver to little endian architecture")
       adapting to little endian architecture,
       zeroing of controller was left out.
      Signed-off-by: NElad Kanfi <eladkan@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      136ab0d0
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 92a03eb0
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS fixes for net
      
      The following patchset contains Netfilter/IPVS fixes for your net tree.
      they are:
      
      1) Fix leak in the error path of nft_expr_init(), from Liping Zhang.
      
      2) Tracing from nf_tables cannot be disabled, also from Zhang.
      
      3) Fix an integer overflow on 32bit archs when setting the number of
         hashtable buckets, from Florian Westphal.
      
      4) Fix configuration of ipvs sync in backup mode with IPv6 address,
         from Quentin Armitage via Simon Horman.
      
      5) Fix incorrect timeout calculation in nft_ct NFT_CT_EXPIRATION,
         from Florian Westphal.
      
      6) Skip clash resolution in conntrack insertion races if NAT is in
         place.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92a03eb0
  11. 12 7月, 2016 4 次提交
    • P
      netfilter: conntrack: skip clash resolution if nat is in place · 590b52e1
      Pablo Neira Ayuso 提交于
      The clash resolution is not easy to apply if the NAT table is
      registered. Even if no NAT rules are installed, the nul-binding ensures
      that a unique tuple is used, thus, the packet that loses race gets a
      different source port number, as described by:
      
      http://marc.info/?l=netfilter-devel&m=146818011604484&w=2
      
      Clash resolution with NAT is also problematic if addresses/port range
      ports are used since the conntrack that wins race may describe a
      different mangling that we may have earlier applied to the packet via
      nf_nat_setup_info().
      
      Fixes: 71d8c47f ("netfilter: conntrack: introduce clash resolution on insertion race")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Tested-by: NMarc Dionne <marc.c.dionne@gmail.com>
      590b52e1
    • D
      Merge branch 'tipc-fixes' · ce9a4f31
      David S. Miller 提交于
      Jon Maloy says:
      
      ====================
      tipc: three small fixes
      
      Fixes for some broadcast link problems that may occur in large systems.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce9a4f31
    • J
      tipc: reset all unicast links when broadcast send link fails · 1fc07f3e
      Jon Paul Maloy 提交于
      In test situations with many nodes and a heavily stressed system we have
      observed that the transmission broadcast link may fail due to an
      excessive number of retransmissions of the same packet. In such
      situations we need to reset all unicast links to all peers, in order to
      reset and re-synchronize the broadcast link.
      
      In this commit, we add a new function tipc_bearer_reset_all() to be used
      in such situations. The function scans across all bearers and resets all
      their pertaining links.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1fc07f3e
    • J
      tipc: ensure correct broadcast send buffer release when peer is lost · a71eb720
      Jon Paul Maloy 提交于
      After a new receiver peer has been added to the broadcast transmission
      link, we allow immediate transmission of new broadcast packets, trusting
      that the new peer will not accept the packets until it has received the
      previously sent unicast broadcast initialiation message. In the same
      way, the sender must not accept any acknowledges until it has itself
      received the broadcast initialization from the peer, as well as
      confirmation of the reception of its own initialization message.
      
      Furthermore, when a receiver peer goes down, the sender has to produce
      the missing acknowledges from the lost peer locally, in order ensure
      correct release of the buffers that were expected to be acknowledged by
      the said peer.
      
      In a highly stressed system we have observed that contact with a peer
      may come up and be lost before the above mentioned broadcast initial-
      ization and confirmation have been received. This leads to the locally
      produced acknowledges being rejected, and the non-acknowledged buffers
      to linger in the broadcast link transmission queue until it fills up
      and the link goes into permanent congestion.
      
      In this commit, we remedy this by temporarily setting the corresponding
      broadcast receive link state to ESTABLISHED and the 'bc_peer_is_up'
      state to true before we issue the local acknowledges. This ensures that
      those acknowledges will always be accepted. The mentioned state values
      are restored immediately afterwards when the link is reset.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a71eb720