1. 17 12月, 2016 16 次提交
    • D
      bpf: fix regression on verifier pruning wrt map lookups · a08dd0da
      Daniel Borkmann 提交于
      Commit 57a09bf0 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL
      registers") introduced a regression where existing programs stopped
      loading due to reaching the verifier's maximum complexity limit,
      whereas prior to this commit they were loading just fine; the affected
      program has roughly 2k instructions.
      
      What was found is that state pruning couldn't be performed effectively
      anymore due to mismatches of the verifier's register state, in particular
      in the id tracking. It doesn't mean that 57a09bf0 is incorrect per
      se, but rather that verifier needs to perform a lot more work for the
      same program with regards to involved map lookups.
      
      Since commit 57a09bf0 is only about tracking registers with type
      PTR_TO_MAP_VALUE_OR_NULL, the id is only needed to follow registers
      until they are promoted through pattern matching with a NULL check to
      either PTR_TO_MAP_VALUE or UNKNOWN_VALUE type. After that point, the
      id becomes irrelevant for the transitioned types.
      
      For UNKNOWN_VALUE, id is already reset to 0 via mark_reg_unknown_value(),
      but not so for PTR_TO_MAP_VALUE where id is becoming stale. It's even
      transferred further into other types that don't make use of it. Among
      others, one example is where UNKNOWN_VALUE is set on function call
      return with RET_INTEGER return type.
      
      states_equal() will then fall through the memcmp() on register state;
      note that the second memcmp() uses offsetofend(), so the id is part of
      that since d2a4dd37 ("bpf: fix state equivalence"). But the bisect
      pointed already to 57a09bf0, where we really reach beyond complexity
      limit. What I found was that states_equal() often failed in this
      case due to id mismatches in spilled regs with registers in type
      PTR_TO_MAP_VALUE. Unlike non-spilled regs, spilled regs just perform
      a memcmp() on their reg state and don't have any other optimizations
      in place, therefore also id was relevant in this case for making a
      pruning decision.
      
      We can safely reset id to 0 as well when converting to PTR_TO_MAP_VALUE.
      For the affected program, it resulted in a ~17 fold reduction of
      complexity and let the program load fine again. Selftest suite also
      runs fine. The only other place where env->id_gen is used currently is
      through direct packet access, but for these cases id is long living, thus
      a different scenario.
      
      Also, the current logic in mark_map_regs() is not fully correct when
      marking NULL branch with UNKNOWN_VALUE. We need to cache the destination
      reg's id in any case. Otherwise, once we marked that reg as UNKNOWN_VALUE,
      it's id is reset and any subsequent registers that hold the original id
      and are of type PTR_TO_MAP_VALUE_OR_NULL won't be marked UNKNOWN_VALUE
      anymore, since mark_map_reg() reuses the uncached regs[regno].id that
      was just overridden. Note, we don't need to cache it outside of
      mark_map_regs(), since it's called once on this_branch and the other
      time on other_branch, which are both two independent verifier states.
      A test case for this is added here, too.
      
      Fixes: 57a09bf0 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a08dd0da
    • D
      net: vrf: Drop conntrack data after pass through VRF device on Tx · eb63ecc1
      David Ahern 提交于
      Locally originated traffic in a VRF fails in the presence of a POSTROUTING
      rule. For example,
      
          $ iptables -t nat -A POSTROUTING -s 11.1.1.0/24  -j MASQUERADE
          $ ping -I red -c1 11.1.1.3
          ping: Warning: source address might be selected on device other than red.
          PING 11.1.1.3 (11.1.1.3) from 11.1.1.2 red: 56(84) bytes of data.
          ping: sendmsg: Operation not permitted
      
      Worse, the above causes random corruption resulting in a panic in random
      places (I have not seen a consistent backtrace).
      
      Call nf_reset to drop the conntrack info following the pass through the
      VRF device.  The nf_reset is needed on Tx but not Rx because of the order
      in which NF_HOOK's are hit: on Rx the VRF device is after the real ingress
      device and on Tx it is is before the real egress device. Connection
      tracking should be tied to the real egress device and not the VRF device.
      
      Fixes: 8f58336d ("net: Add ethernet header for pass through VRF device")
      Fixes: 35402e31 ("net: Add IPv6 support to VRF device")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb63ecc1
    • D
      net: vrf: Fix NAT within a VRF · a0f37efa
      David Ahern 提交于
      Connection tracking with VRF is broken because the pass through the VRF
      device drops the connection tracking info. Removing the call to nf_reset
      allows DNAT and MASQUERADE to work across interfaces within a VRF.
      
      Fixes: 73e20b76 ("net: vrf: Add support for PREROUTING rules on vrf device")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0f37efa
    • D
      Merge branch 'cls_flower-mask' · 8a9f5fdf
      David S. Miller 提交于
      Paul Blakey says:
      
      ====================
      net/sched: cls_flower: Fix mask handling
      
      The series fix how the mask is being handled.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a9f5fdf
    • P
      net/sched: cls_flower: Use masked key when calling HW offloads · f93bd17b
      Paul Blakey 提交于
      Zero bits on the mask signify a "don't care" on the corresponding bits
      in key. Some HWs require those bits on the key to be zero. Since these
      bits are masked anyway, it's okay to provide the masked key to all
      drivers.
      
      Fixes: 5b33f488 ('net/flower: Introduce hardware offload support')
      Signed-off-by: NPaul Blakey <paulb@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f93bd17b
    • P
      net/sched: cls_flower: Use mask for addr_type · 970bfcd0
      Paul Blakey 提交于
      When addr_type is set, mask should also be set.
      
      Fixes: 66530bdf ('sched,cls_flower: set key address type when present')
      Fixes: bc3103f1 ('net/sched: cls_flower: Classify packet in ip tunnels')
      Signed-off-by: NPaul Blakey <paulb@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      970bfcd0
    • B
      net: macb: Added PCI wrapper for Platform Driver. · 83a77e9e
      Bartosz Folta 提交于
      There are hardware PCI implementations of Cadence GEM network
      controller. This patch will allow to use such hardware with reuse of
      existing Platform Driver.
      Signed-off-by: NBartosz Folta <bfolta@cadence.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83a77e9e
    • T
      ibmveth: calculate gso_segs for large packets · 94acf164
      Thomas Falcon 提交于
      Include calculations to compute the number of segments
      that comprise an aggregated large packet.
      Signed-off-by: NThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Reviewed-by: NJonathan Maxwell <jmaxwell37@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94acf164
    • T
      net: qcom/emac: don't try to claim clocks on ACPI systems · 026acd5f
      Timur Tabi 提交于
      On ACPI systems, clocks are not available to drivers directly.  They are
      handled exclusively by ACPI and/or firmware, so there is no clock driver.
      Calls to clk_get() always fail, so we should not even attempt to claim
      any clocks on ACPI systems.
      Signed-off-by: NTimur Tabi <timur@codeaurora.org>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      026acd5f
    • J
    • J
      encx24j600: bugfix - always move ERXTAIL to next packet in encx24j600_rx_packets · ebe5236d
      Jeroen De Wachter 提交于
      Before, encx24j600_rx_packets did not update encx24j600_priv's next_packet
      member when an error occurred during packet handling (either because the
      packet's RSV header indicates an error or because the encx24j600_receive_packet
      method can't allocate an sk_buff).
      
      If the next_packet member is not updated, the ERXTAIL register will be set to
      the same value it had before, which means the bad packet remains in the
      component's memory and its RSV header will be read again when a new packet
      arrives. If the RSV header indicates a bad packet or if sk_buff allocation
      continues to fail, new packets will be stored in the component's memory until
      that memory is full, after which packets will be dropped.
      
      The SETPKTDEC command is always executed though, so the encx24j600 hardware has
      an incorrect count of the packets in its memory.
      
      To prevent this, the next_packet member should always be updated, allowing the
      packet to be skipped (either because it's bad, as indicated in its RSV header,
      or because allocating an sk_buff failed). In the allocation failure case, this
      does mean dropping a valid packet, but dropping the oldest packet to keep as
      much memory as possible available for new packets seems preferable to keeping
      old (but valid) packets around while dropping new ones.
      Signed-off-by: NJeroen De Wachter <jeroen.de_wachter.ext@nokia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ebe5236d
    • D
      Merge branch 'hisilicon-netdev-dev' · ea7a2b9a
      David S. Miller 提交于
      Dongpo Li says:
      
      ====================
      net: ethernet: hisilicon: set dev->dev.parent before PHY connect
      
      This patch series builds atop:
      ec988ad7 ("phy: Don't increment MDIO bus
      refcount unless it's a different owner")
      
      I have checked all the hisilicon ethernet driver and found only two drivers
      need to be fixed to make sure set dev->dev.parent before PHY connect.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea7a2b9a
    • D
      net: ethernet: hip04: Call SET_NETDEV_DEV() · 8cd1f70f
      Dongpo Li 提交于
      The hip04 driver calls into PHYLIB which now checks for
      net_device->dev.parent, so make sure we do set it before calling into
      any MDIO/PHYLIB related function.
      
      Fixes: ec988ad7 ("phy: Don't increment MDIO bus refcount unless it's a different owner")
      Signed-off-by: NDongpo Li <lidongpo@hisilicon.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cd1f70f
    • D
      net: ethernet: hisi_femac: Call SET_NETDEV_DEV() · 2087d421
      Dongpo Li 提交于
      The hisi_femac driver calls into PHYLIB which now checks for
      net_device->dev.parent, so make sure we do set it before calling into
      any MDIO/PHYLIB related function.
      
      Fixes: ec988ad7 ("phy: Don't increment MDIO bus refcount unless it's a different owner")
      Signed-off-by: NDongpo Li <lidongpo@hisilicon.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2087d421
    • A
      net: dsa: mv88e6xxx: Fix opps when adding vlan bridge · 66e2809d
      Andrew Lunn 提交于
      A port is not necessarily assigned to a netdev. And a port does not
      need to be a member of a bridge. So when iterating over all ports,
      check before using the netdev and bridge_dev for a port. Otherwise we
      dereference a NULL pointer.
      
      Fixes: da9c359e ("net: dsa: mv88e6xxx: check hardware VLAN in use")
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      66e2809d
    • T
      net/3com/3c515: Fix timer handling, prevent leaks and crashes · e28ceeb1
      Thomas Gleixner 提交于
      The timer handling in this driver is broken in several ways:
      
      - corkscrew_open() initializes and arms a timer before requesting the
        device interrupt. If the request fails the timer stays armed.
      
        A second call to corkscrew_open will unconditionally reinitialize the
        quued timer and arm it again. Also a immediate device removal will leave
        the timer queued because close() is not called (open() failed) and
        therefore nothing issues del_timer().
      
        The reinitialization corrupts the link chain in the timer wheel hash
        bucket and causes a NULL pointer dereference when the timer wheel tries
        to operate on that hash bucket. Immediate device removal lets the link
        chain poke into freed and possibly reused memory.
      
        Solution: Arm the timer after the successful irq request.
      
      - corkscrew_close() uses del_timer()
      
        On close the timer is disarmed with del_timer() which lets the following
        code race against a concurrent timer expiry function.
      
        Solution: Use del_timer_sync() instead
      
      - corkscrew_close() calls del_timer() unconditionally
      
        del_timer() is invoked even if the timer was never initialized. This
        works by chance because the struct containing the timer is zeroed at
        allocation time.
      
        Solution: Move the setup of the timer into corkscrew_setup().
      Reported-by: NMatthew Whitehead <tedheadster@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e28ceeb1
  2. 14 12月, 2016 1 次提交
  3. 13 12月, 2016 7 次提交
  4. 12 12月, 2016 16 次提交