1. 18 12月, 2016 28 次提交
    • K
      net/x25: use designated initializers · e999cb43
      Kees Cook 提交于
      Prepare to mark sensitive kernel structures for randomization by making
      sure they're using designated initializers. These were identified during
      allyesconfig builds of x86, arm, and arm64, with most initializer fixes
      extracted from grsecurity.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e999cb43
    • K
      isdn: use designated initializers · ebf12f13
      Kees Cook 提交于
      Prepare to mark sensitive kernel structures for randomization by making
      sure they're using designated initializers. These were identified during
      allyesconfig builds of x86, arm, and arm64, with most initializer fixes
      extracted from grsecurity.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ebf12f13
    • K
      bna: use designated initializers · 9751362a
      Kees Cook 提交于
      Prepare to mark sensitive kernel structures for randomization by making
      sure they're using designated initializers. These were identified during
      allyesconfig builds of x86, arm, and arm64, with most initializer fixes
      extracted from grsecurity.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9751362a
    • K
      WAN: use designated initializers · aabd7ad9
      Kees Cook 提交于
      Prepare to mark sensitive kernel structures for randomization by making
      sure they're using designated initializers. These were identified during
      allyesconfig builds of x86, arm, and arm64, with most initializer fixes
      extracted from grsecurity.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aabd7ad9
    • K
      net: use designated initializers · 9d1c0ca5
      Kees Cook 提交于
      Prepare to mark sensitive kernel structures for randomization by making
      sure they're using designated initializers. These were identified during
      allyesconfig builds of x86, arm, and arm64, with most initializer fixes
      extracted from grsecurity.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9d1c0ca5
    • K
      ATM: use designated initializers · 99a5e178
      Kees Cook 提交于
      Prepare to mark sensitive kernel structures for randomization by making
      sure they're using designated initializers. These were identified during
      allyesconfig builds of x86, arm, and arm64, with most initializer fixes
      extracted from grsecurity.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99a5e178
    • K
      isdn/gigaset: use designated initializers · 47941950
      Kees Cook 提交于
      Prepare to mark sensitive kernel structures for randomization by making
      sure they're using designated initializers. These were identified during
      allyesconfig builds of x86, arm, and arm64, with most initializer fixes
      extracted from grsecurity.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47941950
    • D
      Merge branch 'virtio_net-XDP' · cd333e37
      David S. Miller 提交于
      John Fastabend says:
      
      ====================
      XDP for virtio_net
      
      This implements virtio_net for the mergeable buffers and big_packet
      modes. I tested this with vhost_net running on qemu and did not see
      any issues. For testing num_buf > 1 I added a hack to vhost driver
      to only but 100 bytes per buffer.
      
      There are some restrictions for XDP to be enabled and work well
      (see patch 3) for more details.
      
        1. GUEST_TSO{4|6} must be off
        2. MTU must be less than PAGE_SIZE
        3. queues must be available to dedicate to XDP
        4. num_bufs received in mergeable buffers must be 1
        5. big_packet mode must have all data on single page
      
      To test this I used pktgen in the hypervisor and ran the XDP sample
      programs xdp1 and xdp2 from ./samples/bpf in the host. The default
      mode that is used with these patches with Linux guest and QEMU/Linux
      hypervisor is the mergeable buffers mode. I tested this mode for 2+
      days running xdp2 without issues. Additionally I did a series of
      driver unload/load tests to check the allocate/release paths.
      
      To test the big_packets path I applied the following simple patch against
      the virtio driver forcing big_packets mode,
      
      --- a/drivers/net/virtio_net.c
      +++ b/drivers/net/virtio_net.c
      @@ -2242,7 +2242,7 @@ static int virtnet_probe(struct virtio_device *vdev)
                      vi->big_packets = true;
      
              if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
      -               vi->mergeable_rx_bufs = true;
      +               vi->mergeable_rx_bufs = false;
      
              if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF) ||
                  virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
      
      I then repeated the tests with xdp1 and xdp2. After letting them run
      for a few hours I called it good enough.
      
      Testing the unexpected case where virtio receives a packet across
      multiple buffers required patching the hypervisor vhost driver to
      convince it to send these unexpected packets. Then I used ping with
      the -s option to trigger the case with multiple buffers. This mode
      is not expected to be used but as MST pointed out per spec it is
      not strictly speaking illegal to generate multi-buffer packets so we
      need someway to handle these. The following patch can be used to
      generate multiple buffers,
      
      --- a/drivers/vhost/vhost.c
      +++ b/drivers/vhost/vhost.c
      @@ -1777,7 +1777,8 @@ static int translate_desc(struct vhost_virtqueue
      *vq, u64
      
                      _iov = iov + ret;
                      size = node->size - addr + node->start;
      -               _iov->iov_len = min((u64)len - s, size);
      +               printk("%s: build 100 length headers!\n", __func__);
      +               _iov->iov_len = min((u64)len - s, (u64)100);//size);
                      _iov->iov_base = (void __user *)(unsigned long)
                              (node->userspace_addr + addr - node->start);
                      s += size;
      
      The qemu command I most frequently used for testing (although I did test
      various other combinations of devices) is the following,
      
       ./x86_64-softmmu/qemu-system-x86_64              \
          -hda /var/lib/libvirt/images/Fedora-test0.img \
          -m 4096  -enable-kvm -smp 2                   \
          -netdev tap,id=hn0,queues=4,vhost=on          \
          -device virtio-net-pci,netdev=hn0,mq=on,vectors=9,guest_tso4=off,guest_tso6=off \
          -serial stdio
      
      The options 'guest_tso4=off,guest_tso6=off' are required because we
      do not support LRO with XDP at the moment.
      
      Please review any comments/feedback welcome as always.
      ====================
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd333e37
    • J
      virtio_net: xdp, add slowpath case for non contiguous buffers · 72979a6c
      John Fastabend 提交于
      virtio_net XDP support expects receive buffers to be contiguous.
      If this is not the case we enable a slowpath to allow connectivity
      to continue but at a significan performance overhead associated with
      linearizing data. To make it painfully aware to users that XDP is
      running in a degraded mode we throw an xdp buffer error.
      
      To linearize packets we allocate a page and copy the segments of
      the data, including the header, into it. After this the page can be
      handled by XDP code flow as normal.
      
      Then depending on the return code the page is either freed or sent
      to the XDP xmit path. There is no attempt to optimize this path.
      
      This case is being handled simple as a precaution in case some
      unknown backend were to generate packets in this form. To test this
      I had to hack qemu and force it to generate these packets. I do not
      expect this case to be generated by "real" backends.
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72979a6c
    • J
      virtio_net: add XDP_TX support · 56434a01
      John Fastabend 提交于
      This adds support for the XDP_TX action to virtio_net. When an XDP
      program is run and returns the XDP_TX action the virtio_net XDP
      implementation will transmit the packet on a TX queue that aligns
      with the current CPU that the XDP packet was processed on.
      
      Before sending the packet the header is zeroed.  Also XDP is expected
      to handle checksum correctly so no checksum offload  support is
      provided.
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56434a01
    • J
      virtio_net: add dedicated XDP transmit queues · 672aafd5
      John Fastabend 提交于
      XDP requires using isolated transmit queues to avoid interference
      with normal networking stack (BQL, NETDEV_TX_BUSY, etc). This patch
      adds a XDP queue per cpu when a XDP program is loaded and does not
      expose the queues to the OS via the normal API call to
      netif_set_real_num_tx_queues(). This way the stack will never push
      an skb to these queues.
      
      However virtio/vhost/qemu implementation only allows for creating
      TX/RX queue pairs at this time so creating only TX queues was not
      possible. And because the associated RX queues are being created I
      went ahead and exposed these to the stack and let the backend use
      them. This creates more RX queues visible to the network stack than
      TX queues which is worth mentioning but does not cause any issues as
      far as I can tell.
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      672aafd5
    • J
      virtio_net: Add XDP support · f600b690
      John Fastabend 提交于
      This adds XDP support to virtio_net. Some requirements must be
      met for XDP to be enabled depending on the mode. First it will
      only be supported with LRO disabled so that data is not pushed
      across multiple buffers. Second the MTU must be less than a page
      size to avoid having to handle XDP across multiple pages.
      
      If mergeable receive is enabled this patch only supports the case
      where header and data are in the same buf which we can check when
      a packet is received by looking at num_buf. If the num_buf is
      greater than 1 and a XDP program is loaded the packet is dropped
      and a warning is thrown. When any_header_sg is set this does not
      happen and both header and data is put in a single buffer as expected
      so we check this when XDP programs are loaded.  Subsequent patches
      will process the packet in a degraded mode to ensure connectivity
      and correctness is not lost even if backend pushes packets into
      multiple buffers.
      
      If big packets mode is enabled and MTU/LRO conditions above are
      met then XDP is allowed.
      
      This patch was tested with qemu with vhost=on and vhost=off where
      mergeable and big_packet modes were forced via hard coding feature
      negotiation. Multiple buffers per packet was forced via a small
      test patch to vhost.c in the vhost=on qemu mode.
      Suggested-by: NShrijeet Mukherjee <shrijeet@gmail.com>
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f600b690
    • J
      net: xdp: add invalid buffer warning · f23bc46c
      John Fastabend 提交于
      This adds a warning for drivers to use when encountering an invalid
      buffer for XDP. For normal cases this should not happen but to catch
      this in virtual/qemu setups that I may not have expected from the
      emulation layer having a standard warning is useful.
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f23bc46c
    • X
      sctp: sctp_transport_lookup_process should rcu_read_unlock when transport is null · 08abb795
      Xin Long 提交于
      Prior to this patch, sctp_transport_lookup_process didn't rcu_read_unlock
      when it failed to find a transport by sctp_addrs_lookup_transport.
      
      This patch is to fix it by moving up rcu_read_unlock right before checking
      transport and also to remove the out path.
      
      Fixes: 1cceda78 ("sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      08abb795
    • X
      sctp: sctp_epaddr_lookup_transport should be protected by rcu_read_lock · 5cb2cd68
      Xin Long 提交于
      Since commit 7fda702f ("sctp: use new rhlist interface on sctp transport
      rhashtable"), sctp has changed to use rhlist_lookup to look up transport, but
      rhlist_lookup doesn't call rcu_read_lock inside, unlike rhashtable_lookup_fast.
      
      It is called in sctp_epaddr_lookup_transport and sctp_addrs_lookup_transport.
      sctp_addrs_lookup_transport is always in the protection of rcu_read_lock(),
      as __sctp_lookup_association is called in rx path or sctp_lookup_association
      which are in the protection of rcu_read_lock() already.
      
      But sctp_epaddr_lookup_transport is called by sctp_endpoint_lookup_assoc, it
      doesn't call rcu_read_lock, which may cause "suspicious rcu_dereference_check
      usage' in __rhashtable_lookup.
      
      This patch is to fix it by adding rcu_read_lock in sctp_endpoint_lookup_assoc
      before calling sctp_epaddr_lookup_transport.
      
      Fixes: 7fda702f ("sctp: use new rhlist interface on sctp transport rhashtable")
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5cb2cd68
    • D
      Merge branch 'dpaa_eth-fixes' · 10a3ecf4
      David S. Miller 提交于
      Madalin Bucur says:
      
      ====================
      dpaa_eth: a couple of fixes
      
      This patch set introduces big endian accessors in the dpaa_eth driver
      making sure accesses to the QBMan HW are correct on little endian
      platforms. Removing a redundant Kconfig dependency on FSL_SOC.
      Adding myself as maintainer of the dpaa_eth driver.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10a3ecf4
    • M
      MAINTAINERS: net: add entry for Freescale QorIQ DPAA Ethernet driver · 63f4b4b0
      Madalin Bucur 提交于
      Add record for Freescale QORIQ DPAA Ethernet driver adding myself as
      maintainer.
      Signed-off-by: NMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63f4b4b0
    • M
      708f0f4f
    • C
      dpaa_eth: use big endian accessors · 7d6f8dc0
      Claudiu Manoil 提交于
      Ensure correct access to the big endian QMan HW through proper
      accessors.
      Signed-off-by: NClaudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: NMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d6f8dc0
    • L
      irda: irnet: add member name to the miscdevice declaration · 616f6b40
      LABBE Corentin 提交于
      Since the struct miscdevice have many members, it is dangerous to init
      it without members name relying only on member order.
      
      This patch add member name to the init declaration.
      Signed-off-by: NCorentin Labbe <clabbe.montjoie@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      616f6b40
    • L
      irda: irnet: Remove unused IRNET_MAJOR define · 33de4d1b
      LABBE Corentin 提交于
      The IRNET_MAJOR define is not used, so this patch remove it.
      Signed-off-by: NCorentin Labbe <clabbe.montjoie@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      33de4d1b
    • L
      irnet: ppp: move IRNET_MINOR to include/linux/miscdevice.h · 24c946cc
      LABBE Corentin 提交于
      This patch move the define for IRNET_MINOR to include/linux/miscdevice.h
      It is better that all minor number definitions are in the same place.
      Signed-off-by: NCorentin Labbe <clabbe.montjoie@gmail.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      24c946cc
    • L
      irda: irnet: Move linux/miscdevice.h include · e2928235
      LABBE Corentin 提交于
      The only use of miscdevice is irda_ppp so no need to include
      linux/miscdevice.h for all irda files.
      This patch move the linux/miscdevice.h include to irnet_ppp.h
      Signed-off-by: NCorentin Labbe <clabbe.montjoie@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e2928235
    • L
      irda: irproc.c: Remove unneeded linux/miscdevice.h include · 078497a4
      LABBE Corentin 提交于
      irproc.c does not use any miscdevice so this patch remove this
      unnecessary inclusion.
      Signed-off-by: NCorentin Labbe <clabbe.montjoie@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      078497a4
    • D
      bpf: cgroup: annotate pointers in struct cgroup_bpf with __rcu · dcdc43d6
      Daniel Mack 提交于
      The member 'effective' in 'struct cgroup_bpf' is protected by RCU.
      Annotate it accordingly to squelch a sparse warning.
      Signed-off-by: NDaniel Mack <daniel@zonque.org>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dcdc43d6
    • D
      Merge branch 'inet_csk_get_port-and-soreusport-fixes' · 28055c97
      David S. Miller 提交于
      Tom Herbert says:
      
      ====================
      inet: Fixes for inet_csk_get_port and soreusport
      
      This patch set fixes a couple of issues I noticed while debugging our
      softlockup issue in inet_csk_get_port.
      
      - Don't allow jump into port scan in inet_csk_get_port if function
        was called with non-zero port number (looking up explicit port
        number).
      - When inet_csk_get_port is called with zero port number (ie. perform
        scan) an reuseport is set on the socket, don't match sockets that
        also have reuseport set. The intent from the user should be
        to get a new port number and then explictly bind other
        sockets to that number using soreuseport.
      
      Tested:
      
      Ran first patch on production workload with no ill effect.
      
      For second patch, ran a little listener application and first
      demonstrated that unbound sockets with soreuseport can indeed
      be bound to unrelated soreuseport sockets.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28055c97
    • T
      inet: Fix get port to handle zero port number with soreuseport set · 0643ee4f
      Tom Herbert 提交于
      A user may call listen with binding an explicit port with the intent
      that the kernel will assign an available port to the socket. In this
      case inet_csk_get_port does a port scan. For such sockets, the user may
      also set soreuseport with the intent a creating more sockets for the
      port that is selected. The problem is that the initial socket being
      opened could inadvertently choose an existing and unreleated port
      number that was already created with soreuseport.
      
      This patch adds a boolean parameter to inet_bind_conflict that indicates
      rather soreuseport is allowed for the check (in addition to
      sk->sk_reuseport). In calls to inet_bind_conflict from inet_csk_get_port
      the argument is set to true if an explicit port is being looked up (snum
      argument is nonzero), and is false if port scan is done.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0643ee4f
    • T
      inet: Don't go into port scan when looking for specific bind port · 9af7e923
      Tom Herbert 提交于
      inet_csk_get_port is called with port number (snum argument) that may be
      zero or nonzero. If it is zero, then the intent is to find an available
      ephemeral port number to bind to. If snum is non-zero then the caller
      is asking to allocate a specific port number. In the latter case we
      never want to perform the scan in ephemeral port range. It is
      conceivable that this can happen if the "goto again" in "tb_found:"
      is done. This patch adds a check that snum is zero before doing
      the "goto again".
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9af7e923
  2. 17 12月, 2016 12 次提交
    • D
      bpf, test_verifier: fix a test case error result on unprivileged · 0eb6984f
      Daniel Borkmann 提交于
      Running ./test_verifier as unprivileged lets 1 out of 98 tests fail:
      
        [...]
        #71 unpriv: check that printk is disallowed FAIL
        Unexpected error message!
        0: (7a) *(u64 *)(r10 -8) = 0
        1: (bf) r1 = r10
        2: (07) r1 += -8
        3: (b7) r2 = 8
        4: (bf) r3 = r1
        5: (85) call bpf_trace_printk#6
        unknown func bpf_trace_printk#6
        [...]
      
      The test case is correct, just that the error outcome changed with
      ebb676da ("bpf: Print function name in addition to function id").
      Same as with e00c7b21 ("bpf: fix multiple issues in selftest suite
      and samples") issue 2), so just fix up the function name.
      
      Fixes: ebb676da ("bpf: Print function name in addition to function id")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0eb6984f
    • D
      bpf: fix regression on verifier pruning wrt map lookups · a08dd0da
      Daniel Borkmann 提交于
      Commit 57a09bf0 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL
      registers") introduced a regression where existing programs stopped
      loading due to reaching the verifier's maximum complexity limit,
      whereas prior to this commit they were loading just fine; the affected
      program has roughly 2k instructions.
      
      What was found is that state pruning couldn't be performed effectively
      anymore due to mismatches of the verifier's register state, in particular
      in the id tracking. It doesn't mean that 57a09bf0 is incorrect per
      se, but rather that verifier needs to perform a lot more work for the
      same program with regards to involved map lookups.
      
      Since commit 57a09bf0 is only about tracking registers with type
      PTR_TO_MAP_VALUE_OR_NULL, the id is only needed to follow registers
      until they are promoted through pattern matching with a NULL check to
      either PTR_TO_MAP_VALUE or UNKNOWN_VALUE type. After that point, the
      id becomes irrelevant for the transitioned types.
      
      For UNKNOWN_VALUE, id is already reset to 0 via mark_reg_unknown_value(),
      but not so for PTR_TO_MAP_VALUE where id is becoming stale. It's even
      transferred further into other types that don't make use of it. Among
      others, one example is where UNKNOWN_VALUE is set on function call
      return with RET_INTEGER return type.
      
      states_equal() will then fall through the memcmp() on register state;
      note that the second memcmp() uses offsetofend(), so the id is part of
      that since d2a4dd37 ("bpf: fix state equivalence"). But the bisect
      pointed already to 57a09bf0, where we really reach beyond complexity
      limit. What I found was that states_equal() often failed in this
      case due to id mismatches in spilled regs with registers in type
      PTR_TO_MAP_VALUE. Unlike non-spilled regs, spilled regs just perform
      a memcmp() on their reg state and don't have any other optimizations
      in place, therefore also id was relevant in this case for making a
      pruning decision.
      
      We can safely reset id to 0 as well when converting to PTR_TO_MAP_VALUE.
      For the affected program, it resulted in a ~17 fold reduction of
      complexity and let the program load fine again. Selftest suite also
      runs fine. The only other place where env->id_gen is used currently is
      through direct packet access, but for these cases id is long living, thus
      a different scenario.
      
      Also, the current logic in mark_map_regs() is not fully correct when
      marking NULL branch with UNKNOWN_VALUE. We need to cache the destination
      reg's id in any case. Otherwise, once we marked that reg as UNKNOWN_VALUE,
      it's id is reset and any subsequent registers that hold the original id
      and are of type PTR_TO_MAP_VALUE_OR_NULL won't be marked UNKNOWN_VALUE
      anymore, since mark_map_reg() reuses the uncached regs[regno].id that
      was just overridden. Note, we don't need to cache it outside of
      mark_map_regs(), since it's called once on this_branch and the other
      time on other_branch, which are both two independent verifier states.
      A test case for this is added here, too.
      
      Fixes: 57a09bf0 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a08dd0da
    • D
      net: vrf: Drop conntrack data after pass through VRF device on Tx · eb63ecc1
      David Ahern 提交于
      Locally originated traffic in a VRF fails in the presence of a POSTROUTING
      rule. For example,
      
          $ iptables -t nat -A POSTROUTING -s 11.1.1.0/24  -j MASQUERADE
          $ ping -I red -c1 11.1.1.3
          ping: Warning: source address might be selected on device other than red.
          PING 11.1.1.3 (11.1.1.3) from 11.1.1.2 red: 56(84) bytes of data.
          ping: sendmsg: Operation not permitted
      
      Worse, the above causes random corruption resulting in a panic in random
      places (I have not seen a consistent backtrace).
      
      Call nf_reset to drop the conntrack info following the pass through the
      VRF device.  The nf_reset is needed on Tx but not Rx because of the order
      in which NF_HOOK's are hit: on Rx the VRF device is after the real ingress
      device and on Tx it is is before the real egress device. Connection
      tracking should be tied to the real egress device and not the VRF device.
      
      Fixes: 8f58336d ("net: Add ethernet header for pass through VRF device")
      Fixes: 35402e31 ("net: Add IPv6 support to VRF device")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb63ecc1
    • D
      net: vrf: Fix NAT within a VRF · a0f37efa
      David Ahern 提交于
      Connection tracking with VRF is broken because the pass through the VRF
      device drops the connection tracking info. Removing the call to nf_reset
      allows DNAT and MASQUERADE to work across interfaces within a VRF.
      
      Fixes: 73e20b76 ("net: vrf: Add support for PREROUTING rules on vrf device")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0f37efa
    • D
      Merge branch 'cls_flower-mask' · 8a9f5fdf
      David S. Miller 提交于
      Paul Blakey says:
      
      ====================
      net/sched: cls_flower: Fix mask handling
      
      The series fix how the mask is being handled.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a9f5fdf
    • P
      net/sched: cls_flower: Use masked key when calling HW offloads · f93bd17b
      Paul Blakey 提交于
      Zero bits on the mask signify a "don't care" on the corresponding bits
      in key. Some HWs require those bits on the key to be zero. Since these
      bits are masked anyway, it's okay to provide the masked key to all
      drivers.
      
      Fixes: 5b33f488 ('net/flower: Introduce hardware offload support')
      Signed-off-by: NPaul Blakey <paulb@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f93bd17b
    • P
      net/sched: cls_flower: Use mask for addr_type · 970bfcd0
      Paul Blakey 提交于
      When addr_type is set, mask should also be set.
      
      Fixes: 66530bdf ('sched,cls_flower: set key address type when present')
      Fixes: bc3103f1 ('net/sched: cls_flower: Classify packet in ip tunnels')
      Signed-off-by: NPaul Blakey <paulb@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      970bfcd0
    • B
      net: macb: Added PCI wrapper for Platform Driver. · 83a77e9e
      Bartosz Folta 提交于
      There are hardware PCI implementations of Cadence GEM network
      controller. This patch will allow to use such hardware with reuse of
      existing Platform Driver.
      Signed-off-by: NBartosz Folta <bfolta@cadence.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83a77e9e
    • T
      ibmveth: calculate gso_segs for large packets · 94acf164
      Thomas Falcon 提交于
      Include calculations to compute the number of segments
      that comprise an aggregated large packet.
      Signed-off-by: NThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Reviewed-by: NJonathan Maxwell <jmaxwell37@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94acf164
    • T
      net: qcom/emac: don't try to claim clocks on ACPI systems · 026acd5f
      Timur Tabi 提交于
      On ACPI systems, clocks are not available to drivers directly.  They are
      handled exclusively by ACPI and/or firmware, so there is no clock driver.
      Calls to clk_get() always fail, so we should not even attempt to claim
      any clocks on ACPI systems.
      Signed-off-by: NTimur Tabi <timur@codeaurora.org>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      026acd5f
    • J
    • J
      encx24j600: bugfix - always move ERXTAIL to next packet in encx24j600_rx_packets · ebe5236d
      Jeroen De Wachter 提交于
      Before, encx24j600_rx_packets did not update encx24j600_priv's next_packet
      member when an error occurred during packet handling (either because the
      packet's RSV header indicates an error or because the encx24j600_receive_packet
      method can't allocate an sk_buff).
      
      If the next_packet member is not updated, the ERXTAIL register will be set to
      the same value it had before, which means the bad packet remains in the
      component's memory and its RSV header will be read again when a new packet
      arrives. If the RSV header indicates a bad packet or if sk_buff allocation
      continues to fail, new packets will be stored in the component's memory until
      that memory is full, after which packets will be dropped.
      
      The SETPKTDEC command is always executed though, so the encx24j600 hardware has
      an incorrect count of the packets in its memory.
      
      To prevent this, the next_packet member should always be updated, allowing the
      packet to be skipped (either because it's bad, as indicated in its RSV header,
      or because allocating an sk_buff failed). In the allocation failure case, this
      does mean dropping a valid packet, but dropping the oldest packet to keep as
      much memory as possible available for new packets seems preferable to keeping
      old (but valid) packets around while dropping new ones.
      Signed-off-by: NJeroen De Wachter <jeroen.de_wachter.ext@nokia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ebe5236d