1. 27 2月, 2018 12 次提交
  2. 24 2月, 2018 4 次提交
    • D
      net: fib_rules: Add new attribute to set protocol · 1b71af60
      Donald Sharp 提交于
      For ages iproute2 has used `struct rtmsg` as the ancillary header for
      FIB rules and in the process set the protocol value to RTPROT_BOOT.
      Until ca56209a66 ("net: Allow a rule to track originating protocol")
      the kernel rules code ignored the protocol value sent from userspace
      and always returned 0 in notifications. To avoid incompatibility with
      existing iproute2, send the protocol as a new attribute.
      
      Fixes: cac56209 ("net: Allow a rule to track originating protocol")
      Signed-off-by: NDonald Sharp <sharpd@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1b71af60
    • H
      r8169: simplify and improve check for dash · 9dbe7896
      Heiner Kallweit 提交于
      r8168_check_dash() returns false anyway for all chip versions not
      supporting dash. So we can simplify the check conditions.
      
      In addition change the check functions to return bool instead of int,
      because they actually return a bool value.
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9dbe7896
    • H
      r8169: disable WOL per default · 7edf6d31
      Heiner Kallweit 提交于
      Currently, if BIOS enables WOL in the chip, settings are inconsistent
      because the device isn't marked as wakeup-enabled (if not done
      explicitly via userspace tools). This causes issues with suspend/
      resume because mdio_bus_phy_may_suspend() checks whether device is
      wakeup-enabled. In detail MDIO bus access in phy_suspend() can fail
      because the MDIO bus is disabled.
      
      In the history of the driver we find two competing approaches:
      8f9d5138 "r8169: remember WOL preferences on driver load" prefers
      to preserve what the BIOS may have set, whilst bde135a6
      "r8169: only enable PCI wakeups when WOL is active" disabled PCI
      wakeup per default to work around a bug on one platform.
      
      Seems like nobody complained after the latter patch about non-working
      WOL, what makes me think that nobody uses WOL w/o configuring it
      explicitly.
      
      My opinion:
      Vast majority of users doesn't use WOL even if the BIOS enables it in
      the chip. And having WOL being active keeps the PHY(s) from powering
      down if being idle.
      If somebody needs WOL, he can enable it during boot, e.g. by
      configuring systemd.link/WakeOnLan.
      
      Therefore, to make WOL consistent again, disable it per default.
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7edf6d31
    • A
      gianfar: simplify FCS handling and fix memory leak · d903ec77
      Andy Spencer 提交于
      Previously, buffer descriptors containing only the frame check sequence
      (FCS) were skipped and not added to the skb. However, the page reference
      count was still incremented, leading to a memory leak.
      
      Fixing this inside gfar_add_rx_frag() is difficult due to reserved
      memory handling and page reuse. Instead, move the FCS handling to
      gfar_process_frame() and trim off the FCS before passing the skb up the
      networking stack.
      Signed-off-by: NAndy Spencer <aspencer@spacex.com>
      Signed-off-by: NJim Gruen <jgruen@spacex.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d903ec77
  3. 23 2月, 2018 12 次提交
  4. 22 2月, 2018 12 次提交
    • M
      ipvlan: selects master_l3 device instead of depending on it · 218798f4
      Matteo Croce 提交于
      The L3 Master device is just a glue between the core networking code and
      device drivers, so it should be selected automatically rather than
      requiring to be enabled explicitly.
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      218798f4
    • M
      ipvlan: drop ipv6 dependency · 94333fac
      Matteo Croce 提交于
      IPVlan has an hard dependency on IPv6, refactor the ipvlan code to allow
      compiling it with IPv6 disabled, move duplicate code into addr_equal()
      and refactor series of if-else into a switch.
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94333fac
    • D
      net: Allow a rule to track originating protocol · cac56209
      Donald Sharp 提交于
      Allow a rule that is being added/deleted/modified or
      dumped to contain the originating protocol's id.
      
      The protocol is handled just like a routes originating
      protocol is.  This is especially useful because there
      is starting to be a plethora of different user space
      programs adding rules.
      
      Allow the vrf device to specify that the kernel is the originator
      of the rule created for this device.
      Signed-off-by: NDonald Sharp <sharpd@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cac56209
    • T
      amd-xgbe: Restore PCI interrupt enablement setting on resume · cfd092f2
      Tom Lendacky 提交于
      After resuming from suspend, the PCI device support must re-enable the
      interrupt setting so that interrupts are actually delivered.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cfd092f2
    • N
      ibmvnic: Correct goto target for tx irq initialization failure · af9090c2
      Nathan Fontenot 提交于
      When a failure occurs during initialization of the tx sub crq
      irqs, we should branch to the cleanup of the tx irqs. The current
      code branches to the rx irq cleanup and attempts to cleanup the
      rx irqs which have not been initialized.
      Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af9090c2
    • J
      virtio_net: fix ndo_xdp_xmit crash towards dev not ready for XDP · 8dcc5b0a
      Jesper Dangaard Brouer 提交于
      When a driver implements the ndo_xdp_xmit() function, there is
      (currently) no generic way to determine whether it is safe to call.
      
      It is e.g. unsafe to call the drivers ndo_xdp_xmit, if it have not
      allocated the needed XDP TX queues yet.  This is the case for
      virtio_net, which first allocates the XDP TX queues once an XDP/bpf
      prog is attached (in virtnet_xdp_set()).
      
      Thus, a crash will occur for virtio_net when redirecting to another
      virtio_net device's ndo_xdp_xmit, which have not attached a XDP prog.
      The sample xdp_redirect_map tries to attach a dummy XDP prog to take
      this into account, but it can also easily fail if the virtio_net (or
      actually underlying vhost driver) have not allocated enough extra
      queues for the device.
      
      Allocating more queue this is currently a manual config.
      Hint for libvirt XML add:
      
        <driver name='vhost' queues='16'>
          <host mrg_rxbuf='off'/>
          <guest tso4='off' tso6='off' ecn='off' ufo='off'/>
        </driver>
      
      The solution in this patch is to check that the device have loaded an
      XDP/bpf prog before proceeding.  This is similar to the check
      performed in driver ixgbe.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8dcc5b0a
    • J
      virtio_net: fix memory leak in XDP_REDIRECT · 11b7d897
      Jesper Dangaard Brouer 提交于
      XDP_REDIRECT calling xdp_do_redirect() can fail for multiple reasons
      (which can be inspected by tracepoints). The current semantics is that
      on failure the driver calling xdp_do_redirect() must handle freeing or
      recycling the page associated with this frame.  This can be seen as an
      optimization, as drivers usually have an optimized XDP_DROP code path
      for frame recycling in place already.
      
      The virtio_net driver didn't handle when xdp_do_redirect() failed.
      This caused a memory leak as the page refcnt wasn't decremented on
      failures.
      
      The function __virtnet_xdp_xmit() did handle one type of failure,
      when the xmit queue virtqueue_add_outbuf() is full, which "hides"
      releasing a refcnt on the page.  Instead the function __virtnet_xdp_xmit()
      must follow API of xdp_do_redirect(), which on errors leave it up to
      the caller to free the page, of the failed send operation.
      
      Fixes: 186b3c99 ("virtio-net: support XDP_REDIRECT")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      11b7d897
    • J
      virtio_net: fix XDP code path in receive_small() · 95dbe9e7
      Jesper Dangaard Brouer 提交于
      When configuring virtio_net to use the code path 'receive_small()',
      in-order to get correct XDP_REDIRECT support, I discovered TCP packets
      would get silently dropped when loading an XDP program action XDP_PASS.
      
      The bug seems to be that receive_small() when XDP is loaded check that
      hdr->hdr.flags is zero, which seems wrong as hdr.flags contains the
      flags VIRTIO_NET_HDR_F_* :
       #define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 /* Use csum_start, csum_offset */
       #define VIRTIO_NET_HDR_F_DATA_VALID 2 /* Csum is valid */
      
      TCP got dropped as it had the VIRTIO_NET_HDR_F_DATA_VALID flag set.
      
      The flags that are relevant here are the VIRTIO_NET_HDR_GSO_* flags
      stored in hdr->hdr.gso_type. Thus, the fix is just check that none of
      the gso_type flags have been set.
      
      Fixes: bb91accf ("virtio-net: XDP support for small buffers")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95dbe9e7
    • J
      virtio_net: disable XDP_REDIRECT in receive_mergeable() case · 7324f539
      Jesper Dangaard Brouer 提交于
      The virtio_net code have three different RX code-paths in receive_buf().
      Two of these code paths can handle XDP, but one of them is broken for
      at least XDP_REDIRECT.
      
      Function(1): receive_big() does not support XDP.
      Function(2): receive_small() support XDP fully and uses build_skb().
      Function(3): receive_mergeable() broken XDP_REDIRECT uses napi_alloc_skb().
      
      The simple explanation is that receive_mergeable() is broken because
      it uses napi_alloc_skb(), which violates XDP given XDP assumes packet
      header+data in single page and enough tail room for skb_shared_info.
      
      The longer explaination is that receive_mergeable() tries to
      work-around and satisfy these XDP requiresments e.g. by having a
      function xdp_linearize_page() that allocates and memcpy RX buffers
      around (in case packet is scattered across multiple rx buffers).  This
      does currently satisfy XDP_PASS, XDP_DROP and XDP_TX (but only because
      we have not implemented bpf_xdp_adjust_tail yet).
      
      The XDP_REDIRECT action combined with cpumap is broken, and cause hard
      to debug crashes.  The main issue is that the RX packet does not have
      the needed tail-room (SKB_DATA_ALIGN(skb_shared_info)), causing
      skb_shared_info to overlap the next packets head-room (in which cpumap
      stores info).
      
      Reproducing depend on the packet payload length and if RX-buffer size
      happened to have tail-room for skb_shared_info or not.  But to make
      this even harder to troubleshoot, the RX-buffer size is runtime
      dynamically change based on an Exponentially Weighted Moving Average
      (EWMA) over the packet length, when refilling RX rings.
      
      This patch only disable XDP_REDIRECT support in receive_mergeable()
      case, because it can cause a real crash.
      
      IMHO we should consider NOT supporting XDP in receive_mergeable() at
      all, because the principles behind XDP are to gain speed by (1) code
      simplicity, (2) sacrificing memory and (3) where possible moving
      runtime checks to setup time.  These principles are clearly being
      violated in receive_mergeable(), that e.g. runtime track average
      buffer size to save memory consumption.
      
      In the longer run, we should consider introducing a separate receive
      function when attaching an XDP program, and also change the memory
      model to be compatible with XDP when attaching an XDP prog.
      
      Fixes: 186b3c99 ("virtio-net: support XDP_REDIRECT")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7324f539
    • N
      ibmvnic: Allocate max queues stats buffers · abcae546
      Nathan Fontenot 提交于
      To avoid losing any stats when the number of sub-crqs change, allocate
      the max number of stats buffers so a stats buffer exists all possible
      sub-crqs.
      Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      abcae546
    • N
      ibmvnic: Make napi usage dynamic · 86f669b2
      Nathan Fontenot 提交于
      In order to handle the number of rx sub crqs changing during a driver
      reset, the ibmvnic driver also needs to update the number of napi.
      To do this the code to init and free napi's is moved to their own
      routines so they can be called during the reset process.
      Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86f669b2
    • N
      ibmvnic: Free and re-allocate scrqs when tx/rx scrqs change · d7c0ef36
      Nathan Fontenot 提交于
      When the driver resets it is possible that the number of tx/rx
      sub-crqs can change. This patch handles this so that the driver does
      not try to access non-existent sub-crqs.
      
      The count for releasing sub crqs depends on the adapter state. The
      active queue count is not set in probe, so if we are relasing in probe
      state we use the request queue count.
      
      Additionally, a parameter is added to release_sub_crqs() so that
      we know if the h_call to free the sub-crq needs to be made. In
      the reset path we have to do a reset of the main crq, which is
      a free followed by a register of the main crq. The free of main
      crq results in all of the sub crq's being free'ed. When updating
      sub-crq count in the reset path we do not want to h_free the
      sub-crqs, they are already free'ed.
      Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7c0ef36