1. 06 9月, 2014 24 次提交
    • A
      net: merge cases where sock_efree and sock_edemux are the same function · 82eabd9e
      Alexander Duyck 提交于
      Since sock_efree and sock_demux are essentially the same code for non-TCP
      sockets and the case where CONFIG_INET is not defined we can combine the
      code or replace the call to sock_edemux in several spots.  As a result we
      can avoid a bit of unnecessary code or code duplication.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82eabd9e
    • A
      net-timestamp: Make the clone operation stand-alone from phy timestamping · 62bccb8c
      Alexander Duyck 提交于
      The phy timestamping takes a different path than the regular timestamping
      does in that it will create a clone first so that the packets needing to be
      timestamped can be placed in a queue, or the context block could be used.
      
      In order to support these use cases I am pulling the core of the code out
      so it can be used in other drivers beyond just phy devices.
      
      In addition I have added a destructor named sock_efree which is meant to
      provide a simple way for dropping the reference to skb exceptions that
      aren't part of either the receive or send windows for the socket, and I
      have removed some duplication in spots where this destructor could be used
      in place of sock_edemux.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62bccb8c
    • A
      net-timestamp: Merge shared code between phy and regular timestamping · 37846ef0
      Alexander Duyck 提交于
      This change merges the shared bits that exist between skb_tx_tstamp and
      skb_complete_tx_timestamp.  By doing this we can avoid the two diverging as
      there were already changes pushed into skb_tx_tstamp that hadn't made it
      into the other function.
      
      In addition this resolves issues with the fact that
      skb_complete_tx_timestamp was included in linux/skbuff.h even though it was
      only compiled in if phy timestamping was enabled.
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37846ef0
    • E
      ipv4: harden fnhe_hashfun() · d546c621
      Eric Dumazet 提交于
      Lets make this hash function a bit secure, as ICMP attacks are still
      in the wild.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d546c621
    • W
      net-timestamp: fix allocation error in test · 18a47e6d
      Willem de Bruijn 提交于
      A buffer is incorrectly zeroed to the length of the pointer. If
      cfg_payload_len < sizeof(void *) this can overwrites unrelated memory.
      The buffer contents are never read, so no need to zero.
      
      Fixes: 8fe2f761 ("net-timestamp: expand documentation")
      Reported-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18a47e6d
    • D
      hyperv: NULL dereference on error · b1c84927
      Dan Carpenter 提交于
      We try to call free_netvsc_device(net_device) when "net_device" is NULL.
      It leads to an Oops.
      
      Fixes: f90251c8 ('hyperv: Increase the buffer length for netvsc_channel_cb()')
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1c84927
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · a77f9a28
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2014-09-04
      
      This series contains updates to i40e, i40evf, ixgbe and ixgbevf.
      
      Catherine adds dual speed module support to i40e.  Updates i40e to allow
      the user to change link settings when the link is down.
      
      Serey renames i40e_ndo_set_vf_spoofck() to i40e_ndo_set_vf_spookchk()
      to be more consistent with what is defined in netdev and removes a
      unnecessary variable assignment.
      
      Jesse makes a malicious driver detection warning only print if extended
      driver string is enabled for i40e.  Fixes a panic under traffic load when
      resetting or if/whenever there was a Tx-timeout because we were enabling
      the Tx queue to early.
      
      Anjali fixes an issue when PF reset fails, where we were trying to restart
      the admin queue which has not been setup at that point.  This resolves an
      occasional kernel panic when PF reset fails for some reason.
      
      Ethan Zhao replaces the use of a local i40e_vfs_are_assigned() with the
      global kernel pci_vfs_assigned() for i40e.
      
      Alex cleans up the FDB handling for ixgbe.  This change makes it so that
      the behavior for FDB handling is consistent between both the SR-IOV and
      non-SR-IOV cases.  The main change is that we perform bounds checking on
      the number of SR-IOV addresses regardless of if SR-IOV is enabled or not
      as we can only support a certain number of addresses in the hardware.
      
      Emil extends the pending Tx work check to the VF interfaces, where the
      driver initiates a reset of the interface on link loss with pending Tx
      work in order to clear the rings.  Introduces a delay for 82599 VFs of
      at least 500 usecs to make sure the VFLINKS value is correct, since this
      bit tends to flap when a DA or SFP+ cable is disconnected.
      
      Jacob adds code comments in ixgbe to make it more obvious that we are
      resetting features based on the fact that we do not have MSI-X enabled,
      and cannot use the previous settings.  Also resolves a kernel NULL
      pointer dereference by limiting the combined total of MACVLAN and
      SR-IOV VFs, since the hardware has a limited number of pools available
      (64).  Previously, no checks were in place to limit the number of
      accelerated MACVLAN devices based on the number of pools, which would
      be ok since there was already a limit for these well below the number of
      available pools.  However, SR-IOV uses the very same pools, therefore
      we need to ensure that the total number of pools does not exceed the
      number of pools available in the hardware.
      
      v2:
       - clean up code comment in patch 5 by replacing "an" with "auto
         negotiation" based on feedback from Sergei Shtylyov
       - removed un-necessary parenthesis around function call in patch 8
         based on feedback from Sergei Shtylyov
      ====================
      a77f9a28
    • D
      net: ethernet: cpsw: improve interrupt lookup logic in cpsw_probe() · c2b32e58
      Daniel Mack 提交于
      Simplify the interrupt resource lookup code in cpsw_probe() by the
      following:
      
       * Only look at the first member of the resource. As the driver only
         works for DT-enabled platforms anyway, a resource of type
         IORESOURCE_IRQ will only contain one single entry
         (res->start == res->end), so there is no need for the iteration.
      
       * Add a bounds check to avoid overflows if we are passed more than
         ARRAY_SIZE(priv->irqs_table) resources.
      
       * Assign 'ret' with the return value of devm_request_irq() so that
         cpsw_probe() returns the appropriate error code.
      
       * If devm_request_irq() fails, report the error code in the log
         message.
      Signed-off-by: NDaniel Mack <zonque@gmail.com>
      Acked-by: NMugunthan V N <mugunthanvnm@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2b32e58
    • E
      ipv4: fix a race in update_or_create_fnhe() · caa41527
      Eric Dumazet 提交于
      nh_exceptions is effectively used under rcu, but lacks proper
      barriers. Between kzalloc() and setting of nh->nh_exceptions(),
      we need a proper memory barrier.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Fixes: 4895c771 ("ipv4: Add FIB nexthop exceptions.")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      caa41527
    • A
      l2tp: fix missing line continuation · 29abe2fd
      Andy Zhou 提交于
      This syntax error was covered by L2TP_REFCNT_DEBUG not being set by
      default.
      Signed-off-by: NAndy Zhou <azhou@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29abe2fd
    • D
      Merge branch 'amd-xgbe-next' · f35d2a5f
      David S. Miller 提交于
      Tom Lendacky says:
      
      ====================
      amd-xgbe: AMD XGBE driver updates 2014-09-03
      
      The following series of patches includes fixes/updates to the driver.
      
      - Query the device for the actual speed mode (KR/KX) rather than trying
        to track it
      - Update parallel detection logic to support KR mode
      - Fix new warnings from checkpatch in the amd-xgbe and amd-xgbe-phy
        driver
      
      This patch series is based on net-next.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f35d2a5f
    • L
      amd-xgbe-phy: Checkpatch driver fixes · b73c798b
      Lendacky, Thomas 提交于
      This patch contains fixes identified by checkpatch when run with the
      strict option.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b73c798b
    • L
      amd-xgbe: Checkpatch driver fixes · a2ea14d7
      Lendacky, Thomas 提交于
      This patch contains fixes identified by checkpatch when run with the
      strict option.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2ea14d7
    • L
      amd-xgbe-phy: Enhance parallel detection to support KR speed · e6f0562f
      Lendacky, Thomas 提交于
      Add support to allow parallel detection to work in KR speed. With
      both speed modes of KX and KR supported, KX must be checked first.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e6f0562f
    • L
      amd-xgbe-phy: Check device for current speed mode (KR/KX) · e3eec4e7
      Lendacky, Thomas 提交于
      Since device resets can change the current mode it's possible to think
      the device is in a different mode than it actually is.  Rather than
      trying to determine every place that is needed to set/save the current
      mode, be safe and check the devices actual mode when needed rather than
      trying to track it.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e3eec4e7
    • D
      Merge branch 'r8152-next' · e4cf0b75
      David S. Miller 提交于
      Hayes Wang says:
      
      ====================
      r8152: random MAC address
      
      If the interface has invalid MAC address, it couldn't
      be used. In order to let it work normally, give a
      random one.
      
      v3:
        Remove
      	ether_addr_copy(dev->perm_addr, dev->dev_addr);
      
      v2:
        Use "%pM" format specifier for printing a MAC address.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4cf0b75
    • H
      r8152: use eth_hw_addr_random · 179bb6d7
      hayeswang 提交于
      If the hw doesn't have a valid MAC address, give a random one and
      set it to the hw.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      179bb6d7
    • H
      r8152: change the location of rtl8152_set_mac_address · 8ba789ab
      hayeswang 提交于
      Exchange the location of rtl8152_set_mac_address() and
      set_ethernet_addr(). Then, the set_ethernet_addr() could
      set the MAC address by calling rtl8152_set_mac_address()
      later.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8ba789ab
    • D
      Merge branch 'rx_copybreak' · b52b7275
      David S. Miller 提交于
      Govindarajulu Varadarajan says:
      
      ====================
      enic: Add support for rx_copybreak
      
      The following series implements rx_copybreak.
      
      dma_map_single()/dma_unmap_single() is more expensive than alloc_skb & memcpy
      for smaller packets. By doing this we can reuse the dma buff which is already
      mapped. This is very useful when iommu is on. The default skb copybreak value
      is 256.
      
      When iommu is on, we can go much higher than 256. All the drivers that supports
      rx_copybreak provides module parameter to change this value. Since module
      parameter is the least preferred way for changing driver values, this series
      adds ethtool support for setting rx_copybreak.
      
      v4:
      Validate tunable length in ethtool_get_tunable, not in driver implemented
      function.
      
      Loose tunable_ops array for each tunable type. Define one function and let the
      driver use switch case for each type.
      
      Use double underscore for data type in UAPI headers.
      Use const qualifier where possible.
      
      v3:
      Add tunable namespace to ethtool. Use new ethtool cmd ETHTOOL_S/GTUNABLE to
      set/get rx_copybreak from userspace.
      
      v2:
      Add new ethtool_cmd for DMA buffer parameters, instead of adding new members to
      existing ethtool_ringparam.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b52b7275
    • G
      enic: Add tunable_ops support for rx_copybreak · d4ad30b1
      Govindarajulu Varadarajan 提交于
      This patch adds support for setting/getting rx_copybreak using
      generic ethtool tunable.
      
      Defines enic_get_tunable() & enic_set_tunable() to get/set rx_copybreak.
      As of now, these two function supports only rx_copybreak.
      Signed-off-by: NGovindarajulu Varadarajan <_govind@gmx.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4ad30b1
    • G
      ethtool: Add generic options for tunables · f0db9b07
      Govindarajulu Varadarajan 提交于
      This patch adds new ethtool cmd, ETHTOOL_GTUNABLE & ETHTOOL_STUNABLE for getting
      tunable values from driver.
      
      Add get_tunable and set_tunable to ethtool_ops. Driver implements these
      functions for getting/setting tunable value.
      Signed-off-by: NGovindarajulu Varadarajan <_govind@gmx.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f0db9b07
    • G
      enic: implement rx_copybreak · a03bb56e
      Govindarajulu Varadarajan 提交于
      Calling dma_map_single()/dma_unmap_single() is quite expensive compared
      to copying a small packet. So let's copy short frames and keep the buffers
      mapped.
      Signed-off-by: NGovindarajulu Varadarajan <_govind@gmx.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a03bb56e
    • D
      dev_ioctl: remove dev_load() CAP_SYS_MODULE message · e020836d
      Daniel Borkmann 提交于
      Marcel reported to see the following message when autoloading
      is being triggered when adding nlmon device:
      
        Loading kernel module for a network device with
        CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias
        netdev-nlmon instead.
      
      This false-positive happens despite with having correct
      capabilities set, e.g. through issuing `ip link del dev nlmon`
      more than once on a valid device with name nlmon, but Marcel
      has also seen it on creation time when no nlmon module is
      previously compiled-in or loaded as module and the device
      name equals a link type name (e.g. nlmon, vxlan, team).
      
      Stephen says:
      
        The netdev module alias is a hold over from the past. For
        normal devices, people used to create a alias eth0 to and
        point it to the type of network device used, that was back
        in the bad old ISA days before real discovery.
      
        Also, the tunnels create module alias for the control device
        and ip used to use this to autoload the tunnel device.
      
        The message is bogus and should just be removed, I also see
        it in a couple of other cases where tap devices are renamed
        for other usese.
      
      As mentioned in 8909c9ad ("net: don't allow CAP_NET_ADMIN
      to load non-netdev kernel modules"), we nevertheless still
      might want to leave the old autoloading behaviour in place
      as it could break old scripts, so for now, lets just remove
      the log message as Stephen suggests.
      
      Reference: http://thread.gmane.org/gmane.linux.kernel/1105168Reported-by: NMarcel Holtmann <marcel@holtmann.org>
      Suggested-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Vasiliy Kulikov <segoon@openwall.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e020836d
    • D
      net: bpf: make eBPF interpreter images read-only · 60a3b225
      Daniel Borkmann 提交于
      With eBPF getting more extended and exposure to user space is on it's way,
      hardening the memory range the interpreter uses to steer its command flow
      seems appropriate.  This patch moves the to be interpreted bytecode to
      read-only pages.
      
      In case we execute a corrupted BPF interpreter image for some reason e.g.
      caused by an attacker which got past a verifier stage, it would not only
      provide arbitrary read/write memory access but arbitrary function calls
      as well. After setting up the BPF interpreter image, its contents do not
      change until destruction time, thus we can setup the image on immutable
      made pages in order to mitigate modifications to that code. The idea
      is derived from commit 314beb9b ("x86: bpf_jit_comp: secure bpf jit
      against spraying attacks").
      
      This is possible because bpf_prog is not part of sk_filter anymore.
      After setup bpf_prog cannot be altered during its life-time. This prevents
      any modifications to the entire bpf_prog structure (incl. function/JIT
      image pointer).
      
      Every eBPF program (including classic BPF that are migrated) have to call
      bpf_prog_select_runtime() to select either interpreter or a JIT image
      as a last setup step, and they all are being freed via bpf_prog_free(),
      including non-JIT. Therefore, we can easily integrate this into the
      eBPF life-time, plus since we directly allocate a bpf_prog, we have no
      performance penalty.
      
      Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
      inspection of kernel_page_tables.  Brad Spengler proposed the same idea
      via Twitter during development of this patch.
      
      Joint work with Hannes Frederic Sowa.
      Suggested-by: NBrad Spengler <spender@grsecurity.net>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Kees Cook <keescook@chromium.org>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      60a3b225
  2. 05 9月, 2014 3 次提交
  3. 04 9月, 2014 13 次提交