1. 27 10月, 2015 5 次提交
  2. 26 10月, 2015 10 次提交
  3. 23 10月, 2015 16 次提交
    • L
      net: sysctl: fix a kmemleak warning · ce9d9b8e
      Li RongQing 提交于
      the returned buffer of register_sysctl() is stored into net_header
      variable, but net_header is not used after, and compiler maybe
      optimise the variable out, and lead kmemleak reported the below warning
      
      	comm "swapper/0", pid 1, jiffies 4294937448 (age 267.270s)
      	hex dump (first 32 bytes):
      	90 38 8b 01 c0 ff ff ff 00 00 00 00 01 00 00 00 .8..............
      	01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
      	backtrace:
      	[<ffffffc00020f134>] create_object+0x10c/0x2a0
      	[<ffffffc00070ff44>] kmemleak_alloc+0x54/0xa0
      	[<ffffffc0001fe378>] __kmalloc+0x1f8/0x4f8
      	[<ffffffc00028e984>] __register_sysctl_table+0x64/0x5a0
      	[<ffffffc00028eef0>] register_sysctl+0x30/0x40
      	[<ffffffc00099c304>] net_sysctl_init+0x20/0x58
      	[<ffffffc000994dd8>] sock_init+0x10/0xb0
      	[<ffffffc0000842e0>] do_one_initcall+0x90/0x1b8
      	[<ffffffc000966bac>] kernel_init_freeable+0x218/0x2f0
      	[<ffffffc00070ed6c>] kernel_init+0x1c/0xe8
      	[<ffffffc000083bfc>] ret_from_fork+0xc/0x50
      	[<ffffffffffffffff>] 0xffffffffffffffff <<end check kmemleak>>
      
      Before fix, the objdump result on ARM64:
      0000000000000000 <net_sysctl_init>:
         0:   a9be7bfd        stp     x29, x30, [sp,#-32]!
         4:   90000001        adrp    x1, 0 <net_sysctl_init>
         8:   90000000        adrp    x0, 0 <net_sysctl_init>
         c:   910003fd        mov     x29, sp
        10:   91000021        add     x1, x1, #0x0
        14:   91000000        add     x0, x0, #0x0
        18:   a90153f3        stp     x19, x20, [sp,#16]
        1c:   12800174        mov     w20, #0xfffffff4                // #-12
        20:   94000000        bl      0 <register_sysctl>
        24:   b4000120        cbz     x0, 48 <net_sysctl_init+0x48>
        28:   90000013        adrp    x19, 0 <net_sysctl_init>
        2c:   91000273        add     x19, x19, #0x0
        30:   9101a260        add     x0, x19, #0x68
        34:   94000000        bl      0 <register_pernet_subsys>
        38:   2a0003f4        mov     w20, w0
        3c:   35000060        cbnz    w0, 48 <net_sysctl_init+0x48>
        40:   aa1303e0        mov     x0, x19
        44:   94000000        bl      0 <register_sysctl_root>
        48:   2a1403e0        mov     w0, w20
        4c:   a94153f3        ldp     x19, x20, [sp,#16]
        50:   a8c27bfd        ldp     x29, x30, [sp],#32
        54:   d65f03c0        ret
      After:
      0000000000000000 <net_sysctl_init>:
         0:   a9bd7bfd        stp     x29, x30, [sp,#-48]!
         4:   90000000        adrp    x0, 0 <net_sysctl_init>
         8:   910003fd        mov     x29, sp
         c:   a90153f3        stp     x19, x20, [sp,#16]
        10:   90000013        adrp    x19, 0 <net_sysctl_init>
        14:   91000000        add     x0, x0, #0x0
        18:   91000273        add     x19, x19, #0x0
        1c:   f90013f5        str     x21, [sp,#32]
        20:   aa1303e1        mov     x1, x19
        24:   12800175        mov     w21, #0xfffffff4                // #-12
        28:   94000000        bl      0 <register_sysctl>
        2c:   f9002260        str     x0, [x19,#64]
        30:   b40001a0        cbz     x0, 64 <net_sysctl_init+0x64>
        34:   90000014        adrp    x20, 0 <net_sysctl_init>
        38:   91000294        add     x20, x20, #0x0
        3c:   9101a280        add     x0, x20, #0x68
        40:   94000000        bl      0 <register_pernet_subsys>
        44:   2a0003f5        mov     w21, w0
        48:   35000080        cbnz    w0, 58 <net_sysctl_init+0x58>
        4c:   aa1403e0        mov     x0, x20
        50:   94000000        bl      0 <register_sysctl_root>
        54:   14000004        b       64 <net_sysctl_init+0x64>
        58:   f9402260        ldr     x0, [x19,#64]
        5c:   94000000        bl      0 <unregister_sysctl_table>
        60:   f900227f        str     xzr, [x19,#64]
        64:   2a1503e0        mov     w0, w21
        68:   f94013f5        ldr     x21, [sp,#32]
        6c:   a94153f3        ldp     x19, x20, [sp,#16]
        70:   a8c37bfd        ldp     x29, x30, [sp],#48
        74:   d65f03c0        ret
      
      Add the possible error handle to free the net_header to remove the
      kmemleak warning
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce9d9b8e
    • G
      ppp: fix pppoe_dev deletion condition in pppoe_release() · 1acea4f6
      Guillaume Nault 提交于
      We can't rely on PPPOX_ZOMBIE to decide whether to clear po->pppoe_dev.
      PPPOX_ZOMBIE can be set by pppoe_disc_rcv() even when po->pppoe_dev is
      NULL. So we have no guarantee that (sk->sk_state & PPPOX_ZOMBIE) implies
      (po->pppoe_dev != NULL).
      Since we're releasing a PPPoE socket, we want to release the pppoe_dev
      if it exists and reset sk_state to PPPOX_DEAD, no matter the previous
      value of sk_state. So we can just check for po->pppoe_dev and avoid any
      assumption on sk->sk_state.
      
      Fixes: 2b018d57 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release")
      Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1acea4f6
    • L
      af_key: fix two typos · f6b8dec9
      Li RongQing 提交于
      Signed-off-by: NLi RongQing <roy.qing.li@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6b8dec9
    • L
      amd-xgbe: Use wmb before updating current descriptor count · 20a41fba
      Lendacky, Thomas 提交于
      The code currently uses the lightweight dma_wmb barrier before updating
      the current descriptor count. Under heavy load, the Tx cleanup routine
      was seeing the updated current descriptor count before the updated
      descriptor information. As a result, the Tx descriptor was being cleaned
      up before it was used because it was not "owned" by the hardware yet,
      resulting in a Tx queue hang.
      
      Using the wmb barrier insures that the descriptor is updated before the
      descriptor counter preventing the Tx queue hang. For extra insurance,
      the Tx cleanup routine is changed to grab the current decriptor count on
      entry and uses that initial value in the processing loop rather than
      trying to chase the current value.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Tested-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20a41fba
    • N
      net/phy: micrel: Add workaround for bad autoneg · d2fd719b
      Nathan Sullivan 提交于
      Very rarely, the KSZ9031 will appear to complete autonegotiation, but
      will drop all traffic afterwards.  When this happens, the idle error
      count will read 0xFF after autonegotiation completes.  Reset the PHY
      when in that state.
      Signed-off-by: NNathan Sullivan <nathan.sullivan@ni.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2fd719b
    • D
      Merge branch 'ipv6-overflow-arith' · ec3661b4
      David S. Miller 提交于
      Hannes Frederic Sowa says:
      
      ====================
      overflow-arith: begin to add support for overflow builtins functions
      
      I add a new header, linux/overflow-arith.h, as the central place to add
      overflow and wrap-around checking functions. The reason I am doing so
      is that it can make use of compiler supported builtin functions which
      can leverage hardware.
      
      As I need this for a fix in the ipv6 stack, which is also included in
      this series, I propose to add it sooner than later over Davem's net
      tree. This is also the reason why I start slowly with only the one
      function I need at this time.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec3661b4
    • H
      ipv6: protect mtu calculation of wrap-around and infinite loop by rounding issues · b72a2b01
      Hannes Frederic Sowa 提交于
      Raw sockets with hdrincl enabled can insert ipv6 extension headers
      right into the data stream. In case we need to fragment those packets,
      we reparse the options header to find the place where we can insert
      the fragment header. If the extension headers exceed the link's MTU we
      actually cannot make progress in such a case.
      
      Instead of ending up in broken arithmetic or rounding towards 0 and
      entering an endless loop in ip6_fragment, just prevent those cases by
      aborting early and signal -EMSGSIZE to user space.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b72a2b01
    • H
      overflow-arith: begin to add support for overflow builtin functions · 79907146
      Hannes Frederic Sowa 提交于
      The idea of the overflow-arith.h header is to collect overflow checking
      functions in one central place.
      
      If gcc compiler supports the __builtin_overflow_* builtins we use them
      because they might give better performance, otherwise the code falls
      back to normal overflow checking functions.
      
      The builtin_overflow functions are supported by gcc-5 and clang. The
      matter of supporting clang is to just provide a corresponding
      CC_HAVE_BUILTIN_OVERFLOW, because the specific overflow checking builtins
      don't differ between gcc and clang.
      
      I just provide overflow_usub function here as I intend this to get merged
      into net, more functions will definitely follow as they are needed.
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79907146
    • A
      tcp: allow dctcp alpha to drop to zero · c80dbe04
      Andrew Shewmaker 提交于
      If alpha is strictly reduced by alpha >> dctcp_shift_g and if alpha is less
      than 1 << dctcp_shift_g, then alpha may never reach zero. For example,
      given shift_g=4 and alpha=15, alpha >> dctcp_shift_g yields 0 and alpha
      remains 15. The effect isn't noticeable in this case below cwnd=137, but
      could gradually drive uncongested flows with leftover alpha down to
      cwnd=137. A larger dctcp_shift_g would have a greater effect.
      
      This change causes alpha=15 to drop to 0 instead of being decrementing by 1
      as it would when alpha=16. However, it requires one less conditional to
      implement since it doesn't have to guard against subtracting 1 from 0U. A
      decay of 15 is not unreasonable since an equal or greater amount occurs at
      alpha >= 240.
      Signed-off-by: NAndrew G. Shewmaker <agshew@gmail.com>
      Acked-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c80dbe04
    • L
      ipv6: fix the incorrect return value of throw route · ab997ad4
      lucien 提交于
      The error condition -EAGAIN, which is signaled by throw routes, tells
      the rules framework to walk on searching for next matches. If the walk
      ends and we stop walking the rules with the result of a throw route we
      have to translate the error conditions to -ENETUNREACH.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab997ad4
    • J
      macvtap: unbreak receiving of gro skb with frag list · f23d538b
      Jason Wang 提交于
      We don't have fraglist support in TAP_FEATURES. This will lead
      software segmentation of gro skb with frag list. Fixes by having
      frag list support in TAP_FEATURES.
      
      With this patch single session of netperf receiving were restored from
      about 5Gb/s to about 12Gb/s on mlx4.
      
      Fixes a567dd62 ("macvtap: simplify usage of tap_features")
      Cc: Vlad Yasevich <vyasevic@redhat.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f23d538b
    • P
      openvswitch: Fix egress tunnel info. · fc4099f1
      Pravin B Shelar 提交于
      While transitioning to netdev based vport we broke OVS
      feature which allows user to retrieve tunnel packet egress
      information for lwtunnel devices.  Following patch fixes it
      by introducing ndo operation to get the tunnel egress info.
      Same ndo operation can be used for lwtunnel devices and compat
      ovs-tnl-vport devices. So after adding such device operation
      we can remove similar operation from ovs-vport.
      
      Fixes: 614732ea ("openvswitch: Use regular VXLAN net_device device").
      Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc4099f1
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · 0c472b9b
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2015-10-22
      
      This series contains fixes to i40e only.
      
      Jesse provides two small fixes for i40e, first fixes counters that were
      being displayed incorrectly due to indexing beyond the array of strings
      when printing stats.  Then fixed the fact that the driver was printing
      a message about not being able to assign VMDq because a lack of MSI-X
      vectors, when it was not true.  It was due to a line missing that
      initialized a variable.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c472b9b
    • J
      VSOCK: Fix lockdep issue. · 8566b86a
      Jorgen Hansen 提交于
      The recent fix for the vsock sock_put issue used the wrong
      initializer for the transport spin_lock causing an issue when
      running with lockdep checking.
      
      Testing: Verified fix on kernel with lockdep enabled.
      Reviewed-by: NThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: NJorgen Hansen <jhansen@vmware.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8566b86a
    • J
      i40e: fix annoying message · e9e53662
      Jesse Brandeburg 提交于
      The driver was printing a message about not being able
      to assign VMDq because of a lack of MSI-X vectors.
      
      This was because a line was missing that initialized a variable,
      simply a merge error.
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      e9e53662
    • J
      i40e: fix stats offsets · 74a6c665
      Jesse Brandeburg 提交于
      The code was setting up stats that were not being initialized.
      This caused several counters to be displayed incorrectly, due
      to indexing beyond the array of strings when printing stats.
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      74a6c665
  4. 22 10月, 2015 9 次提交
    • B
      qmi_wwan: add Sierra Wireless MC74xx/EM74xx · 0db65fcf
      Bjørn Mork 提交于
      New device IDs shamelessly lifted from the vendor driver.
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0db65fcf
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 199c6550
      David S. Miller 提交于
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2015-10-22
      
      1) Fix IPsec pre-encap fragmentation for GSO packets.
         From Herbert Xu.
      
      2) Fix some header checks in _decode_session6.
         We skip the header informations if the data pointer points
         already behind the header in question for some protocols.
         This is because we call pskb_may_pull with a negative value
         converted to unsigened int from pskb_may_pull in this case.
         Skipping the header informations can lead to incorrect policy
         lookups. From Mathias Krause.
      
      3) Allow to change the replay threshold and expiry timer of a
         state without having to set other attributes like replay
         counter and byte lifetime. Changing these other attributes
         may break the SA. From Michael Rossberg.
      
      4) Fix pmtu discovery for local generated packets.
         We may fail dispatch to the inner address family.
         As a reault, the local error handler is not called
         and the mtu value is not reported back to userspace.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      199c6550
    • D
      net: ipv6: Dont add RT6_LOOKUP_F_IFACE flag if saddr set · d46a9d67
      David Ahern 提交于
      741a11d9 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set")
      adds the RT6_LOOKUP_F_IFACE flag to make device index mismatch fatal if
      oif is given. Hajime reported that this change breaks the Mobile IPv6
      use case that wants to force the message through one interface yet use
      the source address from another interface. Handle this case by only
      adding the flag if oif is set and saddr is not set.
      
      Fixes: 741a11d9 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set")
      Cc: Hajime Tazaki <thehajime@gmail.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d46a9d67
    • D
      Merge branch 'isdn-null-deref' · 92a93fd5
      David S. Miller 提交于
      Karsten Keil says:
      
      ====================
      Fix potential NULL pointer access and memory leak in ISDN layer2 functions
      
      Insu Yun did brinup the issue with not checking the skb_clone() return
      value in the layer2 I-frame ull functions.
      This series fix the issue in a way which avoid protocol violations/data loss
      on a temporary memory shortage.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92a93fd5
    • K
      mISDN: fix OOM condition for sending queued I-Frames · c96356a9
      Karsten Keil 提交于
      The old code did not check the return value of skb_clone().
      The extra skb_clone() is not needed at all, if using skb_realloc_headroom()
      instead, which gives us a private copy with enough headroom as well.
      We need to requeue the original skb if the call failed, because we cannot
      inform upper layers about the data loss. Restructure the code to minimise
      rollback effort if it happens.
      This fix kernel bug #86091
      
      Thanks to Insu Yun <wuninsu@gmail.com> to remind me on this issue.
      Signed-off-by: NKarsten Keil <keil@b1-systems.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c96356a9
    • K
      ISDN: fix OOM condition for sending queued I-Frames · c7a7c95e
      Karsten Keil 提交于
      The skb_clone() return value was not checked and the skb_realloc_headroom()
      usage was wrong, the old skb was not freed. It turned out, that the
      skb_clone is not needed at all, the skb_realloc_headroom() will create a
      private copy with enough headroom and the original SKB can be used for the
      ACK queue.
      We need to requeue the original skb if the call failed, since the upper
      layer cannot be informed about memory shortage.
      
      Thanks to Insu Yun <wuninsu@gmail.com> to remind me on this issue.
      Signed-off-by: NKarsten Keil <keil@b1-systems.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7a7c95e
    • J
      VSOCK: sock_put wasn't safe to call in interrupt context · 4ef7ea91
      Jorgen Hansen 提交于
      In the vsock vmci_transport driver, sock_put wasn't safe to call
      in interrupt context, since that may call the vsock destructor
      which in turn calls several functions that should only be called
      from process context. This change defers the callling of these
      functions  to a worker thread. All these functions were
      deallocation of resources related to the transport itself.
      
      Furthermore, an unused callback was removed to simplify the
      cleanup.
      
      Multiple customers have been hitting this issue when using
      VMware tools on vSphere 2015.
      
      Also added a version to the vmci transport module (starting from
      1.0.2.0-k since up until now it appears that this module was
      sharing version with vsock that is currently at 1.0.1.0-k).
      Reviewed-by: NAditya Asarwade <asarwade@vmware.com>
      Reviewed-by: NThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: NJorgen Hansen <jhansen@vmware.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ef7ea91
    • D
      netlink: fix locking around NETLINK_LIST_MEMBERSHIPS · 47191d65
      David Herrmann 提交于
      Currently, NETLINK_LIST_MEMBERSHIPS grabs the netlink table while copying
      the membership state to user-space. However, grabing the netlink table is
      effectively a write_lock_irq(), and as such we should not be triggering
      page-faults in the critical section.
      
      This can be easily reproduced by the following snippet:
          int s = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
          void *p = mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0);
          int r = getsockopt(s, 0x10e, 9, p, (void*)((char*)p + 4092));
      
      This should work just fine, but currently triggers EFAULT and a possible
      WARN_ON below handle_mm_fault().
      
      Fix this by reducing locking of NETLINK_LIST_MEMBERSHIPS to a read-side
      lock. The write-lock was overkill in the first place, and the read-lock
      allows page-faults just fine.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      47191d65
    • A
      net: phy: dp83848: Add TI DP83848 Ethernet PHY · 34e45ad9
      Andrew F. Davis 提交于
      Add support for the TI DP83848 Ethernet PHY device.
      
      The DP83848 is a highly reliable, feature rich, IEEE 802.3 compliant
      single port 10/100 Mb/s Ethernet Physical Layer Transceiver supporting
      the MII and RMII interfaces.
      Signed-off-by: NAndrew F. Davis <afd@ti.com>
      Signed-off-by: NDan Murphy <dmurphy@ti.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NDan Murphy <dmurphy@ti.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34e45ad9