1. 11 9月, 2015 6 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 65c61bc5
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix out-of-bounds array access in netfilter ipset, from Jozsef
          Kadlecsik.
      
       2) Use correct free operation on netfilter conntrack templates, from
          Daniel Borkmann.
      
       3) Fix route leak in SCTP, from Marcelo Ricardo Leitner.
      
       4) Fix sizeof(pointer) in mac80211, from Thierry Reding.
      
       5) Fix cache pointer comparison in ip6mr leading to missed unlock of
          mrt_lock.  From Richard Laing.
      
       6) rds_conn_lookup() needs to consider network namespace in key
          comparison, from Sowmini Varadhan.
      
       7) Fix deadlock in TIPC code wrt broadcast link wakeups, from Kolmakov
          Dmitriy.
      
       8) Fix fd leaks in bpf syscall, from Daniel Borkmann.
      
       9) Fix error recovery when installing ipv6 multipath routes, we would
          delete the old route before we would know if we could fully commit
          to the new set of nexthops.  Fix from Roopa Prabhu.
      
      10) Fix run-time suspend problems in r8152, from Hayes Wang.
      
      11) In fec, don't program the MAC address into the chip when the clocks
          are gated off.  From Fugang Duan.
      
      12) Fix poll behavior for netlink sockets when using rx ring mmap, from
          Daniel Borkmann.
      
      13) Don't allocate memory with GFP_KERNEL from get_stats64 in r8169
          driver, from Corinna Vinschen.
      
      14) In TCP Cubic congestion control, handle idle periods better where we
          are application limited, in order to keep cwnd from growing out of
          control.  From Eric Dumzet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
        tcp_cubic: better follow cubic curve after idle period
        tcp: generate CA_EVENT_TX_START on data frames
        xen-netfront: respect user provided max_queues
        xen-netback: respect user provided max_queues
        r8169: Fix sleeping function called during get_stats64, v2
        ether: add IEEE 1722 ethertype - TSN
        netlink, mmap: fix edge-case leakages in nf queue zero-copy
        netlink, mmap: don't walk rx ring on poll if receive queue non-empty
        cxgb4: changes for new firmware 1.14.4.0
        net: fec: add netif status check before set mac address
        r8152: fix the runtime suspend issues
        r8152: split DRIVER_VERSION
        ipv6: fix ifnullfree.cocci warnings
        add microchip LAN88xx phy driver
        stmmac: fix check for phydev being open
        net: qlcnic: delete redundant memsets
        net: mv643xx_eth: use kzalloc
        net: jme: use kzalloc() instead of kmalloc+memset
        net: cavium: liquidio: use kzalloc in setup_glist()
        net: ipv6: use common fib_default_rule_pref
        ...
      65c61bc5
    • E
      tcp_cubic: better follow cubic curve after idle period · 30927520
      Eric Dumazet 提交于
      Jana Iyengar found an interesting issue on CUBIC :
      
      The epoch is only updated/reset initially and when experiencing losses.
      The delta "t" of now - epoch_start can be arbitrary large after app idle
      as well as the bic_target. Consequentially the slope (inverse of
      ca->cnt) would be really large, and eventually ca->cnt would be
      lower-bounded in the end to 2 to have delayed-ACK slow-start behavior.
      
      This particularly shows up when slow_start_after_idle is disabled
      as a dangerous cwnd inflation (1.5 x RTT) after few seconds of idle
      time.
      
      Jana initial fix was to reset epoch_start if app limited,
      but Neal pointed out it would ask the CUBIC algorithm to recalculate the
      curve so that we again start growing steeply upward from where cwnd is
      now (as CUBIC does just after a loss). Ideally we'd want the cwnd growth
      curve to be the same shape, just shifted later in time by the amount of
      the idle period.
      Reported-by: NJana Iyengar <jri@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Sangtae Ha <sangtae.ha@gmail.com>
      Cc: Lawrence Brakmo <lawrence@brakmo.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30927520
    • N
      tcp: generate CA_EVENT_TX_START on data frames · 05c5a46d
      Neal Cardwell 提交于
      Issuing a CC TX_START event on control frames like pure ACK
      is a waste of time, as a CC should not care.
      
      Following patch needs this change, as we want CUBIC to properly track
      idle time at a low cost, with a single TX_START being generated.
      
      Yuchung might slightly refine the condition triggering TX_START
      on a followup patch.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Cc: Jana Iyengar <jri@google.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Sangtae Ha <sangtae.ha@gmail.com>
      Cc: Lawrence Brakmo <lawrence@brakmo.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05c5a46d
    • W
      xen-netfront: respect user provided max_queues · 32a84405
      Wei Liu 提交于
      Originally that parameter was always reset to num_online_cpus during
      module initialisation, which renders it useless.
      
      The fix is to only set max_queues to num_online_cpus when user has not
      provided a value.
      Signed-off-by: NWei Liu <wei.liu2@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Tested-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32a84405
    • W
      xen-netback: respect user provided max_queues · 4c82ac3c
      Wei Liu 提交于
      Originally that parameter was always reset to num_online_cpus during
      module initialisation, which renders it useless.
      
      The fix is to only set max_queues to num_online_cpus when user has not
      provided a value.
      Reported-by: NJohnny Strom <johnny.strom@linuxsolutions.fi>
      Signed-off-by: NWei Liu <wei.liu2@citrix.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Acked-by: NIan Campbell <ian.campbell@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4c82ac3c
    • C
      r8169: Fix sleeping function called during get_stats64, v2 · 42020320
      Corinna Vinschen 提交于
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104031
      Fixes: 6e85d5ad
      
      Based on the discussion starting at
      http://www.spinics.net/lists/netdev/msg342193.html
      
      Tested locally on RTL8168evl/8111evl with various concurrent processes
      accessing /proc/net/dev while changing the link state as well as
      removing/reloading the r8169 module.
      Signed-off-by: NCorinna Vinschen <vinschen@redhat.com>
      Tested-by: Npoma <pomidorabelisima@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      42020320
  2. 10 9月, 2015 34 次提交
    • H
      ether: add IEEE 1722 ethertype - TSN · 1ab1e895
      Henrik Austad 提交于
      IEEE 1722 describes AVB (later renamed to TSN - Time Sensitive
      Networking), a protocol, encapsualtion and synchronization to utilize
      standard networks for audio/video (and later other time-sensitive)
      streams.
      
      This standard uses ethertype 0x22F0.
      
      http://standards.ieee.org/develop/regauth/ethertype/eth.txt
      
      This is a respin of a previous patch ("ether: add AVB frame type
      ETH_P_AVB")
      
      CC: "David S. Miller" <davem@davemloft.net>
      CC: netdev@vger.kernel.org
      CC: linux-api@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      Signed-off-by: NHenrik Austad <henrik@austad.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ab1e895
    • D
      netlink, mmap: fix edge-case leakages in nf queue zero-copy · 6bb0fef4
      Daniel Borkmann 提交于
      When netlink mmap on receive side is the consumer of nf queue data,
      it can happen that in some edge cases, we write skb shared info into
      the user space mmap buffer:
      
      Assume a possible rx ring frame size of only 4096, and the network skb,
      which is being zero-copied into the netlink skb, contains page frags
      with an overall skb->len larger than the linear part of the netlink
      skb.
      
      skb_zerocopy(), which is generic and thus not aware of the fact that
      shared info cannot be accessed for such skbs then tries to write and
      fill frags, thus leaking kernel data/pointers and in some corner cases
      possibly writing out of bounds of the mmap area (when filling the
      last slot in the ring buffer this way).
      
      I.e. the ring buffer slot is then of status NL_MMAP_STATUS_VALID, has
      an advertised length larger than 4096, where the linear part is visible
      at the slot beginning, and the leaked sizeof(struct skb_shared_info)
      has been written to the beginning of the next slot (also corrupting
      the struct nl_mmap_hdr slot header incl. status etc), since skb->end
      points to skb->data + ring->frame_size - NL_MMAP_HDRLEN.
      
      The fix adds and lets __netlink_alloc_skb() take the actual needed
      linear room for the network skb + meta data into account. It's completely
      irrelevant for non-mmaped netlink sockets, but in case mmap sockets
      are used, it can be decided whether the available skb_tailroom() is
      really large enough for the buffer, or whether it needs to internally
      fallback to a normal alloc_skb().
      
      >From nf queue side, the information whether the destination port is
      an mmap RX ring is not really available without extra port-to-socket
      lookup, thus it can only be determined in lower layers i.e. when
      __netlink_alloc_skb() is called that checks internally for this. I
      chose to add the extra ldiff parameter as mmap will then still work:
      We have data_len and hlen in nfqnl_build_packet_message(), data_len
      is the full length (capped at queue->copy_range) for skb_zerocopy()
      and hlen some possible part of data_len that needs to be copied; the
      rem_len variable indicates the needed remaining linear mmap space.
      
      The only other workaround in nf queue internally would be after
      allocation time by f.e. cap'ing the data_len to the skb_tailroom()
      iff we deal with an mmap skb, but that would 1) expose the fact that
      we use a mmap skb to upper layers, and 2) trim the skb where we
      otherwise could just have moved the full skb into the normal receive
      queue.
      
      After the patch, in my test case the ring slot doesn't fit and therefore
      shows NL_MMAP_STATUS_COPY, where a full skb carries all the data and
      thus needs to be picked up via recv().
      
      Fixes: 3ab1f683 ("nfnetlink: add support for memory mapped netlink")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6bb0fef4
    • D
      netlink, mmap: don't walk rx ring on poll if receive queue non-empty · a66e3656
      Daniel Borkmann 提交于
      In case of netlink mmap, there can be situations where received frames
      have to be placed into the normal receive queue. The ring buffer indicates
      this through NL_MMAP_STATUS_COPY, so the user is asked to pick them up
      via recvmsg(2) syscall, and to put the slot back to NL_MMAP_STATUS_UNUSED.
      
      Commit 0ef70770 ("netlink: rx mmap: fix POLLIN condition") changed
      polling, so that we walk in the worst case the whole ring through the
      new netlink_has_valid_frame(), for example, when the ring would have no
      NL_MMAP_STATUS_VALID, but at least one NL_MMAP_STATUS_COPY frame.
      
      Since we do a datagram_poll() already earlier to pick up a mask that could
      possibly contain POLLIN | POLLRDNORM already (due to NL_MMAP_STATUS_COPY),
      we can skip checking the rx ring entirely.
      
      In case the kernel is compiled with !CONFIG_NETLINK_MMAP, then all this is
      irrelevant anyway as netlink_poll() is just defined as datagram_poll().
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a66e3656
    • H
      cxgb4: changes for new firmware 1.14.4.0 · f2be053c
      Hariprasad Shenai 提交于
      Incorporate fw_ldst_cmd structure change for new firmware and also
      update version string for the same
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f2be053c
    • N
      net: fec: add netif status check before set mac address · 9638d19e
      Nimrod Andy 提交于
      There exist one issue by below case that case system hang:
      ifconfig eth0 down
      ifconfig eth0 hw ether 00:10:19:19:81:19
      
      After eth0 down, all fec clocks are gated off. In the .fec_set_mac_address()
      function, it will set new MAC address to registers, which causes system hang.
      
      So it needs to add netif status check to avoid registers access when clocks are
      gated off. Until eth0 up the new MAC address are wrote into related registers.
      
      V2:
      As Lucas Stach's suggestion, add a comment in the code to explain why it needed.
      
      CC: Lucas Stach <l.stach@pengutronix.de>
      CC: Florian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NFugang Duan <B38611@freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9638d19e
    • D
      Merge branch 'r8152-autoresume' · 1d68c286
      David S. Miller 提交于
      Hayes Wang says:
      
      ====================
      r8152: fix the autoresume may fail
      
      Fix the autosuspend issues which occur about linking change.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1d68c286
    • H
      r8152: fix the runtime suspend issues · 2dd49e0f
      hayeswang 提交于
      Fix the runtime suspend issues result from the linking change.
      
      Case 1:
      a) link down occurs.
      b) driver disable tx/rx.
      c) autosuspend occurs.
      d) hw linking up.
      e) device suspends without enabling tx/rx.
      f) couldn't wake up when receiving packets.
      
      Case 2:
      a) Nway results in linking down.
      b) autosuspend occurs.
      c) device suspends.
      d) device may not wake up when linking up.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2dd49e0f
    • H
      r8152: split DRIVER_VERSION · d0942473
      hayeswang 提交于
      Split DRIVER_VERSION into NETNEXT_VERSION and NET_VERSION. Then,
      according to the value of DRIVER_VERSION, we could know which
      patches are used generally without comparing the source code.
      Signed-off-by: NHayes Wang <hayeswang@realtek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0942473
    • W
      ipv6: fix ifnullfree.cocci warnings · 52fe51f8
      Wu Fengguang 提交于
      net/ipv6/route.c:2946:3-8: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values.
      
       NULL check before some freeing functions is not needed.
      
       Based on checkpatch warning
       "kfree(NULL) is safe this check is probably not required"
       and kfreeaddr.cocci by Julia Lawall.
      
      Generated by: scripts/coccinelle/free/ifnullfree.cocci
      
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52fe51f8
    • W
      add microchip LAN88xx phy driver · 792aec47
      Woojung.Huh@microchip.com 提交于
      Add Microchip LAN88XX phy driver for phylib.
      Signed-off-by: NWoojung Huh <woojung.huh@microchip.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      792aec47
    • A
      stmmac: fix check for phydev being open · dfc50fca
      Alexey Brodkin 提交于
      Current check of phydev with IS_ERR(phydev) may make not much sense
      because of_phy_connect() returns NULL on failure instead of error value.
      
      Still for checking result of phy_connect() IS_ERR() makes perfect sense.
      
      So let's use combined check IS_ERR_OR_NULL() that covers both cases.
      
      Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: stable@vger.kernel.org
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfc50fca
    • R
      net: qlcnic: delete redundant memsets · 1f0ca208
      Rasmus Villemoes 提交于
      In all cases, mbx->req.arg and mbx->rsp.arg have just been allocated
      using kcalloc(), so these six memsets are redundant.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f0ca208
    • R
      net: mv643xx_eth: use kzalloc · b66a6085
      Rasmus Villemoes 提交于
      The double memset is a little ugly; using kzalloc avoids it altogether.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b66a6085
    • R
      net: jme: use kzalloc() instead of kmalloc+memset · e9b5ac27
      Rasmus Villemoes 提交于
      Using kzalloc saves a tiny bit on .text.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9b5ac27
    • R
      net: cavium: liquidio: use kzalloc in setup_glist() · ce8e5c70
      Rasmus Villemoes 提交于
      We save a little .text and get rid of the sizeof(...) style
      inconsistency.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce8e5c70
    • P
      net: ipv6: use common fib_default_rule_pref · f53de1e9
      Phil Sutter 提交于
      This switches IPv6 policy routing to use the shared
      fib_default_rule_pref() function of IPv4 and DECnet. It is also used in
      multicast routing for IPv4 as well as IPv6.
      
      The motivation for this patch is a complaint about iproute2 behaving
      inconsistent between IPv4 and IPv6 when adding policy rules: Formerly,
      IPv6 rules were assigned a fixed priority of 0x3FFF whereas for IPv4 the
      assigned priority value was decreased with each rule added.
      
      Since then all users of the default_pref field have been converted to
      assign the generic function fib_default_rule_pref(), fib_nl_newrule()
      may just use it directly instead. Therefore get rid of the function
      pointer altogether and make fib_default_rule_pref() static, as it's not
      used outside fib_rules.c anymore.
      Signed-off-by: NPhil Sutter <phil@nwl.cc>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f53de1e9
    • T
      net: ethoc: Remove unnecessary #ifdef CONFIG_OF · 444c5f92
      Tobias Klauser 提交于
      For !CONFIG_OF of_get_property() is defined to always return NULL. Thus
      there's no need to protect the call to of_get_property() with #ifdef
      CONFIG_OF.
      Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      444c5f92
    • F
      net: dsa: bcm_sf2: Fix 64-bits register writes · 03679a14
      Florian Fainelli 提交于
      The macro to write 64-bits quantities to the 32-bits register swapped
      the value and offsets arguments, we want to preserve the ordering of the
      arguments with respect to how writel() is implemented for instance:
      value first, offset/base second.
      
      Fixes: 246d7f77 ("net: dsa: add Broadcom SF2 switch driver")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03679a14
    • A
      bpf: fix out of bounds access in verifier log · 687f0715
      Alexei Starovoitov 提交于
      when the verifier log is enabled the print_bpf_insn() is doing
      bpf_alu_string[BPF_OP(insn->code) >> 4]
      and
      bpf_jmp_string[BPF_OP(insn->code) >> 4]
      where BPF_OP is a 4-bit instruction opcode.
      Malformed insns can cause out of bounds access.
      Fix it by sizing arrays appropriately.
      
      The bug was found by clang address sanitizer with libfuzzer.
      Reported-by: NYonghong Song <yhs@plumgrid.com>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      687f0715
    • R
      ipv6: fix multipath route replace error recovery · 6b9ea5a6
      Roopa Prabhu 提交于
      Problem:
      The ecmp route replace support for ipv6 in the kernel, deletes the
      existing ecmp route too early, ie when it installs the first nexthop.
      If there is an error in installing the subsequent nexthops, its too late
      to recover the already deleted existing route leaving the fib
      in an inconsistent state.
      
      This patch reduces the possibility of this by doing the following:
      a) Changes the existing multipath route add code to a two stage process:
        build rt6_infos + insert them
      	ip6_route_add rt6_info creation code is moved into
      	ip6_route_info_create.
      b) This ensures that most errors are caught during building rt6_infos
        and we fail early
      c) Separates multipath add and del code. Because add needs the special
        two stage mode in a) and delete essentially does not care.
      d) In any event if the code fails during inserting a route again, a
        warning is printed (This should be unlikely)
      
      Before the patch:
      $ip -6 route show
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
      
      /* Try replacing the route with a duplicate nexthop */
      $ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
      fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
      swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
      RTNETLINK answers: File exists
      
      $ip -6 route show
      /* previously added ecmp route 3000:1000:1000:1000::2 dissappears from
       * kernel */
      
      After the patch:
      $ip -6 route show
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
      
      /* Try replacing the route with a duplicate nexthop */
      $ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
      fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
      swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
      RTNETLINK answers: File exists
      
      $ip -6 route show
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
      
      Fixes: 27596472 ("ipv6: fix ECMP route replacement")
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b9ea5a6
    • D
      ebpf: fix fd refcount leaks related to maps in bpf syscall · 592867bf
      Daniel Borkmann 提交于
      We may already have gotten a proper fd struct through fdget(), so
      whenever we return at the end of an map operation, we need to call
      fdput(). However, each map operation from syscall side first probes
      CHECK_ATTR() to verify that unused fields in the bpf_attr union are
      zero.
      
      In case of malformed input, we return with error, but the lookup to
      the map_fd was already performed at that time, so that we return
      without an corresponding fdput(). Fix it by performing an fdget()
      only right before bpf_map_get(). The fdget() invocation on maps in
      the verifier is not affected.
      
      Fixes: db20fd2b ("bpf: add lookup/update/delete/iterate methods to BPF maps")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      592867bf
    • S
      RDS: verify the underlying transport exists before creating a connection · 74e98eb0
      Sasha Levin 提交于
      There was no verification that an underlying transport exists when creating
      a connection, this would cause dereferencing a NULL ptr.
      
      It might happen on sockets that weren't properly bound before attempting to
      send a message, which will cause a NULL ptr deref:
      
      [135546.047719] kasan: GPF could be caused by NULL-ptr deref or user memory accessgeneral protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
      [135546.051270] Modules linked in:
      [135546.051781] CPU: 4 PID: 15650 Comm: trinity-c4 Not tainted 4.2.0-next-20150902-sasha-00041-gbaa1222-dirty #2527
      [135546.053217] task: ffff8800835bc000 ti: ffff8800bc708000 task.ti: ffff8800bc708000
      [135546.054291] RIP: __rds_conn_create (net/rds/connection.c:194)
      [135546.055666] RSP: 0018:ffff8800bc70fab0  EFLAGS: 00010202
      [135546.056457] RAX: dffffc0000000000 RBX: 0000000000000f2c RCX: ffff8800835bc000
      [135546.057494] RDX: 0000000000000007 RSI: ffff8800835bccd8 RDI: 0000000000000038
      [135546.058530] RBP: ffff8800bc70fb18 R08: 0000000000000001 R09: 0000000000000000
      [135546.059556] R10: ffffed014d7a3a23 R11: ffffed014d7a3a21 R12: 0000000000000000
      [135546.060614] R13: 0000000000000001 R14: ffff8801ec3d0000 R15: 0000000000000000
      [135546.061668] FS:  00007faad4ffb700(0000) GS:ffff880252000000(0000) knlGS:0000000000000000
      [135546.062836] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [135546.063682] CR2: 000000000000846a CR3: 000000009d137000 CR4: 00000000000006a0
      [135546.064723] Stack:
      [135546.065048]  ffffffffafe2055c ffffffffafe23fc1 ffffed00493097bf ffff8801ec3d0008
      [135546.066247]  0000000000000000 00000000000000d0 0000000000000000 ac194a24c0586342
      [135546.067438]  1ffff100178e1f78 ffff880320581b00 ffff8800bc70fdd0 ffff880320581b00
      [135546.068629] Call Trace:
      [135546.069028] ? __rds_conn_create (include/linux/rcupdate.h:856 net/rds/connection.c:134)
      [135546.069989] ? rds_message_copy_from_user (net/rds/message.c:298)
      [135546.071021] rds_conn_create_outgoing (net/rds/connection.c:278)
      [135546.071981] rds_sendmsg (net/rds/send.c:1058)
      [135546.072858] ? perf_trace_lock (include/trace/events/lock.h:38)
      [135546.073744] ? lockdep_init (kernel/locking/lockdep.c:3298)
      [135546.074577] ? rds_send_drop_to (net/rds/send.c:976)
      [135546.075508] ? __might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3795)
      [135546.076349] ? __might_fault (mm/memory.c:3795)
      [135546.077179] ? rds_send_drop_to (net/rds/send.c:976)
      [135546.078114] sock_sendmsg (net/socket.c:611 net/socket.c:620)
      [135546.078856] SYSC_sendto (net/socket.c:1657)
      [135546.079596] ? SYSC_connect (net/socket.c:1628)
      [135546.080510] ? trace_dump_stack (kernel/trace/trace.c:1926)
      [135546.081397] ? ring_buffer_unlock_commit (kernel/trace/ring_buffer.c:2479 kernel/trace/ring_buffer.c:2558 kernel/trace/ring_buffer.c:2674)
      [135546.082390] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
      [135546.083410] ? trace_event_raw_event_sys_enter (include/trace/events/syscalls.h:16)
      [135546.084481] ? do_audit_syscall_entry (include/trace/events/syscalls.h:16)
      [135546.085438] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
      [135546.085515] rds_ib_laddr_check(): addr 36.74.25.172 ret -99 node type -1
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74e98eb0
    • D
      xen-netback: require fewer guest Rx slots when not using GSO · 1d5d4852
      David Vrabel 提交于
      Commit f48da8b1 (xen-netback: fix
      unlimited guest Rx internal queue and carrier flapping) introduced a
      regression.
      
      The PV frontend in IPXE only places 4 requests on the guest Rx ring.
      Since netback required at least (MAX_SKB_FRAGS + 1) slots, IPXE could
      not receive any packets.
      
      a) If GSO is not enabled on the VIF, fewer guest Rx slots are required
         for the largest possible packet.  Calculate the required slots
         based on the maximum GSO size or the MTU.
      
         This calculation of the number of required slots relies on
         1650d545 (xen-netback: always fully coalesce guest Rx packets)
         which present in 4.0-rc1 and later.
      
      b) Reduce the Rx stall detection to checking for at least one
         available Rx request.  This is fine since we're predominately
         concerned with detecting interfaces which are down and thus have
         zero available Rx requests.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NWei Liu <wei.liu2@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1d5d4852
    • D
      Merge branch 'cxgb4-fixes' · 9b57ab8b
      David S. Miller 提交于
      Hariprasad Shenai says:
      
      ====================
      cxgb4: Fix tx flit calculation and wc stat configuration
      
      This patch series fixes the following:
      Patch 1/2 fixes tx flit calculation, which if wrong can lead to
      stall, hang, data corrpution, write combining failure. Patch 2/2 fixes
      PCI-E write combining stats configuration.
      
      This patch series has been created against net tree and includes
      patches on cxgb4 driver.
      
      We have included all the maintainers of respective drivers. Kindly review
      the change and let us know in case of any review comments.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b57ab8b
    • H
      cxgb4: Fix for write-combining stats configuration · 2a485cf7
      Hariprasad Shenai 提交于
      The write-combining configuration register SGE_STAT_CFG_A needs to
      be configured after FW initializes the adapter, else FW will reset
      the configuration
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2a485cf7
    • H
      cxgb4: Fix tx flit calculation · fd1754fb
      Hariprasad Shenai 提交于
      In commit 0aac3f56 ("cxgb4: Add comment for calculate tx flits
      and sge length code") introduced a regression where tx flit calculation
      is going wrong, which can lead to data corruption, hang, stall and
      write-combining failure. Fixing it.
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd1754fb
    • A
      net: eth: altera: Fix the initial device operstate · d43cefcd
      Atsushi Nemoto 提交于
      Call netif_carrier_off() prior to register_netdev(), otherwise
      userspace can see incorrect link state.
      Signed-off-by: NAtsushi Nemoto <nemoto@toshiba-tops.co.jp>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d43cefcd
    • L
      Merge tag 'tty-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · b8889c4f
      Linus Torvalds 提交于
      Pull tty driver reverts from Greg KH:
       "Here are some reverts for some tty patches (specifically the pl011
        driver) that ended up breaking a bunch of machines (i.e. almost all
        of the ones with this chip).
      
        People are working on a fix for this, but in the meantime, it's best
        to just revert all 5 patches to restore people's serial consoles.
      
        These reverts have been in linux-next for many days now"
      
      * tag 'tty-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        Revert "uart: pl011: Rename regs with enumeration"
        Revert "uart: pl011: Introduce register accessor"
        Revert "uart: pl011: Introduce register look up table"
        Revert "uart: pl011: Improve LCRH register access decision"
        Revert "uart: pl011: Add support to ZTE ZX296702 uart"
      b8889c4f
    • L
      Merge tag 'for-linus-20150909' of git://git.infradead.org/linux-mtd · fac33bfd
      Linus Torvalds 提交于
      Pull more MTD updates from Brian Norris:
       "There was one significant bug in my first pull request, fixed here.  I
        also threw in a few trivial ID additions and a small module rename.
      
        Details:
      
         - SPI NOR: bug fix for a "end of table" check that resulted in a NULL
           dereference in some cases
      
         - SPI NOR: a few new IDs / feature flags
      
         - OMAP2 NAND: rename module so it doesn't conflict with onenand
           omap2.ko"
      
      * tag 'for-linus-20150909' of git://git.infradead.org/linux-mtd:
        mtd: spi-nor: fix NULL dereference when no match found in spi_nor_ids[]
        mtd: spi-nor: s25sl064p supports both dual and quad I/O
        mtd: spi-nor: allow dual/quad reads on S25FL129P
        mtd: nand: omap2: Rename shippable module to omap2_nand
        mtd: spi-nor: Add support for sst25wf020a
        mtd: spi-nor: Add support for Micron n25q064a serial flash
      fac33bfd
    • L
      Merge tag 'pwm/for-4.3-rc1' of... · 82278fc0
      Linus Torvalds 提交于
      Merge tag 'pwm/for-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
      
      Pull pwm updates from Thierry Reding:
       "This set of changes introduces the beginnings of a new API that's
        based around the concept of states that can be atomically applied.
        Drivers go to various lengths to implement something similar, which
        indicates that the core should really be providing the necessary
        framework.
      
        On top of that, there is a bit of cleanup as well as improved
        kerneldoc and integration into the device-drivers DocBook.
      
        Regarding drivers there is a new one for the NXP LPC18xx family of
        SoCs and a couple of fixes for existing drivers (pca9685, Broadcom
        Kona and Atmel HLCDC)"
      
      * tag 'pwm/for-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
        ARM: at91: pwm: atmel-hlcdc: Add at91sam9n12 errata
        pwm: Add NXP LPC18xx PWM/SCT DT binding documentation
        pwm: NXP LPC18xx PWM/SCT driver
        pwm-pca9685: Support changing the output frequency
        pwm-pca9685: Fix several driver bugs
        pwm: kona: Modify settings application sequence
        pwm: pca9685: Drop owner assignment
        pwm: Add to device-drivers documentation
        pwm: Clean up kerneldoc
        pwm: Remove useless whitespace
        pwm: sysfs: Remove unnecessary padding
        pwm: sysfs: Properly convert from enum to string
        pwm: Make use of pwm_get_xxx() helpers where appropriate
        pwm: Add pwm_get_polarity() helper function
        pwm: Constify PWM device where possible
        pwm: Add the pwm_is_enabled() helper
      82278fc0
    • A
      fix ufs write vs readpage race when writing into a hole · bd2843fe
      Al Viro 提交于
      Followup to the UFS series - with the way we clear the new blocks (via
      buffer cache, possibly on more than a page worth of file) we really
      should not insert a reference to new block into inode block tree until
      after we'd cleared it.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd2843fe
    • L
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · daf0e1ed
      Linus Torvalds 提交于
      Pull virtio updates from Michael Tsirkin:
       "Virtio fixes and features for 4.3:
      
         - virtio-mmio can now be auto-loaded through acpi.
         - virtio blk supports extended partitions.
         - total memory is better reported when using virtio balloon with
           auto-deflate.
         - cache control is re-enabled when using virtio-blk in modern mode"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        virtio_balloon: do not change memory amount visible via /proc/meminfo
        virtio_ballon: change stub of release_pages_by_pfn
        virtio-blk: Allow extended partitions
        virtio_mmio: add ACPI probing
        virtio-blk: use VIRTIO_BLK_F_WCE and VIRTIO_BLK_F_CONFIG_WCE in virtio1
      daf0e1ed
    • L
      Merge tag 'metag-for-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag · 065d80b4
      Linus Torvalds 提交于
      Pull metag updates from James Hogan:
       "Metag architecture changes for v4.3.
      
        Just a couple of changes for v4.3-rc1.  A preparatory IRQ patch to
        prepare for moving irq_data struct members, and a tweak to
        Documentation/features since Meta2 could support THP"
      
      * tag 'metag-for-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
        Documentation/features/vm: Meta2 is capable of THP
        metag/irq: Use access helper irq_data_get_affinity_mask()
      065d80b4
    • L
      Merge tag 'nios2-v4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 · 949feacb
      Linus Torvalds 提交于
      Pull nios2 updates from Ley Foon Tan:
      
       - add defconfig and device tree for max 10 support
       - migrate to new 'set-state' interface for timer
       - fix unaligned handler
       - MAINTAINERS: update nios2 git repo
      
      * tag 'nios2-v4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
        nios2: add Max10 defconfig
        nios2: Add Max10 device tree
        MAINTAINERS: update nios2 git repo
        nios2: remove unused statistic counters
        nios2: fixed variable imm16 to s16
        nios2/time: Migrate to new 'set-state' interface
      949feacb