1. 17 12月, 2015 1 次提交
    • H
      rhashtable: Fix walker list corruption · c6ff5268
      Herbert Xu 提交于
      The commit ba7c95ea ("rhashtable:
      Fix sleeping inside RCU critical section in walk_stop") introduced
      a new spinlock for the walker list.  However, it did not convert
      all existing users of the list over to the new spin lock.  Some
      continued to use the old mutext for this purpose.  This obviously
      led to corruption of the list.
      
      The fix is to use the spin lock everywhere where we touch the list.
      
      This also allows us to do rcu_rad_lock before we take the lock in
      rhashtable_walk_start.  With the old mutex this would've deadlocked
      but it's safe with the new spin lock.
      
      Fixes: ba7c95ea ("rhashtable: Fix sleeping inside RCU...")
      Reported-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6ff5268
  2. 16 12月, 2015 17 次提交
  3. 15 12月, 2015 12 次提交
    • V
      skbuff: Fix offset error in skb_reorder_vlan_header · f6548615
      Vlad Yasevich 提交于
      skb_reorder_vlan_header is called after the vlan header has
      been pulled.  As a result the offset of the begining of
      the mac header has been incrased by 4 bytes (VLAN_HLEN).
      When moving the mac addresses, include this incrase in
      the offset calcualation so that the mac addresses are
      copied correctly.
      
      Fixes: a6e18ff1 (vlan: Fix untag operations of stacked vlans with REORDER_HEADER off)
      CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      CC: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NVladislav Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f6548615
    • W
    • K
      ravb: Add disable 10base · 54499969
      Kazuya Mizuguchi 提交于
      Ethernet AVB does not support 10 Mbps transfer speed.
      Signed-off-by: NKazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
      Signed-off-by: NYoshihiro Kaneko <ykaneko0929@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      54499969
    • S
      sh_eth: fix descriptor access endianness · 1299653a
      Sergei Shtylyov 提交于
      The driver never  calls cpu_to_edmac() when writing the descriptor address
      and edmac_to_cpu() when reading it, although it should -- fix this.
      
      Note that the frame/buffer length descriptor field accesses also need fixing
      but since they are both 16-bit we can't  use {cpu|edmac}_to_{edmac|cpu}()...
      Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1299653a
    • S
      sh_eth: fix TX buffer byte-swapping · 3e230993
      Sergei Shtylyov 提交于
      For the little-endian SH771x kernels the driver has to byte-swap the RX/TX
      buffers,  however yet unset physcial address from the TX descriptor is used
      to call sh_eth_soft_swap(). Use 'skb->data' instead...
      
      Fixes: 31fcb99d ("net: sh_eth: remove __flush_purge_region")
      Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3e230993
    • E
      net: fix IP early demux races · 5037e9ef
      Eric Dumazet 提交于
      David Wilder reported crashes caused by dst reuse.
      
      <quote David>
        I am seeing a crash on a distro V4.2.3 kernel caused by a double
        release of a dst_entry.  In ipv4_dst_destroy() the call to
        list_empty() finds a poisoned next pointer, indicating the dst_entry
        has already been removed from the list and freed. The crash occurs
        18 to 24 hours into a run of a network stress exerciser.
      </quote>
      
      Thanks to his detailed report and analysis, we were able to understand
      the core issue.
      
      IP early demux can associate a dst to skb, after a lookup in TCP/UDP
      sockets.
      
      When socket cache is not properly set, we want to store into
      sk->sk_dst_cache the dst for future IP early demux lookups,
      by acquiring a stable refcount on the dst.
      
      Problem is this acquisition is simply using an atomic_inc(),
      which works well, unless the dst was queued for destruction from
      dst_release() noticing dst refcount went to zero, if DST_NOCACHE
      was set on dst.
      
      We need to make sure current refcount is not zero before incrementing
      it, or risk double free as David reported.
      
      This patch, being a stable candidate, adds two new helpers, and use
      them only from IP early demux problematic paths.
      
      It might be possible to merge in net-next skb_dst_force() and
      skb_dst_force_safe(), but I prefer having the smallest patch for stable
      kernels : Maybe some skb_dst_force() callers do not expect skb->dst
      can suddenly be cleared.
      
      Can probably be backported back to linux-3.6 kernels
      Reported-by: NDavid J. Wilder <dwilder@us.ibm.com>
      Tested-by: NDavid J. Wilder <dwilder@us.ibm.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5037e9ef
    • S
      sh_eth: uninline sh_eth_{write|read}() · 2274d375
      Sergei Shtylyov 提交于
      Commit 3365711d ("sh_eth: WARN on access to a register not implemented in
      in  a particular chip") added WARN_ON() to sh_eth_{read|write}(), thus making
      it  unacceptable for these functions to be *inline* anymore. Remove *inline*
      and move the functions from the header to the driver itself. Below   is our
      code economy with ARM gcc 4.7.3:
      
      $ size drivers/net/ethernet/renesas/sh_eth.o{~,}
         text	   data	    bss	    dec	    hex	filename
        32489	   1140	      0	  33629	   835d	drivers/net/ethernet/renesas/sh_eth.o~
        25413	   1140	      0	  26553	   67b9	drivers/net/ethernet/renesas/sh_eth.o
      Suggested-by: NBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2274d375
    • C
      stmmac: dwmac-sunxi: Call exit cleanup function in probe error path · d856c16d
      Chen-Yu Tsai 提交于
      dwmac-sunxi has 2 callbacks that were called from stmmac_platform as
      part of the probe and remove sequences.
      
      Ater the conversion of dwmac-sunxi into a standalone platform driver,
      the .init function is called before calling into the stmmac driver
      core, but .exit is not called to clean up if stmmac returns an error.
      
      This patch fixes the probe error path. This properly cleans up and
      releases resources when the driver core fails to probe.
      
      Cc: Joachim Eastwood <manabian@gmail.com>
      Fixes: 9a9e9a1e ("stmmac: dwmac-sunxi: turn setup callback into a
      		      probe function")
      Signed-off-by: NChen-Yu Tsai <wens@csie.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d856c16d
    • H
      net: add validation for the socket syscall protocol argument · 79462ad0
      Hannes Frederic Sowa 提交于
      郭永刚 reported that one could simply crash the kernel as root by
      using a simple program:
      
      	int socket_fd;
      	struct sockaddr_in addr;
      	addr.sin_port = 0;
      	addr.sin_addr.s_addr = INADDR_ANY;
      	addr.sin_family = 10;
      
      	socket_fd = socket(10,3,0x40000000);
      	connect(socket_fd , &addr,16);
      
      AF_INET, AF_INET6 sockets actually only support 8-bit protocol
      identifiers. inet_sock's skc_protocol field thus is sized accordingly,
      thus larger protocol identifiers simply cut off the higher bits and
      store a zero in the protocol fields.
      
      This could lead to e.g. NULL function pointer because as a result of
      the cut off inet_num is zero and we call down to inet_autobind, which
      is NULL for raw sockets.
      
      kernel: Call Trace:
      kernel:  [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70
      kernel:  [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80
      kernel:  [<ffffffff81645069>] SYSC_connect+0xd9/0x110
      kernel:  [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80
      kernel:  [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200
      kernel:  [<ffffffff81645e0e>] SyS_connect+0xe/0x10
      kernel:  [<ffffffff81779515>] tracesys_phase2+0x84/0x89
      
      I found no particular commit which introduced this problem.
      
      CVE: CVE-2015-8543
      Cc: Cong Wang <cwang@twopensource.com>
      Reported-by: N郭永刚 <guoyonggang@360.cn>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79462ad0
    • T
      net: phy: mdio-mux: Check return value of mdiobus_alloc() · 20b08e1a
      Tobias Klauser 提交于
      mdiobus_alloc() might return NULL, but its return value is not
      checked in mdio_mux_init(). This could potentially lead to a NULL
      pointer dereference. Fix it by checking the return value
      
      Fixes: 0ca2997d ("netdev/of/phy: Add MDIO bus multiplexer support.")
      Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20b08e1a
    • P
      openvswitch: fix trivial comment typo · e5f5d747
      Paolo Abeni 提交于
      The commit 33db4125 ("openvswitch: Rename LABEL->LABELS") left
      over an old OVS_CT_ATTR_LABEL instance, fix it.
      
      Fixes: 33db4125 ("openvswitch: Rename LABEL->LABELS")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5f5d747
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 9e5be5bd
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      netfilter fixes for net
      
      The following patchset contains Netfilter fixes for you net tree,
      specifically for nf_tables and nfnetlink_queue, they are:
      
      1) Avoid a compilation warning in nfnetlink_queue that was introduced
         in the previous merge window with the simplification of the conntrack
         integration, from Arnd Bergmann.
      
      2) nfnetlink_queue is leaking the pernet subsystem registration from
         a failure path, patch from Nikolay Borisov.
      
      3) Pass down netns pointer to batch callback in nfnetlink, this is the
         largest patch and it is not a bugfix but it is a dependency to
         resolve a splat in the correct way.
      
      4) Fix a splat due to incorrect socket memory accounting with nfnetlink
         skbuff clones.
      
      5) Add missing conntrack dependencies to NFT_DUP_IPV4 and NFT_DUP_IPV6.
      
      6) Traverse the nftables commit list in reverse order from the commit
         path, otherwise we crash when the user applies an incremental update
         via 'nft -f' that deletes an object that was just introduced in this
         batch, from Xin Long.
      
      Regarding the compilation warning fix, many people have sent us (and
      keep sending us) patches to address this, that's why I'm including this
      batch even if this is not critical.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9e5be5bd
  4. 14 12月, 2015 4 次提交
    • D
      net: Flush local routes when device changes vrf association · 7f49e7a3
      David Ahern 提交于
      The VRF driver cycles netdevs when an interface is enslaved or released:
      the down event is used to flush neighbor and route tables and the up
      event (if the interface was already up) effectively moves local and
      connected routes to the proper table.
      
      As of 4f823def the local route is left hanging around after a link
      down, so when a netdev is moved from one VRF to another (or released
      from a VRF altogether) local routes are left in the wrong table.
      
      Fix by handling the NETDEV_CHANGEUPPER event. When the upper dev is
      an L3mdev then call fib_disable_ip to flush all routes, local ones
      to.
      
      Fixes: 4f823def ("ipv4: fix to not remove local route on link down")
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7f49e7a3
    • A
      net:hns: print MAC with %pM · 98900a80
      Andy Shevchenko 提交于
      printf() has a dedicated specifier to print MAC addresses. Use it instead of
      pushing each byte via stack.
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      98900a80
    • A
      net:hns: annotate IO address space properly · 946973a3
      Andy Shevchenko 提交于
      Mark address pointer with __iomem in the IO accessors.
      
      Otherwise we will get a sparse complain like following
      
      .../hns/hns_dsaf_reg.h:991:36: warning: incorrect type in argument 1 (different address spaces)
      .../hns/hns_dsaf_reg.h:991:36:    expected unsigned char [noderef] [usertype] <asn:2>*base
      .../hns/hns_dsaf_reg.h:991:36:    got void *base
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      946973a3
    • X
      netfilter: nf_tables: use reverse traversal commit_list in nf_tables_abort · a907e36d
      Xin Long 提交于
      When we use 'nft -f' to submit rules, it will build multiple rules into
      one netlink skb to send to kernel, kernel will process them one by one.
      meanwhile, it add the trans into commit_list to record every commit.
      if one of them's return value is -EAGAIN, status |= NFNL_BATCH_REPLAY
      will be marked. after all the process is done. it will roll back all the
      commits.
      
      now kernel use list_add_tail to add trans to commit, and use
      list_for_each_entry_safe to roll back. which means the order of adding
      and rollback is the same. that will cause some cases cannot work well,
      even trigger call trace, like:
      
      1. add a set into table foo  [return -EAGAIN]:
         commit_list = 'add set trans'
      2. del foo:
         commit_list = 'add set trans' -> 'del set trans' -> 'del tab trans'
      then nf_tables_abort will be called to roll back:
      firstly process 'add set trans':
                         case NFT_MSG_NEWSET:
                              trans->ctx.table->use--;
                              list_del_rcu(&nft_trans_set(trans)->list);
      
        it will del the set from the table foo, but it has removed when del
        table foo [step 2], then the kernel will panic.
      
      the right order of rollback should be:
        'del tab trans' -> 'del set trans' -> 'add set trans'.
      which is opposite with commit_list order.
      
      so fix it by rolling back commits with reverse order in nf_tables_abort.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      a907e36d
  5. 12 12月, 2015 6 次提交
    • D
      Merge branch 'mpls-fixes' · 6d13cab4
      David S. Miller 提交于
      Robert Shearman says:
      
      ====================
      mpls: fixes for nexthops without via addresses
      
      These four fixes all apply to the case of having an mpls route with an
      output device, but without a nexthop.
      
      Patches 2 and 3 could really have been combined in one patch, but I
      wanted to separate the fix for some recent breakage from the fix for a
      day-1 issue.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d13cab4
    • R
      mpls: make via address optional for multipath routes · f20367df
      Robert Shearman 提交于
      The via address is optional for a single path route, yet is mandatory
      when the multipath attribute is used:
      
        # ip -f mpls route add 100 dev lo
        # ip -f mpls route add 101 nexthop dev lo
        RTNETLINK answers: Invalid argument
      
      Make them consistent by making the via address optional when the
      RTA_MULTIPATH attribute is being parsed so that both forms of
      specifying the route work.
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f20367df
    • R
      mpls: fix out-of-bounds access when via address not specified · eb7809f0
      Robert Shearman 提交于
      When a via address isn't specified, the via table is left initialised
      to 0 (NEIGH_ARP_TABLE), and the via address length also left
      initialised to 0. This results in a via address array of length 0
      being allocated (contiguous with route and nexthop array), meaning
      that when a packet is sent using neigh_xmit the neighbour lookup and
      creation will cause an out-of-bounds access when accessing the 4 bytes
      of the IPv4 address it assumes it has been given a pointer to.
      
      This could be fixed by allocating the 4 bytes of via address necessary
      and leaving it as all zeroes. However, it seems wrong to me to use an
      ipv4 nexthop (including possibly ARPing for 0.0.0.0) when the user
      didn't specify to do so.
      
      Instead, set the via address table to NEIGH_NR_TABLES to signify it
      hasn't been specified and use this at forwarding time to signify a
      neigh_xmit using an L2 address consisting of the device address. This
      mechanism is the same as that used for both ARP and ND for loopback
      interfaces and those flagged as no-arp, which are all we can really
      support in this case.
      
      Fixes: cf4b24f0 ("mpls: reduce memory usage of routes")
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eb7809f0
    • R
      mpls: don't dump RTA_VIA attribute if not specified · 72dcac96
      Robert Shearman 提交于
      The problem seen is that when adding a route with a nexthop with no
      via address specified, iproute2 generates bogus output:
      
        # ip -f mpls route add 100 dev lo
        # ip -f mpls route list
        100 via inet 0.0.8.0 dev lo
      
      The reason for this is that the kernel generates an RTA_VIA attribute
      with the family set to AF_INET, but the via address data having zero
      length. The cause of family being AF_INET is that on route insert
      cfg->rc_via_table is left set to 0, which just happens to be
      NEIGH_ARP_TABLE which is then translated into AF_INET.
      
      iproute2 doesn't validate the length prior to printing and so prints
      garbage. Although it could be fixed to do the validation, I would
      argue that AF_INET addresses should always be exactly 4 bytes so the
      kernel is really giving userspace bogus data.
      
      Therefore, avoid generating the RTA_VIA attribute when dumping the
      route if the via address wasn't specified on add/modify. This is
      indicated by NEIGH_ARP_TABLE and a zero via address length - if the
      user specified a via address the address length would have been
      validated such that it was 4 bytes. Although this is a change in
      behaviour that is visible to userspace, I believe that what was
      generated before was invalid and as such userspace wouldn't be
      expecting it.
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72dcac96
    • R
      mpls: validate L2 via address length · a3e948e8
      Robert Shearman 提交于
      If an L2 via address for an mpls nexthop is specified, the length of
      the L2 address must match that expected by the output device,
      otherwise it could access memory beyond the end of the via address
      buffer in the route.
      
      This check was present prior to commit f8efb73c ("mpls: multipath
      route support"), but got lost in the refactoring, so add it back,
      applying it to all nexthops in multipath routes.
      
      Fixes: f8efb73c ("mpls: multipath route support")
      Signed-off-by: NRobert Shearman <rshearma@brocade.com>
      Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a3e948e8
    • B
      sfc: only use RSS filters if we're using RSS · f1c2ef40
      Bert Kenward 提交于
      Without this, filter insertion on a VF would fail if only one channel
      was in use. This would include the unicast station filter and therefore
      no traffic would be received.
      Signed-off-by: NBert Kenward <bkenward@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1c2ef40