1. 06 1月, 2016 1 次提交
  2. 30 12月, 2015 1 次提交
    • H
      mm/vmstat: fix overflow in mod_zone_page_state() · 6cdb18ad
      Heiko Carstens 提交于
      mod_zone_page_state() takes a "delta" integer argument.  delta contains
      the number of pages that should be added or subtracted from a struct
      zone's vm_stat field.
      
      If a zone is larger than 8TB this will cause overflows.  E.g.  for a
      zone with a size slightly larger than 8TB the line
      
          mod_zone_page_state(zone, NR_ALLOC_BATCH, zone->managed_pages);
      
      in mm/page_alloc.c:free_area_init_core() will result in a negative
      result for the NR_ALLOC_BATCH entry within the zone's vm_stat, since 8TB
      contain 0x8xxxxxxx pages which will be sign extended to a negative
      value.
      
      Fix this by changing the delta argument to long type.
      
      This could fix an early boot problem seen on s390, where we have a 9TB
      system with only one node.  ZONE_DMA contains 2GB and ZONE_NORMAL the
      rest.  The system is trying to allocate a GFP_DMA page but ZONE_DMA is
      completely empty, so it tries to reclaim pages in an endless loop.
      
      This was seen on a heavily patched 3.10 kernel.  One possible
      explaination seem to be the overflows caused by mod_zone_page_state().
      Unfortunately I did not have the chance to verify that this patch
      actually fixes the problem, since I don't have access to the system
      right now.  However the overflow problem does exist anyway.
      
      Given the description that a system with slightly less than 8TB does
      work, this seems to be a candidate for the observed problem.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6cdb18ad
  3. 29 12月, 2015 1 次提交
  4. 24 12月, 2015 1 次提交
    • B
      net: cdc_ncm: avoid changing RX/TX buffers on MTU changes · 1dfddff5
      Bjørn Mork 提交于
      NCM buffer sizes are negotiated with the device independently of
      the network device MTU.  The RX buffers are allocated by the
      usbnet framework based on the rx_urb_size value set by cdc_ncm. A
      single RX buffer can hold a number of MTU sized packets.
      
      The default usbnet change_mtu ndo only modifies rx_urb_size if it
      is equal to hard_mtu.  And the cdc_ncm driver will set rx_urb_size
      and hard_mtu independently of each other, based on dwNtbInMaxSize
      and dwNtbOutMaxSize respectively. It was therefore assumed that
      usbnet_change_mtu() would never touch rx_urb_size.  This failed to
      consider the case where dwNtbInMaxSize and dwNtbOutMaxSize happens
      to be equal.
      
      Fix by implementing an NCM specific change_mtu ndo, modifying the
      netdev MTU without touching the buffer size settings.
      Signed-off-by: NBjørn Mork <bjorn@mork.no>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1dfddff5
  5. 19 12月, 2015 1 次提交
  6. 18 12月, 2015 1 次提交
    • D
      xen: Add RING_COPY_REQUEST() · 454d5d88
      David Vrabel 提交于
      Using RING_GET_REQUEST() on a shared ring is easy to use incorrectly
      (i.e., by not considering that the other end may alter the data in the
      shared ring while it is being inspected).  Safe usage of a request
      generally requires taking a local copy.
      
      Provide a RING_COPY_REQUEST() macro to use instead of
      RING_GET_REQUEST() and an open-coded memcpy().  This takes care of
      ensuring that the copy is done correctly regardless of any possible
      compiler optimizations.
      
      Use a volatile source to prevent the compiler from reordering or
      omitting the copy.
      
      This is part of XSA155.
      
      CC: stable@vger.kernel.org
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      454d5d88
  7. 17 12月, 2015 1 次提交
  8. 16 12月, 2015 1 次提交
  9. 15 12月, 2015 3 次提交
    • E
      net: fix IP early demux races · 5037e9ef
      Eric Dumazet 提交于
      David Wilder reported crashes caused by dst reuse.
      
      <quote David>
        I am seeing a crash on a distro V4.2.3 kernel caused by a double
        release of a dst_entry.  In ipv4_dst_destroy() the call to
        list_empty() finds a poisoned next pointer, indicating the dst_entry
        has already been removed from the list and freed. The crash occurs
        18 to 24 hours into a run of a network stress exerciser.
      </quote>
      
      Thanks to his detailed report and analysis, we were able to understand
      the core issue.
      
      IP early demux can associate a dst to skb, after a lookup in TCP/UDP
      sockets.
      
      When socket cache is not properly set, we want to store into
      sk->sk_dst_cache the dst for future IP early demux lookups,
      by acquiring a stable refcount on the dst.
      
      Problem is this acquisition is simply using an atomic_inc(),
      which works well, unless the dst was queued for destruction from
      dst_release() noticing dst refcount went to zero, if DST_NOCACHE
      was set on dst.
      
      We need to make sure current refcount is not zero before incrementing
      it, or risk double free as David reported.
      
      This patch, being a stable candidate, adds two new helpers, and use
      them only from IP early demux problematic paths.
      
      It might be possible to merge in net-next skb_dst_force() and
      skb_dst_force_safe(), but I prefer having the smallest patch for stable
      kernels : Maybe some skb_dst_force() callers do not expect skb->dst
      can suddenly be cleared.
      
      Can probably be backported back to linux-3.6 kernels
      Reported-by: NDavid J. Wilder <dwilder@us.ibm.com>
      Tested-by: NDavid J. Wilder <dwilder@us.ibm.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5037e9ef
    • H
      net: add validation for the socket syscall protocol argument · 79462ad0
      Hannes Frederic Sowa 提交于
      郭永刚 reported that one could simply crash the kernel as root by
      using a simple program:
      
      	int socket_fd;
      	struct sockaddr_in addr;
      	addr.sin_port = 0;
      	addr.sin_addr.s_addr = INADDR_ANY;
      	addr.sin_family = 10;
      
      	socket_fd = socket(10,3,0x40000000);
      	connect(socket_fd , &addr,16);
      
      AF_INET, AF_INET6 sockets actually only support 8-bit protocol
      identifiers. inet_sock's skc_protocol field thus is sized accordingly,
      thus larger protocol identifiers simply cut off the higher bits and
      store a zero in the protocol fields.
      
      This could lead to e.g. NULL function pointer because as a result of
      the cut off inet_num is zero and we call down to inet_autobind, which
      is NULL for raw sockets.
      
      kernel: Call Trace:
      kernel:  [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70
      kernel:  [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80
      kernel:  [<ffffffff81645069>] SYSC_connect+0xd9/0x110
      kernel:  [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80
      kernel:  [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200
      kernel:  [<ffffffff81645e0e>] SyS_connect+0xe/0x10
      kernel:  [<ffffffff81779515>] tracesys_phase2+0x84/0x89
      
      I found no particular commit which introduced this problem.
      
      CVE: CVE-2015-8543
      Cc: Cong Wang <cwang@twopensource.com>
      Reported-by: N郭永刚 <guoyonggang@360.cn>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79462ad0
    • P
      openvswitch: fix trivial comment typo · e5f5d747
      Paolo Abeni 提交于
      The commit 33db4125 ("openvswitch: Rename LABEL->LABELS") left
      over an old OVS_CT_ATTR_LABEL instance, fix it.
      
      Fixes: 33db4125 ("openvswitch: Rename LABEL->LABELS")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5f5d747
  10. 14 12月, 2015 1 次提交
  11. 13 12月, 2015 2 次提交
  12. 12 12月, 2015 5 次提交
  13. 11 12月, 2015 1 次提交
  14. 10 12月, 2015 3 次提交
  15. 09 12月, 2015 4 次提交
  16. 08 12月, 2015 6 次提交
  17. 07 12月, 2015 4 次提交
  18. 06 12月, 2015 2 次提交
  19. 05 12月, 2015 1 次提交
    • H
      rhashtable: Prevent spurious EBUSY errors on insertion · 3cf92222
      Herbert Xu 提交于
      Thomas and Phil observed that under stress rhashtable insertion
      sometimes failed with EBUSY, even though this error should only
      ever been seen when we're under attack and our hash chain length
      has grown to an unacceptable level, even after a rehash.
      
      It turns out that the logic for detecting whether there is an
      existing rehash is faulty.  In particular, when two threads both
      try to grow the same table at the same time, one of them may see
      the newly grown table and thus erroneously conclude that it had
      been rehashed.  This is what leads to the EBUSY error.
      
      This patch fixes this by remembering the current last table we
      used during insertion so that rhashtable_insert_rehash can detect
      when another thread has also done a resize/rehash.  When this is
      detected we will give up our resize/rehash and simply retry the
      insertion with the new table.
      Reported-by: NThomas Graf <tgraf@suug.ch>
      Reported-by: NPhil Sutter <phil@nwl.cc>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Tested-by: NPhil Sutter <phil@nwl.cc>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3cf92222