1. 23 12月, 2015 5 次提交
    • A
      ipv6/addrlabel: fix ip6addrlbl_get() · e459dfee
      Andrey Ryabinin 提交于
      ip6addrlbl_get() has never worked. If ip6addrlbl_hold() succeeded,
      ip6addrlbl_get() will exit with '-ESRCH'. If ip6addrlbl_hold() failed,
      ip6addrlbl_get() will use about to be free ip6addrlbl_entry pointer.
      
      Fix this by inverting ip6addrlbl_hold() check.
      
      Fixes: 2a8cc6c8 ("[IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table.")
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reviewed-by: NCong Wang <cwang@twopensource.com>
      Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e459dfee
    • I
      switchdev: bridge: Pass ageing time as clock_t instead of jiffies · ef9cdd0f
      Ido Schimmel 提交于
      The bridge's ageing time is offloaded to hardware when:
      	1) A port joins a bridge
      	2) The ageing time of the bridge is changed
      
      In the first case the ageing time is offloaded as jiffies, but in the
      second case it's offloaded as clock_t, which is what existing switchdev
      drivers expect to receive.
      
      Fixes: 6ac311ae ("Adding switchdev ageing notification on port bridged")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef9cdd0f
    • S
      sh_eth: fix 16-bit descriptor field access endianness too · 5cbf20c7
      Sergei Shtylyov 提交于
      Commit 1299653a ("sh_eth: fix descriptor access endianness") only
      addressed the 32-bit buffer address field byte-swapping  but the driver
      still accesses 16-bit frame/buffer length descriptor fields without the
      necessary byte-swapping -- which should affect the big-endian kernels.
      In order to be able to use {cpu|edmac}_to_{edmac|cpu}(), we need to declare
      the RX/TX descriptor word 1 as a 32-bit field and use shifts/masking to
      access the 16-bit subfields (which gets rid of the ugly #ifdef'ery too)...
      Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5cbf20c7
    • V
      veth: don’t modify ip_summed; doing so treats packets with bad checksums as good. · ce8c839b
      Vijay Pandurangan 提交于
      Packets that arrive from real hardware devices have ip_summed ==
      CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
      CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
      current version of veth will replace CHECKSUM_NONE with
      CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
      a veth device to be delivered to the application. This caused applications
      at Twitter to receive corrupt data when network hardware was corrupting
      packets.
      
      We believe this was added as an optimization to skip computing and
      verifying checksums for communication between containers. However, locally
      generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
      written does nothing for them. As far as we can tell, after removing this
      code, these packets are transmitted from one stack to another unmodified
      (tcpdump shows invalid checksums on both sides, as expected), and they are
      delivered correctly to applications. We didn’t test every possible network
      configuration, but we tried a few common ones such as bridging containers,
      using NAT between the host and a container, and routing from hardware
      devices to containers. We have effectively deployed this in production at
      Twitter (by disabling RX checksum offloading on veth devices).
      
      This code dates back to the first version of the driver, commit
      <e314dbdc> ("[NET]: Virtual ethernet device driver"), so I
      suspect this bug occurred mostly because the driver API has evolved
      significantly since then. Commit <0b796750> ("net/veth: Fix
      packet checksumming") (in December 2010) fixed this for packets that get
      created locally and sent to hardware devices, by not changing
      CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
      in from hardware devices.
      Co-authored-by: NEvan Jones <ej@evanjones.ca>
      Signed-off-by: NEvan Jones <ej@evanjones.ca>
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Cc: Phil Sutter <phil@nwl.cc>
      Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NVijay Pandurangan <vijayp@vijayp.ca>
      Acked-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce8c839b
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · a7c09ae6
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains two netfilter fixes:
      
      1) Oneliner from Florian to dump missing NFT_CT_L3PROTOCOL netlink
         attribute, from Florian Westphal.
      
      2) Another oneliner for nf_tables to use skb->protocol from the new
         netdev family, we can't assume ethernet there.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7c09ae6
  2. 22 12月, 2015 3 次提交
  3. 20 12月, 2015 1 次提交
  4. 19 12月, 2015 14 次提交
  5. 18 12月, 2015 10 次提交
    • F
      netfilter: nft_ct: include direction when dumping NFT_CT_L3PROTOCOL key · d5f79b6e
      Florian Westphal 提交于
      one nft userspace test case fails with
      
      'ct l3proto original ipv4' mismatches 'ct l3proto ipv4'
      
      ... because NFTA_CT_DIRECTION attr is missing.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      d5f79b6e
    • P
      netfilter: nf_tables: use skb->protocol instead of assuming ethernet header · aa47e42c
      Pablo Neira Ayuso 提交于
      Otherwise we may end up with incorrect network and transport header for
      other protocols.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      aa47e42c
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 73796d8b
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix uninitialized variable warnings in nfnetlink_queue, a lot of
          people reported this...  From Arnd Bergmann.
      
       2) Don't init mutex twice in i40e driver, from Jesse Brandeburg.
      
       3) Fix spurious EBUSY in rhashtable, from Herbert Xu.
      
       4) Missing DMA unmaps in mvpp2 driver, from Marcin Wojtas.
      
       5) Fix race with work structure access in pppoe driver causing
          corruptions, from Guillaume Nault.
      
       6) Fix OOPS due to sh_eth_rx() not checking whether netdev_alloc_skb()
          actually succeeded or not, from Sergei Shtylyov.
      
       7) Don't lose flags when settifn IFA_F_OPTIMISTIC in ipv6 code, from
          Bjørn Mork.
      
       8) VXLAN_HD_RCO defined incorrectly, fix from Jiri Benc.
      
       9) Fix clock source used for cookies in SCTP, from Marcelo Ricardo
          Leitner.
      
      10) aurora driver needs HAS_DMA dependency, from Geert Uytterhoeven.
      
      11) ndo_fill_metadata_dst op of vxlan has to handle ipv6 tunneling
          properly as well, from Jiri Benc.
      
      12) Handle request sockets properly in xfrm layer, from Eric Dumazet.
      
      13) Double stats update in ipv6 geneve transmit path, fix from Pravin B
          Shelar.
      
      14) sk->sk_policy[] needs RCU protection, and as a result
          xfrm_policy_destroy() needs to free policies using an RCU grace
          period, from Eric Dumazet.
      
      15) SCTP needs to clone ipv6 tx options in order to avoid use after
          free, from Eric Dumazet.
      
      16) Missing kbuild export if ila.h, from Stephen Hemminger.
      
      17) Missing mdiobus_alloc() return value checking in mdio-mux.c, from
          Tobias Klauser.
      
      18) Validate protocol value range in ->create() methods, from Hannes
          Frederic Sowa.
      
      19) Fix early socket demux races that result in illegal dst reuse, from
          Eric Dumazet.
      
      20) Validate socket address length in pptp code, from WANG Cong.
      
      21) skb_reorder_vlan_header() uses incorrect offset and can corrupt
          packets, from Vlad Yasevich.
      
      22) Fix memory leaks in nl80211 registry code, from Ola Olsson.
      
      23) Timeout loop count handing fixes in mISDN, xgbe, qlge, sfc, and
          qlcnic.  From Dan Carpenter.
      
      24) msg.msg_iocb needs to be cleared in recvfrom() otherwise, for
          example, AF_ALG will interpret it as an async call.  From Tadeusz
          Struk.
      
      25) inetpeer_set_addr_v4 forgets to initialize the 'vif' field, from
          Eric Dumazet.
      
      26) rhashtable enforces the minimum table size not early enough,
          breaking how we calculate the per-cpu lock allocations.  From
          Herbert Xu.
      
      27) Fix FCC port lockup in 82xx driver, from Martin Roth.
      
      28) FOU sockets need to be freed using RCU, from Hannes Frederic Sowa.
      
      29) Fix out-of-bounds access in __skb_complete_tx_timestamp() and
          sock_setsockopt() wrt.  timestamp handling.  From WANG Cong.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (117 commits)
        net: check both type and procotol for tcp sockets
        drivers: net: xgene: fix Tx flow control
        tcp: restore fastopen with no data in SYN packet
        af_unix: Revert 'lock_interruptible' in stream receive code
        fou: clean up socket with kfree_rcu
        82xx: FCC: Fixing a bug causing to FCC port lock-up
        gianfar: Don't enable RX Filer if not supported
        net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration
        rhashtable: Fix walker list corruption
        rhashtable: Enforce minimum size on initial hash table
        inet: tcp: fix inetpeer_set_addr_v4()
        ipv6: automatically enable stable privacy mode if stable_secret set
        net: fix uninitialized variable issue
        bluetooth: Validate socket address length in sco_sock_bind().
        net_sched: make qdisc_tree_decrease_qlen() work for non mq
        ser_gigaset: remove unnecessary kfree() calls from release method
        ser_gigaset: fix deallocation of platform device structure
        ser_gigaset: turn nonsense checks into WARN_ON
        ser_gigaset: fix up NULL checks
        qlcnic: fix a timeout loop
        ...
      73796d8b
    • W
      net: check both type and procotol for tcp sockets · ac5cc977
      WANG Cong 提交于
      Dmitry reported the following out-of-bound access:
      
      Call Trace:
       [<ffffffff816cec2e>] __asan_report_load4_noabort+0x3e/0x40
      mm/kasan/report.c:294
       [<ffffffff84affb14>] sock_setsockopt+0x1284/0x13d0 net/core/sock.c:880
       [<     inline     >] SYSC_setsockopt net/socket.c:1746
       [<ffffffff84aed7ee>] SyS_setsockopt+0x1fe/0x240 net/socket.c:1729
       [<ffffffff85c18c76>] entry_SYSCALL_64_fastpath+0x16/0x7a
      arch/x86/entry/entry_64.S:185
      
      This is because we mistake a raw socket as a tcp socket.
      We should check both sk->sk_type and sk->sk_protocol to ensure
      it is a tcp socket.
      
      Willem points out __skb_complete_tx_timestamp() needs to fix as well.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac5cc977
    • I
      drivers: net: xgene: fix Tx flow control · 67894eec
      Iyappan Subramanian 提交于
      Currently the Tx flow control is based on reading the hardware state,
      which is not accurate since it may not reflect the descriptors that
      are not yet reached the memory.
      
      To accurately control the Tx flow, changing it to be software based.
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67894eec
    • E
      tcp: restore fastopen with no data in SYN packet · 07e100f9
      Eric Dumazet 提交于
      Yuchung tracked a regression caused by commit 57be5bda ("ip: convert
      tcp_sendmsg() to iov_iter primitives") for TCP Fast Open.
      
      Some Fast Open users do not actually add any data in the SYN packet.
      
      Fixes: 57be5bda ("ip: convert tcp_sendmsg() to iov_iter primitives")
      Reported-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      07e100f9
    • R
      af_unix: Revert 'lock_interruptible' in stream receive code · 3822b5c2
      Rainer Weikusat 提交于
      With b3ca9b02, the AF_UNIX SOCK_STREAM
      receive code was changed from using mutex_lock(&u->readlock) to
      mutex_lock_interruptible(&u->readlock) to prevent signals from being
      delayed for an indefinite time if a thread sleeping on the mutex
      happened to be selected for handling the signal. But this was never a
      problem with the stream receive code (as opposed to its datagram
      counterpart) as that never went to sleep waiting for new messages with the
      mutex held and thus, wouldn't cause secondary readers to block on the
      mutex waiting for the sleeping primary reader. As the interruptible
      locking makes the code more complicated in exchange for no benefit,
      change it back to using mutex_lock.
      Signed-off-by: NRainer Weikusat <rweikusat@mobileactivedefense.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3822b5c2
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · ce42af94
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "Some i915 fixes, one omap fix, one core regression fix.
      
        Not even enough fixes for a twelve days of xmas song, which seemms
        good"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm: Don't overwrite UNVERFIED mode status to OK
        drm/omap: fix fbdev pix format to support all platforms
        drm/i915: Do a better job at disabling primary plane in the noatomic case.
        drm/i915/skl: Double RC6 WRL always on
        drm/i915/skl: Disable coarse power gating up until F0
        drm/i915: Remove incorrect warning in context cleanup
      ce42af94
    • W
      locking/osq: Fix ordering of node initialisation in osq_lock · b4b29f94
      Will Deacon 提交于
      The Cavium guys reported a soft lockup on their arm64 machine, caused by
      commit c55a6ffa ("locking/osq: Relax atomic semantics"):
      
          mutex_optimistic_spin+0x9c/0x1d0
          __mutex_lock_slowpath+0x44/0x158
          mutex_lock+0x54/0x58
          kernfs_iop_permission+0x38/0x70
          __inode_permission+0x88/0xd8
          inode_permission+0x30/0x6c
          link_path_walk+0x68/0x4d4
          path_openat+0xb4/0x2bc
          do_filp_open+0x74/0xd0
          do_sys_open+0x14c/0x228
          SyS_openat+0x3c/0x48
          el0_svc_naked+0x24/0x28
      
      This is because in osq_lock we initialise the node for the current CPU:
      
          node->locked = 0;
          node->next = NULL;
          node->cpu = curr;
      
      and then publish the current CPU in the lock tail:
      
          old = atomic_xchg_acquire(&lock->tail, curr);
      
      Once the update to lock->tail is visible to another CPU, the node is
      then live and can be both read and updated by concurrent lockers.
      
      Unfortunately, the ACQUIRE semantics of the xchg operation mean that
      there is no guarantee the contents of the node will be visible before
      lock tail is updated.  This can lead to lock corruption when, for
      example, a concurrent locker races to set the next field.
      
      Fixes: c55a6ffa ("locking/osq: Relax atomic semantics"):
      Reported-by: NDavid Daney <ddaney@caviumnetworks.com>
      Reported-by: NAndrew Pinski <andrew.pinski@caviumnetworks.com>
      Tested-by: NAndrew Pinski <andrew.pinski@caviumnetworks.com>
      Acked-by: NDavidlohr Bueso <dave@stgolabs.net>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1449856001-21177-1-git-send-email-will.deacon@arm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4b29f94
    • L
      Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · d7637d01
      Linus Torvalds 提交于
      Pull libnvdimm fixes from Dan Williams:
      
       - Two bug fixes for misuse of PAGE_MASK in scatterlist and dma-debug.
         These are tagged for -stable.  The scatterlist impact is potentially
        corrupted dma addresses on HIGHMEM enabled platforms.
      
       - A minor locking fix for the NFIT hot-add implementation that is new
         in 4.4-rc.  This would only trigger in the case a hot-add raced
         driver removal.
      
      * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        dma-debug: Fix dma_debug_entry offset calculation
        Revert "scatterlist: use sg_phys()"
        nfit: acpi_nfit_notify(): Do not leave device locked
      d7637d01
  6. 17 12月, 2015 7 次提交