1. 04 11月, 2016 5 次提交
    • D
      net: tcp: check skb is non-NULL for exact match on lookups · da96786e
      David Ahern 提交于
      Andrey reported the following error report while running the syzkaller
      fuzzer:
      
      general protection fault: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 0 PID: 648 Comm: syz-executor Not tainted 4.9.0-rc3+ #333
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      task: ffff8800398c4480 task.stack: ffff88003b468000
      RIP: 0010:[<ffffffff83091106>]  [<     inline     >]
      inet_exact_dif_match include/net/tcp.h:808
      RIP: 0010:[<ffffffff83091106>]  [<ffffffff83091106>]
      __inet_lookup_listener+0xb6/0x500 net/ipv4/inet_hashtables.c:219
      RSP: 0018:ffff88003b46f270  EFLAGS: 00010202
      RAX: 0000000000000004 RBX: 0000000000004242 RCX: 0000000000000001
      RDX: 0000000000000000 RSI: ffffc90000e3c000 RDI: 0000000000000054
      RBP: ffff88003b46f2d8 R08: 0000000000004000 R09: ffffffff830910e7
      R10: 0000000000000000 R11: 000000000000000a R12: ffffffff867fa0c0
      R13: 0000000000004242 R14: 0000000000000003 R15: dffffc0000000000
      FS:  00007fb135881700(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020cc3000 CR3: 000000006d56a000 CR4: 00000000000006f0
      Stack:
       0000000000000000 000000000601a8c0 0000000000000000 ffffffff00004242
       424200003b9083c2 ffff88003def4041 ffffffff84e7e040 0000000000000246
       ffff88003a0911c0 0000000000000000 ffff88003a091298 ffff88003b9083ae
      Call Trace:
       [<ffffffff831100f4>] tcp_v4_send_reset+0x584/0x1700 net/ipv4/tcp_ipv4.c:643
       [<ffffffff83115b1b>] tcp_v4_rcv+0x198b/0x2e50 net/ipv4/tcp_ipv4.c:1718
       [<ffffffff83069d22>] ip_local_deliver_finish+0x332/0xad0
      net/ipv4/ip_input.c:216
      ...
      
      MD5 has a code path that calls __inet_lookup_listener with a null skb,
      so inet{6}_exact_dif_match needs to check skb against null before pulling
      the flag.
      
      Fixes: a04a480d ("net: Require exact match for TCP socket lookups if
             dif is l3mdev")
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Tested-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      da96786e
    • E
      tcp: fix potential memory corruption · ac9e70b1
      Eric Dumazet 提交于
      Imagine initial value of max_skb_frags is 17, and last
      skb in write queue has 15 frags.
      
      Then max_skb_frags is lowered to 14 or smaller value.
      
      tcp_sendmsg() will then be allowed to add additional page frags
      and eventually go past MAX_SKB_FRAGS, overflowing struct
      skb_shared_info.
      
      Fixes: 5f74f82e ("net:Add sysctl_max_skb_frags")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
      Cc: Håkon Bugge <haakon.bugge@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac9e70b1
    • M
      qede: Correctly map aggregation replacement pages · 9512925a
      Mintz, Yuval 提交于
      Driver allocates replacement buffers before-hand to make
      sure whenever an aggregation begins there would be a replacement
      for the Rx buffers, as we can't release the buffer until
      aggregation is terminated and driver logic assumes the Rx rings
      are always full.
      
      For every other Rx page that's being allocated [I.e., regular]
      the page is being completely mapped while for the replacement
      buffers only the first portion of the page is being mapped.
      This means that:
        a. Once replacement buffer replenishes the regular Rx ring,
      assuming there's more than a single packet on page we'd post unmapped
      memory toward HW [assuming mapping is actually done in granularity
      smaller than page].
        b. Unmaps are being done for the entire page, which is incorrect.
      
      Fixes: 55482edc ("qede: Add slowpath/fastpath support and enable hardware GRO")
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9512925a
    • H
      0b53df1e
    • W
      inet: fix sleeping inside inet_wait_for_connect() · 14135f30
      WANG Cong 提交于
      Andrey reported this kernel warning:
      
        WARNING: CPU: 0 PID: 4608 at kernel/sched/core.c:7724
        __might_sleep+0x14c/0x1a0 kernel/sched/core.c:7719
        do not call blocking ops when !TASK_RUNNING; state=1 set at
        [<ffffffff811f5a5c>] prepare_to_wait+0xbc/0x210
        kernel/sched/wait.c:178
        Modules linked in:
        CPU: 0 PID: 4608 Comm: syz-executor Not tainted 4.9.0-rc2+ #320
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
         ffff88006625f7a0 ffffffff81b46914 ffff88006625f818 0000000000000000
         ffffffff84052960 0000000000000000 ffff88006625f7e8 ffffffff81111237
         ffff88006aceac00 ffffffff00001e2c ffffed000cc4beff ffffffff84052960
        Call Trace:
         [<     inline     >] __dump_stack lib/dump_stack.c:15
         [<ffffffff81b46914>] dump_stack+0xb3/0x10f lib/dump_stack.c:51
         [<ffffffff81111237>] __warn+0x1a7/0x1f0 kernel/panic.c:550
         [<ffffffff8111132c>] warn_slowpath_fmt+0xac/0xd0 kernel/panic.c:565
         [<ffffffff811922fc>] __might_sleep+0x14c/0x1a0 kernel/sched/core.c:7719
         [<     inline     >] slab_pre_alloc_hook mm/slab.h:393
         [<     inline     >] slab_alloc_node mm/slub.c:2634
         [<     inline     >] slab_alloc mm/slub.c:2716
         [<ffffffff81508da0>] __kmalloc_track_caller+0x150/0x2a0 mm/slub.c:4240
         [<ffffffff8146be14>] kmemdup+0x24/0x50 mm/util.c:113
         [<ffffffff8388b2cf>] dccp_feat_clone_sp_val.part.5+0x4f/0xe0 net/dccp/feat.c:374
         [<     inline     >] dccp_feat_clone_sp_val net/dccp/feat.c:1141
         [<     inline     >] dccp_feat_change_recv net/dccp/feat.c:1141
         [<ffffffff8388d491>] dccp_feat_parse_options+0xaa1/0x13d0 net/dccp/feat.c:1411
         [<ffffffff83894f01>] dccp_parse_options+0x721/0x1010 net/dccp/options.c:128
         [<ffffffff83891280>] dccp_rcv_state_process+0x200/0x15b0 net/dccp/input.c:644
         [<ffffffff838b8a94>] dccp_v4_do_rcv+0xf4/0x1a0 net/dccp/ipv4.c:681
         [<     inline     >] sk_backlog_rcv ./include/net/sock.h:872
         [<ffffffff82b7ceb6>] __release_sock+0x126/0x3a0 net/core/sock.c:2044
         [<ffffffff82b7d189>] release_sock+0x59/0x1c0 net/core/sock.c:2502
         [<     inline     >] inet_wait_for_connect net/ipv4/af_inet.c:547
         [<ffffffff8316b2a2>] __inet_stream_connect+0x5d2/0xbb0 net/ipv4/af_inet.c:617
         [<ffffffff8316b8d5>] inet_stream_connect+0x55/0xa0 net/ipv4/af_inet.c:656
         [<ffffffff82b705e4>] SYSC_connect+0x244/0x2f0 net/socket.c:1533
         [<ffffffff82b72dd4>] SyS_connect+0x24/0x30 net/socket.c:1514
         [<ffffffff83fbf701>] entry_SYSCALL_64_fastpath+0x1f/0xc2
        arch/x86/entry/entry_64.S:209
      
      Unlike commit 26cabd31
      ("sched, net: Clean up sk_wait_event() vs. might_sleep()"), the
      sleeping function is called before schedule_timeout(), this is indeed
      a bug. Fix this by moving the wait logic to the new API, it is similar
      to commit ff960a73
      ("netdev, sched/wait: Fix sleeping inside wait event").
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      14135f30
  2. 03 11月, 2016 4 次提交
  3. 02 11月, 2016 7 次提交
  4. 01 11月, 2016 17 次提交
  5. 31 10月, 2016 1 次提交
    • M
      r8152: Fix broken RX checksums. · b9a321b4
      Mark Lord 提交于
      The r8152 driver has been broken since (approx) 3.16.xx
      when support was added for hardware RX checksums
      on newer chip versions.  Symptoms include random
      segfaults and silent data corruption over NFS.
      
      The hardware checksum logig does not work on the VER_02
      dongles I have here when used with a slow embedded system CPU.
      Google reveals others reporting similar issues on Raspberry Pi.
      
      So, disable hardware RX checksum support for VER_02, and fix
      an obvious coding error for IPV6 checksums in the same function.
      
      Because this bug results in silent data corruption,
      it is a good candidate for back-porting to -stable >= 3.16.xx.
      Signed-off-by: NMark Lord <mlord@pobox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9a321b4
  6. 30 10月, 2016 6 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 2a26d99b
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
       "Lots of fixes, mostly drivers as is usually the case.
      
         1) Don't treat zero DMA address as invalid in vmxnet3, from Alexey
            Khoroshilov.
      
         2) Fix element timeouts in netfilter's nft_dynset, from Anders K.
            Pedersen.
      
         3) Don't put aead_req crypto struct on the stack in mac80211, from
            Ard Biesheuvel.
      
         4) Several uninitialized variable warning fixes from Arnd Bergmann.
      
         5) Fix memory leak in cxgb4, from Colin Ian King.
      
         6) Fix bpf handling of VLAN header push/pop, from Daniel Borkmann.
      
         7) Several VRF semantic fixes from David Ahern.
      
         8) Set skb->protocol properly in ip6_tnl_xmit(), from Eli Cooper.
      
         9) Socket needs to be locked in udp_disconnect(), from Eric Dumazet.
      
        10) Div-by-zero on 32-bit fix in mlx4 driver, from Eugenia Emantayev.
      
        11) Fix stale link state during failover in NCSCI driver, from Gavin
            Shan.
      
        12) Fix netdev lower adjacency list traversal, from Ido Schimmel.
      
        13) Propvide proper handle when emitting notifications of filter
            deletes, from Jamal Hadi Salim.
      
        14) Memory leaks and big-endian issues in rtl8xxxu, from Jes Sorensen.
      
        15) Fix DESYNC_FACTOR handling in ipv6, from Jiri Bohac.
      
        16) Several routing offload fixes in mlxsw driver, from Jiri Pirko.
      
        17) Fix broadcast sync problem in TIPC, from Jon Paul Maloy.
      
        18) Validate chunk len before using it in SCTP, from Marcelo Ricardo
            Leitner.
      
        19) Revert a netns locking change that causes regressions, from Paul
            Moore.
      
        20) Add recursion limit to GRO handling, from Sabrina Dubroca.
      
        21) GFP_KERNEL in irq context fix in ibmvnic, from Thomas Falcon.
      
        22) Avoid accessing stale vxlan/geneve socket in data path, from
            Pravin Shelar"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (189 commits)
        geneve: avoid using stale geneve socket.
        vxlan: avoid using stale vxlan socket.
        qede: Fix out-of-bound fastpath memory access
        net: phy: dp83848: add dp83822 PHY support
        enic: fix rq disable
        tipc: fix broadcast link synchronization problem
        ibmvnic: Fix missing brackets in init_sub_crq_irqs
        ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context
        Revert "ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context"
        arch/powerpc: Update parameters for csum_tcpudp_magic & csum_tcpudp_nofold
        net/mlx4_en: Save slave ethtool stats command
        net/mlx4_en: Fix potential deadlock in port statistics flow
        net/mlx4: Fix firmware command timeout during interrupt test
        net/mlx4_core: Do not access comm channel if it has not yet been initialized
        net/mlx4_en: Fix panic during reboot
        net/mlx4_en: Process all completions in RX rings after port goes up
        net/mlx4_en: Resolve dividing by zero in 32-bit system
        net/mlx4_core: Change the default value of enable_qos
        net/mlx4_core: Avoid setting ports to auto when only one port type is supported
        net/mlx4_core: Fix the resource-type enum in res tracker to conform to FW spec
        ...
      2a26d99b
    • P
      geneve: avoid using stale geneve socket. · fceb9c3e
      pravin shelar 提交于
      This patch is similar to earlier vxlan patch.
      Geneve device close operation frees geneve socket. This
      operation can race with geneve-xmit function which
      dereferences geneve socket. Following patch uses RCU
      mechanism to avoid this situation.
      Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
      Acked-by: NJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fceb9c3e
    • P
      vxlan: avoid using stale vxlan socket. · c6fcc4fc
      pravin shelar 提交于
      When vxlan device is closed vxlan socket is freed. This
      operation can race with vxlan-xmit function which
      dereferences vxlan socket. Following patch uses RCU
      mechanism to avoid this situation.
      Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6fcc4fc
    • M
      qede: Fix out-of-bound fastpath memory access · 087892d2
      Mintz, Yuval 提交于
      Driver allocates a shadow array for transmitted SKBs with X entries;
      That means valid indices are {0,...,X - 1}. [X == 8191]
      Problem is the driver also uses X as a mask for a
      producer/consumer in order to choose the right entry in the
      array which allows access to entry X which is out of bounds.
      
      To fix this, simply allocate X + 1 entries in the shadow array.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      087892d2
    • R
      net: phy: dp83848: add dp83822 PHY support · 30347834
      Roger Quadros 提交于
      This PHY has a compatible register set with DP83848x so
      add support for it.
      Acked-by: NAndrew F. Davis <afd@ti.com>
      Signed-off-by: NRoger Quadros <rogerq@ti.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      30347834
    • G
      enic: fix rq disable · 9fe1c98a
      Govindarajulu Varadarajan 提交于
      When MTU is changed from 9000 to 1500 while there is burst of inbound 9000
      bytes packets, adaptor sometimes delivers 9000 bytes packets to 1500 bytes
      buffers. This causes memory corruption and sometimes crash.
      
      This is because of a race condition in adaptor between "RQ disable"
      clearing descriptor mini-cache and mini-cache valid bit being set by
      completion of descriptor fetch. This can result in stale RQ desc being
      cached and used when packets arrive. In this case, the stale descriptor
      have old MTU value.
      
      Solution is to write RQ->disable twice. The first write will stop any
      further desc fetches, allowing the second disable to clear the mini-cache
      valid bit without danger of a race.
      
      Also, the check for rq->running becoming 0 after writing rq->enable to 0
      is not done properly. When incoming packets are flooding the interface,
      rq->running will pulse high for each dropped packet. Since the driver was
      waiting for 10us between each poll, it is possible to see rq->running = 1
      1000 times in a row, even though it is not actually stuck running.
      This results in false failure of vnic_rq_disable(). Fix is to try more
      than 1000 time without delay between polls to ensure we do not miss when
      running goes low.
      
      In old adaptors rq->enable needs to be re-written to 0 when posted_index
      is reset in vnic_rq_clean() in order to keep rq->prefetch_index in sync.
      Signed-off-by: NGovindarajulu Varadarajan <_govind@gmx.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9fe1c98a