1. 09 2月, 2018 1 次提交
    • D
      rxrpc: Don't put crypto buffers on the stack · 8c2f826d
      David Howells 提交于
      Don't put buffers of data to be handed to crypto on the stack as this may
      cause an assertion failure in the kernel (see below).  Fix this by using an
      kmalloc'd buffer instead.
      
      kernel BUG at ./include/linux/scatterlist.h:147!
      ...
      RIP: 0010:rxkad_encrypt_response.isra.6+0x191/0x1b0 [rxrpc]
      RSP: 0018:ffffbe2fc06cfca8 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff989277d59900 RCX: 0000000000000028
      RDX: 0000259dc06cfd88 RSI: 0000000000000025 RDI: ffffbe30406cfd88
      RBP: ffffbe2fc06cfd60 R08: ffffbe2fc06cfd08 R09: ffffbe2fc06cfd08
      R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff7c5f80d9f95
      R13: ffffbe2fc06cfd88 R14: ffff98927a3f7aa0 R15: ffffbe2fc06cfd08
      FS:  0000000000000000(0000) GS:ffff98927fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055b1ff28f0f8 CR3: 000000001b412003 CR4: 00000000003606f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       rxkad_respond_to_challenge+0x297/0x330 [rxrpc]
       rxrpc_process_connection+0xd1/0x690 [rxrpc]
       ? process_one_work+0x1c3/0x680
       ? __lock_is_held+0x59/0xa0
       process_one_work+0x249/0x680
       worker_thread+0x3a/0x390
       ? process_one_work+0x680/0x680
       kthread+0x121/0x140
       ? kthread_create_worker_on_cpu+0x70/0x70
       ret_from_fork+0x3a/0x50
      Reported-by: NJonathan Billings <jsbillings@jsbillings.org>
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NJonathan Billings <jsbillings@jsbillings.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c2f826d
  2. 08 2月, 2018 5 次提交
    • S
      tcp: tracepoint: only call trace_tcp_send_reset with full socket · 5c487bb9
      Song Liu 提交于
      tracepoint tcp_send_reset requires a full socket to work. However, it
      may be called when in TCP_TIME_WAIT:
      
              case TCP_TW_RST:
                      tcp_v6_send_reset(sk, skb);
                      inet_twsk_deschedule_put(inet_twsk(sk));
                      goto discard_it;
      
      To avoid this problem, this patch checks the socket with sk_fullsock()
      before calling trace_tcp_send_reset().
      
      Fixes: c24b14c4 ("tcp: add tracepoint trace_tcp_send_reset")
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NLawrence Brakmo <brakmo@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5c487bb9
    • M
      sch_netem: Bug fixing in calculating Netem interval · 043e337f
      Md. Islam 提交于
      In Kernel 4.15.0+, Netem does not work properly.
      
      Netem setup:
      
      tc qdisc add dev h1-eth0 root handle 1: netem delay 10ms 2ms
      
      Result:
      
      PING 172.16.101.2 (172.16.101.2) 56(84) bytes of data.
      64 bytes from 172.16.101.2: icmp_seq=1 ttl=64 time=22.8 ms
      64 bytes from 172.16.101.2: icmp_seq=2 ttl=64 time=10.9 ms
      64 bytes from 172.16.101.2: icmp_seq=3 ttl=64 time=10.9 ms
      64 bytes from 172.16.101.2: icmp_seq=5 ttl=64 time=11.4 ms
      64 bytes from 172.16.101.2: icmp_seq=6 ttl=64 time=11.8 ms
      64 bytes from 172.16.101.2: icmp_seq=4 ttl=64 time=4303 ms
      64 bytes from 172.16.101.2: icmp_seq=10 ttl=64 time=11.2 ms
      64 bytes from 172.16.101.2: icmp_seq=11 ttl=64 time=10.3 ms
      64 bytes from 172.16.101.2: icmp_seq=7 ttl=64 time=4304 ms
      64 bytes from 172.16.101.2: icmp_seq=8 ttl=64 time=4303 ms
      
      Patch:
      
      (rnd % (2 * sigma)) - sigma was overflowing s32. After applying the
      patch, I found following output which is desirable.
      
      PING 172.16.101.2 (172.16.101.2) 56(84) bytes of data.
      64 bytes from 172.16.101.2: icmp_seq=1 ttl=64 time=21.1 ms
      64 bytes from 172.16.101.2: icmp_seq=2 ttl=64 time=8.46 ms
      64 bytes from 172.16.101.2: icmp_seq=3 ttl=64 time=9.00 ms
      64 bytes from 172.16.101.2: icmp_seq=4 ttl=64 time=11.8 ms
      64 bytes from 172.16.101.2: icmp_seq=5 ttl=64 time=8.36 ms
      64 bytes from 172.16.101.2: icmp_seq=6 ttl=64 time=11.8 ms
      64 bytes from 172.16.101.2: icmp_seq=7 ttl=64 time=8.11 ms
      64 bytes from 172.16.101.2: icmp_seq=8 ttl=64 time=10.0 ms
      64 bytes from 172.16.101.2: icmp_seq=9 ttl=64 time=11.3 ms
      64 bytes from 172.16.101.2: icmp_seq=10 ttl=64 time=11.5 ms
      64 bytes from 172.16.101.2: icmp_seq=11 ttl=64 time=10.2 ms
      Reviewed-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      043e337f
    • D
      net/ipv6: onlink nexthop checks should default to main table · 44750f84
      David Ahern 提交于
      Because of differences in how ipv4 and ipv6 handle fib lookups,
      verification of nexthops with onlink flag need to default to the main
      table rather than the local table used by IPv4. As it stands an
      address within a connected route on device 1 can be used with
      onlink on device 2. Updating the table properly rejects the route
      due to the egress device mismatch.
      
      Update the extack message as well to show it could be a device
      mismatch for the nexthop spec.
      
      Fixes: fc1e64e1 ("net/ipv6: Add support for onlink flag")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44750f84
    • D
      net/ipv6: Handle reject routes with onlink flag · 58e354c0
      David Ahern 提交于
      Verification of nexthops with onlink flag need to handle unreachable
      routes. The lookup is only intended to validate the gateway address
      is not a local address and if the gateway resolves the egress device
      must match the given device. Hence, hitting any default reject route
      is ok.
      
      Fixes: fc1e64e1 ("net/ipv6: Add support for onlink flag")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58e354c0
    • D
      rxrpc: Fix received abort handling · 17e9e23b
      David Howells 提交于
      AF_RXRPC is incorrectly sending back to the server any abort it receives
      for a client connection.  This is due to the final-ACK offload to the
      connection event processor patch.  The abort code is copied into the
      last-call information on the connection channel and then the event
      processor is set.
      
      Instead, the following should be done:
      
       (1) In the case of a final-ACK for a successful call, the ACK should be
           scheduled as before.
      
       (2) In the case of a locally generated ABORT, the ABORT details should be
           cached for sending in response to further packets related to that
           call and no further action scheduled at call disconnect time.
      
       (3) In the case of an ACK received from the peer, the call should be
           considered dead, no ABORT should be transmitted at this time.  In
           response to further non-ABORT packets from the peer relating to this
           call, an RX_USER_ABORT ABORT should be transmitted.
      
       (4) In the case of a call killed due to network error, an RX_USER_ABORT
           ABORT should be cached for transmission in response to further
           packets, but no ABORT should be sent at this time.
      
      Fixes: 3136ef49 ("rxrpc: Delay terminal ACK transmission on a client call")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      17e9e23b
  3. 07 2月, 2018 10 次提交
    • F
      netfilter: nf_flow_offload: fix use-after-free and a resource leak · 0ff90b6c
      Felix Fietkau 提交于
      flow_offload_del frees the flow, so all associated resource must be
      freed before.
      
      Since the ct entry in struct flow_offload_entry was allocated by
      flow_offload_alloc, it should be freed by flow_offload_free to take care
      of the error handling path when flow_offload_add fails.
      
      While at it, make flow_offload_del static, since it should never be
      called directly, only from the gc step
      Signed-off-by: NFelix Fietkau <nbd@nbd.name>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      0ff90b6c
    • M
      build_bug.h: remove BUILD_BUG_ON_NULL() · 32b395a1
      Masahiro Yamada 提交于
      This macro is only used by net/ipv6/mcast.c, but there is no reason
      why it must be BUILD_BUG_ON_NULL().
      
      Replace it with BUILD_BUG_ON_ZERO(), and remove BUILD_BUG_ON_NULL()
      definition from <linux/build_bug.h>.
      
      Link: http://lkml.kernel.org/r/1515121833-3174-3-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Ian Abbott <abbotti@mev.co.uk>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      32b395a1
    • Y
      bitmap: replace bitmap_{from,to}_u32array · 3aa56885
      Yury Norov 提交于
      with bitmap_{from,to}_arr32 over the kernel. Additionally to it:
      * __check_eq_bitmap() now takes single nbits argument.
      * __check_eq_u32_array is not used in new test but may be used in
        future. So I don't remove it here, but annotate as __used.
      
      Tested on arm64 and 32-bit BE mips.
      
      [arnd@arndb.de: perf: arm_dsu_pmu: convert to bitmap_from_arr32]
        Link: http://lkml.kernel.org/r/20180201172508.5739-2-ynorov@caviumnetworks.com
      [ynorov@caviumnetworks.com: fix net/core/ethtool.c]
        Link: http://lkml.kernel.org/r/20180205071747.4ekxtsbgxkj5b2fz@yury-thinkpad
      Link: http://lkml.kernel.org/r/20171228150019.27953-2-ynorov@caviumnetworks.comSigned-off-by: NYury Norov <ynorov@caviumnetworks.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: David Decotigny <decot@googlers.com>,
      Cc: David S. Miller <davem@davemloft.net>,
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Heiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3aa56885
    • P
      netfilter: nf_tables: fix flowtable free · b408c5b0
      Pablo Neira Ayuso 提交于
      Every flow_offload entry is added into the table twice. Because of this,
      rhashtable_free_and_destroy can't be used, since it would call kfree for
      each flow_offload object twice.
      
      This patch cleans up the flowtable via nf_flow_table_iterate() to
      schedule removal of entries by setting on the dying bit, then there is
      an explicitly invocation of the garbage collector to release resources.
      
      Based on patch from Felix Fietkau.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      b408c5b0
    • P
      netfilter: nft_flow_offload: move flowtable cleanup routines to nf_flow_table · c0ea1bcb
      Pablo Neira Ayuso 提交于
      Move the flowtable cleanup routines to nf_flow_table and expose the
      nf_flow_table_cleanup() helper function.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      c0ea1bcb
    • C
      netfilter: xt_RATEEST: acquire xt_rateest_mutex for hash insert · 7dc68e98
      Cong Wang 提交于
      rateest_hash is supposed to be protected by xt_rateest_mutex,
      and, as suggested by Eric, lookup and insert should be atomic,
      so we should acquire the xt_rateest_mutex once for both.
      
      So introduce a non-locking helper for internal use and keep the
      locking one for external.
      
      Reported-by: <syzbot+5cb189720978275e4c75@syzkaller.appspotmail.com>
      Fixes: 5859034d ("[NETFILTER]: x_tables: add RATEEST target")
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      7dc68e98
    • G
      RDS: IB: Fix null pointer issue · 2c0aa086
      Guanglei Li 提交于
      Scenario:
      1. Port down and do fail over
      2. Ap do rds_bind syscall
      
      PID: 47039  TASK: ffff89887e2fe640  CPU: 47  COMMAND: "kworker/u:6"
       #0 [ffff898e35f159f0] machine_kexec at ffffffff8103abf9
       #1 [ffff898e35f15a60] crash_kexec at ffffffff810b96e3
       #2 [ffff898e35f15b30] oops_end at ffffffff8150f518
       #3 [ffff898e35f15b60] no_context at ffffffff8104854c
       #4 [ffff898e35f15ba0] __bad_area_nosemaphore at ffffffff81048675
       #5 [ffff898e35f15bf0] bad_area_nosemaphore at ffffffff810487d3
       #6 [ffff898e35f15c00] do_page_fault at ffffffff815120b8
       #7 [ffff898e35f15d10] page_fault at ffffffff8150ea95
          [exception RIP: unknown or invalid address]
          RIP: 0000000000000000  RSP: ffff898e35f15dc8  RFLAGS: 00010282
          RAX: 00000000fffffffe  RBX: ffff889b77f6fc00  RCX:ffffffff81c99d88
          RDX: 0000000000000000  RSI: ffff896019ee08e8  RDI:ffff889b77f6fc00
          RBP: ffff898e35f15df0   R8: ffff896019ee08c8  R9:0000000000000000
          R10: 0000000000000400  R11: 0000000000000000  R12:ffff896019ee08c0
          R13: ffff889b77f6fe68  R14: ffffffff81c99d80  R15: ffffffffa022a1e0
          ORIG_RAX: ffffffffffffffff  CS: 0010 SS: 0018
       #8 [ffff898e35f15dc8] cma_ndev_work_handler at ffffffffa022a228 [rdma_cm]
       #9 [ffff898e35f15df8] process_one_work at ffffffff8108a7c6
       #10 [ffff898e35f15e58] worker_thread at ffffffff8108bda0
       #11 [ffff898e35f15ee8] kthread at ffffffff81090fe6
      
      PID: 45659  TASK: ffff880d313d2500  CPU: 31  COMMAND: "oracle_45659_ap"
       #0 [ffff881024ccfc98] __schedule at ffffffff8150bac4
       #1 [ffff881024ccfd40] schedule at ffffffff8150c2cf
       #2 [ffff881024ccfd50] __mutex_lock_slowpath at ffffffff8150cee7
       #3 [ffff881024ccfdc0] mutex_lock at ffffffff8150cdeb
       #4 [ffff881024ccfde0] rdma_destroy_id at ffffffffa022a027 [rdma_cm]
       #5 [ffff881024ccfe10] rds_ib_laddr_check at ffffffffa0357857 [rds_rdma]
       #6 [ffff881024ccfe50] rds_trans_get_preferred at ffffffffa0324c2a [rds]
       #7 [ffff881024ccfe80] rds_bind at ffffffffa031d690 [rds]
       #8 [ffff881024ccfeb0] sys_bind at ffffffff8142a670
      
      PID: 45659                          PID: 47039
      rds_ib_laddr_check
        /* create id_priv with a null event_handler */
        rdma_create_id
        rdma_bind_addr
          cma_acquire_dev
            /* add id_priv to cma_dev->id_list */
            cma_attach_to_dev
                                          cma_ndev_work_handler
                                            /* event_hanlder is null */
                                            id_priv->id.event_handler
      Signed-off-by: NGuanglei Li <guanglei.li@oracle.com>
      Signed-off-by: NHonglei Wang <honglei.wang@oracle.com>
      Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: NYanjun Zhu <yanjun.zhu@oracle.com>
      Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Acked-by: NDoug Ledford <dledford@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c0aa086
    • W
      net: erspan: fix erspan config overwrite · 39f57f67
      William Tu 提交于
      When an erspan tunnel device receives an erpsan packet with different
      tunnel metadata (ex: version, index, hwid, direction), existing code
      overwrites the tunnel device's erspan configuration with the received
      packet's metadata.  The patch fixes it.
      
      Fixes: 1a66a836 ("gre: add collect_md mode to ERSPAN tunnel")
      Fixes: f551c91d ("net: erspan: introduce erspan v2 for ip_gre")
      Fixes: ef7baf5e ("ip6_gre: add ip6 erspan collect_md mode")
      Fixes: 94d7d8f2 ("ip6_gre: add erspan v2 support")
      Signed-off-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39f57f67
    • W
      net: erspan: fix metadata extraction · 3df19283
      William Tu 提交于
      Commit d350a823 ("net: erspan: create erspan metadata uapi header")
      moves the erspan 'version' in front of the 'struct erspan_md2' for
      later extensibility reason.  This breaks the existing erspan metadata
      extraction code because the erspan_md2 then has a 4-byte offset
      to between the erspan_metadata and erspan_base_hdr.  This patch
      fixes it.
      
      Fixes: 1a66a836 ("gre: add collect_md mode to ERSPAN tunnel")
      Fixes: ef7baf5e ("ip6_gre: add ip6 erspan collect_md mode")
      Fixes: 1d7e2ed2 ("net: erspan: refactor existing erspan code")
      Signed-off-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3df19283
    • P
      cls_u32: fix use after free in u32_destroy_key() · d7cdee5e
      Paolo Abeni 提交于
      Li Shuang reported an Oops with cls_u32 due to an use-after-free
      in u32_destroy_key(). The use-after-free can be triggered with:
      
      dev=lo
      tc qdisc add dev $dev root handle 1: htb default 10
      tc filter add dev $dev parent 1: prio 5 handle 1: protocol ip u32 divisor 256
      tc filter add dev $dev protocol ip parent 1: prio 5 u32 ht 800:: match ip dst\
       10.0.0.0/8 hashkey mask 0x0000ff00 at 16 link 1:
      tc qdisc del dev $dev root
      
      Which causes the following kasan splat:
      
       ==================================================================
       BUG: KASAN: use-after-free in u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
       Read of size 4 at addr ffff881b83dae618 by task kworker/u48:5/571
      
       CPU: 17 PID: 571 Comm: kworker/u48:5 Not tainted 4.15.0+ #87
       Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
       Workqueue: tc_filter_workqueue u32_delete_key_freepf_work [cls_u32]
       Call Trace:
        dump_stack+0xd6/0x182
        ? dma_virt_map_sg+0x22e/0x22e
        print_address_description+0x73/0x290
        kasan_report+0x277/0x360
        ? u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
        u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
        u32_delete_key_freepf_work+0x1c/0x30 [cls_u32]
        process_one_work+0xae0/0x1c80
        ? sched_clock+0x5/0x10
        ? pwq_dec_nr_in_flight+0x3c0/0x3c0
        ? _raw_spin_unlock_irq+0x29/0x40
        ? trace_hardirqs_on_caller+0x381/0x570
        ? _raw_spin_unlock_irq+0x29/0x40
        ? finish_task_switch+0x1e5/0x760
        ? finish_task_switch+0x208/0x760
        ? preempt_notifier_dec+0x20/0x20
        ? __schedule+0x839/0x1ee0
        ? check_noncircular+0x20/0x20
        ? firmware_map_remove+0x73/0x73
        ? find_held_lock+0x39/0x1c0
        ? worker_thread+0x434/0x1820
        ? lock_contended+0xee0/0xee0
        ? lock_release+0x1100/0x1100
        ? init_rescuer.part.16+0x150/0x150
        ? retint_kernel+0x10/0x10
        worker_thread+0x216/0x1820
        ? process_one_work+0x1c80/0x1c80
        ? lock_acquire+0x1a5/0x540
        ? lock_downgrade+0x6b0/0x6b0
        ? sched_clock+0x5/0x10
        ? lock_release+0x1100/0x1100
        ? compat_start_thread+0x80/0x80
        ? do_raw_spin_trylock+0x190/0x190
        ? _raw_spin_unlock_irq+0x29/0x40
        ? trace_hardirqs_on_caller+0x381/0x570
        ? _raw_spin_unlock_irq+0x29/0x40
        ? finish_task_switch+0x1e5/0x760
        ? finish_task_switch+0x208/0x760
        ? preempt_notifier_dec+0x20/0x20
        ? __schedule+0x839/0x1ee0
        ? kmem_cache_alloc_trace+0x143/0x320
        ? firmware_map_remove+0x73/0x73
        ? sched_clock+0x5/0x10
        ? sched_clock_cpu+0x18/0x170
        ? find_held_lock+0x39/0x1c0
        ? schedule+0xf3/0x3b0
        ? lock_downgrade+0x6b0/0x6b0
        ? __schedule+0x1ee0/0x1ee0
        ? do_wait_intr_irq+0x340/0x340
        ? do_raw_spin_trylock+0x190/0x190
        ? _raw_spin_unlock_irqrestore+0x32/0x60
        ? process_one_work+0x1c80/0x1c80
        ? process_one_work+0x1c80/0x1c80
        kthread+0x312/0x3d0
        ? kthread_create_worker_on_cpu+0xc0/0xc0
        ret_from_fork+0x3a/0x50
      
       Allocated by task 1688:
        kasan_kmalloc+0xa0/0xd0
        __kmalloc+0x162/0x380
        u32_change+0x1220/0x3c9e [cls_u32]
        tc_ctl_tfilter+0x1ba6/0x2f80
        rtnetlink_rcv_msg+0x4f0/0x9d0
        netlink_rcv_skb+0x124/0x320
        netlink_unicast+0x430/0x600
        netlink_sendmsg+0x8fa/0xd60
        sock_sendmsg+0xb1/0xe0
        ___sys_sendmsg+0x678/0x980
        __sys_sendmsg+0xc4/0x210
        do_syscall_64+0x232/0x7f0
        return_from_SYSCALL_64+0x0/0x75
      
       Freed by task 112:
        kasan_slab_free+0x71/0xc0
        kfree+0x114/0x320
        rcu_process_callbacks+0xc3f/0x1600
        __do_softirq+0x2bf/0xc06
      
       The buggy address belongs to the object at ffff881b83dae600
        which belongs to the cache kmalloc-4096 of size 4096
       The buggy address is located 24 bytes inside of
        4096-byte region [ffff881b83dae600, ffff881b83daf600)
       The buggy address belongs to the page:
       page:ffffea006e0f6a00 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
       flags: 0x17ffffc0008100(slab|head)
       raw: 0017ffffc0008100 0000000000000000 0000000000000000 0000000100070007
       raw: dead000000000100 dead000000000200 ffff880187c0e600 0000000000000000
       page dumped because: kasan: bad access detected
      
       Memory state around the buggy address:
        ffff881b83dae500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
        ffff881b83dae580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       >ffff881b83dae600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                   ^
        ffff881b83dae680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
        ffff881b83dae700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ==================================================================
      
      The problem is that the htnode is freed before the linked knodes and the
      latter will try to access the first at u32_destroy_key() time.
      This change addresses the issue using the htnode refcnt to guarantee
      the correct free order. While at it also add a RCU annotation,
      to keep sparse happy.
      
      v1 -> v2: use rtnl_derefence() instead of RCU read locks
      v2 -> v3:
        - don't check refcnt in u32_destroy_hnode()
        - cleaned-up u32_destroy() implementation
        - cleaned-up code comment
      v3 -> v4:
        - dropped unneeded comment
      Reported-by: NLi Shuang <shuali@redhat.com>
      Fixes: c0d378ef ("net_sched: use tcf_queue_work() in u32 filter")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7cdee5e
  4. 06 2月, 2018 2 次提交
    • T
      sctp: fix dst refcnt leak in sctp_v4_get_dst · 4a31a6b1
      Tommi Rantala 提交于
      Fix dst reference count leak in sctp_v4_get_dst() introduced in commit
      410f0383 ("sctp: add routing output fallback"):
      
      When walking the address_list, successive ip_route_output_key() calls
      may return the same rt->dst with the reference incremented on each call.
      
      The code would not decrement the dst refcount when the dst pointer was
      identical from the previous iteration, causing the dst refcnt leak.
      
      Testcase:
        ip netns add TEST
        ip netns exec TEST ip link set lo up
        ip link add dummy0 type dummy
        ip link add dummy1 type dummy
        ip link add dummy2 type dummy
        ip link set dev dummy0 netns TEST
        ip link set dev dummy1 netns TEST
        ip link set dev dummy2 netns TEST
        ip netns exec TEST ip addr add 192.168.1.1/24 dev dummy0
        ip netns exec TEST ip link set dummy0 up
        ip netns exec TEST ip addr add 192.168.1.2/24 dev dummy1
        ip netns exec TEST ip link set dummy1 up
        ip netns exec TEST ip addr add 192.168.1.3/24 dev dummy2
        ip netns exec TEST ip link set dummy2 up
        ip netns exec TEST sctp_test -H 192.168.1.2 -P 20002 -h 192.168.1.1 -p 20000 -s -B 192.168.1.3
        ip netns del TEST
      
      In 4.4 and 4.9 kernels this results to:
        [  354.179591] unregister_netdevice: waiting for lo to become free. Usage count = 1
        [  364.419674] unregister_netdevice: waiting for lo to become free. Usage count = 1
        [  374.663664] unregister_netdevice: waiting for lo to become free. Usage count = 1
        [  384.903717] unregister_netdevice: waiting for lo to become free. Usage count = 1
        [  395.143724] unregister_netdevice: waiting for lo to become free. Usage count = 1
        [  405.383645] unregister_netdevice: waiting for lo to become free. Usage count = 1
        ...
      
      Fixes: 410f0383 ("sctp: add routing output fallback")
      Fixes: 0ca50d12 ("sctp: fix src address selection if using secondary addresses")
      Signed-off-by: NTommi Rantala <tommi.t.rantala@nokia.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a31a6b1
    • A
      sctp: fix dst refcnt leak in sctp_v6_get_dst() · 957d761c
      Alexey Kodanev 提交于
      When going through the bind address list in sctp_v6_get_dst() and
      the previously found address is better ('matchlen > bmatchlen'),
      the code continues to the next iteration without releasing currently
      held destination.
      
      Fix it by releasing 'bdst' before continue to the next iteration, and
      instead of introducing one more '!IS_ERR(bdst)' check for dst_release(),
      move the already existed one right after ip6_dst_lookup_flow(), i.e. we
      shouldn't proceed further if we get an error for the route lookup.
      
      Fixes: dbc2b5e9 ("sctp: fix src address selection if using secondary addresses for ipv6")
      Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      957d761c
  5. 03 2月, 2018 6 次提交
    • R
      Revert "defer call to mem_cgroup_sk_alloc()" · edbe69ef
      Roman Gushchin 提交于
      This patch effectively reverts commit 9f1c2674 ("net: memcontrol:
      defer call to mem_cgroup_sk_alloc()").
      
      Moving mem_cgroup_sk_alloc() to the inet_csk_accept() completely breaks
      memcg socket memory accounting, as packets received before memcg
      pointer initialization are not accounted and are causing refcounting
      underflow on socket release.
      
      Actually the free-after-use problem was fixed by
      commit c0576e39 ("net: call cgroup_sk_alloc() earlier in
      sk_clone_lock()") for the cgroup pointer.
      
      So, let's revert it and call mem_cgroup_sk_alloc() just before
      cgroup_sk_alloc(). This is safe, as we hold a reference to the socket
      we're cloning, and it holds a reference to the memcg.
      
      Also, let's drop BUG_ON(mem_cgroup_is_root()) check from
      mem_cgroup_sk_alloc(). I see no reasons why bumping the root
      memcg counter is a good reason to panic, and there are no realistic
      ways to hit it.
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      edbe69ef
    • E
      soreuseport: fix mem leak in reuseport_add_sock() · 4db428a7
      Eric Dumazet 提交于
      reuseport_add_sock() needs to deal with attaching a socket having
      its own sk_reuseport_cb, after a prior
      setsockopt(SO_ATTACH_REUSEPORT_?BPF)
      
      Without this fix, not only a WARN_ONCE() was issued, but we were also
      leaking memory.
      
      Thanks to sysbot and Eric Biggers for providing us nice C repros.
      
      ------------[ cut here ]------------
      socket already in reuseport group
      WARNING: CPU: 0 PID: 3496 at net/core/sock_reuseport.c:119  
      reuseport_add_sock+0x742/0x9b0 net/core/sock_reuseport.c:117
      Kernel panic - not syncing: panic_on_warn set ...
      
      CPU: 0 PID: 3496 Comm: syzkaller869503 Not tainted 4.15.0-rc6+ #245
      Hardware name: Google Google Compute Engine/Google Compute Engine,
      BIOS  
      Google 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:17 [inline]
        dump_stack+0x194/0x257 lib/dump_stack.c:53
        panic+0x1e4/0x41c kernel/panic.c:183
        __warn+0x1dc/0x200 kernel/panic.c:547
        report_bug+0x211/0x2d0 lib/bug.c:184
        fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
        fixup_bug arch/x86/kernel/traps.c:247 [inline]
        do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
        do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
        invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1079
      
      Fixes: ef456144 ("soreuseport: define reuseport groups")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: syzbot+c0ea2226f77a42936bf7@syzkaller.appspotmail.com
      Acked-by: NCraig Gallek <kraig@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4db428a7
    • P
      cls_u32: add missing RCU annotation. · 058a6c03
      Paolo Abeni 提交于
      In a couple of points of the control path, n->ht_down is currently
      accessed without the required RCU annotation. The accesses are
      safe, but sparse complaints. Since we already held the
      rtnl lock, let use rtnl_dereference().
      
      Fixes: a1b7c5fd ("net: sched: add cls_u32 offload hooks for netdevs")
      Fixes: de5df632 ("net: sched: cls_u32 changes to knode must appear atomic to readers")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      058a6c03
    • P
      netfilter: nft_flow_offload: no need to flush entries on module removal · 992cfc7c
      Pablo Neira Ayuso 提交于
      nft_flow_offload module removal does not require to flush existing
      flowtables, it is valid to remove this module while keeping flowtables
      around.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      992cfc7c
    • P
      netfilter: nft_flow_offload: wait for garbage collector to run after cleanup · c7f0030b
      Pablo Neira Ayuso 提交于
      If netdevice goes down, then flowtable entries are scheduled to be
      removed. Wait for garbage collector to have a chance to run so it can
      delete them from the hashtable.
      
      The flush call might sleep, so hold the nfnl mutex from
      nft_flow_table_iterate() instead of rcu read side lock. The use of the
      nfnl mutex is also implicitly fixing races between updates via nfnetlink
      and netdevice event.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      c7f0030b
    • C
      netfilter: xt_cgroup: initialize info->priv in cgroup_mt_check_v1() · ba7cd5d9
      Cong Wang 提交于
      xt_cgroup_info_v1->priv is an internal pointer only used for kernel,
      we should not trust what user-space provides.
      
      Reported-by: <syzbot+4fbcfcc0d2e6592bd641@syzkaller.appspotmail.com>
      Fixes: c38c4597 ("netfilter: implement xt_cgroup cgroup2 path match")
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ba7cd5d9
  6. 02 2月, 2018 4 次提交
    • P
      netfilter: flowtable infrastructure depends on NETFILTER_INGRESS · 6be3bcd7
      Pablo Neira Ayuso 提交于
      config NF_FLOW_TABLE depends on NETFILTER_INGRESS. If users forget to
      enable this toggle, flowtable registration fails with EOPNOTSUPP.
      
      Moreover, turn 'select NF_FLOW_TABLE' in every flowtable family flavour
      into dependency instead, otherwise this new dependency on
      NETFILTER_INGRESS causes a warning. This also allows us to remove the
      explicit dependency between family flowtables <-> NF_TABLES and
      NF_CONNTRACK, given they depend on the NF_FLOW_TABLE core that already
      expresses the general dependencies for this new infrastructure.
      
      Moreover, NF_FLOW_TABLE_INET depends on NF_FLOW_TABLE_IPV4 and
      NF_FLOWTABLE_IPV6, which already depends on NF_FLOW_TABLE. So we can get
      rid of direct dependency with NF_FLOW_TABLE.
      
      In general, let's avoid 'select', it just makes things more complicated.
      Reported-by: NJohn Crispin <john@phrozen.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      6be3bcd7
    • S
      netfilter: ipv6: nf_defrag: Kill frag queue on RFC2460 failure · ea23d5e3
      Subash Abhinov Kasiviswanathan 提交于
      Failures were seen in ICMPv6 fragmentation timeout tests if they were
      run after the RFC2460 failure tests. Kernel was not sending out the
      ICMPv6 fragment reassembly time exceeded packet after the fragmentation
      reassembly timeout of 1 minute had elapsed.
      
      This happened because the frag queue was not released if an error in
      IPv6 fragmentation header was detected by RFC2460.
      
      Fixes: 83f1999c ("netfilter: ipv6: nf_defrag: Pass on packets to stack per RFC2460")
      Signed-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      ea23d5e3
    • M
      netfilter: x_tables: make allocation less aggressive · 0537250f
      Michal Hocko 提交于
      syzbot has noticed that xt_alloc_table_info can allocate a lot of memory.
      This is an admin only interface but an admin in a namespace is sufficient
      as well.  eacd86ca ("net/netfilter/x_tables.c: use kvmalloc() in
      xt_alloc_table_info()") has changed the opencoded kmalloc->vmalloc
      fallback into kvmalloc.  It has dropped __GFP_NORETRY on the way because
      vmalloc has simply never fully supported __GFP_NORETRY semantic.  This is
      still the case because e.g.  page tables backing the vmalloc area are
      hardcoded GFP_KERNEL.
      
      Revert back to __GFP_NORETRY as a poors man defence against excessively
      large allocation request here.  We will not rule out the OOM killer
      completely but __GFP_NORETRY should at least stop the large request in
      most cases.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Fixes: eacd86ca ("net/netfilter/x_tables.c: use kvmalloc() in xt_alloc_tableLink: http://lkml.kernel.org/r/20180130140104.GE21609@dhcp22.suse.czSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      0537250f
    • E
      net: igmp: add a missing rcu locking section · e7aadb27
      Eric Dumazet 提交于
      Newly added igmpv3_get_srcaddr() needs to be called under rcu lock.
      
      Timer callbacks do not ensure this locking.
      
      =============================
      WARNING: suspicious RCU usage
      4.15.0+ #200 Not tainted
      -----------------------------
      ./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      3 locks held by syzkaller616973/4074:
       #0:  (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355
       #1:  ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
       #1:  ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316
       #2:  (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
       #2:  (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600
      
      stack backtrace:
      CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x257 lib/dump_stack.c:53
       lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
       __in_dev_get_rcu include/linux/inetdevice.h:216 [inline]
       igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline]
       igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389
       add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432
       add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565
       igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605
       igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722
       igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831
       call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
       expire_timers kernel/time/timer.c:1363 [inline]
       __run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
       run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
       __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
       invoke_softirq kernel/softirq.c:365 [inline]
       irq_exit+0x1cc/0x200 kernel/softirq.c:405
       exiting_irq arch/x86/include/asm/apic.h:541 [inline]
       smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
       apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938
      
      Fixes: a46182b0 ("net: igmp: Use correct source address on IGMPv3 reports")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7aadb27
  7. 01 2月, 2018 7 次提交
  8. 31 1月, 2018 5 次提交
    • P
      netfilter: on sockopt() acquire sock lock only in the required scope · 3f34cfae
      Paolo Abeni 提交于
      Syzbot reported several deadlocks in the netfilter area caused by
      rtnl lock and socket lock being acquired with a different order on
      different code paths, leading to backtraces like the following one:
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.15.0-rc9+ #212 Not tainted
      ------------------------------------------------------
      syzkaller041579/3682 is trying to acquire lock:
        (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>] lock_sock
      include/net/sock.h:1463 [inline]
        (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>]
      do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167
      
      but task is already holding lock:
        (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (rtnl_mutex){+.+.}:
              __mutex_lock_common kernel/locking/mutex.c:756 [inline]
              __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
              mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
              rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
              register_netdevice_notifier+0xad/0x860 net/core/dev.c:1607
              tee_tg_check+0x1a0/0x280 net/netfilter/xt_TEE.c:106
              xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:845
              check_target net/ipv6/netfilter/ip6_tables.c:538 [inline]
              find_check_entry.isra.7+0x935/0xcf0
      net/ipv6/netfilter/ip6_tables.c:580
              translate_table+0xf52/0x1690 net/ipv6/netfilter/ip6_tables.c:749
              do_replace net/ipv6/netfilter/ip6_tables.c:1165 [inline]
              do_ip6t_set_ctl+0x370/0x5f0 net/ipv6/netfilter/ip6_tables.c:1691
              nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
              nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
              ipv6_setsockopt+0x115/0x150 net/ipv6/ipv6_sockglue.c:928
              udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
              sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978
              SYSC_setsockopt net/socket.c:1849 [inline]
              SyS_setsockopt+0x189/0x360 net/socket.c:1828
              entry_SYSCALL_64_fastpath+0x29/0xa0
      
      -> #0 (sk_lock-AF_INET6){+.+.}:
              lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3914
              lock_sock_nested+0xc2/0x110 net/core/sock.c:2780
              lock_sock include/net/sock.h:1463 [inline]
              do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167
              ipv6_setsockopt+0xd7/0x150 net/ipv6/ipv6_sockglue.c:922
              udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
              sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978
              SYSC_setsockopt net/socket.c:1849 [inline]
              SyS_setsockopt+0x189/0x360 net/socket.c:1828
              entry_SYSCALL_64_fastpath+0x29/0xa0
      
      other info that might help us debug this:
      
        Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(rtnl_mutex);
                                      lock(sk_lock-AF_INET6);
                                      lock(rtnl_mutex);
         lock(sk_lock-AF_INET6);
      
        *** DEADLOCK ***
      
      1 lock held by syzkaller041579/3682:
        #0:  (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      
      The problem, as Florian noted, is that nf_setsockopt() is always
      called with the socket held, even if the lock itself is required only
      for very tight scopes and only for some operation.
      
      This patch addresses the issues moving the lock_sock() call only
      where really needed, namely in ipv*_getorigdst(), so that nf_setsockopt()
      does not need anymore to acquire both locks.
      
      Fixes: 22265a5c ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
      Reported-by: syzbot+a4c2dc980ac1af699b36@syzkaller.appspotmail.com
      Suggested-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      3f34cfae
    • V
      tls: Add support for encryption using async offload accelerator · a54667f6
      Vakul Garg 提交于
      Async crypto accelerators (e.g. drivers/crypto/caam) support offloading
      GCM operation. If they are enabled, crypto_aead_encrypt() return error
      code -EINPROGRESS. In this case tls_do_encryption() needs to wait on a
      completion till the time the response for crypto offload request is
      received.
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a54667f6
    • N
      ip6mr: fix stale iterator · 4adfa79f
      Nikolay Aleksandrov 提交于
      When we dump the ip6mr mfc entries via proc, we initialize an iterator
      with the table to dump but we don't clear the cache pointer which might
      be initialized from a prior read on the same descriptor that ended. This
      can result in lock imbalance (an unnecessary unlock) leading to other
      crashes and hangs. Clear the cache pointer like ipmr does to fix the issue.
      Thanks for the reliable reproducer.
      
      Here's syzbot's trace:
       WARNING: bad unlock balance detected!
       4.15.0-rc3+ #128 Not tainted
       syzkaller971460/3195 is trying to release lock (mrt_lock) at:
       [<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
       but there are no more locks to release!
      
       other info that might help us debug this:
       1 lock held by syzkaller971460/3195:
        #0:  (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0
       fs/seq_file.c:165
      
       stack backtrace:
       CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
       Google 01/01/2011
       Call Trace:
        __dump_stack lib/dump_stack.c:17 [inline]
        dump_stack+0x194/0x257 lib/dump_stack.c:53
        print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561
        __lock_release kernel/locking/lockdep.c:3775 [inline]
        lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023
        __raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline]
        _raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255
        ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
        traverse+0x3bc/0xa00 fs/seq_file.c:135
        seq_read+0x96a/0x13d0 fs/seq_file.c:189
        proc_reg_read+0xef/0x170 fs/proc/inode.c:217
        do_loop_readv_writev fs/read_write.c:673 [inline]
        do_iter_read+0x3db/0x5b0 fs/read_write.c:897
        compat_readv+0x1bf/0x270 fs/read_write.c:1140
        do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
        C_SYSC_preadv fs/read_write.c:1209 [inline]
        compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
        do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
        do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
        entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
       RIP: 0023:0xf7f73c79
       RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
       RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
       RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
       RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       BUG: sleeping function called from invalid context at lib/usercopy.c:25
       in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460
       INFO: lockdep is turned off.
       CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
       Google 01/01/2011
       Call Trace:
        __dump_stack lib/dump_stack.c:17 [inline]
        dump_stack+0x194/0x257 lib/dump_stack.c:53
        ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060
        __might_sleep+0x95/0x190 kernel/sched/core.c:6013
        __might_fault+0xab/0x1d0 mm/memory.c:4525
        _copy_to_user+0x2c/0xc0 lib/usercopy.c:25
        copy_to_user include/linux/uaccess.h:155 [inline]
        seq_read+0xcb4/0x13d0 fs/seq_file.c:279
        proc_reg_read+0xef/0x170 fs/proc/inode.c:217
        do_loop_readv_writev fs/read_write.c:673 [inline]
        do_iter_read+0x3db/0x5b0 fs/read_write.c:897
        compat_readv+0x1bf/0x270 fs/read_write.c:1140
        do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
        C_SYSC_preadv fs/read_write.c:1209 [inline]
        compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
        do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
        do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
        entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
       RIP: 0023:0xf7f73c79
       RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
       RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
       RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
       RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0
       lib/usercopy.c:26
      Reported-by: Nsyzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669@syzkaller.appspotmail.com>
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4adfa79f
    • U
      net/sched: kconfig: Remove blank help texts · 11eab148
      Ulf Magnusson 提交于
      Blank help texts are probably either a typo, a Kconfig misunderstanding,
      or some kind of half-committing to adding a help text (in which case a
      TODO comment would be clearer, if the help text really can't be added
      right away).
      
      Best to remove them, IMO.
      Signed-off-by: NUlf Magnusson <ulfalizer@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      11eab148
    • G
      openvswitch: meter: Use 64-bit arithmetic instead of 32-bit · 5b7789e8
      Gustavo A. R. Silva 提交于
      Add suffix LL to constant 1000 in order to give the compiler
      complete information about the proper arithmetic to use. Notice
      that this constant is used in a context that expects an expression
      of type long long int (64 bits, signed).
      
      The expression (band->burst_size + band->rate) * 1000 is currently
      being evaluated using 32-bit arithmetic.
      
      Addresses-Coverity-ID: 1461563 ("Unintentional integer overflow")
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b7789e8