1. 12 10月, 2019 5 次提交
    • J
      nl80211: validate beacon head · 1bd17a73
      Johannes Berg 提交于
      commit f88eb7c0d002a67ef31aeb7850b42ff69abc46dc upstream.
      
      We currently don't validate the beacon head, i.e. the header,
      fixed part and elements that are to go in front of the TIM
      element. This means that the variable elements there can be
      malformed, e.g. have a length exceeding the buffer size, but
      most downstream code from this assumes that this has already
      been checked.
      
      Add the necessary checks to the netlink policy.
      
      Cc: stable@vger.kernel.org
      Fixes: ed1b6cc7 ("cfg80211/nl80211: add beacon settings")
      Link: https://lore.kernel.org/r/1569009255-I7ac7fbe9436e9d8733439eab8acbbd35e55c74ef@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1bd17a73
    • J
      cfg80211: add and use strongly typed element iteration macros · ad180cac
      Johannes Berg 提交于
      commit 0f3b07f027f87a38ebe5c436490095df762819be upstream.
      
      Rather than always iterating elements from frames with pure
      u8 pointers, add a type "struct element" that encapsulates
      the id/datalen/data format of them.
      
      Then, add the element iteration macros
       * for_each_element
       * for_each_element_id
       * for_each_element_extid
      
      which take, as their first 'argument', such a structure and
      iterate through a given u8 array interpreting it as elements.
      
      While at it and since we'll need it, also add
       * for_each_subelement
       * for_each_subelement_id
       * for_each_subelement_extid
      
      which instead of taking data/length just take an outer element
      and use its data/datalen.
      
      Also add for_each_element_completed() to determine if any of
      the loops above completed, i.e. it was able to parse all of
      the elements successfully and no data remained.
      
      Use for_each_element_id() in cfg80211_find_ie_match() as the
      first user of this.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad180cac
    • F
      netfilter: nf_tables: allow lookups in dynamic sets · f7ace7f2
      Florian Westphal 提交于
      [ Upstream commit acab713177377d9e0889c46bac7ff0cfb9a90c4d ]
      
      This un-breaks lookups in sets that have the 'dynamic' flag set.
      Given this active example configuration:
      
      table filter {
        set set1 {
          type ipv4_addr
          size 64
          flags dynamic,timeout
          timeout 1m
        }
      
        chain input {
           type filter hook input priority 0; policy accept;
        }
      }
      
      ... this works:
      nft add rule ip filter input add @set1 { ip saddr }
      
      -> whenever rule is triggered, the source ip address is inserted
      into the set (if it did not exist).
      
      This won't work:
      nft add rule ip filter input ip saddr @set1 counter
      Error: Could not process rule: Operation not supported
      
      In other words, we can add entries to the set, but then can't make
      matching decision based on that set.
      
      That is just wrong -- all set backends support lookups (else they would
      not be very useful).
      The failure comes from an explicit rejection in nft_lookup.c.
      
      Looking at the history, it seems like NFT_SET_EVAL used to mean
      'set contains expressions' (aka. "is a meter"), for instance something like
      
       nft add rule ip filter input meter example { ip saddr limit rate 10/second }
       or
       nft add rule ip filter input meter example { ip saddr counter }
      
      The actual meaning of NFT_SET_EVAL however, is
      'set can be updated from the packet path'.
      
      'meters' and packet-path insertions into sets, such as
      'add @set { ip saddr }' use exactly the same kernel code (nft_dynset.c)
      and thus require a set backend that provides the ->update() function.
      
      The only set that provides this also is the only one that has the
      NFT_SET_EVAL feature flag.
      
      Removing the wrong check makes the above example work.
      While at it, also fix the flag check during set instantiation to
      allow supported combinations only.
      
      Fixes: 8aeff920 ("netfilter: nf_tables: add stateful object reference to set elements")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f7ace7f2
    • L
      9p: Transport error uninitialized · 07f3596c
      Lu Shuaibing 提交于
      [ Upstream commit 0ce772fe79b68f83df40f07f28207b292785c677 ]
      
      The p9_tag_alloc() does not initialize the transport error t_err field.
      The struct p9_req_t *req is allocated and stored in a struct p9_client
      variable. The field t_err is never initialized before p9_conn_cancel()
      checks its value.
      
      KUMSAN(KernelUninitializedMemorySantizer, a new error detection tool)
      reports this bug.
      
      ==================================================================
      BUG: KUMSAN: use of uninitialized memory in p9_conn_cancel+0x2d9/0x3b0
      Read of size 4 at addr ffff88805f9b600c by task kworker/1:2/1216
      
      CPU: 1 PID: 1216 Comm: kworker/1:2 Not tainted 5.2.0-rc4+ #28
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      Workqueue: events p9_write_work
      Call Trace:
       dump_stack+0x75/0xae
       __kumsan_report+0x17c/0x3e6
       kumsan_report+0xe/0x20
       p9_conn_cancel+0x2d9/0x3b0
       p9_write_work+0x183/0x4a0
       process_one_work+0x4d1/0x8c0
       worker_thread+0x6e/0x780
       kthread+0x1ca/0x1f0
       ret_from_fork+0x35/0x40
      
      Allocated by task 1979:
       save_stack+0x19/0x80
       __kumsan_kmalloc.constprop.3+0xbc/0x120
       kmem_cache_alloc+0xa7/0x170
       p9_client_prepare_req.part.9+0x3b/0x380
       p9_client_rpc+0x15e/0x880
       p9_client_create+0x3d0/0xac0
       v9fs_session_init+0x192/0xc80
       v9fs_mount+0x67/0x470
       legacy_get_tree+0x70/0xd0
       vfs_get_tree+0x4a/0x1c0
       do_mount+0xba9/0xf90
       ksys_mount+0xa8/0x120
       __x64_sys_mount+0x62/0x70
       do_syscall_64+0x6d/0x1e0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Freed by task 0:
      (stack is not available)
      
      The buggy address belongs to the object at ffff88805f9b6008
       which belongs to the cache p9_req_t of size 144
      The buggy address is located 4 bytes inside of
       144-byte region [ffff88805f9b6008, ffff88805f9b6098)
      The buggy address belongs to the page:
      page:ffffea00017e6d80 refcount:1 mapcount:0 mapping:ffff888068b63740 index:0xffff88805f9b7d90 compound_mapcount: 0
      flags: 0x100000000010200(slab|head)
      raw: 0100000000010200 ffff888068b66450 ffff888068b66450 ffff888068b63740
      raw: ffff88805f9b7d90 0000000000100001 00000001ffffffff 0000000000000000
      page dumped because: kumsan: bad access detected
      ==================================================================
      
      Link: http://lkml.kernel.org/r/20190613070854.10434-1-shuaibinglu@126.comSigned-off-by: NLu Shuaibing <shuaibinglu@126.com>
      [dominique.martinet@cea.fr: grouped the added init with the others]
      Signed-off-by: NDominique Martinet <dominique.martinet@cea.fr>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      07f3596c
    • J
      cfg80211: initialize on-stack chandefs · 3a0e6733
      Johannes Berg 提交于
      commit f43e5210c739fe76a4b0ed851559d6902f20ceb1 upstream.
      
      In a few places we don't properly initialize on-stack chandefs,
      resulting in EDMG data to be non-zero, which broke things.
      
      Additionally, in a few places we rely on the driver to init the
      data completely, but perhaps we shouldn't as non-EDMG drivers
      may not initialize the EDMG data, also initialize it there.
      
      Cc: stable@vger.kernel.org
      Fixes: 2a38075cd0be ("nl80211: Add support for EDMG channels")
      Reported-by: NDmitry Osipenko <digetx@gmail.com>
      Tested-by: NDmitry Osipenko <digetx@gmail.com>
      Link: https://lore.kernel.org/r/1569239475-I2dcce394ecf873376c386a78f31c2ec8b538fa25@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3a0e6733
  2. 08 10月, 2019 14 次提交
    • A
      NFC: fix attrs checks in netlink interface · c8a65ec0
      Andrey Konovalov 提交于
      commit 18917d51472fe3b126a3a8f756c6b18085eb8130 upstream.
      
      nfc_genl_deactivate_target() relies on the NFC_ATTR_TARGET_INDEX
      attribute being present, but doesn't check whether it is actually
      provided by the user. Same goes for nfc_genl_fw_download() and
      NFC_ATTR_FIRMWARE_NAME.
      
      This patch adds appropriate checks.
      
      Found with syzkaller.
      Signed-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c8a65ec0
    • E
      sch_cbq: validate TCA_CBQ_WRROPT to avoid crash · 74e2a311
      Eric Dumazet 提交于
      [ Upstream commit e9789c7cc182484fc031fd88097eb14cb26c4596 ]
      
      syzbot reported a crash in cbq_normalize_quanta() caused
      by an out of range cl->priority.
      
      iproute2 enforces this check, but malicious users do not.
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN PTI
      Modules linked in:
      CPU: 1 PID: 26447 Comm: syz-executor.1 Not tainted 5.3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:cbq_normalize_quanta.part.0+0x1fd/0x430 net/sched/sch_cbq.c:902
      RSP: 0018:ffff8801a5c333b0 EFLAGS: 00010206
      RAX: 0000000020000003 RBX: 00000000fffffff8 RCX: ffffc9000712f000
      RDX: 00000000000043bf RSI: ffffffff83be8962 RDI: 0000000100000018
      RBP: ffff8801a5c33420 R08: 000000000000003a R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000002ef
      R13: ffff88018da95188 R14: dffffc0000000000 R15: 0000000000000015
      FS:  00007f37d26b1700(0000) GS:ffff8801dad00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000004c7cec CR3: 00000001bcd0a006 CR4: 00000000001626f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       [<ffffffff83be9d57>] cbq_normalize_quanta include/net/pkt_sched.h:27 [inline]
       [<ffffffff83be9d57>] cbq_addprio net/sched/sch_cbq.c:1097 [inline]
       [<ffffffff83be9d57>] cbq_set_wrr+0x2d7/0x450 net/sched/sch_cbq.c:1115
       [<ffffffff83bee8a7>] cbq_change_class+0x987/0x225b net/sched/sch_cbq.c:1537
       [<ffffffff83b96985>] tc_ctl_tclass+0x555/0xcd0 net/sched/sch_api.c:2329
       [<ffffffff83a84655>] rtnetlink_rcv_msg+0x485/0xc10 net/core/rtnetlink.c:5248
       [<ffffffff83cadf0a>] netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2510
       [<ffffffff83a7db6d>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5266
       [<ffffffff83cac2c6>] netlink_unicast_kernel net/netlink/af_netlink.c:1324 [inline]
       [<ffffffff83cac2c6>] netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1350
       [<ffffffff83cacd4a>] netlink_sendmsg+0x89a/0xd50 net/netlink/af_netlink.c:1939
       [<ffffffff8399d46e>] sock_sendmsg_nosec net/socket.c:673 [inline]
       [<ffffffff8399d46e>] sock_sendmsg+0x12e/0x170 net/socket.c:684
       [<ffffffff8399f1fd>] ___sys_sendmsg+0x81d/0x960 net/socket.c:2359
       [<ffffffff839a2d05>] __sys_sendmsg+0x105/0x1d0 net/socket.c:2397
       [<ffffffff839a2df9>] SYSC_sendmsg net/socket.c:2406 [inline]
       [<ffffffff839a2df9>] SyS_sendmsg+0x29/0x30 net/socket.c:2404
       [<ffffffff8101ccc8>] do_syscall_64+0x528/0x770 arch/x86/entry/common.c:305
       [<ffffffff84400091>] entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74e2a311
    • T
      tipc: fix unlimited bundling of small messages · ed9420dd
      Tuong Lien 提交于
      [ Upstream commit e95584a889e1902fdf1ded9712e2c3c3083baf96 ]
      
      We have identified a problem with the "oversubscription" policy in the
      link transmission code.
      
      When small messages are transmitted, and the sending link has reached
      the transmit window limit, those messages will be bundled and put into
      the link backlog queue. However, bundles of data messages are counted
      at the 'CRITICAL' level, so that the counter for that level, instead of
      the counter for the real, bundled message's level is the one being
      increased.
      Subsequent, to-be-bundled data messages at non-CRITICAL levels continue
      to be tested against the unchanged counter for their own level, while
      contributing to an unrestrained increase at the CRITICAL backlog level.
      
      This leaves a gap in congestion control algorithm for small messages
      that can result in starvation for other users or a "real" CRITICAL
      user. Even that eventually can lead to buffer exhaustion & link reset.
      
      We fix this by keeping a 'target_bskb' buffer pointer at each levels,
      then when bundling, we only bundle messages at the same importance
      level only. This way, we know exactly how many slots a certain level
      have occupied in the queue, so can manage level congestion accurately.
      
      By bundling messages at the same level, we even have more benefits. Let
      consider this:
      - One socket sends 64-byte messages at the 'CRITICAL' level;
      - Another sends 4096-byte messages at the 'LOW' level;
      
      When a 64-byte message comes and is bundled the first time, we put the
      overhead of message bundle to it (+ 40-byte header, data copy, etc.)
      for later use, but the next message can be a 4096-byte one that cannot
      be bundled to the previous one. This means the last bundle carries only
      one payload message which is totally inefficient, as for the receiver
      also! Later on, another 64-byte message comes, now we make a new bundle
      and the same story repeats...
      
      With the new bundling algorithm, this will not happen, the 64-byte
      messages will be bundled together even when the 4096-byte message(s)
      comes in between. However, if the 4096-byte messages are sent at the
      same level i.e. 'CRITICAL', the bundling algorithm will again cause the
      same overhead.
      
      Also, the same will happen even with only one socket sending small
      messages at a rate close to the link transmit's one, so that, when one
      message is bundled, it's transmitted shortly. Then, another message
      comes, a new bundle is created and so on...
      
      We will solve this issue radically by another patch.
      
      Fixes: 365ad353 ("tipc: reduce risk of user starvation during link congestion")
      Reported-by: NHoang Le <hoang.h.le@dektech.com.au>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed9420dd
    • D
      net/rds: Fix error handling in rds_ib_add_one() · 36a4043c
      Dotan Barak 提交于
      [ Upstream commit d64bf89a75b65f83f06be9fb8f978e60d53752db ]
      
      rds_ibdev:ipaddr_list and rds_ibdev:conn_list are initialized
      after allocation some resources such as protection domain.
      If allocation of such resources fail, then these uninitialized
      variables are accessed in rds_ib_dev_free() in failure path. This
      can potentially crash the system. The code has been updated to
      initialize these variables very early in the function.
      Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il>
      Signed-off-by: NSudhakar Dindukurti <sudhakar.dindukurti@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36a4043c
    • J
      udp: only do GSO if # of segs > 1 · 012363f5
      Josh Hunt 提交于
      [ Upstream commit 4094871db1d65810acab3d57f6089aa39ef7f648 ]
      
      Prior to this change an application sending <= 1MSS worth of data and
      enabling UDP GSO would fail if the system had SW GSO enabled, but the
      same send would succeed if HW GSO offload is enabled. In addition to this
      inconsistency the error in the SW GSO case does not get back to the
      application if sending out of a real device so the user is unaware of this
      failure.
      
      With this change we only perform GSO if the # of segments is > 1 even
      if the application has enabled segmentation. I've also updated the
      relevant udpgso selftests.
      
      Fixes: bec1f6f6 ("udp: generate gso with UDP_SEGMENT")
      Signed-off-by: NJosh Hunt <johunt@akamai.com>
      Reviewed-by: NWillem de Bruijn <willemb@google.com>
      Reviewed-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      012363f5
    • D
      vsock: Fix a lockdep warning in __vsock_release() · 3c1f0704
      Dexuan Cui 提交于
      [ Upstream commit 0d9138ffac24cf8b75366ede3a68c951e6dcc575 ]
      
      Lockdep is unhappy if two locks from the same class are held.
      
      Fix the below warning for hyperv and virtio sockets (vmci socket code
      doesn't have the issue) by using lock_sock_nested() when __vsock_release()
      is called recursively:
      
      ============================================
      WARNING: possible recursive locking detected
      5.3.0+ #1 Not tainted
      --------------------------------------------
      server/1795 is trying to acquire lock:
      ffff8880c5158990 (sk_lock-AF_VSOCK){+.+.}, at: hvs_release+0x10/0x120 [hv_sock]
      
      but task is already holding lock:
      ffff8880c5158150 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 [vsock]
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(sk_lock-AF_VSOCK);
        lock(sk_lock-AF_VSOCK);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      2 locks held by server/1795:
       #0: ffff8880c5d05ff8 (&sb->s_type->i_mutex_key#10){+.+.}, at: __sock_release+0x2d/0xa0
       #1: ffff8880c5158150 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 [vsock]
      
      stack backtrace:
      CPU: 5 PID: 1795 Comm: server Not tainted 5.3.0+ #1
      Call Trace:
       dump_stack+0x67/0x90
       __lock_acquire.cold.67+0xd2/0x20b
       lock_acquire+0xb5/0x1c0
       lock_sock_nested+0x6d/0x90
       hvs_release+0x10/0x120 [hv_sock]
       __vsock_release+0x24/0xf0 [vsock]
       __vsock_release+0xa0/0xf0 [vsock]
       vsock_release+0x12/0x30 [vsock]
       __sock_release+0x37/0xa0
       sock_close+0x14/0x20
       __fput+0xc1/0x250
       task_work_run+0x98/0xc0
       do_exit+0x344/0xc60
       do_group_exit+0x47/0xb0
       get_signal+0x15c/0xc50
       do_signal+0x30/0x720
       exit_to_usermode_loop+0x50/0xa0
       do_syscall_64+0x24e/0x270
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x7f4184e85f31
      Tested-by: NStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: NDexuan Cui <decui@microsoft.com>
      Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3c1f0704
    • J
      udp: fix gso_segs calculations · 544aee54
      Josh Hunt 提交于
      [ Upstream commit 44b321e5020d782ad6e8ae8183f09b163be6e6e2 ]
      
      Commit dfec0ee2 ("udp: Record gso_segs when supporting UDP segmentation offload")
      added gso_segs calculation, but incorrectly got sizeof() the pointer and
      not the underlying data type. In addition let's fix the v6 case.
      
      Fixes: bec1f6f6 ("udp: generate gso with UDP_SEGMENT")
      Fixes: dfec0ee2 ("udp: Record gso_segs when supporting UDP segmentation offload")
      Signed-off-by: NJosh Hunt <johunt@akamai.com>
      Reviewed-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      544aee54
    • E
      sch_dsmark: fix potential NULL deref in dsmark_init() · 79fd59ae
      Eric Dumazet 提交于
      [ Upstream commit 474f0813a3002cb299bb73a5a93aa1f537a80ca8 ]
      
      Make sure TCA_DSMARK_INDICES was provided by the user.
      
      syzbot reported :
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 8799 Comm: syz-executor235 Not tainted 5.3.0+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:nla_get_u16 include/net/netlink.h:1501 [inline]
      RIP: 0010:dsmark_init net/sched/sch_dsmark.c:364 [inline]
      RIP: 0010:dsmark_init+0x193/0x640 net/sched/sch_dsmark.c:339
      Code: 85 db 58 0f 88 7d 03 00 00 e8 e9 1a ac fb 48 8b 9d 70 ff ff ff 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 ca
      RSP: 0018:ffff88809426f3b8 EFLAGS: 00010247
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff85c6eb09
      RDX: 0000000000000000 RSI: ffffffff85c6eb17 RDI: 0000000000000004
      RBP: ffff88809426f4b0 R08: ffff88808c4085c0 R09: ffffed1015d26159
      R10: ffffed1015d26158 R11: ffff8880ae930ac7 R12: ffff8880a7e96940
      R13: dffffc0000000000 R14: ffff88809426f8c0 R15: 0000000000000000
      FS:  0000000001292880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000080 CR3: 000000008ca1b000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       qdisc_create+0x4ee/0x1210 net/sched/sch_api.c:1237
       tc_modify_qdisc+0x524/0x1c50 net/sched/sch_api.c:1653
       rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5223
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5241
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:637 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:657
       ___sys_sendmsg+0x803/0x920 net/socket.c:2311
       __sys_sendmsg+0x105/0x1d0 net/socket.c:2356
       __do_sys_sendmsg net/socket.c:2365 [inline]
       __se_sys_sendmsg net/socket.c:2363 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
       do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x440369
      
      Fixes: 758cc43c ("[PKT_SCHED]: Fix dsmark to apply changes consistent")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      79fd59ae
    • E
      nfc: fix memory leak in llcp_sock_bind() · dd9c580a
      Eric Dumazet 提交于
      [ Upstream commit a0c2dc1fe63e2869b74c1c7f6a81d1745c8a695d ]
      
      sysbot reported a memory leak after a bind() has failed.
      
      While we are at it, abort the operation if kmemdup() has failed.
      
      BUG: memory leak
      unreferenced object 0xffff888105d83ec0 (size 32):
        comm "syz-executor067", pid 7207, jiffies 4294956228 (age 19.430s)
        hex dump (first 32 bytes):
          00 69 6c 65 20 72 65 61 64 00 6e 65 74 3a 5b 34  .ile read.net:[4
          30 32 36 35 33 33 30 39 37 5d 00 00 00 00 00 00  026533097]......
        backtrace:
          [<0000000036bac473>] kmemleak_alloc_recursive /./include/linux/kmemleak.h:43 [inline]
          [<0000000036bac473>] slab_post_alloc_hook /mm/slab.h:522 [inline]
          [<0000000036bac473>] slab_alloc /mm/slab.c:3319 [inline]
          [<0000000036bac473>] __do_kmalloc /mm/slab.c:3653 [inline]
          [<0000000036bac473>] __kmalloc_track_caller+0x169/0x2d0 /mm/slab.c:3670
          [<000000000cd39d07>] kmemdup+0x27/0x60 /mm/util.c:120
          [<000000008e57e5fc>] kmemdup /./include/linux/string.h:432 [inline]
          [<000000008e57e5fc>] llcp_sock_bind+0x1b3/0x230 /net/nfc/llcp_sock.c:107
          [<000000009cb0b5d3>] __sys_bind+0x11c/0x140 /net/socket.c:1647
          [<00000000492c3bbc>] __do_sys_bind /net/socket.c:1658 [inline]
          [<00000000492c3bbc>] __se_sys_bind /net/socket.c:1656 [inline]
          [<00000000492c3bbc>] __x64_sys_bind+0x1e/0x30 /net/socket.c:1656
          [<0000000008704b2a>] do_syscall_64+0x76/0x1a0 /arch/x86/entry/common.c:296
          [<000000009f4c57a4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 30cc4587 ("NFC: Move LLCP code to the NFC top level diirectory")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dd9c580a
    • M
      net: Unpublish sk from sk_reuseport_cb before call_rcu · d5b1db1c
      Martin KaFai Lau 提交于
      [ Upstream commit 8c7138b33e5c690c308b2a7085f6313fdcb3f616 ]
      
      The "reuse->sock[]" array is shared by multiple sockets.  The going away
      sk must unpublish itself from "reuse->sock[]" before making call_rcu()
      call.  However, this unpublish-action is currently done after a grace
      period and it may cause use-after-free.
      
      The fix is to move reuseport_detach_sock() to sk_destruct().
      Due to the above reason, any socket with sk_reuseport_cb has
      to go through the rcu grace period before freeing it.
      
      It is a rather old bug (~3 yrs).  The Fixes tag is not necessary
      the right commit but it is the one that introduced the SOCK_RCU_FREE
      logic and this fix is depending on it.
      
      Fixes: a4298e45 ("net: add SOCK_RCU_FREE socket flag")
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5b1db1c
    • P
      net: ipv4: avoid mixed n_redirects and rate_tokens usage · 124b64fe
      Paolo Abeni 提交于
      [ Upstream commit b406472b5ad79ede8d10077f0c8f05505ace8b6d ]
      
      Since commit c09551c6ff7f ("net: ipv4: use a dedicated counter
      for icmp_v4 redirect packets") we use 'n_redirects' to account
      for redirect packets, but we still use 'rate_tokens' to compute
      the redirect packets exponential backoff.
      
      If the device sent to the relevant peer any ICMP error packet
      after sending a redirect, it will also update 'rate_token' according
      to the leaking bucket schema; typically 'rate_token' will raise
      above BITS_PER_LONG and the redirect packets backoff algorithm
      will produce undefined behavior.
      
      Fix the issue using 'n_redirects' to compute the exponential backoff
      in ip_rt_send_redirect().
      
      Note that we still clear rate_tokens after a redirect silence period,
      to avoid changing an established behaviour.
      
      The root cause predates git history; before the mentioned commit in
      the critical scenario, the kernel stopped sending redirects, after
      the mentioned commit the behavior more randomic.
      Reported-by: NXiumei Mu <xmu@redhat.com>
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Fixes: c09551c6ff7f ("net: ipv4: use a dedicated counter for icmp_v4 redirect packets")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      124b64fe
    • D
      ipv6: Handle missing host route in __ipv6_ifa_notify · 6f8564ed
      David Ahern 提交于
      [ Upstream commit 2d819d250a1393a3e725715425ab70a0e0772a71 ]
      
      Rajendra reported a kernel panic when a link was taken down:
      
          [ 6870.263084] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
          [ 6870.271856] IP: [<ffffffff8efc5764>] __ipv6_ifa_notify+0x154/0x290
      
          <snip>
      
          [ 6870.570501] Call Trace:
          [ 6870.573238] [<ffffffff8efc58c6>] ? ipv6_ifa_notify+0x26/0x40
          [ 6870.579665] [<ffffffff8efc98ec>] ? addrconf_dad_completed+0x4c/0x2c0
          [ 6870.586869] [<ffffffff8efe70c6>] ? ipv6_dev_mc_inc+0x196/0x260
          [ 6870.593491] [<ffffffff8efc9c6a>] ? addrconf_dad_work+0x10a/0x430
          [ 6870.600305] [<ffffffff8f01ade4>] ? __switch_to_asm+0x34/0x70
          [ 6870.606732] [<ffffffff8ea93a7a>] ? process_one_work+0x18a/0x430
          [ 6870.613449] [<ffffffff8ea93d6d>] ? worker_thread+0x4d/0x490
          [ 6870.619778] [<ffffffff8ea93d20>] ? process_one_work+0x430/0x430
          [ 6870.626495] [<ffffffff8ea99dd9>] ? kthread+0xd9/0xf0
          [ 6870.632145] [<ffffffff8f01ade4>] ? __switch_to_asm+0x34/0x70
          [ 6870.638573] [<ffffffff8ea99d00>] ? kthread_park+0x60/0x60
          [ 6870.644707] [<ffffffff8f01ae77>] ? ret_from_fork+0x57/0x70
          [ 6870.650936] Code: 31 c0 31 d2 41 b9 20 00 08 02 b9 09 00 00 0
      
      addrconf_dad_work is kicked to be scheduled when a device is brought
      up. There is a race between addrcond_dad_work getting scheduled and
      taking the rtnl lock and a process taking the link down (under rtnl).
      The latter removes the host route from the inet6_addr as part of
      addrconf_ifdown which is run for NETDEV_DOWN. The former attempts
      to use the host route in __ipv6_ifa_notify. If the down event removes
      the host route due to the race to the rtnl, then the BUG listed above
      occurs.
      
      Since the DAD sequence can not be aborted, add a check for the missing
      host route in __ipv6_ifa_notify. The only way this should happen is due
      to the previously mentioned race. The host route is created when the
      address is added to an interface; it is only removed on a down event
      where the address is kept. Add a warning if the host route is missing
      AND the device is up; this is a situation that should never happen.
      
      Fixes: f1705ec1 ("net: ipv6: Make address flushing on ifdown optional")
      Reported-by: NRajendra Dendukuri <rajendra.dendukuri@broadcom.com>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f8564ed
    • E
      ipv6: drop incoming packets having a v4mapped source address · 658d7ee4
      Eric Dumazet 提交于
      [ Upstream commit 6af1799aaf3f1bc8defedddfa00df3192445bbf3 ]
      
      This began with a syzbot report. syzkaller was injecting
      IPv6 TCP SYN packets having a v4mapped source address.
      
      After an unsuccessful 4-tuple lookup, TCP creates a request
      socket (SYN_RECV) and calls reqsk_queue_hash_req()
      
      reqsk_queue_hash_req() calls sk_ehashfn(sk)
      
      At this point we have AF_INET6 sockets, and the heuristic
      used by sk_ehashfn() to either hash the IPv4 or IPv6 addresses
      is to use ipv6_addr_v4mapped(&sk->sk_v6_daddr)
      
      For the particular spoofed packet, we end up hashing V4 addresses
      which were not initialized by the TCP IPv6 stack, so KMSAN fired
      a warning.
      
      I first fixed sk_ehashfn() to test both source and destination addresses,
      but then faced various problems, including user-space programs
      like packetdrill that had similar assumptions.
      
      Instead of trying to fix the whole ecosystem, it is better
      to admit that we have a dual stack behavior, and that we
      can not build linux kernels without V4 stack anyway.
      
      The dual stack API automatically forces the traffic to be IPv4
      if v4mapped addresses are used at bind() or connect(), so it makes
      no sense to allow IPv6 traffic to use the same v4mapped class.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      658d7ee4
    • H
      erspan: remove the incorrect mtu limit for erspan · 7f30c44b
      Haishuang Yan 提交于
      [ Upstream commit 0e141f757b2c78c983df893e9993313e2dc21e38 ]
      
      erspan driver calls ether_setup(), after commit 61e84623
      ("net: centralize net_device min/max MTU checking"), the range
      of mtu is [min_mtu, max_mtu], which is [68, 1500] by default.
      
      It causes the dev mtu of the erspan device to not be greater
      than 1500, this limit value is not correct for ipgre tap device.
      
      Tested:
      Before patch:
      # ip link set erspan0 mtu 1600
      Error: mtu greater than device maximum.
      After patch:
      # ip link set erspan0 mtu 1600
      # ip -d link show erspan0
      21: erspan0@NONE: <BROADCAST,MULTICAST> mtu 1600 qdisc noop state DOWN
      mode DEFAULT group default qlen 1000
          link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 0
      
      Fixes: 61e84623 ("net: centralize net_device min/max MTU checking")
      Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7f30c44b
  3. 05 10月, 2019 12 次提交
  4. 01 10月, 2019 6 次提交
  5. 21 9月, 2019 3 次提交
    • T
      netfilter: nf_conntrack_ftp: Fix debug output · 6075729f
      Thomas Jarosch 提交于
      [ Upstream commit 3a069024d371125227de3ac8fa74223fcf473520 ]
      
      The find_pattern() debug output was printing the 'skip' character.
      This can be a NULL-byte and messes up further pr_debug() output.
      
      Output without the fix:
      kernel: nf_conntrack_ftp: Pattern matches!
      kernel: nf_conntrack_ftp: Skipped up to `<7>nf_conntrack_ftp: find_pattern `PORT': dlen = 8
      kernel: nf_conntrack_ftp: find_pattern `EPRT': dlen = 8
      
      Output with the fix:
      kernel: nf_conntrack_ftp: Pattern matches!
      kernel: nf_conntrack_ftp: Skipped up to 0x0 delimiter!
      kernel: nf_conntrack_ftp: Match succeeded!
      kernel: nf_conntrack_ftp: conntrack_ftp: match `172,17,0,100,200,207' (20 bytes at 4150681645)
      kernel: nf_conntrack_ftp: find_pattern `PORT': dlen = 8
      Signed-off-by: NThomas Jarosch <thomas.jarosch@intra2net.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      6075729f
    • T
      netfilter: xt_physdev: Fix spurious error message in physdev_mt_check · 7ac5947f
      Todd Seidelmann 提交于
      [ Upstream commit 3cf2f450fff304be9cf4868bf0df17f253bc5b1c ]
      
      Simplify the check in physdev_mt_check() to emit an error message
      only when passed an invalid chain (ie, NF_INET_LOCAL_OUT).
      This avoids cluttering up the log with errors against valid rules.
      
      For large/heavily modified rulesets, current behavior can quickly
      overwhelm the ring buffer, because this function gets called on
      every change, regardless of the rule that was changed.
      Signed-off-by: NTodd Seidelmann <tseidelmann@linode.com>
      Acked-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      7ac5947f
    • I
      bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0 · d9f79f0a
      Ilya Leoshkevich 提交于
      [ Upstream commit 2c238177bd7f4b14bdf7447cc1cd9bb791f147e6 ]
      
      test_select_reuseport fails on s390 due to verifier rejecting
      test_select_reuseport_kern.o with the following message:
      
      	; data_check.eth_protocol = reuse_md->eth_protocol;
      	18: (69) r1 = *(u16 *)(r6 +22)
      	invalid bpf_context access off=22 size=2
      
      This is because on big-endian machines casts from __u32 to __u16 are
      generated by referencing the respective variable as __u16 with an offset
      of 2 (as opposed to 0 on little-endian machines).
      
      The verifier already has all the infrastructure in place to allow such
      accesses, it's just that they are not explicitly enabled for
      eth_protocol field. Enable them for eth_protocol field by using
      bpf_ctx_range instead of offsetof.
      
      Ditto for ip_protocol, bind_inany and len, since they already allow
      narrowing, and the same problem can arise when working with them.
      
      Fixes: 2dbb9b9e ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT")
      Signed-off-by: NIlya Leoshkevich <iii@linux.ibm.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d9f79f0a