1. 30 1月, 2019 15 次提交
  2. 29 1月, 2019 25 次提交
    • V
      liquidio: fix the validation of rx checksum status from NIC hardware · ac93e2fa
      Veerasenareddy Burru 提交于
      Fixed the code that was incorrectly interpreting the rx checksum validation
      status from hardware, and updating kernel that the packet arrived with
      correct checksum though the packet arrived with incorrect checksum and
      hardware also indicated checksum is not correct.
      Signed-off-by: NVeerasenareddy Burru <vburru@marvell.com>
      Acked-by: NDerek Chickles <dchickles@marvell.com>
      Signed-off-by: NFelix Manlunas <fmanlunas@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac93e2fa
    • Y
      net: ti: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles · b3379a42
      Yang Wei 提交于
      dev_consume_skb_irq() should be called in cpmac_end_xmit() when
      xmit done. It makes drop profiles more friendly.
      Signed-off-by: NYang Wei <yang.wei9@zte.com.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3379a42
    • Y
      net: apple: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles · 10009115
      Yang Wei 提交于
      dev_consume_skb_irq() should be called in bmac_txdma_intr() when
      xmit done. It makes drop profiles more friendly.
      Signed-off-by: NYang Wei <yang.wei9@zte.com.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10009115
    • AlbinYang's avatar
      net: amd8111e: replace dev_kfree_skb_irq by dev_consume_skb_irq · 3afa73dd
      AlbinYang 提交于
      dev_consume_skb_irq() should be called in amd8111e_tx() when xmit
      done. It makes drop profiles more friendly.
      Signed-off-by: NYang Wei <yang.wei9@zte.com.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3afa73dd
    • AlbinYang's avatar
      net: alteon: replace dev_kfree_skb_irq by dev_consume_skb_irq · f48af114
      AlbinYang 提交于
      dev_consume_skb_irq() should be called in ace_tx_int() when xmit
      done. It makes drop profiles more friendly.
      Signed-off-by: NYang Wei <yang.wei9@zte.com.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f48af114
    • D
      net: tls: Fix deadlock in free_resources tx · 10231213
      Dave Watson 提交于
      If there are outstanding async tx requests (when crypto returns EINPROGRESS),
      there is a potential deadlock: the tx work acquires the lock, while we
      cancel_delayed_work_sync() while holding the lock.  Drop the lock while waiting
      for the work to complete.
      
      Fixes: a42055e8 ("Add support for async encryption of records...")
      Signed-off-by: NDave Watson <davejwatson@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10231213
    • D
      net: tls: Save iv in tls_rec for async crypto requests · 32eb67b9
      Dave Watson 提交于
      aead_request_set_crypt takes an iv pointer, and we change the iv
      soon after setting it.  Some async crypto algorithms don't save the iv,
      so we need to save it in the tls_rec for async requests.
      
      Found by hardcoding x64 aesni to use async crypto manager (to test the async
      codepath), however I don't think this combination can happen in the wild.
      Presumably other hardware offloads will need this fix, but there have been
      no user reports.
      
      Fixes: a42055e8 ("Add support for async encryption of records...")
      Signed-off-by: NDave Watson <davejwatson@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32eb67b9
    • J
      vhost: fix OOB in get_rx_bufs() · b46a0bf7
      Jason Wang 提交于
      After batched used ring updating was introduced in commit e2b3b35e
      ("vhost_net: batch used ring update in rx"). We tend to batch heads in
      vq->heads for more than one packet. But the quota passed to
      get_rx_bufs() was not correctly limited, which can result a OOB write
      in vq->heads.
      
              headcount = get_rx_bufs(vq, vq->heads + nvq->done_idx,
                          vhost_len, &in, vq_log, &log,
                          likely(mergeable) ? UIO_MAXIOV : 1);
      
      UIO_MAXIOV was still used which is wrong since we could have batched
      used in vq->heads, this will cause OOB if the next buffer needs more
      than 960 (1024 (UIO_MAXIOV) - 64 (VHOST_NET_BATCH)) heads after we've
      batched 64 (VHOST_NET_BATCH) heads:
      Acked-by: NStefan Hajnoczi <stefanha@redhat.com>
      
      =============================================================================
      BUG kmalloc-8k (Tainted: G    B            ): Redzone overwritten
      -----------------------------------------------------------------------------
      
      INFO: 0x00000000fd93b7a2-0x00000000f0713384. First byte 0xa9 instead of 0xcc
      INFO: Allocated in alloc_pd+0x22/0x60 age=3933677 cpu=2 pid=2674
          kmem_cache_alloc_trace+0xbb/0x140
          alloc_pd+0x22/0x60
          gen8_ppgtt_create+0x11d/0x5f0
          i915_ppgtt_create+0x16/0x80
          i915_gem_create_context+0x248/0x390
          i915_gem_context_create_ioctl+0x4b/0xe0
          drm_ioctl_kernel+0xa5/0xf0
          drm_ioctl+0x2ed/0x3a0
          do_vfs_ioctl+0x9f/0x620
          ksys_ioctl+0x6b/0x80
          __x64_sys_ioctl+0x11/0x20
          do_syscall_64+0x43/0xf0
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      INFO: Slab 0x00000000d13e87af objects=3 used=3 fp=0x          (null) flags=0x200000000010201
      INFO: Object 0x0000000003278802 @offset=17064 fp=0x00000000e2e6652b
      
      Fixing this by allocating UIO_MAXIOV + VHOST_NET_BATCH iovs for
      vhost-net. This is done through set the limitation through
      vhost_dev_init(), then set_owner can allocate the number of iov in a
      per device manner.
      
      This fixes CVE-2018-16880.
      
      Fixes: e2b3b35e ("vhost_net: batch used ring update in rx")
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b46a0bf7
    • S
      enetc: include linux/vmalloc.h for vzalloc etc · bbcbf2ee
      Stephen Rothwell 提交于
      Fixes: d4fd0404 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bbcbf2ee
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · ec7146db
      David S. Miller 提交于
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2019-01-29
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      The main changes are:
      
      1) Teach verifier dead code removal, this also allows for optimizing /
         removing conditional branches around dead code and to shrink the
         resulting image. Code store constrained architectures like nfp would
         have hard time doing this at JIT level, from Jakub.
      
      2) Add JMP32 instructions to BPF ISA in order to allow for optimizing
         code generation for 32-bit sub-registers. Evaluation shows that this
         can result in code reduction of ~5-20% compared to 64 bit-only code
         generation. Also add implementation for most JITs, from Jiong.
      
      3) Add support for __int128 types in BTF which is also needed for
         vmlinux's BTF conversion to work, from Yonghong.
      
      4) Add a new command to bpftool in order to dump a list of BPF-related
         parameters from the system or for a specific network device e.g. in
         terms of available prog/map types or helper functions, from Quentin.
      
      5) Add AF_XDP sock_diag interface for querying sockets from user
         space which provides information about the RX/TX/fill/completion
         rings, umem, memory usage etc, from Björn.
      
      6) Add skb context access for skb_shared_info->gso_segs field, from Eric.
      
      7) Add support for testing flow dissector BPF programs by extending
         existing BPF_PROG_TEST_RUN infrastructure, from Stanislav.
      
      8) Split BPF kselftest's test_verifier into various subgroups of tests
         in order better deal with merge conflicts in this area, from Jakub.
      
      9) Add support for queue/stack manipulations in bpftool, from Stanislav.
      
      10) Document BTF, from Yonghong.
      
      11) Dump supported ELF section names in libbpf on program load
          failure, from Taeung.
      
      12) Silence a false positive compiler warning in verifier's BTF
          handling, from Peter.
      
      13) Fix help string in bpftool's feature probing, from Prashant.
      
      14) Remove duplicate includes in BPF kselftests, from Yue.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec7146db
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 343917b4
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS updates for net-next
      
      The following patchset contains Netfilter/IPVS updates for your net-next tree:
      
      1) Introduce a hashtable to speed up object lookups, from Florian Westphal.
      
      2) Make direct calls to built-in extension, also from Florian.
      
      3) Call helper before confirming the conntrack as it used to be originally,
         from Florian.
      
      4) Call request_module() to autoload br_netfilter when physdev is used
         to relax the dependency, also from Florian.
      
      5) Allow to insert rules at a given position ID that is internal to the
         batch, from Phil Sutter.
      
      6) Several patches to replace conntrack indirections by direct calls,
         and to reduce modularization, from Florian. This also includes
         several follow up patches to deal with minor fallout from this
         rework.
      
      7) Use RCU from conntrack gre helper, from Florian.
      
      8) GRE conntrack module becomes built-in into nf_conntrack, from Florian.
      
      9) Replace nf_ct_invert_tuplepr() by calls to nf_ct_invert_tuple(),
         from Florian.
      
      10) Unify sysctl handling at the core of nf_conntrack, from Florian.
      
      11) Provide modparam to register conntrack hooks.
      
      12) Allow to match on the interface kind string, from wenxu.
      
      13) Remove several exported symbols, not required anymore now after
          a bit of de-modulatization work has been done, from Florian.
      
      14) Remove built-in map support in the hash extension, this can be
          done with the existing userspace infrastructure, from laura.
      
      15) Remove indirection to calculate checksums in IPVS, from Matteo Croce.
      
      16) Use call wrappers for indirection in IPVS, also from Matteo.
      
      17) Remove superfluous __percpu parameter in nft_counter, patch from
          Luc Van Oostenryck.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      343917b4
    • D
      Merge branch 'bpf-flow-dissector-tests' · 3d2af27a
      Daniel Borkmann 提交于
      Stanislav Fomichev says:
      
      ====================
      This patch series adds support for testing flow dissector BPF programs
      by extending already existing BPF_PROG_TEST_RUN. The goal is to have
      a packet as an input and `struct bpf_flow_key' as an output. That way
      we can easily test flow dissector programs' behavior. I've also modified
      existing test_progs.c test to do a simple flow dissector run as well.
      
      * first patch introduces new __skb_flow_bpf_dissect to simplify
        sharing between __skb_flow_bpf_dissect and BPF_PROG_TEST_RUN
      * second patch adds actual BPF_PROG_TEST_RUN support
      * third patch adds example usage to the selftests
      
      v3:
      * rebased on top of latest bpf-next
      
      v2:
      * loop over 'kattr->test.repeat' inside of
        bpf_prog_test_run_flow_dissector, don't reuse
        bpf_test_run/bpf_test_run_one
      ====================
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      3d2af27a
    • S
      selftests/bpf: add simple BPF_PROG_TEST_RUN examples for flow dissector · bf0f0fd9
      Stanislav Fomichev 提交于
      Use existing pkt_v4 and pkt_v6 to make sure flow_keys are what we want.
      
      Also, add new bpf_flow_load routine (and flow_dissector_load.h header)
      that loads bpf_flow.o program and does all required setup.
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      bf0f0fd9
    • S
      bpf: add BPF_PROG_TEST_RUN support for flow dissector · b7a1848e
      Stanislav Fomichev 提交于
      The input is packet data, the output is struct bpf_flow_key. This should
      make it easy to test flow dissector programs without elaborate
      setup.
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      b7a1848e
    • S
      net/flow_dissector: move bpf case into __skb_flow_bpf_dissect · c8aa7038
      Stanislav Fomichev 提交于
      This way, we can reuse it for flow dissector in BPF_PROG_TEST_RUN.
      
      No functional changes.
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      c8aa7038
    • J
      tools: bpftool: warn about risky prog array updates · d76198b0
      Jakub Kicinski 提交于
      When prog array is updated with bpftool users often refer
      to the map via the ID.  Unfortunately, that's likely
      to lead to confusion because prog arrays get flushed when
      the last user reference is gone.  If there is no other
      reference bpftool will create one, update successfully
      just to close the map again and have it flushed.
      
      Warn about this case in non-JSON mode.
      
      If the problem continues causing confusion we can remove
      the support for referring to a map by ID for prog array
      update completely.  For now it seems like the potential
      inconvenience to users who know what they're doing outweighs
      the benefit.
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      d76198b0
    • Y
      selftests: bpf: remove duplicated include · cdd7b406
      YueHaibing 提交于
      Remove duplicated include.
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      cdd7b406
    • D
      Merge branch 'qed-Bug-fixes' · bfe2599d
      David S. Miller 提交于
      Manish Chopra says:
      
      ====================
      qed: Bug fixes
      
      This series have SR-IOV and some general fixes.
      Please consider applying it to "net"
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bfe2599d
    • M
      qed: Fix stack out of bounds bug · ffb057f9
      Manish Chopra 提交于
      KASAN reported following bug in qed_init_qm_get_idx_from_flags
      due to inappropriate casting of "pq_flags". Fix the type of "pq_flags".
      
      [  196.624707] BUG: KASAN: stack-out-of-bounds in qed_init_qm_get_idx_from_flags+0x1a4/0x1b8 [qed]
      [  196.624712] Read of size 8 at addr ffff809b00bc7360 by task kworker/0:9/1712
      [  196.624714]
      [  196.624720] CPU: 0 PID: 1712 Comm: kworker/0:9 Not tainted 4.18.0-60.el8.aarch64+debug #1
      [  196.624723] Hardware name: To be filled by O.E.M. Saber/Saber, BIOS 0ACKL024 09/26/2018
      [  196.624733] Workqueue: events work_for_cpu_fn
      [  196.624738] Call trace:
      [  196.624742]  dump_backtrace+0x0/0x2f8
      [  196.624745]  show_stack+0x24/0x30
      [  196.624749]  dump_stack+0xe0/0x11c
      [  196.624755]  print_address_description+0x68/0x260
      [  196.624759]  kasan_report+0x178/0x340
      [  196.624762]  __asan_report_load_n_noabort+0x38/0x48
      [  196.624786]  qed_init_qm_get_idx_from_flags+0x1a4/0x1b8 [qed]
      [  196.624808]  qed_init_qm_info+0xec0/0x2200 [qed]
      [  196.624830]  qed_resc_alloc+0x284/0x7e8 [qed]
      [  196.624853]  qed_slowpath_start+0x6cc/0x1ae8 [qed]
      [  196.624864]  __qede_probe.isra.10+0x1cc/0x12c0 [qede]
      [  196.624874]  qede_probe+0x78/0xf0 [qede]
      [  196.624879]  local_pci_probe+0xc4/0x180
      [  196.624882]  work_for_cpu_fn+0x54/0x98
      [  196.624885]  process_one_work+0x758/0x1900
      [  196.624888]  worker_thread+0x4e0/0xd18
      [  196.624892]  kthread+0x2c8/0x350
      [  196.624897]  ret_from_fork+0x10/0x18
      [  196.624899]
      [  196.624902] Allocated by task 2:
      [  196.624906]  kasan_kmalloc.part.1+0x40/0x108
      [  196.624909]  kasan_kmalloc+0xb4/0xc8
      [  196.624913]  kasan_slab_alloc+0x14/0x20
      [  196.624916]  kmem_cache_alloc_node+0x1dc/0x480
      [  196.624921]  copy_process.isra.1.part.2+0x1d8/0x4a98
      [  196.624924]  _do_fork+0x150/0xfa0
      [  196.624926]  kernel_thread+0x48/0x58
      [  196.624930]  kthreadd+0x3a4/0x5a0
      [  196.624932]  ret_from_fork+0x10/0x18
      [  196.624934]
      [  196.624937] Freed by task 0:
      [  196.624938] (stack is not available)
      [  196.624940]
      [  196.624943] The buggy address belongs to the object at ffff809b00bc0000
      [  196.624943]  which belongs to the cache thread_stack of size 32768
      [  196.624946] The buggy address is located 29536 bytes inside of
      [  196.624946]  32768-byte region [ffff809b00bc0000, ffff809b00bc8000)
      [  196.624948] The buggy address belongs to the page:
      [  196.624952] page:ffff7fe026c02e00 count:1 mapcount:0 mapping:ffff809b4001c000 index:0x0 compound_mapcount: 0
      [  196.624960] flags: 0xfffff8000008100(slab|head)
      [  196.624967] raw: 0fffff8000008100 dead000000000100 dead000000000200 ffff809b4001c000
      [  196.624970] raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000
      [  196.624973] page dumped because: kasan: bad access detected
      [  196.624974]
      [  196.624976] Memory state around the buggy address:
      [  196.624980]  ffff809b00bc7200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  196.624983]  ffff809b00bc7280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  196.624985] >ffff809b00bc7300: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2
      [  196.624988]                                                        ^
      [  196.624990]  ffff809b00bc7380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  196.624993]  ffff809b00bc7400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  196.624995] ==================================================================
      Signed-off-by: NManish Chopra <manishc@marvell.com>
      Signed-off-by: NAriel Elior <aelior@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ffb057f9
    • M
      qed: Fix system crash in ll2 xmit · 7c81626a
      Manish Chopra 提交于
      Cache number of fragments in the skb locally as in case
      of linear skb (with zero fragments), tx completion
      (or freeing of skb) may happen before driver tries
      to get number of frgaments from the skb which could
      lead to stale access to an already freed skb.
      Signed-off-by: NManish Chopra <manishc@marvell.com>
      Signed-off-by: NAriel Elior <aelior@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c81626a
    • M
      qed: Fix VF probe failure while FLR · 327852ec
      Manish Chopra 提交于
      VFs may hit VF-PF channel timeout while probing, as in some
      cases it was observed that VF FLR and VF "acquire" message
      transaction (i.e first message from VF to PF in VF's probe flow)
      could occur simultaneously which could lead VF to fail sending
      "acquire" message to PF as VF is marked disabled from HW perspective
      due to FLR, which will result into channel timeout and VF probe failure.
      
      In such cases, try retrying VF "acquire" message so that in later
      attempts it could be successful to pass message to PF after the VF
      FLR is completed and can be probed successfully.
      Signed-off-by: NManish Chopra <manishc@marvell.com>
      Signed-off-by: NAriel Elior <aelior@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      327852ec
    • M
      qed: Fix LACP pdu drops for VFs · ff929696
      Manish Chopra 提交于
      VF is always configured to drop control frames
      (with reserved mac addresses) but to work LACP
      on the VFs, it would require LACP control frames
      to be forwarded or transmitted successfully.
      
      This patch fixes this in such a way that trusted VFs
      (marked through ndo_set_vf_trust) would be allowed to
      pass the control frames such as LACP pdus.
      Signed-off-by: NManish Chopra <manishc@marvell.com>
      Signed-off-by: NAriel Elior <aelior@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff929696
    • M
      qed: Fix bug in tx promiscuous mode settings · 9e71a15d
      Manish Chopra 提交于
      When running tx switched traffic between VNICs
      created via a bridge(to which VFs are added),
      adapter drops the unicast packets in tx flow due to
      VNIC's ucast mac being unknown to it. But VF interfaces
      being in promiscuous mode should have caused adapter
      to accept all the unknown ucast packets. Later, it
      was found that driver doesn't really configure tx
      promiscuous mode settings to accept all unknown unicast macs.
      
      This patch fixes tx promiscuous mode settings to accept all
      unknown/unmatched unicast macs and works out the scenario.
      Signed-off-by: NManish Chopra <manishc@marvell.com>
      Signed-off-by: NAriel Elior <aelior@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9e71a15d
    • D
      Merge branch 'qed-Error-recovery-process' · bb7c778b
      David S. Miller 提交于
      Michal Kalderon says:
      
      ====================
      qed*: Error recovery process
      
      Parity errors might happen in the device's memories due to momentary bit
      flips which are caused by radiation.
      Errors that are not correctable initiate a process kill event, which blocks
      the device access towards the host and the network, and a recovery process
      is started in the management FW and in the driver.
      
      This series adds the support of this process in the qed core module and in
      the qede driver (patches 2 & 3).
      Patch 1 in the series revises the load sequence, to avoid PCI errors that
      might be observed during a recovery process.
      
      Changes in v2:
      	- Addressed issue found in https://patchwork.ozlabs.org/patch/1030545/
      	  The change was done be removing the enum and passing a boolean to
      	  the related functions.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb7c778b
    • T
      qede: Error recovery process · ccc67ef5
      Tomer Tayar 提交于
      This patch adds the error recovery process in the qede driver.
      The process includes a partial/customized driver unload and load, which
      allows it to look like a short suspend period to the kernel while
      preserving the net devices' state.
      Signed-off-by: NTomer Tayar <tomer.tayar@cavium.com>
      Signed-off-by: NAriel Elior <ariel.elior@cavium.com>
      Signed-off-by: NMichal Kalderon <michal.kalderon@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ccc67ef5