1. 15 5月, 2020 7 次提交
    • D
      bpf: Restrict bpf_probe_read{, str}() only to archs where they work · 0ebeea8c
      Daniel Borkmann 提交于
      Given the legacy bpf_probe_read{,str}() BPF helpers are broken on archs
      with overlapping address ranges, we should really take the next step to
      disable them from BPF use there.
      
      To generally fix the situation, we've recently added new helper variants
      bpf_probe_read_{user,kernel}() and bpf_probe_read_{user,kernel}_str().
      For details on them, see 6ae08ae3 ("bpf: Add probe_read_{user, kernel}
      and probe_read_{user,kernel}_str helpers").
      
      Given bpf_probe_read{,str}() have been around for ~5 years by now, there
      are plenty of users at least on x86 still relying on them today, so we
      cannot remove them entirely w/o breaking the BPF tracing ecosystem.
      
      However, their use should be restricted to archs with non-overlapping
      address ranges where they are working in their current form. Therefore,
      move this behind a CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE and
      have x86, arm64, arm select it (other archs supporting it can follow-up
      on it as well).
      
      For the remaining archs, they can workaround easily by relying on the
      feature probe from bpftool which spills out defines that can be used out
      of BPF C code to implement the drop-in replacement for old/new kernels
      via: bpftool feature probe macro
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/bpf/20200515101118.6508-2-daniel@iogearbox.net
      0ebeea8c
    • Y
      selftests/bpf: Enforce returning 0 for fentry/fexit programs · 6d74f64b
      Yonghong Song 提交于
      There are a few fentry/fexit programs returning non-0.
      The tests with these programs will break with the previous
      patch which enfoced return-0 rules. Fix them properly.
      
      Fixes: ac065870 ("selftests/bpf: Add BPF_PROG, BPF_KPROBE, and BPF_KRETPROBE macros")
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200514053207.1298479-1-yhs@fb.com
      6d74f64b
    • Y
      bpf: Enforce returning 0 for fentry/fexit progs · e92888c7
      Yonghong Song 提交于
      Currently, tracing/fentry and tracing/fexit prog
      return values are not enforced. In trampoline codes,
      the fentry/fexit prog return values are ignored.
      Let us enforce it to be 0 to avoid confusion and
      allows potential future extension.
      
      This patch also explicitly added return value
      checking for tracing/raw_tp, tracing/fmod_ret,
      and freplace programs such that these program
      return values can be anything. The purpose are
      two folds:
       1. to make it explicit about return value expectations
          for these programs in verifier.
       2. for tracing prog_type, if a future attach type
          is added, the default is -ENOTSUPP which will
          enforce to specify return value ranges explicitly.
      
      Fixes: fec56f58 ("bpf: Introduce BPF trampoline")
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200514053206.1298415-1-yhs@fb.com
      e92888c7
    • A
      security: Fix the default value of secid_to_secctx hook · 625236ba
      Anders Roxell 提交于
      security_secid_to_secctx is called by the bpf_lsm hook and a successful
      return value (i.e 0) implies that the parameter will be consumed by the
      LSM framework. The current behaviour return success when the pointer
      isn't initialized when CONFIG_BPF_LSM is enabled, with the default
      return from kernel/bpf/bpf_lsm.c.
      
      This is the internal error:
      
      [ 1229.341488][ T2659] usercopy: Kernel memory exposure attempt detected from null address (offset 0, size 280)!
      [ 1229.374977][ T2659] ------------[ cut here ]------------
      [ 1229.376813][ T2659] kernel BUG at mm/usercopy.c:99!
      [ 1229.378398][ T2659] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      [ 1229.380348][ T2659] Modules linked in:
      [ 1229.381654][ T2659] CPU: 0 PID: 2659 Comm: systemd-journal Tainted: G    B   W         5.7.0-rc5-next-20200511-00019-g864e0c6319b8-dirty #13
      [ 1229.385429][ T2659] Hardware name: linux,dummy-virt (DT)
      [ 1229.387143][ T2659] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--)
      [ 1229.389165][ T2659] pc : usercopy_abort+0xc8/0xcc
      [ 1229.390705][ T2659] lr : usercopy_abort+0xc8/0xcc
      [ 1229.392225][ T2659] sp : ffff000064247450
      [ 1229.393533][ T2659] x29: ffff000064247460 x28: 0000000000000000
      [ 1229.395449][ T2659] x27: 0000000000000118 x26: 0000000000000000
      [ 1229.397384][ T2659] x25: ffffa000127049e0 x24: ffffa000127049e0
      [ 1229.399306][ T2659] x23: ffffa000127048e0 x22: ffffa000127048a0
      [ 1229.401241][ T2659] x21: ffffa00012704b80 x20: ffffa000127049e0
      [ 1229.403163][ T2659] x19: ffffa00012704820 x18: 0000000000000000
      [ 1229.405094][ T2659] x17: 0000000000000000 x16: 0000000000000000
      [ 1229.407008][ T2659] x15: 0000000000000000 x14: 003d090000000000
      [ 1229.408942][ T2659] x13: ffff80000d5b25b2 x12: 1fffe0000d5b25b1
      [ 1229.410859][ T2659] x11: 1fffe0000d5b25b1 x10: ffff80000d5b25b1
      [ 1229.412791][ T2659] x9 : ffffa0001034bee0 x8 : ffff00006ad92d8f
      [ 1229.414707][ T2659] x7 : 0000000000000000 x6 : ffffa00015eacb20
      [ 1229.416642][ T2659] x5 : ffff0000693c8040 x4 : 0000000000000000
      [ 1229.418558][ T2659] x3 : ffffa0001034befc x2 : d57a7483a01c6300
      [ 1229.420610][ T2659] x1 : 0000000000000000 x0 : 0000000000000059
      [ 1229.422526][ T2659] Call trace:
      [ 1229.423631][ T2659]  usercopy_abort+0xc8/0xcc
      [ 1229.425091][ T2659]  __check_object_size+0xdc/0x7d4
      [ 1229.426729][ T2659]  put_cmsg+0xa30/0xa90
      [ 1229.428132][ T2659]  unix_dgram_recvmsg+0x80c/0x930
      [ 1229.429731][ T2659]  sock_recvmsg+0x9c/0xc0
      [ 1229.431123][ T2659]  ____sys_recvmsg+0x1cc/0x5f8
      [ 1229.432663][ T2659]  ___sys_recvmsg+0x100/0x160
      [ 1229.434151][ T2659]  __sys_recvmsg+0x110/0x1a8
      [ 1229.435623][ T2659]  __arm64_sys_recvmsg+0x58/0x70
      [ 1229.437218][ T2659]  el0_svc_common.constprop.1+0x29c/0x340
      [ 1229.438994][ T2659]  do_el0_svc+0xe8/0x108
      [ 1229.440587][ T2659]  el0_svc+0x74/0x88
      [ 1229.441917][ T2659]  el0_sync_handler+0xe4/0x8b4
      [ 1229.443464][ T2659]  el0_sync+0x17c/0x180
      [ 1229.444920][ T2659] Code: aa1703e2 aa1603e1 910a8260 97ecc860 (d4210000)
      [ 1229.447070][ T2659] ---[ end trace 400497d91baeaf51 ]---
      [ 1229.448791][ T2659] Kernel panic - not syncing: Fatal exception
      [ 1229.450692][ T2659] Kernel Offset: disabled
      [ 1229.452061][ T2659] CPU features: 0x240002,20002004
      [ 1229.453647][ T2659] Memory Limit: none
      [ 1229.455015][ T2659] ---[ end Kernel panic - not syncing: Fatal exception ]---
      
      Rework the so the default return value is -EOPNOTSUPP.
      
      There are likely other callbacks such as security_inode_getsecctx() that
      may have the same problem, and that someone that understand the code
      better needs to audit them.
      
      Thank you Arnd for helping me figure out what went wrong.
      
      Fixes: 98e828a0 ("security: Refactor declaration of LSM hooks")
      Signed-off-by: NAnders Roxell <anders.roxell@linaro.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NJames Morris <jamorris@linux.microsoft.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Link: https://lore.kernel.org/bpf/20200512174607.9630-1-anders.roxell@linaro.org
      625236ba
    • S
      libbpf: Fix register naming in PT_REGS s390 macros · 516d8d49
      Sumanth Korikkar 提交于
      Fix register naming in PT_REGS s390 macros
      
      Fixes: b8ebce86 ("libbpf: Provide CO-RE variants of PT_REGS macros")
      Signed-off-by: NSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Acked-by: NAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200513154414.29972-1-sumanthk@linux.ibm.com
      516d8d49
    • A
      bpf: Fix bug in mmap() implementation for BPF array map · 333291ce
      Andrii Nakryiko 提交于
      mmap() subsystem allows user-space application to memory-map region with
      initial page offset. This wasn't taken into account in initial implementation
      of BPF array memory-mapping. This would result in wrong pages, not taking into
      account requested page shift, being memory-mmaped into user-space. This patch
      fixes this gap and adds a test for such scenario.
      
      Fixes: fc970227 ("bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY")
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20200512235925.3817805-1-andriin@fb.com
      333291ce
    • M
      samples: bpf: Fix build error · 23ad0466
      Matteo Croce 提交于
      GCC 10 is very strict about symbol clash, and lwt_len_hist_user contains
      a symbol which clashes with libbpf:
      
      /usr/bin/ld: samples/bpf/lwt_len_hist_user.o:(.bss+0x0): multiple definition of `bpf_log_buf'; samples/bpf/bpf_load.o:(.bss+0x8c0): first defined here
      collect2: error: ld returned 1 exit status
      
      bpf_log_buf here seems to be a leftover, so removing it.
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20200511113234.80722-1-mcroce@redhat.com
      23ad0466
  2. 14 5月, 2020 9 次提交
  3. 13 5月, 2020 8 次提交
  4. 12 5月, 2020 3 次提交
    • D
      Merge branch 'net-ipa-fix-cleanup-after-modem-crash' · 1abfb181
      David S. Miller 提交于
      Alex Elder says:
      
      ====================
      net: ipa: fix cleanup after modem crash
      
      The first patch in this series fixes a bug where the size of a data
      transfer request was never set, meaning it was 0.  The consequence
      of this was that such a transfer request would never complete if
      attempted, and led to a hung task timeout.
      
      This data transfer is required for cleaning up IPA hardware state
      when recovering from a modem crash.  The code to implement this
      cleanup is already present, but its use was commented out because
      it hit the bug described above.  So the second patch in this series
      enables the use of that "tag process" cleanup code.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1abfb181
    • A
      net: ipa: use tag process on modem crash · 2c4bb809
      Alex Elder 提交于
      One part of recovering from a modem crash is performing a "tag
      sequence" of several IPA immediate commands, to clear the hardware
      pipeline.  The sequence ends with a data transfer request on the
      command endpoint (which is not otherwise done).  Unfortunately,
      attempting to do the data transfer led to a hang, so that request
      plus two other commands were commented out.
      
      The previous commit fixes the bug that was causing that hang.  And
      with that bug fixed we can properly issue the tag sequence when the
      modem crashes, to return the hardware to a known state.
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c4bb809
    • A
      net: ipa: set DMA length in gsi_trans_cmd_add() · c781e1d4
      Alex Elder 提交于
      When a command gets added to a transaction for the AP->command
      channel we set the DMA address of its scatterlist entry, but not
      its DMA length.  Fix this bug.
      Signed-off-by: NAlex Elder <elder@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c781e1d4
  5. 11 5月, 2020 2 次提交
  6. 10 5月, 2020 2 次提交
    • Z
      netprio_cgroup: Fix unlimited memory leak of v2 cgroups · 090e28b2
      Zefan Li 提交于
      If systemd is configured to use hybrid mode which enables the use of
      both cgroup v1 and v2, systemd will create new cgroup on both the default
      root (v2) and netprio_cgroup hierarchy (v1) for a new session and attach
      task to the two cgroups. If the task does some network thing then the v2
      cgroup can never be freed after the session exited.
      
      One of our machines ran into OOM due to this memory leak.
      
      In the scenario described above when sk_alloc() is called
      cgroup_sk_alloc() thought it's in v2 mode, so it stores
      the cgroup pointer in sk->sk_cgrp_data and increments
      the cgroup refcnt, but then sock_update_netprioidx()
      thought it's in v1 mode, so it stores netprioidx value
      in sk->sk_cgrp_data, so the cgroup refcnt will never be freed.
      
      Currently we do the mode switch when someone writes to the ifpriomap
      cgroup control file. The easiest fix is to also do the switch when
      a task is attached to a new cgroup.
      
      Fixes: bd1060a1 ("sock, cgroup: add sock->sk_cgroup")
      Reported-by: NYang Yingliang <yangyingliang@huawei.com>
      Tested-by: NYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: NZefan Li <lizefan@huawei.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      090e28b2
    • A
      net: freescale: select CONFIG_FIXED_PHY where needed · 99352c79
      Arnd Bergmann 提交于
      I ran into a randconfig build failure with CONFIG_FIXED_PHY=m
      and CONFIG_GIANFAR=y:
      
      x86_64-linux-ld: drivers/net/ethernet/freescale/gianfar.o:(.rodata+0x418): undefined reference to `fixed_phy_change_carrier'
      
      It seems the same thing can happen with dpaa and ucc_geth, so change
      all three to do an explicit 'select FIXED_PHY'.
      
      The fixed-phy driver actually has an alternative stub function that
      theoretically allows building network drivers when fixed-phy is
      disabled, but I don't see how that would help here, as the drivers
      presumably would not work then.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      99352c79
  7. 09 5月, 2020 8 次提交
  8. 08 5月, 2020 1 次提交
    • C
      net: fix a potential recursive NETDEV_FEAT_CHANGE · dd912306
      Cong Wang 提交于
      syzbot managed to trigger a recursive NETDEV_FEAT_CHANGE event
      between bonding master and slave. I managed to find a reproducer
      for this:
      
        ip li set bond0 up
        ifenslave bond0 eth0
        brctl addbr br0
        ethtool -K eth0 lro off
        brctl addif br0 bond0
        ip li set br0 up
      
      When a NETDEV_FEAT_CHANGE event is triggered on a bonding slave,
      it captures this and calls bond_compute_features() to fixup its
      master's and other slaves' features. However, when syncing with
      its lower devices by netdev_sync_lower_features() this event is
      triggered again on slaves when the LRO feature fails to change,
      so it goes back and forth recursively until the kernel stack is
      exhausted.
      
      Commit 17b85d29 intentionally lets __netdev_update_features()
      return -1 for such a failure case, so we have to just rely on
      the existing check inside netdev_sync_lower_features() and skip
      NETDEV_FEAT_CHANGE event only for this specific failure case.
      
      Fixes: fd867d51 ("net/core: generic support for disabling netdev features down stack")
      Reported-by: syzbot+e73ceacfd8560cc8a3ca@syzkaller.appspotmail.com
      Reported-by: syzbot+c2fb6f9ddcea95ba49b5@syzkaller.appspotmail.com
      Cc: Jarod Wilson <jarod@redhat.com>
      Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Reviewed-by: NJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd912306