1. 07 1月, 2020 4 次提交
    • L
      iommu/vt-d: debugfs: Add support to show page table internals · e2726dae
      Lu Baolu 提交于
      Export page table internals of the domain attached to each device.
      Example of such dump on a Skylake machine:
      
      $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
      [ ... ]
      Device 0000:00:14.0 with pasid 0 @0x15f3d9000
      IOVA_PFN                PML5E                   PML4E
      0x000000008ced0 |       0x0000000000000000      0x000000015f3da003
      0x000000008ced1 |       0x0000000000000000      0x000000015f3da003
      0x000000008ced2 |       0x0000000000000000      0x000000015f3da003
      0x000000008ced3 |       0x0000000000000000      0x000000015f3da003
      0x000000008ced4 |       0x0000000000000000      0x000000015f3da003
      0x000000008ced5 |       0x0000000000000000      0x000000015f3da003
      0x000000008ced6 |       0x0000000000000000      0x000000015f3da003
      0x000000008ced7 |       0x0000000000000000      0x000000015f3da003
      0x000000008ced8 |       0x0000000000000000      0x000000015f3da003
      0x000000008ced9 |       0x0000000000000000      0x000000015f3da003
      
      PDPE                    PDE                     PTE
      0x000000015f3db003      0x000000015f3dc003      0x000000008ced0003
      0x000000015f3db003      0x000000015f3dc003      0x000000008ced1003
      0x000000015f3db003      0x000000015f3dc003      0x000000008ced2003
      0x000000015f3db003      0x000000015f3dc003      0x000000008ced3003
      0x000000015f3db003      0x000000015f3dc003      0x000000008ced4003
      0x000000015f3db003      0x000000015f3dc003      0x000000008ced5003
      0x000000015f3db003      0x000000015f3dc003      0x000000008ced6003
      0x000000015f3db003      0x000000015f3dc003      0x000000008ced7003
      0x000000015f3db003      0x000000015f3dc003      0x000000008ced8003
      0x000000015f3db003      0x000000015f3dc003      0x000000008ced9003
      [ ... ]
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      e2726dae
    • L
      iommu/vt-d: Flush PASID-based iotlb for iova over first level · 33cd6e64
      Lu Baolu 提交于
      When software has changed first-level tables, it should invalidate
      the affected IOTLB and the paging-structure-caches using the PASID-
      based-IOTLB Invalidate Descriptor defined in spec 6.5.2.4.
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      33cd6e64
    • L
      iommu/vt-d: Setup pasid entries for iova over first level · ddf09b6d
      Lu Baolu 提交于
      Intel VT-d in scalable mode supports two types of page tables for
      IOVA translation: first level and second level. The IOMMU driver
      can choose one from both for IOVA translation according to the use
      case. This sets up the pasid entry if a domain is selected to use
      the first-level page table for iova translation.
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      ddf09b6d
    • J
      iommu/vt-d: Fix CPU and IOMMU SVM feature matching checks · ff3dc652
      Jacob Pan 提交于
      Shared Virtual Memory(SVM) is based on a collective set of hardware
      features detected at runtime. There are requirements for matching CPU
      and IOMMU capabilities.
      
      The current code checks CPU and IOMMU feature set for SVM support but
      the result is never stored nor used. Therefore, SVM can still be used
      even when these checks failed. The consequences can be:
      1. CPU uses 5-level paging mode for virtual address of 57 bits, but
      IOMMU can only support 4-level paging mode with 48 bits address for DMA.
      2. 1GB page size is used by CPU but IOMMU does not support it. VT-d
      unrecoverable faults may be generated.
      
      The best solution to fix these problems is to prevent them in the first
      place.
      
      This patch consolidates code for checking PASID, CPU vs. IOMMU paging
      mode compatibility, as well as provides specific error messages for
      each failed checks. On sane hardware configurations, these error message
      shall never appear in kernel log.
      Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      ff3dc652
  2. 21 12月, 2019 2 次提交
  3. 20 12月, 2019 1 次提交
  4. 18 12月, 2019 4 次提交
  5. 17 12月, 2019 2 次提交
  6. 16 12月, 2019 1 次提交
  7. 14 12月, 2019 1 次提交
  8. 13 12月, 2019 4 次提交
    • D
      fs: remove ksys_dup() · 8243186f
      Dominik Brodowski 提交于
      ksys_dup() is used only at one place in the kernel, namely to duplicate
      fd 0 of /dev/console to stdout and stderr. The same functionality can be
      achieved by using functions already available within the kernel namespace.
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      8243186f
    • D
      init: unify opening /dev/console as stdin/stdout/stderr · b49a733d
      Dominik Brodowski 提交于
      Merge the two instances where /dev/console is opened as
      stdin/stdout/stderr.
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      b49a733d
    • R
      cpufreq: Avoid leaving stale IRQ work items during CPU offline · 85572c2c
      Rafael J. Wysocki 提交于
      The scheduler code calling cpufreq_update_util() may run during CPU
      offline on the target CPU after the IRQ work lists have been flushed
      for it, so the target CPU should be prevented from running code that
      may queue up an IRQ work item on it at that point.
      
      Unfortunately, that may not be the case if dvfs_possible_from_any_cpu
      is set for at least one cpufreq policy in the system, because that
      allows the CPU going offline to run the utilization update callback
      of the cpufreq governor on behalf of another (online) CPU in some
      cases.
      
      If that happens, the cpufreq governor callback may queue up an IRQ
      work on the CPU running it, which is going offline, and the IRQ work
      may not be flushed after that point.  Moreover, that IRQ work cannot
      be flushed until the "offlining" CPU goes back online, so if any
      other CPU calls irq_work_sync() to wait for the completion of that
      IRQ work, it will have to wait until the "offlining" CPU is back
      online and that may not happen forever.  In particular, a system-wide
      deadlock may occur during CPU online as a result of that.
      
      The failing scenario is as follows.  CPU0 is the boot CPU, so it
      creates a cpufreq policy and becomes the "leader" of it
      (policy->cpu).  It cannot go offline, because it is the boot CPU.
      Next, other CPUs join the cpufreq policy as they go online and they
      leave it when they go offline.  The last CPU to go offline, say CPU3,
      may queue up an IRQ work while running the governor callback on
      behalf of CPU0 after leaving the cpufreq policy because of the
      dvfs_possible_from_any_cpu effect described above.  Then, CPU0 is
      the only online CPU in the system and the stale IRQ work is still
      queued on CPU3.  When, say, CPU1 goes back online, it will run
      irq_work_sync() to wait for that IRQ work to complete and so it
      will wait for CPU3 to go back online (which may never happen even
      in principle), but (worse yet) CPU0 is waiting for CPU1 at that
      point too and a system-wide deadlock occurs.
      
      To address this problem notice that CPUs which cannot run cpufreq
      utilization update code for themselves (for example, because they
      have left the cpufreq policies that they belonged to), should also
      be prevented from running that code on behalf of the other CPUs that
      belong to a cpufreq policy with dvfs_possible_from_any_cpu set and so
      in that case the cpufreq_update_util_data pointer of the CPU running
      the code must not be NULL as well as for the CPU which is the target
      of the cpufreq utilization update in progress.
      
      Accordingly, change cpufreq_this_cpu_can_update() into a regular
      function in kernel/sched/cpufreq.c (instead of a static inline in a
      header file) and make it check the cpufreq_update_util_data pointer
      of the local CPU if dvfs_possible_from_any_cpu is set for the target
      cpufreq policy.
      
      Also update the schedutil governor to do the
      cpufreq_this_cpu_can_update() check in the non-fast-switch
      case too to avoid the stale IRQ work issues.
      
      Fixes: 99d14d0e ("cpufreq: Process remote callbacks from any CPU if the platform permits")
      Link: https://lore.kernel.org/linux-pm/20191121093557.bycvdo4xyinbc5cb@vireshk-i7/Reported-by: NAnson Huang <anson.huang@nxp.com>
      Tested-by: NAnson Huang <anson.huang@nxp.com>
      Cc: 4.14+ <stable@vger.kernel.org> # 4.14+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Tested-by: Peng Fan <peng.fan@nxp.com> (i.MX8QXP-MEK)
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      85572c2c
    • G
      blk-cgroup: remove blkcg_drain_queue · 5addeae1
      Guoqing Jiang 提交于
      Since blk_drain_queue had already been removed, so this function
      is not needed anymore.
      Signed-off-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5addeae1
  9. 12 12月, 2019 3 次提交
    • D
      init: use do_mount() instead of ksys_mount() · cccaa5e3
      Dominik Brodowski 提交于
      In prepare_namespace(), do_mount() can be used instead of ksys_mount()
      as the first and third argument are const strings in the kernel, the
      second and fourth argument are passed through anyway, and the fifth
      argument is NULL.
      
      In do_mount_root(), ksys_mount() is called with the first and third
      argument being already kernelspace strings, which do not need to be
      copied over from userspace to kernelspace (again). The second and
      fourth arguments are passed through to do_mount() anyway. The fifth
      argument, while already residing in kernelspace, needs to be put into
      a page of its own. Then, do_mount() can be used instead of
      ksys_mount().
      
      Once this is done, there are no in-kernel users to ksys_mount() left,
      which can therefore be removed.
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      cccaa5e3
    • D
      devtmpfs: use do_mount() instead of ksys_mount() · 5e787dbf
      Dominik Brodowski 提交于
      In devtmpfs, do_mount() can be called directly instead of complex wrapping
      by ksys_mount():
      - the first and third arguments are const strings in the kernel,
        and do not need to be copied over from userspace;
      - the fifth argument is NULL, and therefore no page needs to be
        copied over from userspace;
      - the second and fourth argument are passed through anyway.
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      5e787dbf
    • A
      bpf: Make BPF trampoline use register_ftrace_direct() API · b91e014f
      Alexei Starovoitov 提交于
      Make BPF trampoline attach its generated assembly code to kernel functions via
      register_ftrace_direct() API. It helps ftrace-based tracers co-exist with BPF
      trampoline on the same kernel function. It also switches attaching logic from
      arch specific text_poke to generic ftrace that is available on many
      architectures. text_poke is still necessary for bpf-to-bpf attach and for
      bpf_tail_call optimization.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20191209000114.1876138-3-ast@kernel.org
      b91e014f
  10. 11 12月, 2019 4 次提交
  11. 10 12月, 2019 2 次提交
  12. 09 12月, 2019 2 次提交
  13. 08 12月, 2019 3 次提交
    • A
      efi: Fix efi_loaded_image_t::unload type · 9fa76ca7
      Arvind Sankar 提交于
      The ::unload field is a function pointer, so it should be u32 for 32-bit,
      u64 for 64-bit. Add a prototype for it in the native efi_loaded_image_t
      type. Also change type of parent_handle and device_handle from void * to
      efi_handle_t for documentation purposes.
      
      The unload method is not used, so no functional change.
      Signed-off-by: NArvind Sankar <nivedita@alum.mit.edu>
      Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Bhupesh Sharma <bhsharma@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: https://lkml.kernel.org/r/20191206165542.31469-6-ardb@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9fa76ca7
    • L
      pipe: remove 'waiting_writers' merging logic · a28c8b9d
      Linus Torvalds 提交于
      This code is ancient, and goes back to when we only had a single page
      for the pipe buffers.  The exact history is hidden in the mists of time
      (ie "before git", and in fact predates the BK repository too).
      
      At that long-ago point in time, it actually helped to try to merge big
      back-and-forth pipe reads and writes, and not limit pipe reads to the
      single pipe buffer in length just because that was all we had at a time.
      
      However, since then we've expanded the pipe buffers to multiple pages,
      and this logic really doesn't seem to make sense.  And a lot of it is
      somewhat questionable (ie "hmm, the user asked for a non-blocking read,
      but we see that there's a writer pending, so let's wait anyway to get
      the extra data that the writer will have").
      
      But more importantly, it makes the "go to sleep" logic much less
      obvious, and considering the wakeup issues we've had, I want to make for
      less of those kinds of things.
      
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a28c8b9d
    • E
      inet: protect against too small mtu values. · 501a90c9
      Eric Dumazet 提交于
      syzbot was once again able to crash a host by setting a very small mtu
      on loopback device.
      
      Let's make inetdev_valid_mtu() available in include/net/ip.h,
      and use it in ip_setup_cork(), so that we protect both ip_append_page()
      and __ip_append_data()
      
      Also add a READ_ONCE() when the device mtu is read.
      
      Pairs this lockless read with one WRITE_ONCE() in __dev_set_mtu(),
      even if other code paths might write over this field.
      
      Add a big comment in include/linux/netdevice.h about dev->mtu
      needing READ_ONCE()/WRITE_ONCE() annotations.
      
      Hopefully we will add the missing ones in followup patches.
      
      [1]
      
      refcount_t: saturated; leaking memory.
      WARNING: CPU: 0 PID: 9464 at lib/refcount.c:22 refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 0 PID: 9464 Comm: syz-executor850 Not tainted 5.4.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       panic+0x2e3/0x75c kernel/panic.c:221
       __warn.cold+0x2f/0x3e kernel/panic.c:582
       report_bug+0x289/0x300 lib/bug.c:195
       fixup_bug arch/x86/kernel/traps.c:174 [inline]
       fixup_bug arch/x86/kernel/traps.c:169 [inline]
       do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267
       do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286
       invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
      RIP: 0010:refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22
      Code: 06 31 ff 89 de e8 c8 f5 e6 fd 84 db 0f 85 6f ff ff ff e8 7b f4 e6 fd 48 c7 c7 e0 71 4f 88 c6 05 56 a6 a4 06 01 e8 c7 a8 b7 fd <0f> 0b e9 50 ff ff ff e8 5c f4 e6 fd 0f b6 1d 3d a6 a4 06 31 ff 89
      RSP: 0018:ffff88809689f550 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffffffff815e4336 RDI: ffffed1012d13e9c
      RBP: ffff88809689f560 R08: ffff88809c50a3c0 R09: fffffbfff15d31b1
      R10: fffffbfff15d31b0 R11: ffffffff8ae98d87 R12: 0000000000000001
      R13: 0000000000040100 R14: ffff888099041104 R15: ffff888218d96e40
       refcount_add include/linux/refcount.h:193 [inline]
       skb_set_owner_w+0x2b6/0x410 net/core/sock.c:1999
       sock_wmalloc+0xf1/0x120 net/core/sock.c:2096
       ip_append_page+0x7ef/0x1190 net/ipv4/ip_output.c:1383
       udp_sendpage+0x1c7/0x480 net/ipv4/udp.c:1276
       inet_sendpage+0xdb/0x150 net/ipv4/af_inet.c:821
       kernel_sendpage+0x92/0xf0 net/socket.c:3794
       sock_sendpage+0x8b/0xc0 net/socket.c:936
       pipe_to_sendpage+0x2da/0x3c0 fs/splice.c:458
       splice_from_pipe_feed fs/splice.c:512 [inline]
       __splice_from_pipe+0x3ee/0x7c0 fs/splice.c:636
       splice_from_pipe+0x108/0x170 fs/splice.c:671
       generic_splice_sendpage+0x3c/0x50 fs/splice.c:842
       do_splice_from fs/splice.c:861 [inline]
       direct_splice_actor+0x123/0x190 fs/splice.c:1035
       splice_direct_to_actor+0x3b4/0xa30 fs/splice.c:990
       do_splice_direct+0x1da/0x2a0 fs/splice.c:1078
       do_sendfile+0x597/0xd00 fs/read_write.c:1464
       __do_sys_sendfile64 fs/read_write.c:1525 [inline]
       __se_sys_sendfile64 fs/read_write.c:1511 [inline]
       __x64_sys_sendfile64+0x1dd/0x220 fs/read_write.c:1511
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x441409
      Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fffb64c4f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441409
      RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000005
      RBP: 0000000000073b8a R08: 0000000000000010 R09: 0000000000000010
      R10: 0000000000010001 R11: 0000000000000246 R12: 0000000000402180
      R13: 0000000000402210 R14: 0000000000000000 R15: 0000000000000000
      Kernel Offset: disabled
      Rebooting in 86400 seconds..
      
      Fixes: 1470ddf7 ("inet: Remove explicit write references to sk/inet in ip_append_data")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      501a90c9
  14. 07 12月, 2019 1 次提交
    • G
      tcp: fix rejected syncookies due to stale timestamps · 04d26e7b
      Guillaume Nault 提交于
      If no synflood happens for a long enough period of time, then the
      synflood timestamp isn't refreshed and jiffies can advance so much
      that time_after32() can't accurately compare them any more.
      
      Therefore, we can end up in a situation where time_after32(now,
      last_overflow + HZ) returns false, just because these two values are
      too far apart. In that case, the synflood timestamp isn't updated as
      it should be, which can trick tcp_synq_no_recent_overflow() into
      rejecting valid syncookies.
      
      For example, let's consider the following scenario on a system
      with HZ=1000:
      
        * The synflood timestamp is 0, either because that's the timestamp
          of the last synflood or, more commonly, because we're working with
          a freshly created socket.
      
        * We receive a new SYN, which triggers synflood protection. Let's say
          that this happens when jiffies == 2147484649 (that is,
          'synflood timestamp' + HZ + 2^31 + 1).
      
        * Then tcp_synq_overflow() doesn't update the synflood timestamp,
          because time_after32(2147484649, 1000) returns false.
          With:
            - 2147484649: the value of jiffies, aka. 'now'.
            - 1000: the value of 'last_overflow' + HZ.
      
        * A bit later, we receive the ACK completing the 3WHS. But
          cookie_v[46]_check() rejects it because tcp_synq_no_recent_overflow()
          says that we're not under synflood. That's because
          time_after32(2147484649, 120000) returns false.
          With:
            - 2147484649: the value of jiffies, aka. 'now'.
            - 120000: the value of 'last_overflow' + TCP_SYNCOOKIE_VALID.
      
          Of course, in reality jiffies would have increased a bit, but this
          condition will last for the next 119 seconds, which is far enough
          to accommodate for jiffie's growth.
      
      Fix this by updating the overflow timestamp whenever jiffies isn't
      within the [last_overflow, last_overflow + HZ] range. That shouldn't
      have any performance impact since the update still happens at most once
      per second.
      
      Now we're guaranteed to have fresh timestamps while under synflood, so
      tcp_synq_no_recent_overflow() can safely use it with time_after32() in
      such situations.
      
      Stale timestamps can still make tcp_synq_no_recent_overflow() return
      the wrong verdict when not under synflood. This will be handled in the
      next patch.
      
      For 64 bits architectures, the problem was introduced with the
      conversion of ->tw_ts_recent_stamp to 32 bits integer by commit
      cca9bab1 ("tcp: use monotonic timestamps for PAWS").
      The problem has always been there on 32 bits architectures.
      
      Fixes: cca9bab1 ("tcp: use monotonic timestamps for PAWS")
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      04d26e7b
  15. 06 12月, 2019 1 次提交
  16. 05 12月, 2019 5 次提交
    • M
      mm: remove __ARCH_HAS_4LEVEL_HACK and include/asm-generic/4level-fixup.h · f949286c
      Mike Rapoport 提交于
      There are no architectures that use include/asm-generic/4level-fixup.h
      therefore it can be removed along with __ARCH_HAS_4LEVEL_HACK define.
      
      Link: http://lkml.kernel.org/r/1572938135-31886-14-git-send-email-rppt@kernel.orgSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Anatoly Pugachev <matorola@gmail.com>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Peter Rosin <peda@axentia.se>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Cc: Sam Creasey <sammy@sammy.net>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f949286c
    • A
      lib/bitmap: introduce bitmap_replace() helper · 30544ed5
      Andy Shevchenko 提交于
      In some drivers we want to have a single operation over bitmap which is
      an equivalent to:
      
      	*dst = (*old & ~(*mask)) | (*new & *mask)
      
      Introduce bitmap_replace() helper for this.
      
      Link: http://lkml.kernel.org/r/20191022172922.61232-8-andriy.shevchenko@linux.intel.comSigned-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Acked-by: NLinus Walleij <linus.walleij@linaro.org>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Marek Vasut <marek.vasut+renesas@gmail.com>
      Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
      Cc: William Breathitt Gray <vilhelm.gray@gmail.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      30544ed5
    • A
      kcov: remote coverage support · eec028c9
      Andrey Konovalov 提交于
      Patch series " kcov: collect coverage from usb and vhost", v3.
      
      This patchset extends kcov to allow collecting coverage from backgound
      kernel threads.  This extension requires custom annotations for each of
      the places where coverage collection is desired.  This patchset
      implements this for hub events in the USB subsystem and for vhost
      workers.  See the first patch description for details about the kcov
      extension.  The other two patches apply this kcov extension to USB and
      vhost.
      
      Examples of other subsystems that might potentially benefit from this
      when custom annotations are added (the list is based on
      process_one_work() callers for bugs recently reported by syzbot):
      
      1. fs: writeback wb_workfn() worker,
      2. net: addrconf_dad_work()/addrconf_verify_work() workers,
      3. net: neigh_periodic_work() worker,
      4. net/p9: p9_write_work()/p9_read_work() workers,
      5. block: blk_mq_run_work_fn() worker.
      
      These patches have been used to enable coverage-guided USB fuzzing with
      syzkaller for the last few years, see the details here:
      
        https://github.com/google/syzkaller/blob/master/docs/linux/external_fuzzing_usb.md
      
      This patchset has been pushed to the public Linux kernel Gerrit
      instance:
      
        https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/1524
      
      This patch (of 3):
      
      Add background thread coverage collection ability to kcov.
      
      With KCOV_ENABLE coverage is collected only for syscalls that are issued
      from the current process.  With KCOV_REMOTE_ENABLE it's possible to
      collect coverage for arbitrary parts of the kernel code, provided that
      those parts are annotated with kcov_remote_start()/kcov_remote_stop().
      
      This allows to collect coverage from two types of kernel background
      threads: the global ones, that are spawned during kernel boot in a
      limited number of instances (e.g.  one USB hub_event() worker thread is
      spawned per USB HCD); and the local ones, that are spawned when a user
      interacts with some kernel interface (e.g.  vhost workers).
      
      To enable collecting coverage from a global background thread, a unique
      global handle must be assigned and passed to the corresponding
      kcov_remote_start() call.  Then a userspace process can pass a list of
      such handles to the KCOV_REMOTE_ENABLE ioctl in the handles array field
      of the kcov_remote_arg struct.  This will attach the used kcov device to
      the code sections, that are referenced by those handles.
      
      Since there might be many local background threads spawned from
      different userspace processes, we can't use a single global handle per
      annotation.  Instead, the userspace process passes a non-zero handle
      through the common_handle field of the kcov_remote_arg struct.  This
      common handle gets saved to the kcov_handle field in the current
      task_struct and needs to be passed to the newly spawned threads via
      custom annotations.  Those threads should in turn be annotated with
      kcov_remote_start()/kcov_remote_stop().
      
      Internally kcov stores handles as u64 integers.  The top byte of a
      handle is used to denote the id of a subsystem that this handle belongs
      to, and the lower 4 bytes are used to denote the id of a thread instance
      within that subsystem.  A reserved value 0 is used as a subsystem id for
      common handles as they don't belong to a particular subsystem.  The
      bytes 4-7 are currently reserved and must be zero.  In the future the
      number of bytes used for the subsystem or handle ids might be increased.
      
      When a particular userspace process collects coverage by via a common
      handle, kcov will collect coverage for each code section that is
      annotated to use the common handle obtained as kcov_handle from the
      current task_struct.  However non common handles allow to collect
      coverage selectively from different subsystems.
      
      Link: http://lkml.kernel.org/r/e90e315426a384207edbec1d6aa89e43008e4caf.1572366574.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: David Windsor <dwindsor@gmail.com>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Anders Roxell <anders.roxell@linaro.org>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eec028c9
    • K
      uaccess: disallow > INT_MAX copy sizes · 6d13de14
      Kees Cook 提交于
      As we've done with VFS, string operations, etc, reject usercopy sizes
      larger than INT_MAX, which would be nice to have for catching bugs
      related to size calculation overflows[1].
      
      This adds 10 bytes to x86_64 defconfig text and 1980 bytes to the data
      section:
      
           text    data     bss     dec     hex filename
        19691167        5134320 1646664 26472151        193eed7 vmlinux.before
        19691177        5136300 1646664 26474141        193f69d vmlinux.after
      
      [1] https://marc.info/?l=linux-s390&m=156631939010493&w=2
      
      Link: http://lkml.kernel.org/r/201908251612.F9902D7A@keescookSigned-off-by: NKees Cook <keescook@chromium.org>
      Suggested-by: NDan Carpenter <dan.carpenter@oracle.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6d13de14
    • H
      lib/genalloc.c: rename addr_in_gen_pool to gen_pool_has_addr · 964975ac
      Huang Shijie 提交于
      Follow the kernel conventions, rename addr_in_gen_pool to
      gen_pool_has_addr.
      
      [sjhuang@iluvatar.ai: fix Documentation/ too]
       Link: http://lkml.kernel.org/r/20181229015914.5573-1-sjhuang@iluvatar.ai
      Link: http://lkml.kernel.org/r/20181228083950.20398-1-sjhuang@iluvatar.aiSigned-off-by: NHuang Shijie <sjhuang@iluvatar.ai>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      964975ac
新手
引导
客服 返回
顶部