1. 03 3月, 2017 1 次提交
    • I
      sched/headers: Move task->mm handling methods to <linux/sched/mm.h> · 68e21be2
      Ingo Molnar 提交于
      Move the following task->mm helper APIs into a new header file,
      <linux/sched/mm.h>, to further reduce the size and complexity
      of <linux/sched.h>.
      
      Here are how the APIs are used in various kernel files:
      
        # mm_alloc():
        arch/arm/mach-rpc/ecard.c
        fs/exec.c
        include/linux/sched/mm.h
        kernel/fork.c
      
        # __mmdrop():
        arch/arc/include/asm/mmu_context.h
        include/linux/sched/mm.h
        kernel/fork.c
      
        # mmdrop():
        arch/arm/mach-rpc/ecard.c
        arch/m68k/sun3/mmu_emu.c
        arch/x86/mm/tlb.c
        drivers/gpu/drm/amd/amdkfd/kfd_process.c
        drivers/gpu/drm/i915/i915_gem_userptr.c
        drivers/infiniband/hw/hfi1/file_ops.c
        drivers/vfio/vfio_iommu_spapr_tce.c
        fs/exec.c
        fs/proc/base.c
        fs/proc/task_mmu.c
        fs/proc/task_nommu.c
        fs/userfaultfd.c
        include/linux/mmu_notifier.h
        include/linux/sched/mm.h
        kernel/fork.c
        kernel/futex.c
        kernel/sched/core.c
        mm/khugepaged.c
        mm/ksm.c
        mm/mmu_context.c
        mm/mmu_notifier.c
        mm/oom_kill.c
        virt/kvm/kvm_main.c
      
        # mmdrop_async_fn():
        include/linux/sched/mm.h
      
        # mmdrop_async():
        include/linux/sched/mm.h
        kernel/fork.c
      
        # mmget_not_zero():
        fs/userfaultfd.c
        include/linux/sched/mm.h
        mm/oom_kill.c
      
        # mmput():
        arch/arc/include/asm/mmu_context.h
        arch/arc/kernel/troubleshoot.c
        arch/frv/mm/mmu-context.c
        arch/powerpc/platforms/cell/spufs/context.c
        arch/sparc/include/asm/mmu_context_32.h
        drivers/android/binder.c
        drivers/gpu/drm/etnaviv/etnaviv_gem.c
        drivers/gpu/drm/i915/i915_gem_userptr.c
        drivers/infiniband/core/umem.c
        drivers/infiniband/core/umem_odp.c
        drivers/infiniband/core/uverbs_main.c
        drivers/infiniband/hw/mlx4/main.c
        drivers/infiniband/hw/mlx5/main.c
        drivers/infiniband/hw/usnic/usnic_uiom.c
        drivers/iommu/amd_iommu_v2.c
        drivers/iommu/intel-svm.c
        drivers/lguest/lguest_user.c
        drivers/misc/cxl/fault.c
        drivers/misc/mic/scif/scif_rma.c
        drivers/oprofile/buffer_sync.c
        drivers/vfio/vfio_iommu_type1.c
        drivers/vhost/vhost.c
        drivers/xen/gntdev.c
        fs/exec.c
        fs/proc/array.c
        fs/proc/base.c
        fs/proc/task_mmu.c
        fs/proc/task_nommu.c
        fs/userfaultfd.c
        include/linux/sched/mm.h
        kernel/cpuset.c
        kernel/events/core.c
        kernel/events/uprobes.c
        kernel/exit.c
        kernel/fork.c
        kernel/ptrace.c
        kernel/sys.c
        kernel/trace/trace_output.c
        kernel/tsacct.c
        mm/memcontrol.c
        mm/memory.c
        mm/mempolicy.c
        mm/migrate.c
        mm/mmu_notifier.c
        mm/nommu.c
        mm/oom_kill.c
        mm/process_vm_access.c
        mm/rmap.c
        mm/swapfile.c
        mm/util.c
        virt/kvm/async_pf.c
      
        # mmput_async():
        include/linux/sched/mm.h
        kernel/fork.c
        mm/oom_kill.c
      
        # get_task_mm():
        arch/arc/kernel/troubleshoot.c
        arch/powerpc/platforms/cell/spufs/context.c
        drivers/android/binder.c
        drivers/gpu/drm/etnaviv/etnaviv_gem.c
        drivers/infiniband/core/umem.c
        drivers/infiniband/core/umem_odp.c
        drivers/infiniband/hw/mlx4/main.c
        drivers/infiniband/hw/mlx5/main.c
        drivers/infiniband/hw/usnic/usnic_uiom.c
        drivers/iommu/amd_iommu_v2.c
        drivers/iommu/intel-svm.c
        drivers/lguest/lguest_user.c
        drivers/misc/cxl/fault.c
        drivers/misc/mic/scif/scif_rma.c
        drivers/oprofile/buffer_sync.c
        drivers/vfio/vfio_iommu_type1.c
        drivers/vhost/vhost.c
        drivers/xen/gntdev.c
        fs/proc/array.c
        fs/proc/base.c
        fs/proc/task_mmu.c
        include/linux/sched/mm.h
        kernel/cpuset.c
        kernel/events/core.c
        kernel/exit.c
        kernel/fork.c
        kernel/ptrace.c
        kernel/sys.c
        kernel/trace/trace_output.c
        kernel/tsacct.c
        mm/memcontrol.c
        mm/memory.c
        mm/mempolicy.c
        mm/migrate.c
        mm/mmu_notifier.c
        mm/nommu.c
        mm/util.c
      
        # mm_access():
        fs/proc/base.c
        include/linux/sched/mm.h
        kernel/fork.c
        mm/process_vm_access.c
      
        # mm_release():
        arch/arc/include/asm/mmu_context.h
        fs/exec.c
        include/linux/sched/mm.h
        include/uapi/linux/sched.h
        kernel/exit.c
        kernel/fork.c
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      68e21be2
  2. 02 3月, 2017 10 次提交
  3. 28 2月, 2017 3 次提交
  4. 25 2月, 2017 1 次提交
  5. 24 2月, 2017 3 次提交
    • M
      arm64/cpufeature: check correct field width when updating sys_val · 638f863d
      Mark Rutland 提交于
      When we're updating a register's sys_val, we use arm64_ftr_value() to
      find the new field value. We use cpuid_feature_extract_field() to find
      the new value, but this implicitly assumes a 4-bit field, so we may
      extract more bits than we mean to for fields like CTR_EL0.L1ip.
      
      This affects update_cpu_ftr_reg(), where we may extract erroneous values
      for ftr_cur and ftr_new. Depending on the additional bits extracted in
      either case, we may erroneously detect that the value is mismatched, and
      we'll try to compute a new safe value.
      
      Dependent on these extra bits and feature type, arm64_ftr_safe_value()
      may pessimistically select the always-safe value, or may erroneously
      choose either the extracted cur or new value as the safe option. The
      extra bits will subsequently be masked out in arm64_ftr_set_value(), so
      we may choose a higher value, yet write back a lower one.
      
      Fix this by passing the width down explicitly in arm64_ftr_value(), so
      we always extract the correct amount.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      638f863d
    • M
      Revert "arm64: mm: set the contiguous bit for kernel mappings where appropriate" · d81bbe6d
      Mark Rutland 提交于
      This reverts commit 0bfc445d.
      
      When we change the permissions of regions mapped using contiguous
      entries, the architecture requires us to follow a Break-Before-Make
      strategy, breaking *all* associated entries before we can change any of
      the following properties from the entries:
      
       - presence of the contiguous bit
       - output address
       - attributes
       - permissiones
      
      Failure to do so can result in a number of problems (e.g. TLB conflict
      aborts and/or erroneous results from TLB lookups).
      
      See ARM DDI 0487A.k_iss10775, "Misprogramming of the Contiguous bit",
      page D4-1762.
      
      We do not take this into account when altering the permissions of kernel
      segments in mark_rodata_ro(), where we change the permissions of live
      contiguous entires one-by-one, leaving them transiently inconsistent.
      This has been observed to result in failures on some fast model
      configurations.
      
      Unfortunately, we cannot follow Break-Before-Make here as we'd have to
      unmap kernel text and data used to perform the sequence.
      
      For the timebeing, revert commit 0bfc445d so as to avoid issues
      resulting from this misuse of the contiguous bit.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reported-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <Will.Deacon@arm.com>
      Cc: stable@vger.kernel.org # v4.10
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d81bbe6d
    • S
      arm64: Avoid clobbering mm in erratum workaround on QDF2400 · ea6eac90
      Shanker Donthineni 提交于
      Commit 38fd94b0 ("arm64: Work around Falkor erratum 1003") tried to
      work around a hardware erratum, but actually caused a system crash of
      its own during switch_mm:
      
       cpu_do_switch_mm+0x20/0x40
       efi_virtmap_load+0x34/0x40
       virt_efi_get_next_variable+0x64/0xc8
       efivar_init+0x8c/0x348
       efisubsys_init+0xd4/0x270
       do_one_initcall+0x80/0x110
       kernel_init_freeable+0x19c/0x240
       kernel_init+0x10/0x100
       ret_from_fork+0x10/0x50
      
       Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
      
      In cpu_do_switch_mm, x1 contains the mm_struct pointer, which needs to
      be preserved by the pre_ttbr0_update_workaround macro rather than passed
      as a temporary.
      
      This patch clobbers x2 and x3 instead, keeping the mm_struct intact
      after the workaround has run.
      
      Fixes: 38fd94b0 ("arm64: Work around Falkor erratum 1003")
      Tested-by: NManoj Iyer <manoj.iyer@canonical.com>
      Signed-off-by: NShanker Donthineni <shankerd@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ea6eac90
  6. 22 2月, 2017 3 次提交
  7. 18 2月, 2017 2 次提交
    • D
      bpf: make jited programs visible in traces · 74451e66
      Daniel Borkmann 提交于
      Long standing issue with JITed programs is that stack traces from
      function tracing check whether a given address is kernel code
      through {__,}kernel_text_address(), which checks for code in core
      kernel, modules and dynamically allocated ftrace trampolines. But
      what is still missing is BPF JITed programs (interpreted programs
      are not an issue as __bpf_prog_run() will be attributed to them),
      thus when a stack trace is triggered, the code walking the stack
      won't see any of the JITed ones. The same for address correlation
      done from user space via reading /proc/kallsyms. This is read by
      tools like perf, but the latter is also useful for permanent live
      tracing with eBPF itself in combination with stack maps when other
      eBPF types are part of the callchain. See offwaketime example on
      dumping stack from a map.
      
      This work tries to tackle that issue by making the addresses and
      symbols known to the kernel. The lookup from *kernel_text_address()
      is implemented through a latched RB tree that can be read under
      RCU in fast-path that is also shared for symbol/size/offset lookup
      for a specific given address in kallsyms. The slow-path iteration
      through all symbols in the seq file done via RCU list, which holds
      a tiny fraction of all exported ksyms, usually below 0.1 percent.
      Function symbols are exported as bpf_prog_<tag>, in order to aide
      debugging and attribution. This facility is currently enabled for
      root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening
      is active in any mode. The rationale behind this is that still a lot
      of systems ship with world read permissions on kallsyms thus addresses
      should not get suddenly exposed for them. If that situation gets
      much better in future, we always have the option to change the
      default on this. Likewise, unprivileged programs are not allowed
      to add entries there either, but that is less of a concern as most
      such programs types relevant in this context are for root-only anyway.
      If enabled, call graphs and stack traces will then show a correct
      attribution; one example is illustrated below, where the trace is
      now visible in tooling such as perf script --kallsyms=/proc/kallsyms
      and friends.
      
      Before:
      
        7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux)
               f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so)
      
      After:
      
        7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux)
        [...]
        7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux)
        7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux)
               f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so)
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74451e66
    • D
      bpf: remove stubs for cBPF from arch code · 9383191d
      Daniel Borkmann 提交于
      Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make
      that a single __weak function in the core that can be overridden
      similarly to the eBPF one. Also remove stale pr_err() mentions
      of bpf_jit_compile.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9383191d
  8. 15 2月, 2017 7 次提交
  9. 11 2月, 2017 2 次提交
  10. 10 2月, 2017 2 次提交
    • C
      arm64: Work around Falkor erratum 1003 · 38fd94b0
      Christopher Covington 提交于
      The Qualcomm Datacenter Technologies Falkor v1 CPU may allocate TLB entries
      using an incorrect ASID when TTBRx_EL1 is being updated. When the erratum
      is triggered, page table entries using the new translation table base
      address (BADDR) will be allocated into the TLB using the old ASID. All
      circumstances leading to the incorrect ASID being cached in the TLB arise
      when software writes TTBRx_EL1[ASID] and TTBRx_EL1[BADDR], a memory
      operation is in the process of performing a translation using the specific
      TTBRx_EL1 being written, and the memory operation uses a translation table
      descriptor designated as non-global. EL2 and EL3 code changing the EL1&0
      ASID is not subject to this erratum because hardware is prohibited from
      performing translations from an out-of-context translation regime.
      
      Consider the following pseudo code.
      
        write new BADDR and ASID values to TTBRx_EL1
      
      Replacing the above sequence with the one below will ensure that no TLB
      entries with an incorrect ASID are used by software.
      
        write reserved value to TTBRx_EL1[ASID]
        ISB
        write new value to TTBRx_EL1[BADDR]
        ISB
        write new value to TTBRx_EL1[ASID]
        ISB
      
      When the above sequence is used, page table entries using the new BADDR
      value may still be incorrectly allocated into the TLB using the reserved
      ASID. Yet this will not reduce functionality, since TLB entries incorrectly
      tagged with the reserved ASID will never be hit by a later instruction.
      
      Based on work by Shanker Donthineni <shankerd@codeaurora.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NChristopher Covington <cov@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      38fd94b0
    • W
      arm64: head.S: Enable EL1 (host) access to SPE when entered at EL2 · 2bf47e19
      Will Deacon 提交于
      The SPE architecture requires each exception level to enable access
      to the SPE controls for the exception level below it, since additional
      context-switch logic may be required to handle the buffer safely.
      
      This patch allows EL1 (host) access to the SPE controls when entered at
      EL2.
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2bf47e19
  11. 09 2月, 2017 5 次提交
  12. 08 2月, 2017 1 次提交