1. 14 12月, 2017 1 次提交
  2. 06 12月, 2017 2 次提交
    • M
      powerpc/xmon: Don't print hashed pointers in xmon · d8104182
      Michael Ellerman 提交于
      Since commit ad67b74d ("printk: hash addresses printed with %p")
      pointers printed with %p are hashed, ie. you don't see the actual
      pointer value but rather a cryptographic hash of its value.
      
      In xmon we want to see the actual pointer values, because xmon is a
      debugger, so replace %p with %px which prints the actual pointer
      value.
      
      We justify doing this in xmon because 1) xmon is a kernel crash
      debugger, it's only accessible via the console 2) xmon doesn't print
      to dmesg, so the pointers it prints are not able to be leaked that
      way.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d8104182
    • N
      powerpc/64s: Initialize ISAv3 MMU registers before setting partition table · 371b8044
      Nicholas Piggin 提交于
      kexec can leave MMU registers set when booting into a new kernel,
      the PIDR (Process Identification Register) in particular. The boot
      sequence does not zero PIDR, so it only gets set when CPUs first
      switch to a userspace processes (until then it's running a kernel
      thread with effective PID = 0).
      
      This leaves a window where a process table entry and page tables are
      set up due to user processes running on other CPUs, that happen to
      match with a stale PID. The CPU with that PID may cause speculative
      accesses that address quadrant 0 (aka userspace addresses), which will
      result in cached translations and PWC (Page Walk Cache) for that
      process, on a CPU which is not in the mm_cpumask and so they will not
      be invalidated properly.
      
      The most common result is the kernel hanging in infinite page fault
      loops soon after kexec (usually in schedule_tail, which is usually the
      first non-speculative quadrant 0 access to a new PID) due to a stale
      PWC. However being a stale translation error, it could result in
      anything up to security and data corruption problems.
      
      Fix this by zeroing out PIDR at boot and kexec.
      
      Fixes: 7e381c0f ("powerpc/mm/radix: Add mmu context handling callback for radix")
      Cc: stable@vger.kernel.org # v4.7+
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      371b8044
  3. 05 12月, 2017 2 次提交
    • H
      bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type · c895f6f7
      Hendrik Brueckner 提交于
      Commit 0515e599 ("bpf: introduce BPF_PROG_TYPE_PERF_EVENT
      program type") introduced the bpf_perf_event_data structure which
      exports the pt_regs structure.  This is OK for multiple architectures
      but fail for s390 and arm64 which do not export pt_regs.  Programs
      using them, for example, the bpf selftest fail to compile on these
      architectures.
      
      For s390, exporting the pt_regs is not an option because s390 wants
      to allow changes to it.  For arm64, there is a user_pt_regs structure
      that covers parts of the pt_regs structure for use by user space.
      
      To solve the broken uapi for s390 and arm64, introduce an abstract
      type for pt_regs and add an asm/bpf_perf_event.h file that concretes
      the type.  An asm-generic header file covers the architectures that
      export pt_regs today.
      
      The arch-specific enablement for s390 and arm64 follows in separate
      commits.
      Reported-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Fixes: 0515e599 ("bpf: introduce BPF_PROG_TYPE_PERF_EVENT program type")
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Reviewed-and-tested-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      c895f6f7
    • D
      Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier" · ab9dbf77
      David Gibson 提交于
      This reverts commit a3b2cb30.
      
      That commit tried to fix problems with panic on powerpc in certain
      circumstances, where some output from the generic panic code was being
      dropped.
      
      Unfortunately, it breaks things worse in other circumstances. In
      particular when running a PAPR guest, it will now attempt to reboot
      instead of informing the hypervisor (KVM or PowerVM) that the guest
      has crashed. The crash notification is important to some
      virtualization management layers.
      
      Revert it for now until we can come up with a better solution.
      
      Fixes: a3b2cb30 ("powerpc: Do not call ppc_md.panic in fadump panic notifier")
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      [mpe: Tweak change log a bit]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ab9dbf77
  4. 04 12月, 2017 1 次提交
    • R
      powerpc/perf: Fix oops when grouping different pmu events · 5aa04b3e
      Ravi Bangoria 提交于
      When user tries to group imc (In-Memory Collections) event with
      normal event, (sometime) kernel crashes with following log:
      
          Faulting instruction address: 0x00000000
          [link register   ] c00000000010ce88 power_check_constraints+0x128/0x980
          ...
          c00000000010e238 power_pmu_event_init+0x268/0x6f0
          c0000000002dc60c perf_try_init_event+0xdc/0x1a0
          c0000000002dce88 perf_event_alloc+0x7b8/0xac0
          c0000000002e92e0 SyS_perf_event_open+0x530/0xda0
          c00000000000b004 system_call+0x38/0xe0
      
      'event_base' field of 'struct hw_perf_event' is used as flags for
      normal hw events and used as memory address for imc events. While
      grouping these two types of events, collect_events() tries to
      interpret imc 'event_base' as a flag, which causes a corruption
      resulting in a crash.
      
      Consider only those events which belongs to 'perf_hw_context' in
      collect_events().
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-By: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5aa04b3e
  5. 30 11月, 2017 1 次提交
  6. 29 11月, 2017 2 次提交
    • V
      powerpc: Do not assign thread.tidr if already assigned · 7e4d4233
      Vaibhav Jain 提交于
      If set_thread_tidr() is called twice for same task_struct then it will
      allocate a new tidr value to it leaving the previous value still
      dangling in the vas_thread_ida table.
      
      To fix this the patch changes set_thread_tidr() to check if a tidr
      value is already assigned to the task_struct and if yes then returns
      zero.
      
      Fixes: ec233ede("powerpc: Add support for setting SPRN_TIDR")
      Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      [mpe: Modify to return 0 in the success case, not the TID value]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7e4d4233
    • V
      powerpc: Avoid signed to unsigned conversion in set_thread_tidr() · aca7573f
      Vaibhav Jain 提交于
      There is an unsafe signed to unsigned conversion in set_thread_tidr()
      that may cause an error value to be assigned to SPRN_TIDR register and
      used as thread-id.
      
      The issue happens as assign_thread_tidr() returns an int and
      thread.tidr is an unsigned-long. So a negative error code returned
      from assign_thread_tidr() will fail the error check and gets assigned
      as tidr as a large positive value.
      
      To fix this the patch assigns the return value of assign_thread_tidr()
      to a temporary int and assigns it to thread.tidr iff its '> 0'.
      
      The patch shouldn't impact the calling convention of set_thread_tidr()
      i.e all -ve return-values are error codes and a return value of '0'
      indicates success.
      
      Fixes: ec233ede("powerpc: Add support for setting SPRN_TIDR")
      Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
      Reviewed-by: Christophe Lombard clombard@linux.vnet.ibm.com
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      aca7573f
  7. 28 11月, 2017 1 次提交
    • J
      KVM: Let KVM_SET_SIGNAL_MASK work as advertised · 20b7035c
      Jan H. Schönherr 提交于
      KVM API says for the signal mask you set via KVM_SET_SIGNAL_MASK, that
      "any unblocked signal received [...] will cause KVM_RUN to return with
      -EINTR" and that "the signal will only be delivered if not blocked by
      the original signal mask".
      
      This, however, is only true, when the calling task has a signal handler
      registered for a signal. If not, signal evaluation is short-circuited for
      SIG_IGN and SIG_DFL, and the signal is either ignored without KVM_RUN
      returning or the whole process is terminated.
      
      Make KVM_SET_SIGNAL_MASK behave as advertised by utilizing logic similar
      to that in do_sigtimedwait() to avoid short-circuiting of signals.
      Signed-off-by: NJan H. Schönherr <jschoenh@amazon.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      20b7035c
  8. 24 11月, 2017 1 次提交
  9. 23 11月, 2017 2 次提交
    • M
      powerpc/powernv: Fix kexec crashes caused by tlbie tracing · a3961f82
      Mahesh Salgaonkar 提交于
      Rebooting into a new kernel with kexec fails in trace_tlbie() which is
      called from native_hpte_clear(). This happens if the running kernel
      has CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints
      always execute few RCU checks regardless of whether tracing is on or
      off. We are already in the last phase of kexec sequence in real mode
      with HILE_BE set. At this point the RCU check ends up in
      RCU_LOCKDEP_WARN and causes kexec to fail.
      
      Fix this by not calling trace_tlbie() from native_hpte_clear().
      
      mpe: It's not safe to call trace points at this point in the kexec
      path, even if we could avoid the RCU checks/warnings. The only
      solution is to not call them.
      
      Fixes: 0428491c ("powerpc/mm: Trace tlbie(l) instructions")
      Cc: stable@vger.kernel.org # v4.13+
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Reported-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a3961f82
    • P
      KVM: PPC: Book3S HV: Fix migration and HPT resizing of HPT guests on radix hosts · ded13fc1
      Paul Mackerras 提交于
      This fixes two errors that prevent a guest using the HPT MMU from
      successfully migrating to a POWER9 host in radix MMU mode, or resizing
      its HPT when running on a radix host.
      
      The first bug was that commit 8dc6cca5 ("KVM: PPC: Book3S HV:
      Don't rely on host's page size information", 2017-09-11) missed two
      uses of hpte_base_page_size(), one in the HPT rehashing code and
      one in kvm_htab_write() (which is used on the destination side in
      migrating a HPT guest).  Instead we use kvmppc_hpte_base_page_shift().
      Having the shift count means that we can use left and right shifts
      instead of multiplication and division in a few places.
      
      Along the way, this adds a check in kvm_htab_write() to ensure that the
      page size encoding in the incoming HPTEs is recognized, and if not
      return an EINVAL error to userspace.
      
      The second bug was that kvm_htab_write was performing some but not all
      of the functions of kvmhv_setup_mmu(), resulting in the destination VM
      being left in radix mode as far as the hardware is concerned.  The
      simplest fix for now is make kvm_htab_write() call
      kvmppc_setup_partition_table() like kvmppc_hv_setup_htab_rma() does.
      In future it would be better to refactor the code more extensively
      to remove the duplication.
      
      Fixes: 8dc6cca5 ("KVM: PPC: Book3S HV: Don't rely on host's page size information")
      Fixes: 7a84084c ("KVM: PPC: Book3S HV: Set partition table rather than SDR1 on POWER9")
      Reported-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Tested-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      ded13fc1
  10. 22 11月, 2017 7 次提交
    • M
      powerpc/64s: Fix Power9 DD2.1 logic in DT CPU features · 4d6c51b1
      Michael Ellerman 提交于
      I got the logic wrong in the DT CPU features code when I added the
      Power9 DD2.1 feature. We should be setting the bit if we detect a
      DD2.1, not clearing it if we detect a DD2.0.
      
      This code isn't actually exercised at the moment so nothing is
      actually broken.
      
      Fixes: 3ffa9d9e ("powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 feature")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4d6c51b1
    • M
      powerpc/perf: Fix IMC_MAX_PMU macro · 73ce9aec
      Madhavan Srinivasan 提交于
      IMC_MAX_PMU is used for static storage (per_nest_pmu_arr) which holds
      nest pmu information. Current value for the macro is 32 based on
      the initial number of nest pmu units supported by the nest microcode.
      But going forward, microcode could support more nest units. Instead
      of static storage, patch to fix the code to dynamically allocate an
      array based on the number of nest imc units found in the device tree.
      
      Fixes:8f95faaa ('powerpc/powernv: Detect and create IMC device')
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      73ce9aec
    • M
      powerpc/perf: Fix pmu_count to count only nest imc pmus · de34787f
      Madhavan Srinivasan 提交于
      "pmu_count" in opal_imc_counters_probe() is intended to hold
      the number of successful nest imc pmu registerations. But
      current code also counts other imc units like core_imc and
      thread_imc. Patch add a check to count only nest imc pmus.
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      de34787f
    • C
      powerpc: Fix boot on BOOK3S_32 with CONFIG_STRICT_KERNEL_RWX · 252eb558
      Christophe Leroy 提交于
      On powerpc32, patch_instruction() is called by apply_feature_fixups()
      which is called from early_init()
      
      There is the following note in front of early_init():
       * Note that the kernel may be running at an address which is different
       * from the address that it was linked at, so we must use RELOC/PTRRELOC
       * to access static data (including strings).  -- paulus
      
      Therefore, slab_is_available() cannot be called yet, and
      text_poke_area must be addressed with PTRRELOC()
      
      Fixes: 95902e6c ("powerpc/mm: Implement STRICT_KERNEL_RWX on PPC32")
      Cc: stable@vger.kernel.org # v4.14+
      Reported-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      252eb558
    • M
      powerpc/perf/imc: Use cpu_to_node() not topology_physical_package_id() · f3f1dfd6
      Michael Ellerman 提交于
      init_imc_pmu() uses topology_physical_package_id() to detect the
      node id of the processor it is on to get local memory, but that's
      wrong, and can lead to crashes. Fix it to use cpu_to_node().
      
      Fixes: 885dcd70 ("powerpc/perf: Add nest IMC PMU support")
      Cc: stable@vger.kernel.org # v4.14+
      Reported-By: NRob Lippert <rlippert@google.com>
      Tested-By: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f3f1dfd6
    • K
      treewide: setup_timer() -> timer_setup() (2 field) · 86cb30ec
      Kees Cook 提交于
      This converts all remaining setup_timer() calls that use a nested field
      to reach a struct timer_list. Coccinelle does not have an easy way to
      match multiple fields, so a new script is needed to change the matches of
      "&_E->_timer" into "&_E->_field1._timer" in all the rules.
      
      spatch --very-quiet --all-includes --include-headers \
      	-I ./arch/x86/include -I ./arch/x86/include/generated \
      	-I ./include -I ./arch/x86/include/uapi \
      	-I ./arch/x86/include/generated/uapi -I ./include/uapi \
      	-I ./include/generated/uapi --include ./include/linux/kconfig.h \
      	--dir . \
      	--cocci-file ~/src/data/timer_setup-2fields.cocci
      
      @fix_address_of depends@
      expression e;
      @@
      
       setup_timer(
      -&(e)
      +&e
       , ...)
      
      // Update any raw setup_timer() usages that have a NULL callback, but
      // would otherwise match change_timer_function_usage, since the latter
      // will update all function assignments done in the face of a NULL
      // function initialization in setup_timer().
      @change_timer_function_usage_NULL@
      expression _E;
      identifier _field1;
      identifier _timer;
      type _cast_data;
      @@
      
      (
      -setup_timer(&_E->_field1._timer, NULL, _E);
      +timer_setup(&_E->_field1._timer, NULL, 0);
      |
      -setup_timer(&_E->_field1._timer, NULL, (_cast_data)_E);
      +timer_setup(&_E->_field1._timer, NULL, 0);
      |
      -setup_timer(&_E._field1._timer, NULL, &_E);
      +timer_setup(&_E._field1._timer, NULL, 0);
      |
      -setup_timer(&_E._field1._timer, NULL, (_cast_data)&_E);
      +timer_setup(&_E._field1._timer, NULL, 0);
      )
      
      @change_timer_function_usage@
      expression _E;
      identifier _field1;
      identifier _timer;
      struct timer_list _stl;
      identifier _callback;
      type _cast_func, _cast_data;
      @@
      
      (
      -setup_timer(&_E->_field1._timer, _callback, _E);
      +timer_setup(&_E->_field1._timer, _callback, 0);
      |
      -setup_timer(&_E->_field1._timer, &_callback, _E);
      +timer_setup(&_E->_field1._timer, _callback, 0);
      |
      -setup_timer(&_E->_field1._timer, _callback, (_cast_data)_E);
      +timer_setup(&_E->_field1._timer, _callback, 0);
      |
      -setup_timer(&_E->_field1._timer, &_callback, (_cast_data)_E);
      +timer_setup(&_E->_field1._timer, _callback, 0);
      |
      -setup_timer(&_E->_field1._timer, (_cast_func)_callback, _E);
      +timer_setup(&_E->_field1._timer, _callback, 0);
      |
      -setup_timer(&_E->_field1._timer, (_cast_func)&_callback, _E);
      +timer_setup(&_E->_field1._timer, _callback, 0);
      |
      -setup_timer(&_E->_field1._timer, (_cast_func)_callback, (_cast_data)_E);
      +timer_setup(&_E->_field1._timer, _callback, 0);
      |
      -setup_timer(&_E->_field1._timer, (_cast_func)&_callback, (_cast_data)_E);
      +timer_setup(&_E->_field1._timer, _callback, 0);
      |
      -setup_timer(&_E._field1._timer, _callback, (_cast_data)_E);
      +timer_setup(&_E._field1._timer, _callback, 0);
      |
      -setup_timer(&_E._field1._timer, _callback, (_cast_data)&_E);
      +timer_setup(&_E._field1._timer, _callback, 0);
      |
      -setup_timer(&_E._field1._timer, &_callback, (_cast_data)_E);
      +timer_setup(&_E._field1._timer, _callback, 0);
      |
      -setup_timer(&_E._field1._timer, &_callback, (_cast_data)&_E);
      +timer_setup(&_E._field1._timer, _callback, 0);
      |
      -setup_timer(&_E._field1._timer, (_cast_func)_callback, (_cast_data)_E);
      +timer_setup(&_E._field1._timer, _callback, 0);
      |
      -setup_timer(&_E._field1._timer, (_cast_func)_callback, (_cast_data)&_E);
      +timer_setup(&_E._field1._timer, _callback, 0);
      |
      -setup_timer(&_E._field1._timer, (_cast_func)&_callback, (_cast_data)_E);
      +timer_setup(&_E._field1._timer, _callback, 0);
      |
      -setup_timer(&_E._field1._timer, (_cast_func)&_callback, (_cast_data)&_E);
      +timer_setup(&_E._field1._timer, _callback, 0);
      |
       _E->_field1._timer@_stl.function = _callback;
      |
       _E->_field1._timer@_stl.function = &_callback;
      |
       _E->_field1._timer@_stl.function = (_cast_func)_callback;
      |
       _E->_field1._timer@_stl.function = (_cast_func)&_callback;
      |
       _E._field1._timer@_stl.function = _callback;
      |
       _E._field1._timer@_stl.function = &_callback;
      |
       _E._field1._timer@_stl.function = (_cast_func)_callback;
      |
       _E._field1._timer@_stl.function = (_cast_func)&_callback;
      )
      
      // callback(unsigned long arg)
      @change_callback_handle_cast
       depends on change_timer_function_usage@
      identifier change_timer_function_usage._callback;
      identifier change_timer_function_usage._field1;
      identifier change_timer_function_usage._timer;
      type _origtype;
      identifier _origarg;
      type _handletype;
      identifier _handle;
      @@
      
       void _callback(
      -_origtype _origarg
      +struct timer_list *t
       )
       {
      (
      	... when != _origarg
      	_handletype *_handle =
      -(_handletype *)_origarg;
      +from_timer(_handle, t, _field1._timer);
      	... when != _origarg
      |
      	... when != _origarg
      	_handletype *_handle =
      -(void *)_origarg;
      +from_timer(_handle, t, _field1._timer);
      	... when != _origarg
      |
      	... when != _origarg
      	_handletype *_handle;
      	... when != _handle
      	_handle =
      -(_handletype *)_origarg;
      +from_timer(_handle, t, _field1._timer);
      	... when != _origarg
      |
      	... when != _origarg
      	_handletype *_handle;
      	... when != _handle
      	_handle =
      -(void *)_origarg;
      +from_timer(_handle, t, _field1._timer);
      	... when != _origarg
      )
       }
      
      // callback(unsigned long arg) without existing variable
      @change_callback_handle_cast_no_arg
       depends on change_timer_function_usage &&
                           !change_callback_handle_cast@
      identifier change_timer_function_usage._callback;
      identifier change_timer_function_usage._field1;
      identifier change_timer_function_usage._timer;
      type _origtype;
      identifier _origarg;
      type _handletype;
      @@
      
       void _callback(
      -_origtype _origarg
      +struct timer_list *t
       )
       {
      +	_handletype *_origarg = from_timer(_origarg, t, _field1._timer);
      +
      	... when != _origarg
      -	(_handletype *)_origarg
      +	_origarg
      	... when != _origarg
       }
      
      // Avoid already converted callbacks.
      @match_callback_converted
       depends on change_timer_function_usage &&
                  !change_callback_handle_cast &&
      	    !change_callback_handle_cast_no_arg@
      identifier change_timer_function_usage._callback;
      identifier t;
      @@
      
       void _callback(struct timer_list *t)
       { ... }
      
      // callback(struct something *handle)
      @change_callback_handle_arg
       depends on change_timer_function_usage &&
      	    !match_callback_converted &&
                  !change_callback_handle_cast &&
                  !change_callback_handle_cast_no_arg@
      identifier change_timer_function_usage._callback;
      identifier change_timer_function_usage._field1;
      identifier change_timer_function_usage._timer;
      type _handletype;
      identifier _handle;
      @@
      
       void _callback(
      -_handletype *_handle
      +struct timer_list *t
       )
       {
      +	_handletype *_handle = from_timer(_handle, t, _field1._timer);
      	...
       }
      
      // If change_callback_handle_arg ran on an empty function, remove
      // the added handler.
      @unchange_callback_handle_arg
       depends on change_timer_function_usage &&
      	    change_callback_handle_arg@
      identifier change_timer_function_usage._callback;
      identifier change_timer_function_usage._field1;
      identifier change_timer_function_usage._timer;
      type _handletype;
      identifier _handle;
      identifier t;
      @@
      
       void _callback(struct timer_list *t)
       {
      -	_handletype *_handle = from_timer(_handle, t, _field1._timer);
       }
      
      // We only want to refactor the setup_timer() data argument if we've found
      // the matching callback. This undoes changes in change_timer_function_usage.
      @unchange_timer_function_usage
       depends on change_timer_function_usage &&
                  !change_callback_handle_cast &&
                  !change_callback_handle_cast_no_arg &&
      	    !change_callback_handle_arg@
      expression change_timer_function_usage._E;
      identifier change_timer_function_usage._field1;
      identifier change_timer_function_usage._timer;
      identifier change_timer_function_usage._callback;
      type change_timer_function_usage._cast_data;
      @@
      
      (
      -timer_setup(&_E->_field1._timer, _callback, 0);
      +setup_timer(&_E->_field1._timer, _callback, (_cast_data)_E);
      |
      -timer_setup(&_E._field1._timer, _callback, 0);
      +setup_timer(&_E._field1._timer, _callback, (_cast_data)&_E);
      )
      
      // If we fixed a callback from a .function assignment, fix the
      // assignment cast now.
      @change_timer_function_assignment
       depends on change_timer_function_usage &&
                  (change_callback_handle_cast ||
                   change_callback_handle_cast_no_arg ||
                   change_callback_handle_arg)@
      expression change_timer_function_usage._E;
      identifier change_timer_function_usage._field1;
      identifier change_timer_function_usage._timer;
      identifier change_timer_function_usage._callback;
      type _cast_func;
      typedef TIMER_FUNC_TYPE;
      @@
      
      (
       _E->_field1._timer.function =
      -_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E->_field1._timer.function =
      -&_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E->_field1._timer.function =
      -(_cast_func)_callback;
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E->_field1._timer.function =
      -(_cast_func)&_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E._field1._timer.function =
      -_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E._field1._timer.function =
      -&_callback;
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E._field1._timer.function =
      -(_cast_func)_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E._field1._timer.function =
      -(_cast_func)&_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      )
      
      // Sometimes timer functions are called directly. Replace matched args.
      @change_timer_function_calls
       depends on change_timer_function_usage &&
                  (change_callback_handle_cast ||
                   change_callback_handle_cast_no_arg ||
                   change_callback_handle_arg)@
      expression _E;
      identifier change_timer_function_usage._field1;
      identifier change_timer_function_usage._timer;
      identifier change_timer_function_usage._callback;
      type _cast_data;
      @@
      
       _callback(
      (
      -(_cast_data)_E
      +&_E->_field1._timer
      |
      -(_cast_data)&_E
      +&_E._field1._timer
      |
      -_E
      +&_E->_field1._timer
      )
       )
      
      // If a timer has been configured without a data argument, it can be
      // converted without regard to the callback argument, since it is unused.
      @match_timer_function_unused_data@
      expression _E;
      identifier _field1;
      identifier _timer;
      identifier _callback;
      @@
      
      (
      -setup_timer(&_E->_field1._timer, _callback, 0);
      +timer_setup(&_E->_field1._timer, _callback, 0);
      |
      -setup_timer(&_E->_field1._timer, _callback, 0L);
      +timer_setup(&_E->_field1._timer, _callback, 0);
      |
      -setup_timer(&_E->_field1._timer, _callback, 0UL);
      +timer_setup(&_E->_field1._timer, _callback, 0);
      |
      -setup_timer(&_E._field1._timer, _callback, 0);
      +timer_setup(&_E._field1._timer, _callback, 0);
      |
      -setup_timer(&_E._field1._timer, _callback, 0L);
      +timer_setup(&_E._field1._timer, _callback, 0);
      |
      -setup_timer(&_E._field1._timer, _callback, 0UL);
      +timer_setup(&_E._field1._timer, _callback, 0);
      |
      -setup_timer(&_field1._timer, _callback, 0);
      +timer_setup(&_field1._timer, _callback, 0);
      |
      -setup_timer(&_field1._timer, _callback, 0L);
      +timer_setup(&_field1._timer, _callback, 0);
      |
      -setup_timer(&_field1._timer, _callback, 0UL);
      +timer_setup(&_field1._timer, _callback, 0);
      |
      -setup_timer(_field1._timer, _callback, 0);
      +timer_setup(_field1._timer, _callback, 0);
      |
      -setup_timer(_field1._timer, _callback, 0L);
      +timer_setup(_field1._timer, _callback, 0);
      |
      -setup_timer(_field1._timer, _callback, 0UL);
      +timer_setup(_field1._timer, _callback, 0);
      )
      
      @change_callback_unused_data
       depends on match_timer_function_unused_data@
      identifier match_timer_function_unused_data._callback;
      type _origtype;
      identifier _origarg;
      @@
      
       void _callback(
      -_origtype _origarg
      +struct timer_list *unused
       )
       {
      	... when != _origarg
       }
      Signed-off-by: NKees Cook <keescook@chromium.org>
      86cb30ec
    • K
      treewide: setup_timer() -> timer_setup() · e99e88a9
      Kees Cook 提交于
      This converts all remaining cases of the old setup_timer() API into using
      timer_setup(), where the callback argument is the structure already
      holding the struct timer_list. These should have no behavioral changes,
      since they just change which pointer is passed into the callback with
      the same available pointers after conversion. It handles the following
      examples, in addition to some other variations.
      
      Casting from unsigned long:
      
          void my_callback(unsigned long data)
          {
              struct something *ptr = (struct something *)data;
          ...
          }
          ...
          setup_timer(&ptr->my_timer, my_callback, ptr);
      
      and forced object casts:
      
          void my_callback(struct something *ptr)
          {
          ...
          }
          ...
          setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);
      
      become:
      
          void my_callback(struct timer_list *t)
          {
              struct something *ptr = from_timer(ptr, t, my_timer);
          ...
          }
          ...
          timer_setup(&ptr->my_timer, my_callback, 0);
      
      Direct function assignments:
      
          void my_callback(unsigned long data)
          {
              struct something *ptr = (struct something *)data;
          ...
          }
          ...
          ptr->my_timer.function = my_callback;
      
      have a temporary cast added, along with converting the args:
      
          void my_callback(struct timer_list *t)
          {
              struct something *ptr = from_timer(ptr, t, my_timer);
          ...
          }
          ...
          ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;
      
      And finally, callbacks without a data assignment:
      
          void my_callback(unsigned long data)
          {
          ...
          }
          ...
          setup_timer(&ptr->my_timer, my_callback, 0);
      
      have their argument renamed to verify they're unused during conversion:
      
          void my_callback(struct timer_list *unused)
          {
          ...
          }
          ...
          timer_setup(&ptr->my_timer, my_callback, 0);
      
      The conversion is done with the following Coccinelle script:
      
      spatch --very-quiet --all-includes --include-headers \
      	-I ./arch/x86/include -I ./arch/x86/include/generated \
      	-I ./include -I ./arch/x86/include/uapi \
      	-I ./arch/x86/include/generated/uapi -I ./include/uapi \
      	-I ./include/generated/uapi --include ./include/linux/kconfig.h \
      	--dir . \
      	--cocci-file ~/src/data/timer_setup.cocci
      
      @fix_address_of@
      expression e;
      @@
      
       setup_timer(
      -&(e)
      +&e
       , ...)
      
      // Update any raw setup_timer() usages that have a NULL callback, but
      // would otherwise match change_timer_function_usage, since the latter
      // will update all function assignments done in the face of a NULL
      // function initialization in setup_timer().
      @change_timer_function_usage_NULL@
      expression _E;
      identifier _timer;
      type _cast_data;
      @@
      
      (
      -setup_timer(&_E->_timer, NULL, _E);
      +timer_setup(&_E->_timer, NULL, 0);
      |
      -setup_timer(&_E->_timer, NULL, (_cast_data)_E);
      +timer_setup(&_E->_timer, NULL, 0);
      |
      -setup_timer(&_E._timer, NULL, &_E);
      +timer_setup(&_E._timer, NULL, 0);
      |
      -setup_timer(&_E._timer, NULL, (_cast_data)&_E);
      +timer_setup(&_E._timer, NULL, 0);
      )
      
      @change_timer_function_usage@
      expression _E;
      identifier _timer;
      struct timer_list _stl;
      identifier _callback;
      type _cast_func, _cast_data;
      @@
      
      (
      -setup_timer(&_E->_timer, _callback, _E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, &_callback, _E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, _callback, (_cast_data)_E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, (_cast_func)_callback, _E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E._timer, _callback, (_cast_data)_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, _callback, (_cast_data)&_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, &_callback, (_cast_data)_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
      +timer_setup(&_E._timer, _callback, 0);
      |
       _E->_timer@_stl.function = _callback;
      |
       _E->_timer@_stl.function = &_callback;
      |
       _E->_timer@_stl.function = (_cast_func)_callback;
      |
       _E->_timer@_stl.function = (_cast_func)&_callback;
      |
       _E._timer@_stl.function = _callback;
      |
       _E._timer@_stl.function = &_callback;
      |
       _E._timer@_stl.function = (_cast_func)_callback;
      |
       _E._timer@_stl.function = (_cast_func)&_callback;
      )
      
      // callback(unsigned long arg)
      @change_callback_handle_cast
       depends on change_timer_function_usage@
      identifier change_timer_function_usage._callback;
      identifier change_timer_function_usage._timer;
      type _origtype;
      identifier _origarg;
      type _handletype;
      identifier _handle;
      @@
      
       void _callback(
      -_origtype _origarg
      +struct timer_list *t
       )
       {
      (
      	... when != _origarg
      	_handletype *_handle =
      -(_handletype *)_origarg;
      +from_timer(_handle, t, _timer);
      	... when != _origarg
      |
      	... when != _origarg
      	_handletype *_handle =
      -(void *)_origarg;
      +from_timer(_handle, t, _timer);
      	... when != _origarg
      |
      	... when != _origarg
      	_handletype *_handle;
      	... when != _handle
      	_handle =
      -(_handletype *)_origarg;
      +from_timer(_handle, t, _timer);
      	... when != _origarg
      |
      	... when != _origarg
      	_handletype *_handle;
      	... when != _handle
      	_handle =
      -(void *)_origarg;
      +from_timer(_handle, t, _timer);
      	... when != _origarg
      )
       }
      
      // callback(unsigned long arg) without existing variable
      @change_callback_handle_cast_no_arg
       depends on change_timer_function_usage &&
                           !change_callback_handle_cast@
      identifier change_timer_function_usage._callback;
      identifier change_timer_function_usage._timer;
      type _origtype;
      identifier _origarg;
      type _handletype;
      @@
      
       void _callback(
      -_origtype _origarg
      +struct timer_list *t
       )
       {
      +	_handletype *_origarg = from_timer(_origarg, t, _timer);
      +
      	... when != _origarg
      -	(_handletype *)_origarg
      +	_origarg
      	... when != _origarg
       }
      
      // Avoid already converted callbacks.
      @match_callback_converted
       depends on change_timer_function_usage &&
                  !change_callback_handle_cast &&
      	    !change_callback_handle_cast_no_arg@
      identifier change_timer_function_usage._callback;
      identifier t;
      @@
      
       void _callback(struct timer_list *t)
       { ... }
      
      // callback(struct something *handle)
      @change_callback_handle_arg
       depends on change_timer_function_usage &&
      	    !match_callback_converted &&
                  !change_callback_handle_cast &&
                  !change_callback_handle_cast_no_arg@
      identifier change_timer_function_usage._callback;
      identifier change_timer_function_usage._timer;
      type _handletype;
      identifier _handle;
      @@
      
       void _callback(
      -_handletype *_handle
      +struct timer_list *t
       )
       {
      +	_handletype *_handle = from_timer(_handle, t, _timer);
      	...
       }
      
      // If change_callback_handle_arg ran on an empty function, remove
      // the added handler.
      @unchange_callback_handle_arg
       depends on change_timer_function_usage &&
      	    change_callback_handle_arg@
      identifier change_timer_function_usage._callback;
      identifier change_timer_function_usage._timer;
      type _handletype;
      identifier _handle;
      identifier t;
      @@
      
       void _callback(struct timer_list *t)
       {
      -	_handletype *_handle = from_timer(_handle, t, _timer);
       }
      
      // We only want to refactor the setup_timer() data argument if we've found
      // the matching callback. This undoes changes in change_timer_function_usage.
      @unchange_timer_function_usage
       depends on change_timer_function_usage &&
                  !change_callback_handle_cast &&
                  !change_callback_handle_cast_no_arg &&
      	    !change_callback_handle_arg@
      expression change_timer_function_usage._E;
      identifier change_timer_function_usage._timer;
      identifier change_timer_function_usage._callback;
      type change_timer_function_usage._cast_data;
      @@
      
      (
      -timer_setup(&_E->_timer, _callback, 0);
      +setup_timer(&_E->_timer, _callback, (_cast_data)_E);
      |
      -timer_setup(&_E._timer, _callback, 0);
      +setup_timer(&_E._timer, _callback, (_cast_data)&_E);
      )
      
      // If we fixed a callback from a .function assignment, fix the
      // assignment cast now.
      @change_timer_function_assignment
       depends on change_timer_function_usage &&
                  (change_callback_handle_cast ||
                   change_callback_handle_cast_no_arg ||
                   change_callback_handle_arg)@
      expression change_timer_function_usage._E;
      identifier change_timer_function_usage._timer;
      identifier change_timer_function_usage._callback;
      type _cast_func;
      typedef TIMER_FUNC_TYPE;
      @@
      
      (
       _E->_timer.function =
      -_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E->_timer.function =
      -&_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E->_timer.function =
      -(_cast_func)_callback;
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E->_timer.function =
      -(_cast_func)&_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E._timer.function =
      -_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E._timer.function =
      -&_callback;
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E._timer.function =
      -(_cast_func)_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      |
       _E._timer.function =
      -(_cast_func)&_callback
      +(TIMER_FUNC_TYPE)_callback
       ;
      )
      
      // Sometimes timer functions are called directly. Replace matched args.
      @change_timer_function_calls
       depends on change_timer_function_usage &&
                  (change_callback_handle_cast ||
                   change_callback_handle_cast_no_arg ||
                   change_callback_handle_arg)@
      expression _E;
      identifier change_timer_function_usage._timer;
      identifier change_timer_function_usage._callback;
      type _cast_data;
      @@
      
       _callback(
      (
      -(_cast_data)_E
      +&_E->_timer
      |
      -(_cast_data)&_E
      +&_E._timer
      |
      -_E
      +&_E->_timer
      )
       )
      
      // If a timer has been configured without a data argument, it can be
      // converted without regard to the callback argument, since it is unused.
      @match_timer_function_unused_data@
      expression _E;
      identifier _timer;
      identifier _callback;
      @@
      
      (
      -setup_timer(&_E->_timer, _callback, 0);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, _callback, 0L);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E->_timer, _callback, 0UL);
      +timer_setup(&_E->_timer, _callback, 0);
      |
      -setup_timer(&_E._timer, _callback, 0);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, _callback, 0L);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_E._timer, _callback, 0UL);
      +timer_setup(&_E._timer, _callback, 0);
      |
      -setup_timer(&_timer, _callback, 0);
      +timer_setup(&_timer, _callback, 0);
      |
      -setup_timer(&_timer, _callback, 0L);
      +timer_setup(&_timer, _callback, 0);
      |
      -setup_timer(&_timer, _callback, 0UL);
      +timer_setup(&_timer, _callback, 0);
      |
      -setup_timer(_timer, _callback, 0);
      +timer_setup(_timer, _callback, 0);
      |
      -setup_timer(_timer, _callback, 0L);
      +timer_setup(_timer, _callback, 0);
      |
      -setup_timer(_timer, _callback, 0UL);
      +timer_setup(_timer, _callback, 0);
      )
      
      @change_callback_unused_data
       depends on match_timer_function_unused_data@
      identifier match_timer_function_unused_data._callback;
      type _origtype;
      identifier _origarg;
      @@
      
       void _callback(
      -_origtype _origarg
      +struct timer_list *unused
       )
       {
      	... when != _origarg
       }
      Signed-off-by: NKees Cook <keescook@chromium.org>
      e99e88a9
  11. 21 11月, 2017 1 次提交
  12. 20 11月, 2017 1 次提交
  13. 18 11月, 2017 1 次提交
    • G
      pid: replace pid bitmap implementation with IDR API · 95846ecf
      Gargi Sharma 提交于
      Patch series "Replacing PID bitmap implementation with IDR API", v4.
      
      This series replaces kernel bitmap implementation of PID allocation with
      IDR API.  These patches are written to simplify the kernel by replacing
      custom code with calls to generic code.
      
      The following are the stats for pid and pid_namespace object files
      before and after the replacement.  There is a noteworthy change between
      the IDR and bitmap implementation.
      
      Before
         text       data        bss        dec        hex    filename
         8447       3894         64      12405       3075    kernel/pid.o
      After
         text       data        bss        dec        hex    filename
         3397        304          0       3701        e75    kernel/pid.o
      
      Before
         text       data        bss        dec        hex    filename
         5692       1842        192       7726       1e2e    kernel/pid_namespace.o
      After
         text       data        bss        dec        hex    filename
         2854        216         16       3086        c0e    kernel/pid_namespace.o
      
      The following are the stats for ps, pstree and calling readdir on /proc
      for 10,000 processes.
      
      ps:
              With IDR API    With bitmap
      real    0m1.479s        0m2.319s
      user    0m0.070s        0m0.060s
      sys     0m0.289s        0m0.516s
      
      pstree:
              With IDR API    With bitmap
      real    0m1.024s        0m1.794s
      user    0m0.348s        0m0.612s
      sys     0m0.184s        0m0.264s
      
      proc:
              With IDR API    With bitmap
      real    0m0.059s        0m0.074s
      user    0m0.000s        0m0.004s
      sys     0m0.016s        0m0.016s
      
      This patch (of 2):
      
      Replace the current bitmap implementation for Process ID allocation.
      Functions that are no longer required, for example, free_pidmap(),
      alloc_pidmap(), etc.  are removed.  The rest of the functions are
      modified to use the IDR API.  The change was made to make the PID
      allocation less complex by replacing custom code with calls to generic
      API.
      
      [gs051095@gmail.com: v6]
        Link: http://lkml.kernel.org/r/1507760379-21662-2-git-send-email-gs051095@gmail.com
      [avagin@openvz.org: restore the old behaviour of the ns_last_pid sysctl]
        Link: http://lkml.kernel.org/r/20171106183144.16368-1-avagin@openvz.org
      Link: http://lkml.kernel.org/r/1507583624-22146-2-git-send-email-gs051095@gmail.comSigned-off-by: NGargi Sharma <gs051095@gmail.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      95846ecf
  14. 16 11月, 2017 3 次提交
  15. 15 11月, 2017 2 次提交
    • J
      fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall · 4d2dc2cc
      Jeff Layton 提交于
      Currently, we're capping the values too low in the F_GETLK64 case. The
      fields in that structure are 64-bit values, so we shouldn't need to do
      any sort of fixup there.
      
      Make sure we check that assumption at build time in the future however
      by ensuring that the sizes we're copying will fit.
      
      With this, we no longer need COMPAT_LOFF_T_MAX either, so remove it.
      
      Fixes: 94073ad7 (fs/locks: don't mess with the address limit in compat_fcntl64)
      Reported-by: NVitaly Lipatov <lav@etersoft.ru>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NDavid Howells <dhowells@redhat.com>
      4d2dc2cc
    • M
      powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 feature · 3ffa9d9e
      Michael Ellerman 提交于
      Recently we added a CPU feature for Power9 DD2.0, to capture the fact
      that some workarounds are required only on Power9 DD1 and DD2.0 but
      not DD2.1 or later.
      
      Then in commit 9d2f510a ("powerpc/64s/idle: avoid POWER9 DD1 and
      DD2.0 ERAT workaround on DD2.1") and commit e3646330
      "powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on
      DD2.1") we changed CPU_FTR_SECTIONs to check for DD1 or DD20, eg:
      
        BEGIN_FTR_SECTION
                PPC_INVALIDATE_ERAT
        END_FTR_SECTION_IFSET(CPU_FTR_POWER9_DD1 | CPU_FTR_POWER9_DD20)
      
      Unfortunately although this reads as "if set DD1 or DD2.0", the or is
      a bitwise or and actually generates a mask of both bits. The code that
      does the feature patching then checks that the value of the CPU
      features masked with that mask are equal to the mask.
      
      So the end result is we're checking for DD1 and DD20 being set, which
      never happens. Yes the API is terrible.
      
      Removing the ERAT workaround on DD2.0 results in random SEGVs, the
      system tends to boot, but things randomly die including sometimes
      dhclient, udev etc.
      
      To fix the problem and hopefully avoid it in future, we remove the
      DD2.0 CPU feature and instead add a DD2.1 (or later) feature. This
      allows us to easily express that the workarounds are required if DD2.1
      is not set.
      
      At some point we will drop the DD1 workarounds entirely and some of
      this can be cleaned up.
      
      Fixes: 9d2f510a ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 ERAT workaround on DD2.1")
      Fixes: e3646330 ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on DD2.1")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3ffa9d9e
  16. 14 11月, 2017 1 次提交
    • M
      powerpc/64s: Fix masking of SRR1 bits on instruction fault · 475b581f
      Michael Ellerman 提交于
      On 64-bit Book3s, when we take an instruction fault the reason for the
      fault may be reported in SRR1. For data faults the reason is reported
      in DSISR (Data Storage Instruction Status Register).
      
      The reasons reported in each do not necessarily correspond, so we mask
      the SRR1 bits before copying them to the DSISR, which is then used by
      the page fault code.
      
      Prior to commit b4c001dc ("powerpc/mm: Use symbolic constants for
      filtering SRR1 bits on ISIs") we used a hard-coded mask of 0x58200000,
      which corresponds to:
      
        DSISR_NOHPTE		0x40000000 /* no translation found */
        DSISR_NOEXEC_OR_G	0x10000000 /* exec of no-exec or guarded */
        DSISR_PROTFAULT	0x08000000 /* protection fault */
        DSISR_KEYFAULT	0x00200000 /* Storage Key fault */
      
      That commit added a #define for the mask, DSISR_SRR1_MATCH_64S, but
      incorrectly used a different similarly named DSISR_BAD_FAULT_64S.
      
      This had the effect of changing the mask to 0xa43a0000, which omits
      everything but DSISR_KEYFAULT.
      
      Luckily this had no visible effect, because in practice we hardly use
      the DSISR bits. The lack of DSISR_NOHPTE means a TLB flush
      optimisation was missed in the native HPTE code, and DSISR_NOEXEC_OR_G
      and DSISR_PROTFAULT are both only used to trigger rare warnings.
      
      So we got lucky, but let's fix it. The new value only has bits between
      17 and 30 set, so we can continue to use andis.
      
      Fixes: b4c001dc ("powerpc/mm: Use symbolic constants for filtering SRR1 bits on ISIs")
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      475b581f
  17. 13 11月, 2017 11 次提交