1. 21 10月, 2019 1 次提交
    • F
      KVM: PPC: Report single stepping capability · 1a9167a2
      Fabiano Rosas 提交于
      When calling the KVM_SET_GUEST_DEBUG ioctl, userspace might request
      the next instruction to be single stepped via the
      KVM_GUESTDBG_SINGLESTEP control bit of the kvm_guest_debug structure.
      
      This patch adds the KVM_CAP_PPC_GUEST_DEBUG_SSTEP capability in order
      to inform userspace about the state of single stepping support.
      
      We currently don't have support for guest single stepping implemented
      in Book3S HV so the capability is only present for Book3S PR and
      BookE.
      Signed-off-by: NFabiano Rosas <farosas@linux.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      1a9167a2
  2. 15 10月, 2019 1 次提交
    • G
      KVM: PPC: Book3S HV: XIVE: Ensure VP isn't already in use · 12ade69c
      Greg Kurz 提交于
      Connecting a vCPU to a XIVE KVM device means establishing a 1:1
      association between a vCPU id and the offset (VP id) of a VP
      structure within a fixed size block of VPs. We currently try to
      enforce the 1:1 relationship by checking that a vCPU with the
      same id isn't already connected. This is good but unfortunately
      not enough because we don't map VP ids to raw vCPU ids but to
      packed vCPU ids, and the packing function kvmppc_pack_vcpu_id()
      isn't bijective by design. We got away with it because QEMU passes
      vCPU ids that fit well in the packing pattern. But nothing prevents
      userspace to come up with a forged vCPU id resulting in a packed id
      collision which causes the KVM device to associate two vCPUs to the
      same VP. This greatly confuses the irq layer and ultimately crashes
      the kernel, as shown below.
      
      Example: a guest with 1 guest thread per core, a core stride of
      8 and 300 vCPUs has vCPU ids 0,8,16...2392. If QEMU is patched to
      inject at some point an invalid vCPU id 348, which is the packed
      version of itself and 2392, we get:
      
      genirq: Flags mismatch irq 199. 00010000 (kvm-2-2392) vs. 00010000 (kvm-2-348)
      CPU: 24 PID: 88176 Comm: qemu-system-ppc Not tainted 5.3.0-xive-nr-servers-5.3-gku+ #38
      Call Trace:
      [c000003f7f9937e0] [c000000000c0110c] dump_stack+0xb0/0xf4 (unreliable)
      [c000003f7f993820] [c0000000001cb480] __setup_irq+0xa70/0xad0
      [c000003f7f9938d0] [c0000000001cb75c] request_threaded_irq+0x13c/0x260
      [c000003f7f993940] [c00800000d44e7ac] kvmppc_xive_attach_escalation+0x104/0x270 [kvm]
      [c000003f7f9939d0] [c00800000d45013c] kvmppc_xive_connect_vcpu+0x424/0x620 [kvm]
      [c000003f7f993ac0] [c00800000d444428] kvm_arch_vcpu_ioctl+0x260/0x448 [kvm]
      [c000003f7f993b90] [c00800000d43593c] kvm_vcpu_ioctl+0x154/0x7c8 [kvm]
      [c000003f7f993d00] [c0000000004840f0] do_vfs_ioctl+0xe0/0xc30
      [c000003f7f993db0] [c000000000484d44] ksys_ioctl+0x104/0x120
      [c000003f7f993e00] [c000000000484d88] sys_ioctl+0x28/0x80
      [c000003f7f993e20] [c00000000000b278] system_call+0x5c/0x68
      xive-kvm: Failed to request escalation interrupt for queue 0 of VCPU 2392
      ------------[ cut here ]------------
      remove_proc_entry: removing non-empty directory 'irq/199', leaking at least 'kvm-2-348'
      WARNING: CPU: 24 PID: 88176 at /home/greg/Work/linux/kernel-kvm-ppc/fs/proc/generic.c:684 remove_proc_entry+0x1ec/0x200
      Modules linked in: kvm_hv kvm dm_mod vhost_net vhost tap xt_CHECKSUM iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter squashfs loop fuse i2c_dev sg ofpart ocxl powernv_flash at24 xts mtd uio_pdrv_genirq vmx_crypto opal_prd ipmi_powernv uio ipmi_devintf ipmi_msghandler ibmpowernv ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables ext4 mbcache jbd2 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq libcrc32c raid1 raid0 linear sd_mod ast i2c_algo_bit drm_vram_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ahci libahci libata tg3 drm_panel_orientation_quirks [last unloaded: kvm]
      CPU: 24 PID: 88176 Comm: qemu-system-ppc Not tainted 5.3.0-xive-nr-servers-5.3-gku+ #38
      NIP:  c00000000053b0cc LR: c00000000053b0c8 CTR: c0000000000ba3b0
      REGS: c000003f7f9934b0 TRAP: 0700   Not tainted  (5.3.0-xive-nr-servers-5.3-gku+)
      MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 48228222  XER: 20040000
      CFAR: c000000000131a50 IRQMASK: 0
      GPR00: c00000000053b0c8 c000003f7f993740 c0000000015ec500 0000000000000057
      GPR04: 0000000000000001 0000000000000000 000049fb98484262 0000000000001bcf
      GPR08: 0000000000000007 0000000000000007 0000000000000001 9000000000001033
      GPR12: 0000000000008000 c000003ffffeb800 0000000000000000 000000012f4ce5a1
      GPR16: 000000012ef5a0c8 0000000000000000 000000012f113bb0 0000000000000000
      GPR20: 000000012f45d918 c000003f863758b0 c000003f86375870 0000000000000006
      GPR24: c000003f86375a30 0000000000000007 c0002039373d9020 c0000000014c4a48
      GPR28: 0000000000000001 c000003fe62a4f6b c00020394b2e9fab c000003fe62a4ec0
      NIP [c00000000053b0cc] remove_proc_entry+0x1ec/0x200
      LR [c00000000053b0c8] remove_proc_entry+0x1e8/0x200
      Call Trace:
      [c000003f7f993740] [c00000000053b0c8] remove_proc_entry+0x1e8/0x200 (unreliable)
      [c000003f7f9937e0] [c0000000001d3654] unregister_irq_proc+0x114/0x150
      [c000003f7f993880] [c0000000001c6284] free_desc+0x54/0xb0
      [c000003f7f9938c0] [c0000000001c65ec] irq_free_descs+0xac/0x100
      [c000003f7f993910] [c0000000001d1ff8] irq_dispose_mapping+0x68/0x80
      [c000003f7f993940] [c00800000d44e8a4] kvmppc_xive_attach_escalation+0x1fc/0x270 [kvm]
      [c000003f7f9939d0] [c00800000d45013c] kvmppc_xive_connect_vcpu+0x424/0x620 [kvm]
      [c000003f7f993ac0] [c00800000d444428] kvm_arch_vcpu_ioctl+0x260/0x448 [kvm]
      [c000003f7f993b90] [c00800000d43593c] kvm_vcpu_ioctl+0x154/0x7c8 [kvm]
      [c000003f7f993d00] [c0000000004840f0] do_vfs_ioctl+0xe0/0xc30
      [c000003f7f993db0] [c000000000484d44] ksys_ioctl+0x104/0x120
      [c000003f7f993e00] [c000000000484d88] sys_ioctl+0x28/0x80
      [c000003f7f993e20] [c00000000000b278] system_call+0x5c/0x68
      Instruction dump:
      2c230000 41820008 3923ff78 e8e900a0 3c82ff69 3c62ff8d 7fa6eb78 7fc5f378
      3884f080 3863b948 4bbf6925 60000000 <0fe00000> 4bffff7c fba10088 4bbf6e41
      ---[ end trace b925b67a74a1d8d1 ]---
      BUG: Kernel NULL pointer dereference at 0x00000010
      Faulting instruction address: 0xc00800000d44fc04
      Oops: Kernel access of bad area, sig: 11 [#1]
      LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
      Modules linked in: kvm_hv kvm dm_mod vhost_net vhost tap xt_CHECKSUM iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter squashfs loop fuse i2c_dev sg ofpart ocxl powernv_flash at24 xts mtd uio_pdrv_genirq vmx_crypto opal_prd ipmi_powernv uio ipmi_devintf ipmi_msghandler ibmpowernv ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables ext4 mbcache jbd2 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq libcrc32c raid1 raid0 linear sd_mod ast i2c_algo_bit drm_vram_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ahci libahci libata tg3 drm_panel_orientation_quirks [last unloaded: kvm]
      CPU: 24 PID: 88176 Comm: qemu-system-ppc Tainted: G        W         5.3.0-xive-nr-servers-5.3-gku+ #38
      NIP:  c00800000d44fc04 LR: c00800000d44fc00 CTR: c0000000001cd970
      REGS: c000003f7f9938e0 TRAP: 0300   Tainted: G        W          (5.3.0-xive-nr-servers-5.3-gku+)
      MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24228882  XER: 20040000
      CFAR: c0000000001cd9ac DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0
      GPR00: c00800000d44fc00 c000003f7f993b70 c00800000d468300 0000000000000000
      GPR04: 00000000000000c7 0000000000000000 0000000000000000 c000003ffacd06d8
      GPR08: 0000000000000000 c000003ffacd0738 0000000000000000 fffffffffffffffd
      GPR12: 0000000000000040 c000003ffffeb800 0000000000000000 000000012f4ce5a1
      GPR16: 000000012ef5a0c8 0000000000000000 000000012f113bb0 0000000000000000
      GPR20: 000000012f45d918 00007ffffe0d9a80 000000012f4f5df0 000000012ef8c9f8
      GPR24: 0000000000000001 0000000000000000 c000003fe4501ed0 c000003f8b1d0000
      GPR28: c0000033314689c0 c000003fe4501c00 c000003fe4501e70 c000003fe4501e90
      NIP [c00800000d44fc04] kvmppc_xive_cleanup_vcpu+0xfc/0x210 [kvm]
      LR [c00800000d44fc00] kvmppc_xive_cleanup_vcpu+0xf8/0x210 [kvm]
      Call Trace:
      [c000003f7f993b70] [c00800000d44fc00] kvmppc_xive_cleanup_vcpu+0xf8/0x210 [kvm] (unreliable)
      [c000003f7f993bd0] [c00800000d450bd4] kvmppc_xive_release+0xdc/0x1b0 [kvm]
      [c000003f7f993c30] [c00800000d436a98] kvm_device_release+0xb0/0x110 [kvm]
      [c000003f7f993c70] [c00000000046730c] __fput+0xec/0x320
      [c000003f7f993cd0] [c000000000164ae0] task_work_run+0x150/0x1c0
      [c000003f7f993d30] [c000000000025034] do_notify_resume+0x304/0x440
      [c000003f7f993e20] [c00000000000dcc4] ret_from_except_lite+0x70/0x74
      Instruction dump:
      3bff0008 7fbfd040 419e0054 847e0004 2fa30000 419effec e93d0000 8929203c
      2f890000 419effb8 4800821d e8410018 <e9230010> e9490008 9b2a0039 7c0004ac
      ---[ end trace b925b67a74a1d8d2 ]---
      
      Kernel panic - not syncing: Fatal exception
      
      This affects both XIVE and XICS-on-XIVE devices since the beginning.
      
      Check the VP id instead of the vCPU id when a new vCPU is connected.
      The allocation of the XIVE CPU structure in kvmppc_xive_connect_vcpu()
      is moved after the check to avoid the need for rollback.
      
      Cc: stable@vger.kernel.org # v4.12+
      Signed-off-by: NGreg Kurz <groug@kaod.org>
      Reviewed-by: NCédric Le Goater <clg@kaod.org>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      12ade69c
  3. 14 10月, 2019 2 次提交
    • L
      Linux 5.4-rc3 · 4f5cafb5
      Linus Torvalds 提交于
      4f5cafb5
    • L
      Merge tag 'trace-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · d4615e5a
      Linus Torvalds 提交于
      Pull tracing fixes from Steven Rostedt:
       "A few tracing fixes:
      
         - Remove lockdown from tracefs itself and moved it to the trace
           directory. Have the open functions there do the lockdown checks.
      
         - Fix a few races with opening an instance file and the instance
           being deleted (Discovered during the lockdown updates). Kept
           separate from the clean up code such that they can be backported to
           stable easier.
      
         - Clean up and consolidated the checks done when opening a trace
           file, as there were multiple checks that need to be done, and it
           did not make sense having them done in each open instance.
      
         - Fix a regression in the record mcount code.
      
         - Small hw_lat detector tracer fixes.
      
         - A trace_pipe read fix due to not initializing trace_seq"
      
      * tag 'trace-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Initialize iter->seq after zeroing in tracing_read_pipe()
        tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency
        tracing/hwlat: Report total time spent in all NMIs during the sample
        recordmcount: Fix nop_mcount() function
        tracing: Do not create tracefs files if tracefs lockdown is in effect
        tracing: Add locked_down checks to the open calls of files created for tracefs
        tracing: Add tracing_check_open_get_tr()
        tracing: Have trace events system open call tracing_open_generic_tr()
        tracing: Get trace_array reference for available_tracers files
        ftrace: Get a reference counter for the trace_array on filter files
        tracefs: Revert ccbd54ff ("tracefs: Restrict tracefs when the kernel is locked down")
      d4615e5a
  4. 13 10月, 2019 28 次提交
    • L
      Merge tag 'hwmon-for-v5.4-rc3' of... · 2581efa9
      Linus Torvalds 提交于
      Merge tag 'hwmon-for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
      
       - Update/fix inspur-ipsps1 and k10temp Documentation
      
       - Fix nct7904 driver
      
       - Fix HWMON_P_MIN_ALARM mask in hwmon core
      
      * tag 'hwmon-for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: docs: Extend inspur-ipsps1 title underline
        hwmon: (nct7904) Add array fan_alarm and vsen_alarm to store the alarms in nct7904_data struct.
        docs: hwmon: Include 'inspur-ipsps1.rst' into docs
        hwmon: Fix HWMON_P_MIN_ALARM mask
        hwmon: (k10temp) Update documentation and add temp2_input info
        hwmon: (nct7904) Fix the incorrect value of vsen_mask in nct7904_data struct
      2581efa9
    • L
      Merge tag 'fixes-for-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux · 71b1b553
      Linus Torvalds 提交于
      Pull MTD fixes from Richard Weinberger:
       "Two fixes for MTD:
      
         - spi-nor: Fix for a regression in write_sr()
      
         - rawnand: Regression fix for the au1550nd driver"
      
      * tag 'fixes-for-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
        mtd: rawnand: au1550nd: Fix au_read_buf16() prototype
        mtd: spi-nor: Fix direction of the write_sr() transfer
      71b1b553
    • L
      Merge tag 'for-linus-20191012' of git://git.kernel.dk/linux-block · b27528b0
      Linus Torvalds 提交于
      Pull io_uring fix from Jens Axboe:
       "Single small fix for a regression in the sequence logic for linked
        commands"
      
      * tag 'for-linus-20191012' of git://git.kernel.dk/linux-block:
        io_uring: fix sequence logic for timeout requests
      b27528b0
    • P
      tracing: Initialize iter->seq after zeroing in tracing_read_pipe() · d303de1f
      Petr Mladek 提交于
      A customer reported the following softlockup:
      
      [899688.160002] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [test.sh:16464]
      [899688.160002] CPU: 0 PID: 16464 Comm: test.sh Not tainted 4.12.14-6.23-azure #1 SLE12-SP4
      [899688.160002] RIP: 0010:up_write+0x1a/0x30
      [899688.160002] Kernel panic - not syncing: softlockup: hung tasks
      [899688.160002] RIP: 0010:up_write+0x1a/0x30
      [899688.160002] RSP: 0018:ffffa86784d4fde8 EFLAGS: 00000257 ORIG_RAX: ffffffffffffff12
      [899688.160002] RAX: ffffffff970fea00 RBX: 0000000000000001 RCX: 0000000000000000
      [899688.160002] RDX: ffffffff00000001 RSI: 0000000000000080 RDI: ffffffff970fea00
      [899688.160002] RBP: ffffffffffffffff R08: ffffffffffffffff R09: 0000000000000000
      [899688.160002] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b59014720d8
      [899688.160002] R13: ffff8b59014720c0 R14: ffff8b5901471090 R15: ffff8b5901470000
      [899688.160002]  tracing_read_pipe+0x336/0x3c0
      [899688.160002]  __vfs_read+0x26/0x140
      [899688.160002]  vfs_read+0x87/0x130
      [899688.160002]  SyS_read+0x42/0x90
      [899688.160002]  do_syscall_64+0x74/0x160
      
      It caught the process in the middle of trace_access_unlock(). There is
      no loop. So, it must be looping in the caller tracing_read_pipe()
      via the "waitagain" label.
      
      Crashdump analyze uncovered that iter->seq was completely zeroed
      at this point, including iter->seq.seq.size. It means that
      print_trace_line() was never able to print anything and
      there was no forward progress.
      
      The culprit seems to be in the code:
      
      	/* reset all but tr, trace, and overruns */
      	memset(&iter->seq, 0,
      	       sizeof(struct trace_iterator) -
      	       offsetof(struct trace_iterator, seq));
      
      It was added by the commit 53d0aa77 ("ftrace:
      add logic to record overruns"). It was v2.6.27-rc1.
      It was the time when iter->seq looked like:
      
           struct trace_seq {
      	unsigned char		buffer[PAGE_SIZE];
      	unsigned int		len;
           };
      
      There was no "size" variable and zeroing was perfectly fine.
      
      The solution is to reinitialize the structure after or without
      zeroing.
      
      Link: http://lkml.kernel.org/r/20191011142134.11997-1-pmladek@suse.comSigned-off-by: NPetr Mladek <pmladek@suse.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      d303de1f
    • S
      tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency · fc64e4ad
      Srivatsa S. Bhat (VMware) 提交于
      max_latency is intended to record the maximum ever observed hardware
      latency, which may occur in either part of the loop (inner/outer). So
      we need to also consider the outer-loop sample when updating
      max_latency.
      
      Link: http://lkml.kernel.org/r/157073345463.17189.18124025522664682811.stgit@srivatsa-ubuntu
      
      Fixes: e7c15cd8 ("tracing: Added hardware latency tracer")
      Cc: stable@vger.kernel.org
      Signed-off-by: NSrivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      fc64e4ad
    • S
      tracing/hwlat: Report total time spent in all NMIs during the sample · 98dc19c1
      Srivatsa S. Bhat (VMware) 提交于
      nmi_total_ts is supposed to record the total time spent in *all* NMIs
      that occur on the given CPU during the (active portion of the)
      sampling window. However, the code seems to be overwriting this
      variable for each NMI, thereby only recording the time spent in the
      most recent NMI. Fix it by accumulating the duration instead.
      
      Link: http://lkml.kernel.org/r/157073343544.17189.13911783866738671133.stgit@srivatsa-ubuntu
      
      Fixes: 7b2c8625 ("tracing: Add NMI tracing in hwlat detector")
      Cc: stable@vger.kernel.org
      Signed-off-by: NSrivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      98dc19c1
    • S
      recordmcount: Fix nop_mcount() function · 7f8557b8
      Steven Rostedt (VMware) 提交于
      The removal of the longjmp code in recordmcount.c mistakenly made the return
      of make_nop() being negative an exit of nop_mcount(). It should not exit the
      routine, but instead just not process that part of the code. By exiting with
      an error code, it would cause the update of recordmcount to fail some files
      which would fail the build if ftrace function tracing was enabled.
      
      Link: http://lkml.kernel.org/r/20191009110538.5909fec6@gandalf.local.homeReported-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Tested-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Fixes: 3f1df120 ("recordmcount: Rewrite error/success handling")
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      7f8557b8
    • S
      tracing: Do not create tracefs files if tracefs lockdown is in effect · bf8e6021
      Steven Rostedt (VMware) 提交于
      If on boot up, lockdown is activated for tracefs, don't even bother creating
      the files. This can also prevent instances from being created if lockdown is
      in effect.
      
      Link: http://lkml.kernel.org/r/CAHk-=whC6Ji=fWnjh2+eS4b15TnbsS4VPVtvBOwCy1jjEG_JHQ@mail.gmail.comSuggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      bf8e6021
    • S
      tracing: Add locked_down checks to the open calls of files created for tracefs · 17911ff3
      Steven Rostedt (VMware) 提交于
      Added various checks on open tracefs calls to see if tracefs is in lockdown
      mode, and if so, to return -EPERM.
      
      Note, the event format files (which are basically standard on all machines)
      as well as the enabled_functions file (which shows what is currently being
      traced) are not lockde down. Perhaps they should be, but it seems counter
      intuitive to lockdown information to help you know if the system has been
      modified.
      
      Link: http://lkml.kernel.org/r/CAHk-=wj7fGPKUspr579Cii-w_y60PtRaiDgKuxVtBAMK0VNNkA@mail.gmail.comSuggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      17911ff3
    • S
      tracing: Add tracing_check_open_get_tr() · 8530dec6
      Steven Rostedt (VMware) 提交于
      Currently, most files in the tracefs directory test if tracing_disabled is
      set. If so, it should return -ENODEV. The tracing_disabled is called when
      tracing is found to be broken. Originally it was done in case the ring
      buffer was found to be corrupted, and we wanted to prevent reading it from
      crashing the kernel. But it's also called if a tracing selftest fails on
      boot. It's a one way switch. That is, once it is triggered, tracing is
      disabled until reboot.
      
      As most tracefs files can also be used by instances in the tracefs
      directory, they need to be carefully done. Each instance has a trace_array
      associated to it, and when the instance is removed, the trace_array is
      freed. But if an instance is opened with a reference to the trace_array,
      then it requires looking up the trace_array to get its ref counter (as there
      could be a race with it being deleted and the open itself). Once it is
      found, a reference is added to prevent the instance from being removed (and
      the trace_array associated with it freed).
      
      Combine the two checks (tracing_disabled and trace_array_get()) into a
      single helper function. This will also make it easier to add lockdown to
      tracefs later.
      
      Link: http://lkml.kernel.org/r/20191011135458.7399da44@gandalf.local.homeSigned-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      8530dec6
    • S
      tracing: Have trace events system open call tracing_open_generic_tr() · aa07d71f
      Steven Rostedt (VMware) 提交于
      Instead of having the trace events system open call open code the taking of
      the trace_array descriptor (with trace_array_get()) and then calling
      trace_open_generic(), have it use the tracing_open_generic_tr() that does
      the combination of the two. This requires making tracing_open_generic_tr()
      global.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      aa07d71f
    • S
      tracing: Get trace_array reference for available_tracers files · 194c2c74
      Steven Rostedt (VMware) 提交于
      As instances may have different tracers available, we need to look at the
      trace_array descriptor that shows the list of the available tracers for the
      instance. But there's a race between opening the file and an admin
      deleting the instance. The trace_array_get() needs to be called before
      accessing the trace_array.
      
      Cc: stable@vger.kernel.org
      Fixes: 607e2ea1 ("tracing: Set up infrastructure to allow tracers for instances")
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      194c2c74
    • S
      ftrace: Get a reference counter for the trace_array on filter files · 9ef16693
      Steven Rostedt (VMware) 提交于
      The ftrace set_ftrace_filter and set_ftrace_notrace files are specific for
      an instance now. They need to take a reference to the instance otherwise
      there could be a race between accessing the files and deleting the instance.
      
      It wasn't until the :mod: caching where these file operations started
      referencing the trace_array directly.
      
      Cc: stable@vger.kernel.org
      Fixes: 673feb9d ("ftrace: Add :mod: caching infrastructure to trace_array")
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      9ef16693
    • S
      tracefs: Revert ccbd54ff ("tracefs: Restrict tracefs when the kernel is locked down") · 3ed270b1
      Steven Rostedt (VMware) 提交于
      Running the latest kernel through my "make instances" stress tests, I
      triggered the following bug (with KASAN and kmemleak enabled):
      
      mkdir invoked oom-killer:
      gfp_mask=0x40cd0(GFP_KERNEL|__GFP_COMP|__GFP_RECLAIMABLE), order=0,
      oom_score_adj=0
      CPU: 1 PID: 2229 Comm: mkdir Not tainted 5.4.0-rc2-test #325
      Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
      Call Trace:
       dump_stack+0x64/0x8c
       dump_header+0x43/0x3b7
       ? trace_hardirqs_on+0x48/0x4a
       oom_kill_process+0x68/0x2d5
       out_of_memory+0x2aa/0x2d0
       __alloc_pages_nodemask+0x96d/0xb67
       __alloc_pages_node+0x19/0x1e
       alloc_slab_page+0x17/0x45
       new_slab+0xd0/0x234
       ___slab_alloc.constprop.86+0x18f/0x336
       ? alloc_inode+0x2c/0x74
       ? irq_trace+0x12/0x1e
       ? tracer_hardirqs_off+0x1d/0xd7
       ? __slab_alloc.constprop.85+0x21/0x53
       __slab_alloc.constprop.85+0x31/0x53
       ? __slab_alloc.constprop.85+0x31/0x53
       ? alloc_inode+0x2c/0x74
       kmem_cache_alloc+0x50/0x179
       ? alloc_inode+0x2c/0x74
       alloc_inode+0x2c/0x74
       new_inode_pseudo+0xf/0x48
       new_inode+0x15/0x25
       tracefs_get_inode+0x23/0x7c
       ? lookup_one_len+0x54/0x6c
       tracefs_create_file+0x53/0x11d
       trace_create_file+0x15/0x33
       event_create_dir+0x2a3/0x34b
       __trace_add_new_event+0x1c/0x26
       event_trace_add_tracer+0x56/0x86
       trace_array_create+0x13e/0x1e1
       instance_mkdir+0x8/0x17
       tracefs_syscall_mkdir+0x39/0x50
       ? get_dname+0x31/0x31
       vfs_mkdir+0x78/0xa3
       do_mkdirat+0x71/0xb0
       sys_mkdir+0x19/0x1b
       do_fast_syscall_32+0xb0/0xed
      
      I bisected this down to the addition of the proxy_ops into tracefs for
      lockdown. It appears that the allocation of the proxy_ops and then freeing
      it in the destroy_inode callback, is causing havoc with the memory system.
      Reading the documentation about destroy_inode and talking with Linus about
      this, this is buggy and wrong. When defining the destroy_inode() method, it
      is expected that the destroy_inode() will also free the inode, and not just
      the extra allocations done in the creation of the inode. The faulty commit
      causes a memory leak of the inode data structure when they are deleted.
      
      Instead of allocating the proxy_ops (and then having to free it) the checks
      should be done by the open functions themselves, and not hack into the
      tracefs directory. First revert the tracefs updates for locked_down and then
      later we can add the locked_down checks in the kernel/trace files.
      
      Link: http://lkml.kernel.org/r/20191011135458.7399da44@gandalf.local.home
      
      Fixes: ccbd54ff ("tracefs: Restrict tracefs when the kernel is locked down")
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      3ed270b1
    • L
      Merge tag 'char-misc-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · da940012
      Linus Torvalds 提交于
      Pull char/misc driver fixes from Greg KH:
       "Here are some small char/misc driver fixes for 5.4-rc3.
      
        Nothing huge here. Some binder driver fixes (although it is still
        being discussed if these all fix the reported issues or not, so more
        might be coming later), some mei device ids and fixes, and a google
        firmware driver bugfix that fixes a regression, as well as some other
        tiny fixes.
      
        All have been in linux-next with no reported issues"
      
      * tag 'char-misc-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        firmware: google: increment VPD key_len properly
        w1: ds250x: Fix build error without CRC16
        virt: vbox: fix memory leak in hgcm_call_preprocess_linaddr
        binder: Fix comment headers on binder_alloc_prepare_to_free()
        binder: prevent UAF read in print_binder_transaction_log_entry()
        misc: fastrpc: prevent memory leak in fastrpc_dma_buf_attach
        mei: avoid FW version request on Ibex Peak and earlier
        mei: me: add comet point (lake) LP device ids
      da940012
    • L
      Merge tag 'staging-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 9cbc6348
      Linus Torvalds 提交于
      Pull staging/IIO driver fixes from Greg KH:
       "Here are some staging and IIO driver fixes for 5.4-rc3.
      
        The "biggest" thing here is a removal of the fbtft device and flexfb
        code as they have been abandoned by their authors and are no longer
        needed for that hardware.
      
        Other than that, the usual amount of staging driver and iio driver
        fixes for reported issues, and some speakup sysfs file documentation,
        which has been long awaited for.
      
        All have been in linux-next with no reported issues"
      
      * tag 'staging-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (32 commits)
        iio: Fix an undefied reference error in noa1305_probe
        iio: light: opt3001: fix mutex unlock race
        iio: adc: ad799x: fix probe error handling
        iio: light: add missing vcnl4040 of_compatible
        iio: light: fix vcnl4000 devicetree hooks
        iio: imu: st_lsm6dsx: fix waitime for st_lsm6dsx i2c controller
        iio: adc: axp288: Override TS pin bias current for some models
        iio: imu: adis16400: fix memory leak
        iio: imu: adis16400: release allocated memory on failure
        iio: adc: stm32-adc: fix a race when using several adcs with dma and irq
        iio: adc: stm32-adc: move registers definitions
        iio: accel: adxl372: Perform a reset at start up
        iio: accel: adxl372: Fix push to buffers lost samples
        iio: accel: adxl372: Fix/remove limitation for FIFO samples
        iio: adc: hx711: fix bug in sampling of data
        staging: vt6655: Fix memory leak in vt6655_probe
        staging: exfat: Use kvzalloc() instead of kzalloc() for exfat_sb_info
        Staging: fbtft: fix memory leak in fbtft_framebuffer_alloc
        staging: speakup: document sysfs attributes
        staging: rtl8188eu: fix HighestRate check in odm_ARFBRefresh_8188E()
        ...
      9cbc6348
    • L
      Merge tag 'tty-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 82c87e7d
      Linus Torvalds 提交于
      Pull tty/serial driver fixes from Greg KH:
       "Here are some small tty and serial driver fixes for 5.4-rc3 that
        resolve a number of reported issues and regressions.
      
        None of these are huge, full details are in the shortlog. There's also
        a MAINTAINERS update that I think you might have already taken in your
        tree already, but git should handle that merge easily.
      
        All have been in linux-next with no reported issues"
      
      * tag 'tty-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        MAINTAINERS: kgdb: Add myself as a reviewer for kgdb/kdb
        tty: serial: imx: Use platform_get_irq_optional() for optional IRQs
        serial: fix kernel-doc warning in comments
        serial: 8250_omap: Fix gpio check for auto RTS/CTS
        serial: mctrl_gpio: Check for NULL pointer
        tty: serial: fsl_lpuart: Fix lpuart_flush_buffer()
        tty: serial: Fix PORT_LINFLEXUART definition
        tty: n_hdlc: fix build on SPARC
        serial: uartps: Fix uartps_major handling
        serial: uartlite: fix exit path null pointer
        tty: serial: linflexuart: Fix magic SysRq handling
        serial: sh-sci: Use platform_get_irq_optional() for optional interrupts
        dt-bindings: serial: sh-sci: Document r8a774b1 bindings
        serial/sifive: select SERIAL_EARLYCON
        tty: serial: rda: Fix the link time qualifier of 'rda_uart_exit()'
        tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()'
      82c87e7d
    • L
      Merge tag 'usb-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 6c90bbd0
      Linus Torvalds 提交于
      Pull USB fixes from Greg KH:
       "Here are a lot of small USB driver fixes for 5.4-rc3.
      
        syzbot has stepped up its testing of the USB driver stack, now able to
        trigger fun race conditions between disconnect and probe functions.
        Because of that we have a lot of fixes in here from Johan and others
        fixing these reported issues that have been around since almost all
        time.
      
        We also are just deleting the rio500 driver, making all of the syzbot
        bugs found in it moot as it turns out no one has been using it for
        years as there is a userspace version that is being used instead.
      
        There are also a number of other small fixes in here, all resolving
        reported issues or regressions.
      
        All have been in linux-next without any reported issues"
      
      * tag 'usb-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (65 commits)
        USB: yurex: fix NULL-derefs on disconnect
        USB: iowarrior: use pr_err()
        USB: iowarrior: drop redundant iowarrior mutex
        USB: iowarrior: drop redundant disconnect mutex
        USB: iowarrior: fix use-after-free after driver unbind
        USB: iowarrior: fix use-after-free on release
        USB: iowarrior: fix use-after-free on disconnect
        USB: chaoskey: fix use-after-free on release
        USB: adutux: fix use-after-free on release
        USB: ldusb: fix NULL-derefs on driver unbind
        USB: legousbtower: fix use-after-free on release
        usb: cdns3: Fix for incorrect DMA mask.
        usb: cdns3: fix cdns3_core_init_role()
        usb: cdns3: gadget: Fix full-speed mode
        USB: usb-skeleton: drop redundant in-urb check
        USB: usb-skeleton: fix use-after-free after driver unbind
        USB: usb-skeleton: fix NULL-deref on disconnect
        usb:cdns3: Fix for CV CH9 running with g_zero driver.
        usb: dwc3: Remove dev_err() on platform_get_irq() failure
        usb: dwc3: Switch to platform_get_irq_byname_optional()
        ...
      6c90bbd0
    • L
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 328fefad
      Linus Torvalds 提交于
      Pull scheduler fixes from Ingo Molnar:
       "Two fixes: a guest-cputime accounting fix, and a cgroup bandwidth
        quota precision fix"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/vtime: Fix guest/system mis-accounting on task switch
        sched/fair: Scale bandwidth quota and period without losing quota/period ratio precision
      328fefad
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 465a7e29
      Linus Torvalds 提交于
      Pull perf fixes from Ingo Molnar:
       "Mostly tooling fixes, but also a couple of updates for new Intel
        models (which are technically hw-enablement, but to users it's a fix
        to perf behavior on those new CPUs - hope this is fine), an AUX
        inheritance fix, event time-sharing fix, and a fix for lost non-perf
        NMI events on AMD systems"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
        perf/x86/cstate: Add Tiger Lake CPU support
        perf/x86/msr: Add Tiger Lake CPU support
        perf/x86/intel: Add Tiger Lake CPU support
        perf/x86/cstate: Update C-state counters for Ice Lake
        perf/x86/msr: Add new CPU model numbers for Ice Lake
        perf/x86/cstate: Add Comet Lake CPU support
        perf/x86/msr: Add Comet Lake CPU support
        perf/x86/intel: Add Comet Lake CPU support
        perf/x86/amd: Change/fix NMI latency mitigation to use a timestamp
        perf/core: Fix corner case in perf_rotate_context()
        perf/core: Rework memory accounting in perf_mmap()
        perf/core: Fix inheritance of aux_output groups
        perf annotate: Don't return -1 for error when doing BPF disassembly
        perf annotate: Return appropriate error code for allocation failures
        perf annotate: Fix arch specific ->init() failure errors
        perf annotate: Propagate the symbol__annotate() error return
        perf annotate: Fix the signedness of failure returns
        perf annotate: Propagate perf_env__arch() error
        perf evsel: Fall back to global 'perf_env' in perf_evsel__env()
        perf tools: Propagate get_cpuid() error
        ...
      465a7e29
    • L
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9b4e40c8
      Linus Torvalds 提交于
      Pull EFI fixes from Ingo Molnar:
       "Misc EFI fixes all across the map: CPER error report fixes, fixes to
        TPM event log parsing, fix for a kexec hang, a Sparse fix and other
        fixes"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi/tpm: Fix sanity check of unsigned tbl_size being less than zero
        efi/x86: Do not clean dummy variable in kexec path
        efi: Make unexported efi_rci2_sysfs_init() static
        efi/tpm: Only set 'efi_tpm_final_log_size' after successful event log parsing
        efi/tpm: Don't traverse an event log with no events
        efi/tpm: Don't access event->count when it isn't mapped
        efivar/ssdt: Don't iterate over EFI vars if no SSDT override was specified
        efi/cper: Fix endianness of PCIe class code
      9b4e40c8
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fcb45a28
      Linus Torvalds 提交于
      Pull x86 fixes from Ingo Molnar:
       "A handful of fixes: a kexec linking fix, an AMD MWAITX fix, a vmware
        guest support fix when built under Clang, and new CPU model number
        definitions"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu: Add Comet Lake to the Intel CPU models header
        lib/string: Make memzero_explicit() inline instead of external
        x86/cpu/vmware: Use the full form of INL in VMWARE_PORT
        x86/asm: Fix MWAITX C-state hint value
      fcb45a28
    • L
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e9ec3588
      Linus Torvalds 提交于
      Pull x86 license tag fixlets from Ingo Molnar:
       "Fix a couple of SPDX tags in x86 headers to follow the canonical
        pattern"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86: Use the correct SPDX License Identifier in headers
      e9ec3588
    • L
      Merge tag 'riscv/for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 48acba98
      Linus Torvalds 提交于
      Pull RISC-V fixes from Paul Walmsley:
      
       - Fix several bugs in the breakpoint trap handler
      
       - Drop an unnecessary loop around calls to preempt_schedule_irq()
      
      * tag 'riscv/for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        RISC-V: entry: Remove unneeded need_resched() loop
        riscv: Correct the handling of unexpected ebreak in do_trap_break()
        riscv: avoid sending a SIGTRAP to a user thread trapped in WARN()
        riscv: avoid kernel hangs when trapped in BUG()
      48acba98
    • L
      Merge tag 'mips_fixes_5.4_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 63f9bff5
      Linus Torvalds 提交于
      Pull MIPS fixes from Paul Burton:
      
       - Build fixes for CONFIG_OPTIMIZE_INLINING=y builds in which the
         compiler may choose not to inline __xchg() & __cmpxchg().
      
       - A build fix for Loongson configurations with GCC 9.x.
      
       - Expose some extra HWCAP bits to indicate support for various
         instruction set extensions to userland.
      
       - Fix bad stack access in firmware handling code for old SNI
         RM200/300/400 machines.
      
      * tag 'mips_fixes_5.4_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: Disable Loongson MMI instructions for kernel build
        MIPS: elf_hwcap: Export userspace ASEs
        MIPS: fw: sni: Fix out of bounds init of o32 stack
        MIPS: include: Mark __xchg as __always_inline
        MIPS: include: Mark __cmpxchg as __always_inline
      63f9bff5
    • L
      Merge tag 'powerpc-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · db60a5a0
      Linus Torvalds 提交于
      Pull powerpc fixes from Michael Ellerman:
       "Fix a kernel crash in spufs_create_root() on Cell machines, since the
        new mount API went in.
      
        Fix a regression in our KVM code caused by our recent PCR changes.
      
        Avoid a warning message about a failing hypervisor API on systems that
        don't have that API.
      
        A couple of minor build fixes.
      
        Thanks to: Alexey Kardashevskiy, Alistair Popple, Desnes A. Nunes do
        Rosario, Emmanuel Nicolet, Jordan Niethe, Laurent Dufour, Stephen
        Rothwell"
      
      * tag 'powerpc-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        spufs: fix a crash in spufs_create_root()
        powerpc/kvm: Fix kvmppc_vcore->in_guest value in kvmhv_switch_to_host
        selftests/powerpc: Fix compile error on tlbie_test due to newer gcc
        powerpc/pseries: Remove confusing warning message.
        powerpc/64s/radix: Fix build failure with RADIX_MMU=n
      db60a5a0
    • L
      Merge tag 'for-linus-5.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 680b5b3c
      Linus Torvalds 提交于
      Pull xen fixes from Juergen Gross:
      
       - correct panic handling when running as a Xen guest
      
       - cleanup the Xen grant driver to remove printing a pointer being
         always NULL
      
       - remove a soon to be wrong call of of_dma_configure()
      
      * tag 'for-linus-5.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen: Stop abusing DT of_dma_configure API
        xen/grant-table: remove unnecessary printing
        x86/xen: Return from panic notifier
      680b5b3c
    • L
      Merge tag 's390-5.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · f154988a
      Linus Torvalds 提交于
      Pull s390 fixes from Vasily Gorbik:
      
       - Fix virtio-ccw DMA regression
      
       - Fix compiler warnings in uaccess
      
      * tag 's390-5.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/uaccess: avoid (false positive) compiler warnings
        s390/cio: fix virtio-ccw DMA without PV
      f154988a
  5. 12 10月, 2019 8 次提交