1. 02 11月, 2017 3 次提交
    • P
      irqchip: mips-gic: Use irq_cpu_online to (un)mask all-VP(E) IRQs · da61fcf9
      Paul Burton 提交于
      The gic_all_vpes_local_irq_controller chip currently attempts to operate
      on all CPUs/VPs in the system when masking or unmasking an interrupt.
      This has a few drawbacks:
      
       - In multi-cluster systems we may not always have access to all CPUs in
         the system. When all CPUs in a cluster are powered down that
         cluster's GIC may also power down, in which case we cannot configure
         its state.
      
       - Relatedly, if we power down a cluster after having configured
         interrupts for CPUs within it then the cluster's GIC may lose state &
         we need to reconfigure it. The current approach doesn't take this
         into account.
      
       - It's wasteful if we run Linux on fewer VPs than are present in the
         system. For example if we run a uniprocessor kernel on CPU0 of a
         system with 16 CPUs then there's no point in us configuring CPUs
         1-15.
      
       - The implementation is also lacking in that it expects the range
         0..gic_vpes-1 to represent valid Linux CPU numbers which may not
         always be the case - for example if we run on a system with more VPs
         than the kernel is configured to support.
      
      Fix all of these issues by only configuring the affected interrupts for
      CPUs which are online at the time, and recording the configuration in a
      new struct gic_all_vpes_chip_data for later use by CPUs being brought
      online. We register a CPU hotplug state (reusing
      CPUHP_AP_IRQ_GIC_STARTING which the ARM GIC driver uses, and which seems
      suitably generic for reuse with the MIPS GIC) and execute
      irq_cpu_online() in order to configure the interrupts on the newly
      onlined CPU.
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mips@linux-mips.org
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      da61fcf9
    • D
      irqdomain: Update the comments of fwnode field of irq_domain structure · 4b821300
      Dou Liyang 提交于
      Commit:
      
      f110711a ("irqdomain: Convert irqdomain-%3Eof_node to fwnode")
      
      converted of_node field to fwnode, but didn't update its comments.
      
      Update it.
      
      Fixes: f110711a ("irqdomain: Convert irqdomain-%3Eof_node to fwnode")
      Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      4b821300
    • M
      irqchip/gic-v3-its: Setup VLPI properties at map time · d4d7b4ad
      Marc Zyngier 提交于
      So far, we require the hypervisor to update the VLPI properties
      once the the VLPI mapping has been established. While this
      makes it easy for the ITS driver, it creates a window where
      an incoming interrupt can be delivered with an unknown set
      of properties. Not very nice.
      
      Instead, let's add a "properties" field to the mapping structure,
      and use that to configure the VLPI before it actually gets mapped.
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      d4d7b4ad
  2. 19 10月, 2017 4 次提交
    • M
      irqchip/gic-v3-its: Limit scope of VPE mapping to be per ITS · 2247e1bf
      Marc Zyngier 提交于
      So far, we map all VPEs on all ITSs. While this is not wrong,
      this is quite a big hammer, as moving a VPE around requires
      all ITSs to be synchronized. Needles to say, this is an
      expensive proposition.
      
      Instead, let's switch to a mode where we issue VMAPP commands
      only on ITSs that are actually involved in reporting interrupts
      to the given VM.
      
      For that purpose, we refcount the number of interrupts are are
      mapped for this VM on each ITS, performing the map/unmap
      operations as required. It then allows us to use this refcount
      to only issue VMOVP to the ITSs that need to know about this
      VM.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      2247e1bf
    • M
      irqchip/gic-v3-its: Make GICv4_ITS_LIST_MAX globally available · ab60491e
      Marc Zyngier 提交于
      As we're about to make use of the maximum number of ITSs in
      a GICv4 system, let's make this value global (and rename it to
      GICv4_ITS_LIST_MAX).
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      ab60491e
    • S
      irqchip/gic-v3: Add support for Range Selector (RS) feature · eda0d04a
      Shanker Donthineni 提交于
      A new feature Range Selector (RS) has been added to GIC specification
      in order to support more than 16 CPUs at affinity level 0. New fields
      are introduced in SGI system registers (ICC_SGI0R_EL1, ICC_SGI1R_EL1
      and ICC_ASGI1R_EL1) to relax an artificial limit of 16 at level 0.
      
      - A new RSS field in ICC_CTLR_EL3, ICC_CTLR_EL1 and ICV_CTLR_EL1:
        [18] - Range Selector Support (RSS)
        0b0 = Targeted SGIs with affinity level 0 values of 0-15 are supported.
        0b1 = Targeted SGIs with affinity level 0 values of 0-255 are supported.
      
      - A new RS field in ICC_SGI0R_EL1, ICC_SGI1R_EL1 and ICC_ASGI1R_EL1:
        [47:44] - RangeSelector (RS) which group of 16 TargetList[n] field
                  TargetList[n] represents aff0 value ((RS*16)+n)
                  When ICC_CTLR_EL3.RSS==0 or ICC_CTLR_EL1.RSS==0, RS is RES0.
      
      - A new RSS field in GICD_TYPER:
        [26] - Range Selector Support (RSS)
        0b0 = Targeted SGIs with affinity level 0 values of 0-15 are supported.
        0b1 = Targeted SGIs with affinity level 0 values of 0-255 are supported.
      Signed-off-by: NShanker Donthineni <shankerd@codeaurora.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      eda0d04a
    • M
      irqdomain: Move revmap_trees_mutex to struct irq_domain · f1d78358
      Masahiro Yamada 提交于
      The revmap_trees_mutex protects domain->revmap_tree.  There is no
      need to make it global because it is allowed to modify revmap_tree
      of two different domains concurrently.  Having said that, this would
      not be a actual bottleneck because the interrupt map/unmap does not
      occur quite often.  Rather, the motivation is to tidy up the code
      from a data structure point of view.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      f1d78358
  3. 17 10月, 2017 1 次提交
  4. 29 9月, 2017 6 次提交
  5. 28 9月, 2017 1 次提交
    • K
      timer: Prepare to change timer callback argument type · 686fef92
      Kees Cook 提交于
      Modern kernel callback systems pass the structure associated with a
      given callback to the callback function. The timer callback remains one
      of the legacy cases where an arbitrary unsigned long argument continues
      to be passed as the callback argument. This has several problems:
      
      - This bloats the timer_list structure with a normally redundant
        .data field.
      
      - No type checking is being performed, forcing callbacks to do
        explicit type casts of the unsigned long argument into the object
        that was passed, rather than using container_of(), as done in most
        of the other callback infrastructure.
      
      - Neighboring buffer overflows can overwrite both the .function and
        the .data field, providing attackers with a way to elevate from a buffer
        overflow into a simplistic ROP-like mechanism that allows calling
        arbitrary functions with a controlled first argument.
      
      - For future Control Flow Integrity work, this creates a unique function
        prototype for timer callbacks, instead of allowing them to continue to
        be clustered with other void functions that take a single unsigned long
        argument.
      
      This adds a new timer initialization API, which will ultimately replace
      the existing setup_timer(), setup_{deferrable,pinned,etc}_timer() family,
      named timer_setup() (to mirror hrtimer_setup(), making instances of its
      use much easier to grep for).
      
      In order to support the migration of existing timers into the new
      callback arguments, timer_setup() casts its arguments to the existing
      legacy types, and explicitly passes the timer pointer as the legacy
      data argument. Once all setup_*timer() callers have been replaced with
      timer_setup(), the casts can be removed, and the data argument can be
      dropped with the timer expiration code changed to just pass the timer
      to the callback directly.
      
      Since the regular pattern of using container_of() during local variable
      declaration repeats the need for the variable type declaration
      to be included, this adds a helper modeled after other from_*()
      helpers that wrap container_of(), named from_timer(). This helper uses
      typeof(*variable), removing the type redundancy and minimizing the need
      for line wraps in forthcoming conversions from "unsigned data long" to
      "struct timer_list *" in the timer callbacks:
      
      -void callback(unsigned long data)
      +void callback(struct timer_list *t)
      {
      -   struct some_data_structure *local = (struct some_data_structure *)data;
      +   struct some_data_structure *local = from_timer(local, t, timer);
      
      Finally, in order to support the handful of timer users that perform
      open-coded assignments of the .function (and .data) fields, provide
      cast macros (TIMER_FUNC_TYPE and TIMER_DATA_TYPE) that can be used
      temporarily. Once conversion has been completed, these can be globally
      trivially removed.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20170928133817.GA113410@beast
      686fef92
  6. 27 9月, 2017 1 次提交
  7. 26 9月, 2017 13 次提交
    • P
      smp/hotplug: Hotplug state fail injection · 1db49484
      Peter Zijlstra 提交于
      Add a sysfs file to one-time fail a specific state. This can be used
      to test the state rollback code paths.
      
      Something like this (hotplug-up.sh):
      
        #!/bin/bash
      
        echo 0 > /debug/sched_debug
        echo 1 > /debug/tracing/events/cpuhp/enable
      
        ALL_STATES=`cat /sys/devices/system/cpu/hotplug/states | cut -d':' -f1`
        STATES=${1:-$ALL_STATES}
      
        for state in $STATES
        do
      	  echo 0 > /sys/devices/system/cpu/cpu1/online
      	  echo 0 > /debug/tracing/trace
      	  echo Fail state: $state
      	  echo $state > /sys/devices/system/cpu/cpu1/hotplug/fail
      	  cat /sys/devices/system/cpu/cpu1/hotplug/fail
      	  echo 1 > /sys/devices/system/cpu/cpu1/online
      
      	  cat /debug/tracing/trace > hotfail-${state}.trace
      
      	  sleep 1
        done
      
      Can be used to test for all possible rollback (barring multi-instance)
      scenarios on CPU-up, CPU-down is a trivial modification of the above.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: bigeasy@linutronix.de
      Cc: efault@gmx.de
      Cc: rostedt@goodmis.org
      Cc: max.byungchul.park@gmail.com
      Link: https://lkml.kernel.org/r/20170920170546.972581715@infradead.org
      
      1db49484
    • P
      smp/hotplug: Add state diagram · fac1c204
      Peter Zijlstra 提交于
      Add a state diagram to clarify when which states are ran where.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: bigeasy@linutronix.de
      Cc: efault@gmx.de
      Cc: rostedt@goodmis.org
      Cc: max.byungchul.park@gmail.com
      Link: https://lkml.kernel.org/r/20170920170546.661598270@infradead.org
      
      fac1c204
    • J
      nvmet-fc: sync header templates with comments · 6b71f9e1
      James Smart 提交于
      Comments were incorrect:
      - defer_rcv was in host port template. moved to target port template
      - Added Mandatory statements for target port template items
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6b71f9e1
    • T
      genirq/matrix: Add tracepoints · ec0f7cd2
      Thomas Gleixner 提交于
      Add tracepoints for the irq bitmap matrix allocator.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213153.279468022@linutronix.de
      ec0f7cd2
    • T
      genirq: Implement bitmap matrix allocator · 2f75d9e1
      Thomas Gleixner 提交于
      Implement the infrastructure for a simple bitmap based allocator, which
      will replace the x86 vector allocator. It's in the core code as other
      architectures might be able to reuse/extend it. For now it only implements
      allocations for single CPUs, but it's simple to add multi CPU allocation
      support if required.
      
      The concept is rather simple:
      
       Global information:
       	system_vector bitmap
      	global accounting
      
       PerCPU information:
       	allocation bitmap
      	managed allocation bitmap
      	local accounting
      
      The system vector bitmap is used to exclude vectors system wide from the
      allocation space.
      
      The allocation bitmap is used to keep track of per cpu used vectors.
      
      The managed allocation bitmap is used to reserve vectors for managed
      interrupts.
      
      When a regular (non managed) interrupt allocation happens then the
      following rule applies:
      
            tmpmap = system_map | alloc_map | managed_map
            find_zero_bit(tmpmap)
      
      Oring the bitmaps together gives the real available space. The same rule
      applies for reserving a managed interrupt vector. But contrary to the
      regular interrupts the reservation only marks the bit in the managed map
      and therefor excludes it from the regular allocations. The managed map is
      only cleaned out when the a managed interrupt is completely released and it
      stays alive accross CPU offline/online operations.
      
      For managed interrupt allocations the rule is:
      
            tmpmap = managed_map & ~alloc_map
            find_first_bit(tmpmap)
      
      This returns the first bit which is in the managed map, but not yet
      allocated in the allocation map. The allocation marks it in the allocation
      map and hands it back to the caller for use.
      
      The rest of the code are helper functions to handle the various
      requirements and the accounting which are necessary to replace the x86
      vector allocation code. The result is a single patch as the evolution of
      this infrastructure cannot be represented in bits and pieces.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213153.185437174@linutronix.de
      2f75d9e1
    • T
      genirq/irqdomain: Add force reactivation flag to irq domains · 22d0b12f
      Thomas Gleixner 提交于
      Allow irqdomains to tell the core code, that after early activation the
      interrupt needs to be reactivated at request_irq() time.
      
      This allows reservation of vectors at early activation time and actual
      vector assignment at request_irq() time.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213153.106242536@linutronix.de
      22d0b12f
    • T
      genirq/irqdomain: Propagate early activation · 42e1cc2d
      Thomas Gleixner 提交于
      Propagate the early activation mode to the irqdomain activate()
      callbacks. This is required for the upcoming reservation, late vector
      assignment scheme, so that the early activation call can act accordingly.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213153.028353660@linutronix.de
      42e1cc2d
    • T
      genirq/irqdomain: Allow irq_domain_activate_irq() to fail · bb9b428a
      Thomas Gleixner 提交于
      Allow irq_domain_activate_irq() to fail. This is required to support a
      reservation and late vector assignment scheme.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213152.933882227@linutronix.de
      bb9b428a
    • T
      genirq/irqdomain: Update irq_domain_ops.activate() signature · 72491643
      Thomas Gleixner 提交于
      The irq_domain_ops.activate() callback has no return value and no way to
      tell the function that the activation is early.
      
      The upcoming changes to support a reservation scheme which allows to assign
      interrupt vectors on x86 only when the interrupt is actually requested
      requires:
      
        - A return value, so activation can fail at request_irq() time
        
        - Information that the activate invocation is early, i.e. before
          request_irq().
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213152.848490816@linutronix.de
      72491643
    • T
      genirq: Make state consistent for !IRQ_DOMAIN_HIERARCHY · 457f6d35
      Thomas Gleixner 提交于
      In the !IRQ_DOMAIN_HIERARCHY cas the activation stubs are not
      setting/clearing the activation status bits. This is not a problem at the
      moment, but upcoming changes require a correct status.
      
      Add the set/clear incovations to the stub functions and move them to the
      core internal header to avoid duplication and visibility outside the core.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213152.591985591@linutronix.de
      457f6d35
    • T
      irqdomain/debugfs: Provide domain specific debug callback · c3e7239a
      Thomas Gleixner 提交于
      Some interrupt domains like the X86 vector domain has special requirements
      for debugging, like showing the vector usage on the CPUs.
      
      Add a callback to the irqdomain ops which can be filled in by domains which
      require it and add conditional invocations to the irqdomain and the per irq
      debug files.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213152.512937505@linutronix.de
      c3e7239a
    • T
      genirq/msi: Capture device name for debugfs · 07557ccb
      Thomas Gleixner 提交于
      For debugging the allocation of unused or potentially leaked interrupt
      descriptor it's helpful to have some information about the site which
      allocated them. In case of MSI this is simple because the caller hands the
      device struct pointer into the domain allocation function.
      
      Duplicate the device name and show it in the debugfs entry of the interrupt
      descriptor.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NJuergen Gross <jgross@suse.com>
      Tested-by: NYu Chen <yu.c.chen@intel.com>
      Acked-by: NJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213152.433038426@linutronix.de
      07557ccb
    • G
      PCI: Add dummy pci_acs_enabled() for CONFIG_PCI=n build · fe594932
      Geert Uytterhoeven 提交于
      If CONFIG_PCI=n and gcc (e.g. 4.1.2) decides not to inline
      get_pci_function_alias_group(), the build fails with:
      
        drivers/iommu/iommu.o: In function `get_pci_function_alias_group':
        iommu.c:(.text+0xfdc): undefined reference to `pci_acs_enabled'
      
      Due to the various dummies for PCI calls in the CONFIG_PCI=n case,
      pci_acs_enabled() never called, but not all versions of gcc are smart
      enough to realize that.
      
      While explicitly marking get_pci_function_alias_group() inline would fix
      the build, this would inflate the code for the CONFIG_PCI=y case, as
      get_pci_function_alias_group() is a not-so-small function called from two
      places.
      
      Hence fix the issue by introducing a dummy for pci_acs_enabled() instead.
      
      Fixes: 0ae349a0 ("iommu/qcom: Add qcom_iommu")
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NAlex Williamson <alex.williamson@redhat.com>
      fe594932
  8. 25 9月, 2017 6 次提交
  9. 22 9月, 2017 3 次提交
    • E
      net: prevent dst uses after free · 222d7dbd
      Eric Dumazet 提交于
      In linux-4.13, Wei worked hard to convert dst to a traditional
      refcounted model, removing GC.
      
      We now want to make sure a dst refcount can not transition from 0 back
      to 1.
      
      The problem here is that input path attached a not refcounted dst to an
      skb. Then later, because packet is forwarded and hits skb_dst_force()
      before exiting RCU section, we might try to take a refcount on one dst
      that is about to be freed, if another cpu saw 1 -> 0 transition in
      dst_release() and queued the dst for freeing after one RCU grace period.
      
      Lets unify skb_dst_force() and skb_dst_force_safe(), since we should
      always perform the complete check against dst refcount, and not assume
      it is not zero.
      
      Bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=197005
      
      [  989.919496]  skb_dst_force+0x32/0x34
      [  989.919498]  __dev_queue_xmit+0x1ad/0x482
      [  989.919501]  ? eth_header+0x28/0xc6
      [  989.919502]  dev_queue_xmit+0xb/0xd
      [  989.919504]  neigh_connected_output+0x9b/0xb4
      [  989.919507]  ip_finish_output2+0x234/0x294
      [  989.919509]  ? ipt_do_table+0x369/0x388
      [  989.919510]  ip_finish_output+0x12c/0x13f
      [  989.919512]  ip_output+0x53/0x87
      [  989.919513]  ip_forward_finish+0x53/0x5a
      [  989.919515]  ip_forward+0x2cb/0x3e6
      [  989.919516]  ? pskb_trim_rcsum.part.9+0x4b/0x4b
      [  989.919518]  ip_rcv_finish+0x2e2/0x321
      [  989.919519]  ip_rcv+0x26f/0x2eb
      [  989.919522]  ? vlan_do_receive+0x4f/0x289
      [  989.919523]  __netif_receive_skb_core+0x467/0x50b
      [  989.919526]  ? tcp_gro_receive+0x239/0x239
      [  989.919529]  ? inet_gro_receive+0x226/0x238
      [  989.919530]  __netif_receive_skb+0x4d/0x5f
      [  989.919532]  netif_receive_skb_internal+0x5c/0xaf
      [  989.919533]  napi_gro_receive+0x45/0x81
      [  989.919536]  ixgbe_poll+0xc8a/0xf09
      [  989.919539]  ? kmem_cache_free_bulk+0x1b6/0x1f7
      [  989.919540]  net_rx_action+0xf4/0x266
      [  989.919543]  __do_softirq+0xa8/0x19d
      [  989.919545]  irq_exit+0x5d/0x6b
      [  989.919546]  do_IRQ+0x9c/0xb5
      [  989.919548]  common_interrupt+0x93/0x93
      [  989.919548]  </IRQ>
      
      Similarly dst_clone() can use dst_hold() helper to have additional
      debugging, as a follow up to commit 44ebe791 ("net: add debug
      atomic_inc_not_zero() in dst_hold()")
      
      In net-next we will convert dst atomic_t to refcount_t for peace of
      mind.
      
      Fixes: a4c2fd7f ("net: remove DST_NOCACHE flag")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Reported-by: NPaweł Staszewski <pstaszewski@itcare.pl>
      Bisected-by: NPaweł Staszewski <pstaszewski@itcare.pl>
      Acked-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      222d7dbd
    • D
      Input: uinput - avoid FF flush when destroying device · e8b95728
      Dmitry Torokhov 提交于
      Normally, when input device supporting force feedback effects is being
      destroyed, we try to "flush" currently playing effects, so that the
      physical device does not continue vibrating (or executing other effects).
      Unfortunately this does not work well for uinput as flushing of the effects
      deadlocks with the destroy action:
      
      - if device is being destroyed because the file descriptor is being closed,
        then there is noone to even service FF requests;
      
      - if device is being destroyed because userspace sent UI_DEV_DESTROY,
        while theoretically it could be possible to service FF requests,
        userspace is unlikely to do so (they'd need to make sure FF handling
        happens on a separate thread) even if kernel solves the issue with FF
        ioctls deadlocking with UI_DEV_DESTROY ioctl on udev->mutex.
      
      To avoid lockups like the one below, let's install a custom input device
      flush handler, and avoid trying to flush force feedback effects when we
      destroying the device, and instead rely on uinput to shut off the device
      properly.
      
      NMI watchdog: Watchdog detected hard LOCKUP on cpu 3
      ...
       <<EOE>>  [<ffffffff817a0307>] _raw_spin_lock_irqsave+0x37/0x40
       [<ffffffff810e633d>] complete+0x1d/0x50
       [<ffffffffa00ba08c>] uinput_request_done+0x3c/0x40 [uinput]
       [<ffffffffa00ba587>] uinput_request_submit.part.7+0x47/0xb0 [uinput]
       [<ffffffffa00bb62b>] uinput_dev_erase_effect+0x5b/0x76 [uinput]
       [<ffffffff815d91ad>] erase_effect+0xad/0xf0
       [<ffffffff815d929d>] flush_effects+0x4d/0x90
       [<ffffffff815d4cc0>] input_flush_device+0x40/0x60
       [<ffffffff815daf1c>] evdev_cleanup+0xac/0xc0
       [<ffffffff815daf5b>] evdev_disconnect+0x2b/0x60
       [<ffffffff815d74ac>] __input_unregister_device+0xac/0x150
       [<ffffffff815d75f7>] input_unregister_device+0x47/0x70
       [<ffffffffa00bac45>] uinput_destroy_device+0xb5/0xc0 [uinput]
       [<ffffffffa00bb2de>] uinput_ioctl_handler.isra.9+0x65e/0x740 [uinput]
       [<ffffffff811231ab>] ? do_futex+0x12b/0xad0
       [<ffffffffa00bb3f8>] uinput_ioctl+0x18/0x20 [uinput]
       [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480
       [<ffffffff81337553>] ? security_file_ioctl+0x43/0x60
       [<ffffffff812414a9>] SyS_ioctl+0x79/0x90
       [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71
      Reported-by: NRodrigo Rivas Costa <rodrigorivascosta@gmail.com>
      Reported-by: NClément VUCHENER <clement.vuchener@gmail.com>
      Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=193741Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      e8b95728
    • F
      net: ethtool: Add back transceiver type · 19cab887
      Florian Fainelli 提交于
      Commit 3f1ac7a7 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API")
      deprecated the ethtool_cmd::transceiver field, which was fine in
      premise, except that the PHY library was actually using it to report the
      type of transceiver: internal or external.
      
      Use the first word of the reserved field to put this __u8 transceiver
      field back in. It is made read-only, and we don't expect the
      ETHTOOL_xLINKSETTINGS API to be doing anything with this anyway, so this
      is mostly for the legacy path where we do:
      
      ethtool_get_settings()
      -> dev->ethtool_ops->get_link_ksettings()
         -> convert_link_ksettings_to_legacy_settings()
      
      to have no information loss compared to the legacy get_settings API.
      
      Fixes: 3f1ac7a7 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      19cab887
  10. 21 9月, 2017 2 次提交