1. 04 10月, 2006 40 次提交
    • J
      [PATCH] RCU: add fake writers to rcutorture · b772e1dd
      Josh Triplett 提交于
      rcutorture currently has one writer and an arbitrary number of readers.  To
      better exercise some of the code paths in RCU implementations, add fake
      writer threads which call the synchronize function for the RCU variant in a
      loop, with a delay between calls to arrange for different numbers of
      writers running in parallel.
      
      [bunk@stusta.de: cleanup]
      Acked-by: NPaul McKenney <paulmck@us.ibm.com>
      Cc: Dipkanar Sarma <dipankar@in.ibm.com>
      Signed-off-by: NJosh Triplett <josh@freedesktop.org>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b772e1dd
    • J
      [PATCH] rcu: Fix sign bug making rcu_random always return the same sequence · 75cfef32
      Josh Triplett 提交于
      rcu_random uses a counter rrs_count to occasionally mix data from
      get_random_bytes into the state of its pseudorandom generator.  However,
      the rrs_counter gets declared as an unsigned long, and rcu_random checks
      for --rrs_count < 0, so this code will never mix any real random data into
      the state, and will thus always return the same sequence of random numbers.
      
      Also, change the return value of rcu_random from long to unsigned long, to
      avoid potential issues caused by the use of the % operator, which can
      return negative values for negative left operands.
      Signed-off-by: NJosh Triplett <josh@freedesktop.org>
      Acked-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      75cfef32
    • J
      [PATCH] rcu: Avoid kthread_stop on invalid pointer if rcutorture reader startup fails · 2860aaba
      Josh Triplett 提交于
      rcu_torture_init kmallocs the array of reader threads, then creates each
      one with kthread_run, cleaning up with rcu_torture_cleanup if this fails.
      rcu_torture_cleanup calls kthread_stop on any non-NULL pointer in the
      array; however, any readers after the one that failed to start up will have
      invalid pointers, not null pointers.  Avoid this by using kzalloc instead.
      Signed-off-by: NJosh Triplett <josh@freedesktop.org>
      Acked-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2860aaba
    • J
      [PATCH] rcu: Mention rcu_bh in description of rcutorture's torture_type parameter · 3c29e03d
      Josh Triplett 提交于
      The comment for rcutorture's torture_type parameter only lists the RCU
      variants rcu and srcu, but not rcu_bh; add rcu_bh to the list.
      Signed-off-by: NJosh Triplett <josh@freedesktop.org>
      Acked-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3c29e03d
    • J
    • A
      [PATCH] cpufreq: make the transition_notifier chain use SRCU · b4dfdbb3
      Alan Stern 提交于
      This patch (as762) changes the cpufreq_transition_notifier_list from a
      blocking_notifier_head to an srcu_notifier_head.  This will prevent errors
      caused attempting to call down_read() to access the notifier chain at a
      time when interrupts must remain disabled, during system suspend.
      
      It's not clear to me whether this is really necessary; perhaps the chain
      could be made into an atomic_notifier.  However a couple of the callout
      routines do use blocking operations, so this approach seems safer.
      
      The head of the notifier chain needs to be initialized before use; this is
      done by an __init routine at core_initcall time.  If this turns out not to
      be a good choice, it can easily be changed.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: Jesse Brandeburg <jesse.brandeburg@gmail.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b4dfdbb3
    • A
      [PATCH] SRCU: report out-of-memory errors · e6a92013
      Alan Stern 提交于
      Currently the init_srcu_struct() routine has no way to report out-of-memory
      errors.  This patch (as761) makes it return -ENOMEM when the per-cpu data
      allocation fails.
      
      The patch also makes srcu_init_notifier_head() report a BUG if a notifier
      head can't be initialized.  Perhaps it should return -ENOMEM instead, but
      in the most likely cases where this might occur I don't think any recovery
      is possible.  Notifier chains generally are not created dynamically.
      
      [akpm@osdl.org: avoid statement-with-side-effect in macro]
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Acked-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e6a92013
    • A
      [PATCH] Add SRCU-based notifier chains · eabc0694
      Alan Stern 提交于
      This patch (as751) adds a new type of notifier chain, based on the SRCU
      (Sleepable Read-Copy Update) primitives recently added to the kernel.  An
      SRCU notifier chain is much like a blocking notifier chain, in that it must
      be called in process context and its callout routines are allowed to sleep.
       The difference is that the chain's links are protected by the SRCU
      mechanism rather than by an rw-semaphore, so calling the chain has
      extremely low overhead: no memory barriers and no cache-line bouncing.  On
      the other hand, unregistering from the chain is expensive and the chain
      head requires special runtime initialization (plus cleanup if it is to be
      deallocated).
      
      SRCU notifiers are appropriate for notifiers that will be called very
      frequently and for which unregistration occurs very seldom.  The proposed
      "task notifier" scheme qualifies, as may some of the network notifiers.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Acked-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Acked-by: NChandra Seetharaman <sekharan@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      eabc0694
    • P
      [PATCH] srcu-3: add SRCU operations to rcutorture · b2896d2e
      Paul E. McKenney 提交于
      Adds SRCU operations to rcutorture and updates rcutorture documentation.
      Also increases the stress imposed by the rcutorture test.
      
      [bunk@stusta.de: make needlessly global code static]
      Signed-off-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Cc: Paul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NAdrian Bunk <bunk@stusta.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b2896d2e
    • P
      [PATCH] srcu-3: RCU variant permitting read-side blocking · 621934ee
      Paul E. McKenney 提交于
      Updated patch adding a variant of RCU that permits sleeping in read-side
      critical sections.  SRCU is as follows:
      
      o	Each use of SRCU creates its own srcu_struct, and each
      	srcu_struct has its own set of grace periods.  This is
      	critical, as it prevents one subsystem with a blocking
      	reader from holding up SRCU grace periods for other
      	subsystems.
      
      o	The SRCU primitives (srcu_read_lock(), srcu_read_unlock(),
      	and synchronize_srcu()) all take a pointer to a srcu_struct.
      
      o	The SRCU primitives must be called from process context.
      
      o	srcu_read_lock() returns an int that must be passed to
      	the matching srcu_read_unlock().  Realtime RCU avoids the
      	need for this by storing the state in the task struct,
      	but SRCU needs to allow a given code path to pass through
      	multiple SRCU domains -- storing state in the task struct
      	would therefore require either arbitrary space in the
      	task struct or arbitrary limits on SRCU nesting.  So I
      	kicked the state-storage problem up to the caller.
      
      	Of course, it is not permitted to call synchronize_srcu()
      	while in an SRCU read-side critical section.
      
      o	There is no call_srcu().  It would not be hard to implement
      	one, but it seems like too easy a way to OOM the system.
      	(Hey, we have enough trouble with call_rcu(), which does
      	-not- permit readers to sleep!!!)  So, if you want it,
      	please tell me why...
      
      [josht@us.ibm.com: sparse notation]
      Signed-off-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NJosh Triplett <josh@freedesktop.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      621934ee
    • E
      [PATCH] htirq: tidy up the htirq code · 95d77884
      Eric W. Biederman 提交于
      This moves the declarations for the architecture helpers into
      include/linux/htirq.h from the generic include/linux/pci.h.  Hopefully this
      will make this distinction clearer.
      
      htirq.h is included where it is needed.
      
      The dependency on the msi code is fixed and removed.
      
      The Makefile is tidied up.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      95d77884
    • E
      [PATCH] msi: move the ia64 code into arch/ia64 · 03571e11
      Eric W. Biederman 提交于
      This is just a few makefile tweaks and some file renames.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      03571e11
    • E
      [PATCH] msi: refactor and move the msi irq_chip into the arch code · 3b7d1921
      Eric W. Biederman 提交于
      It turns out msi_ops was simply not enough to abstract the architecture
      specific details of msi.  So I have moved the resposibility of constructing
      the struct irq_chip to the architectures, and have two architecture specific
      functions arch_setup_msi_irq, and arch_teardown_msi_irq.
      
      For simple architectures those functions can do all of the work.  For
      architectures with platform dependencies they can call into the appropriate
      platform code.
      
      With this msi.c is finally free of assuming you have an apic, and this
      actually takes less code.
      
      The helpers for the architecture specific code are declared in the linux/msi.h
      to keep them separate from the msi functions used by drivers in linux/pci.h
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3b7d1921
    • E
      [PATCH] msi: only use a single irq_chip for msi interrupts · 277bc33b
      Eric W. Biederman 提交于
      The logic works like this.
      
      Since we no longer track the state logic by hand in msi.c startup and shutdown
      are no longer needed.
      
      By updating msi_set_mask_bit to work on msi devices that do not implement a
      mask bit we can always call the mask/unmask functions.
      
      What we really have are mask and unmask so we use them to implement the .mask
      and .unmask functions instead of .enable and .disable.
      
      By switching to the handle_edge_irq handler we only need an ack function that
      moves the irq if necessary.  Which removes the old end and ack functions and
      their peculiar logic of sometimes disabling an irq.
      
      This removes the reliance on pre genirq irq handling methods.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      277bc33b
    • E
      [PATCH] msi: simplify msi sanity checks by adding with generic irq code · 1f80025e
      Eric W. Biederman 提交于
      Currently msi.c is doing sanity checks that make certain before an irq is
      destroyed it has no more users.
      
      By adding irq_has_action I can perform the test is a generic way, instead of
      relying on a msi specific data structure.
      
      By performing the core check in dynamic_irq_cleanup I ensure every user of
      dynamic irqs has a test present and we don't free resources that are in use.
      
      In msi.c this allows me to kill the attrib.state member of msi_desc and all of
      the assciated code to maintain it.
      
      To keep from freeing data structures when irq cleanup code is called to soon
      changing dyanamic_irq_cleanup is insufficient because there are msi specific
      data structures that are also not safe to free.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Greg KH <greg@kroah.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1f80025e
    • E
      [PATCH] Initial generic hypertransport interrupt support · 8b955b0d
      Eric W. Biederman 提交于
      This patch implements two functions ht_create_irq and ht_destroy_irq for
      use by drivers.  Several other functions are implemented as helpers for
      arch specific irq_chip handlers.
      
      The driver for the card I tested this on isn't yet ready to be merged.
      However this code is and hypertransport irqs are in use in a few other
      places in the kernel.  Not that any of this will get merged before 2.6.19
      
      Because the ipath-ht400 is slightly out of spec this code will need to be
      generalized to work there.
      
      I think all of the powerpc uses are for a plain interrupt controller in a
      chipset so support for native hypertransport devices is a little less
      interesting.
      
      However I think this is a half way decent model on how to separate arch
      specific and generic helper code, and I think this is a functional model of
      how to get the architecture dependencies out of the msi code.
      
      [akpm@osdl.org: Kconfig fix]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8b955b0d
    • E
      [PATCH] Add Hypertransport capability defines · e78d0169
      Eric W. Biederman 提交于
      This adds defines for the hypertransport capability subtypes and starts
      using them a little.
      
      [akpm@osdl.org: fix typo]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e78d0169
    • E
      [PATCH] genirq: x86_64 irq: Kill irq compression · cd1182f5
      Eric W. Biederman 提交于
      With more irqs in the system we don't need this.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      cd1182f5
    • E
      [PATCH] genirq: x86_64 irq: Kill gsi_irq_sharing · f023d764
      Eric W. Biederman 提交于
      After raising the number of irqs the system supports this function is no
      longer necessary.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f023d764
    • E
      [PATCH] genirq: x86_64 irq: make vector_irq per cpu · 550f2299
      Eric W. Biederman 提交于
      This refactors the irq handling code to make the vectors a per cpu resource so
      the same vector number can be simultaneously used on multiple cpus for
      different irqs.
      
      This should make systems that were hitting limits on the total number of irqs
      much more livable.
      
      [akpm@osdl.org: build fix]
      [akpm@osdl.org: __target_IO_APIC_irq is unneeded on UP]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      550f2299
    • E
      [PATCH] genirq: x86_64 irq: Make the external irq handlers report their vector, not the irq number · e500f574
      Eric W. Biederman 提交于
      This is a small pessimization but it paves the way for making this information
      per cpu.  Which allows the the maximum number of IRQS to become NR_CPUS*224.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e500f574
    • E
      [PATCH] genirq: irq: generalize the check for HARDIRQ_BITS · 23d0b8b0
      Eric W. Biederman 提交于
      This patch adds support for systems that cannot receive every interrupt on a
      single cpu simultaneously, in the check to see if we have enough HARDIRQ_BITS.
      
      MAX_HARDIRQS_PER_CPU becomes the count of the maximum number of hardare
      generated interrupts per cpu.
      
      On architectures that support per cpu interrupt delivery this can be a
      significant space savings and scalability bonus.
      
      This patch adds support for systems that cannot receive every interrupt on
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      23d0b8b0
    • E
      [PATCH] genirq: irq: remove msi hacks · 323a01c5
      Eric W. Biederman 提交于
      Because of the nasty way that CONFIG_PCI_MSI was implemented we wound up with
      set_irq_info and set_native_irq_info, with move_irq and move_native_irq.  Both
      functions did the same thing but they were built and called under different
      circumstances.  Now that the msi hacks are gone we can kill move_irq and
      set_irq_info.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      323a01c5
    • E
      [PATCH] genirq: i386 irq: Remove the msi assumption that irq == vector · ace80ab7
      Eric W. Biederman 提交于
      This patch removes the change in behavior of the irq allocation code when
      CONFIG_PCI_MSI is defined.  Removing all instances of the assumption that irq
      == vector.
      
      create_irq is rewritten to first allocate a free irq and then to assign that
      irq a vector.
      
      assign_irq_vector is made static and the AUTO_ASSIGN case which allocates an
      vector not bound to an irq is removed.
      
      The ioapic vector methods are removed, and everything now works with irqs.
      
      The definition of NR_IRQS no longer depends on CONFIG_PCI_MSI
      
      [akpm@osdl.org: cleanup]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ace80ab7
    • E
      [PATCH] genirq: x86_64 irq: Remove the msi assumption that irq == vector · 04b9267b
      Eric W. Biederman 提交于
      This patch removes the change in behavior of the irq allocation code when
      CONFIG_PCI_MSI is defined.  Removing all instances of the assumption that irq
      == vector.
      
      create_irq is rewritten to first allocate a free irq and then to assign that
      irq a vector.
      
      assign_irq_vector is made static and the AUTO_ASSIGN case which allocates an
      vector not bound to an irq is removed.
      
      The ioapic vector methods are removed, and everything now works with irqs.
      
      The definition of NR_IRQS no longer depends on CONFIG_PCI_MSI
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      04b9267b
    • E
      [PATCH] genirq: msi: only build msi-apic.c on ia64 · 4b2fabb9
      Eric W. Biederman 提交于
      After the previous changes ia64 is the only architecture useing msi-apic.c
      
      [akpm@osdl.org: unbreak MSI on ia64]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4b2fabb9
    • E
      [PATCH] genirq: i386 irq: Move msi message composition into io_apic.c · 2d3fcc1c
      Eric W. Biederman 提交于
      This removes the hardcoded assumption that irq == vector in the msi
      composition code, and it allows the msi message composition to setup logical
      mode, or lowest priorirty delivery mode as we do for other apic interrupts,
      and with the same selection criteria.
      
      Basically this moves the problem of what is in the msi message into the
      architecture irq management code where it belongs.  Not in a generic layer
      that doesn't have enough information to compose msi messages properly.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2d3fcc1c
    • E
      [PATCH] genirq: x86_64 irq: Move msi message composition into io_apic.c · 589e367f
      Eric W. Biederman 提交于
      This removes the hardcoded assumption that irq == vector in the msi
      composition code, and it allows the msi message composition to setup logical
      mode, or lowest priorirty delivery mode as we do for other apic interrupts,
      and with the same selection criteria.
      
      Basically this moves the problem of what is in the msi message into the
      architecture irq management code where it belongs.  Not in a generic layer
      that doesn't have enough information to compose msi messages properly.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      589e367f
    • E
      [PATCH] genirq: msi: make the msi code irq based and not vector based · 1ce03373
      Eric W. Biederman 提交于
      The msi currently allocates irqs backwards.  First it allocates a platform
      dependent routing value for an interrupt the ``vector'' and then it figures
      out from the vector which irq you are on.
      
      For ia64 this is fine.  For x86 and x86_64 this is complete nonsense and makes
      an enourmous mess of the irq handling code and prevents some pretty
      significant cleanups in the code for handling large numbers of irqs.
      
      This patch refactors msi.c to work in terms of irqs and create_irq/destroy_irq
      for dynamically managing irqs.
      
      Hopefully this is finally a version of msi.c that is useful on more than just
      x86 derivatives.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1ce03373
    • E
      [PATCH] genirq: x86_64 irq: Dynamic irq support · c4fa0bbf
      Eric W. Biederman 提交于
      The current implementation of create_irq() is a hack but it is the current
      hack that msi.c uses, and unfortunately the ``generic'' apic msi ops depend on
      this hack.  Thus we are this hack of assuming irq == vector until the
      depencencies in the generic irq code are removed.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c4fa0bbf
    • E
      [PATCH] genirq: i386 irq: Dynamic irq support · 3fc471ed
      Eric W. Biederman 提交于
      The current implementation of create_irq() is a hack but it is the current
      hack that msi.c uses, and unfortunately the ``generic'' apic msi ops depend on
      this hack.  Thus we are stuck this hack of assuming irq == vector until the
      depencencies in the generic msi code are removed.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3fc471ed
    • E
      [PATCH] genirq: ia64 irq: Dynamic irq support · b6cf2583
      Eric W. Biederman 提交于
      [akpm@osdl.org: build fix]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b6cf2583
    • E
      [PATCH] genirq: irq: add a dynamic irq creation API · 3a16d713
      Eric W. Biederman 提交于
      With the msi support comes a new concept in irq handling, irqs that are
      created dynamically at run time.
      
      Currently the msi code allocates irqs backwards.  First it allocates a
      platform dependent routing value for an interrupt the ``vector'' and then it
      figures out from the vector which irq you are on.
      
      This msi backwards allocator suffers from two basic problems.  The allocator
      suffers because it is trying to do something that is architecture specific in
      a generic way making it brittle, inflexible, and tied to tightly to the
      architecture implementation.  The alloctor also suffers from it's very
      backwards nature as it has tied things together that should have no
      dependencies.
      
      To solve the basic dynamic irq allocation problem two new architecture
      specific functions are added: create_irq and destroy_irq.
      
      create_irq takes no input and returns an unused irq number, that won't be
      reused until it is returned to the free poll with destroy_irq.  The irq then
      can be used for any purpose although the only initial consumer is the msi
      code.
      
      destroy_irq takes an irq number allocated with create_irq and returns it to
      the free pool.
      
      Making this functionality per architecture increases the simplicity of the irq
      allocation code and increases it's flexibility.
      
      dynamic_irq_init() and dynamic_irq_cleanup() are added to automate the
      irq_desc initializtion that should happen for dynamic irqs.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3a16d713
    • E
      [PATCH] genirq: msi: simplify the msi irq limit policy · 92db6d10
      Eric W. Biederman 提交于
      Currently we attempt to predict how many irqs we will be able to allocate with
      msi using pci_vector_resources and some complicated accounting, and then we
      only allow each device as many irqs as we think are available on average.
      
      Only the s2io driver even takes advantage of this feature all other drivers
      have a fixed number of irqs they need and bail if they can't get them.
      
      pci_vector_resources is inaccurate if anyone ever frees an irq.  The whole
      implmentation is racy.  The current irq limit policy does not appear to make
      sense with current drivers.  So I have simplified things.  We can revisit this
      we we need a more sophisticated policy.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      92db6d10
    • E
      [PATCH] genirq: msi: refactor the msi_ops · 38bc0361
      Eric W. Biederman 提交于
      The current msi_ops are short sighted in a number of ways, this patch attempts
      to fix the glaring deficiences.
      
      - Report in msi_ops if a 64bit address is needed in the msi message, so we
        can fail 32bit only msi structures.
      
      - Send and receive a full struct msi_msg in both setup and target.  This is
        a little cleaner and allows for architectures that need to modify the data
        to retarget the msi interrupt to a different cpu.
      
      - In target pass in the full cpu mask instead of just the first cpu in case
        we can make use of the full cpu mask.
      
      - Operate in terms of irqs and not vectors, currently there is still a 1-1
        relationship but on architectures other than ia64 I expect this will change.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      38bc0361
    • E
      [PATCH] genirq: msi: implement helper functions read_msi_msg and write_msi_msg · 0366f8f7
      Eric W. Biederman 提交于
      In support of this I also add a struct msi_msg that captures the the two
      address and one data field ina typical msi message, and I remember the pos and
      if the address is 64bit in struct msi_desc.
      
      This makes the code a little more readable and easier to maintain, and paves
      the way to further simplfications.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0366f8f7
    • E
      [PATCH] genirq: msi: make the msi boolean tests return either 0 or 1 · dd159eec
      Eric W. Biederman 提交于
      This allows the output of the msi tests to be stored directly in a bit field.
      If you don't do this a value greater than one will be truncated and become 0.
      Changing true to false with bizare consequences.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      dd159eec
    • E
      [PATCH] genirq: msi: simplify msi enable and disable · 7bd007e4
      Eric W. Biederman 提交于
      The problem.  Because the disable routines leave the msi interrupts in all
      sorts of half enabled states the enable routines become impossible to
      implement correctly, and almost impossible to understand.
      
      Simplifing this allows me to simply kill the buggy reroute_msix_table, and
      generally makes the code more maintainable.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7bd007e4
    • E
      [PATCH] genirq: x86_64 irq: Reenable migrating irqs to other cpus · 0be6652f
      Eric W. Biederman 提交于
      In the latest changes the code for migrating x86_64 irqs was dropped.  This
      reads it in a fashion that will work even if we change the vector on level
      triggered irqs when we migrate them.
      
      [akpm@osdl.org: build fix]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0be6652f
    • E
      [PATCH] genirq: irq: add moved_masked_irq · e7b946e9
      Eric W. Biederman 提交于
      Currently move_native_irq disables and renables the irq we are migrating to
      ensure we don't take that irq when we are actually doing the migration
      operation.  Disabling the irq needs to happen but sometimes doing the work is
      move_native_irq is too late.
      
      On x86 with ioapics the irq move sequences needs to be:
      edge_triggered:
        mask irq.
        move irq.
        unmask irq.
        ack irq.
      level_triggered:
        mask irq.
        ack irq.
        move irq.
        unmask irq.
      
      We can easily perform the edge triggered sequence, with the current defintion
      of move_native_irq.  However the level triggered case does not map well.  For
      that I have added move_masked_irq, to allow me to disable the irqs around both
      the ack and the move.
      
      Q: Why have we not seen this problem earlier?
      
      A: The only symptom I have been able to reproduce is that if we change
         the vector before acknowleding an irq the wrong irq is acknowledged.
         Since we currently are not reprogramming the irq vector during
         migration no problems show up.
      
         We have to mask the irq before we acknowledge the irq or else we could
         hit a window where an irq is asserted just before we acknowledge it.
      
         Edge triggered irqs do not have this problem because acknowledgements
         do not propogate in the same way.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e7b946e9