提交 · 36d93d88a5396baa135f8bcde7b8501dfe3b8e53 · openeuler / raspberrypi-kernel

22 7月, 2012 1 次提交

Revert "x86/early_printk: Replace obsolete simple_strtoul() usage with kstrtoint()" · 36d93d88

由 Ingo Molnar 提交于 6月 22, 2012

This reverts commit fbd24153.

This commit is subtly buggy: kstrto*int() can return an error but
it's not checked in every path. simple_strtoul() on the other hand
could not fail, so this patch subtly intruduces new failure modes.
Signed-off-by: NShuah Khan <shuahkhan@gmail.com>
Link: http://lkml.kernel.org/r/1338424803.3569.5.camel@lorien2Signed-off-by: NIngo Molnar <mingo@kernel.org>

36d93d88

06 7月, 2012 3 次提交

x86/apic/x2apic: Use multiple cluster members for the irq destination only... · d872818d

由 Suresh Siddha 提交于 6月 25, 2012

x86/apic/x2apic: Use multiple cluster members for the irq destination only with the explicit affinity

During boot or driver load etc, interrupt destination is setup
using default target cpu's. Later the user (irqbalance etc) or
the driver (irq_set_affinity/ irq_set_affinity_hint) can request
the interrupt to be migrated to some specific set of cpu's.

In the x2apic cluster routing, for the default scenario use
single cpu as the interrupt destination and when there is an
explicit interrupt affinity request, route the interrupt to
multiple members of a x2apic cluster specified in the cpumask of
the migration request.

This will minmize the vector pressure when there are lot of
interrupt sources and relatively few x2apic clusters (for
example a single socket server). This will allow the performance
critical interrupts to be routed to multiple cpu's in the x2apic
cluster (irqbalance for example uses the cache siblings etc
while specifying the interrupt destination) and allow
non-critical interrupts to be serviced by a single logical cpu.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NAlexander Gordeev <agordeev@redhat.com>
Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
Link: http://lkml.kernel.org/r/1340656709-11423-4-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

d872818d

x86/apic/x2apic: Limit the vector reservation to the user specified mask · 1ac322d0

由 Suresh Siddha 提交于 6月 25, 2012

For the x2apic cluster mode, vector for an interrupt is
currently reserved on all the cpu's that are part of the x2apic
cluster. But the interrupts will be routed only to the cluster
(derived from the first cpu in the mask) members specified in
the mask. So there is no need to reserve the vector in the
unused cluster members.

Modify __assign_irq_vector() to reserve the vectors based on the
user specified irq destination mask. If the new mask is a proper
subset of the currently used mask, cleanup the vector allocation
on the unused cpu members.

Also, allow the apic driver to tune the vector domain based on
the affinity mask (which in most cases is the user-specified
mask).
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NAlexander Gordeev <agordeev@redhat.com>
Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
Link: http://lkml.kernel.org/r/1340656709-11423-3-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

1ac322d0

x86/apic: Optimize cpu traversal in __assign_irq_vector() using domain membership · b39f25a8

由 Suresh Siddha 提交于 6月 25, 2012

Currently __assign_irq_vector() goes through each cpu in the
specified mask until it finds a free vector in all the cpu's
that are part of the same interrupt domain. We visit all the
interrupt domain sibling cpus to reserve the free vector. So,
when we fail to find a free vector in an interrupt domain, it is
safe to continue our search with a cpu belonging to a new
interrupt domain. No need to go through each cpu, if the domain
containing that cpu is already visited.

Use the irq_cfg's old_domain to track the visited domains and
optimize the cpu traversal while finding a free vector in the
given cpumask.

NOTE: We can also optimize the search by using for_each_cpu() and
skip the current cpu, if it is not the first cpu in the mask
returned by the vector_allocation_domain(). But re-using the
cfg->old_domain to track the visited domains will be slightly
faster.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NAlexander Gordeev <agordeev@redhat.com>
Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
Link: http://lkml.kernel.org/r/1340656709-11423-2-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

b39f25a8

18 6月, 2012 1 次提交

x86/vsmp: Fix vector_allocation_domain's return value · abf71f30

由 Ido Yariv 提交于 6月 15, 2012

Commit 8637e38a ("x86/apic: Avoid useless scanning thru a
cpumask in assign_irq_vector()") modified
vector_allocation_domain() to return a boolean indicating if
cpumask is dynamic or static. Adjust vSMP's callback
implementation accordingly.
Signed-off-by: NIdo Yariv <ido@wizery.com>
Acked-by: NShai Fultheim <shai@scalemp.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/1339773055-27397-1-git-send-email-ido@wizery.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

abf71f30

15 6月, 2012 2 次提交

irq/apic: Use config_enabled(CONFIG_SMP) checks to clean up irq_set_affinity() for UP · 7eb9ae07

由 Suresh Siddha 提交于 6月 14, 2012

Move the ->irq_set_affinity() routines out of the #ifdef CONFIG_SMP
sections and use config_enabled(CONFIG_SMP) checks inside those
routines. Thus making those routines simple null stubs for
!CONFIG_SMP and retaining those routines with no additional
runtime overhead for CONFIG_SMP kernels.

Cleans up the ifdef CONFIG_SMP in and around routines related to
irq_set_affinity in io_apic and irq_remapping subsystems.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: torvalds@linux-foundation.org
Cc: joerg.roedel@amd.com
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1339723729.3475.63.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

7eb9ae07

x86/vsmp: Fix linker error when CONFIG_PROC_FS is not set · d48daf37

由 Ido Yariv 提交于 6月 14, 2012

set_vsmp_pv_ops() references no_irq_affinity which is undeclared
if CONFIG_PROC_FS isn't set. Fix this by adding an #ifdef around
this variable's access.
Reported-by: NFengguang Wu <wfg@linux.intel.com>
Signed-off-by: NIdo Yariv <ido@wizery.com>
Acked-by: NShai Fultheim <shai@scalemp.com>
Link: http://lkml.kernel.org/r/1339688588-12674-1-git-send-email-ido@wizery.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

d48daf37

14 6月, 2012 6 次提交

x86/apic/es7000: Make apicid of a cluster (not CPU) from a cpumask · 5a0a2a30

由 Alexander Gordeev 提交于 6月 14, 2012

cpu_mask_to_apicid_and() always returns apicid of a single CPU,
even in case multiple CPUs were requested. This update fixes a
typo and forces apicid of a cluster to be returned.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614075043.GI3383@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

5a0a2a30

x86/apic/es7000+summit: Always make valid apicid from a cpumask · 214e270b

由 Alexander Gordeev 提交于 6月 14, 2012

In case of invalid parameters cpu_mask_to_apicid_and() might
return apicid value of 0 (on Summit) or a uninitialized value
(on ES7000), although it is supposed to return apicid of cpu-0
at least. Fix the operation to always return a valid apicid.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614075026.GH3383@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

214e270b

x86/apic/es7000+summit: Fix compile warning in cpu_mask_to_apicid() · 49ad3fd4

由 Alexander Gordeev 提交于 6月 14, 2012

Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614075010.GG3383@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

49ad3fd4

x86/apic: Fix ugly casting and branching in cpu_mask_to_apicid_and() · ea3807ea

由 Alexander Gordeev 提交于 6月 14, 2012

Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614074954.GF3383@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ea3807ea

x86/apic: Eliminate cpu_mask_to_apicid() operation · a5a39156

由 Alexander Gordeev 提交于 6月 14, 2012

Since there are only two locations where cpu_mask_to_apicid() is
called from, remove the operation and use only
cpu_mask_to_apicid_and() instead.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Suggested-and-acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614074935.GE3383@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

a5a39156

x86/x2apic/cluster: Vector_allocation_domain() should return a value · cac4afbc

由 Alexander Gordeev 提交于 6月 14, 2012

Since commit 8637e38a ("x86/apic: Avoid useless scanning thru a
cpumask in assign_irq_vector()") vector_allocation_domain()
operation indicates if a cpumask is dynamic or static. This
update fixes the oversight and makes the operation to return a
value.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614103933.GJ3383@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

cac4afbc

11 6月, 2012 1 次提交

x86/vsmp: Ignore IOAPIC IRQ affinity if possible · 110c1e1f

由 Ravikiran Thirumalai 提交于 6月 03, 2012

vSMP can route interrupts more optimally based on internal
knowledge the OS does not have. In order to support this
optimization, all CPUs must be able to handle all possible
IOAPIC interrupts.

Fix this by setting the vector allocation domain for all CPUs
and by enabling this feature in vSMP.
Signed-off-by: NRavikiran Thirumalai <kiran.thirumalai@gmail.com>
Signed-off-by: NShai Fultheim <shai@scalemp.com>
[ Rebased, simplified, and reworded the commit message. ]
Signed-off-by: NIdo Yariv <ido@wizery.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

110c1e1f

08 6月, 2012 5 次提交

x86/apic: Make cpu_mask_to_apicid() operations check cpu_online_mask · 4988a40c

由 Alexander Gordeev 提交于 6月 07, 2012

Currently cpu_mask_to_apicid() should not get a offline CPU with
the cpumask. Otherwise some apic drivers might try to access
non-existent per-cpu variables (i.e. x2apic). In that regard
cpu_mask_to_apicid() and cpu_mask_to_apicid_and() operations are
inconsistent.

This fix makes the two operations do not rely on calling
functions and always return the apicid for only online CPUs. As
result, the meaning and implementations of cpu_mask_to_apicid()
and cpu_mask_to_apicid_and() operations become straight.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131624.GG4759@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

4988a40c

x86/apic: Make cpu_mask_to_apicid() operations return error code · ff164324

由 Alexander Gordeev 提交于 6月 07, 2012

Current cpu_mask_to_apicid() and cpu_mask_to_apicid_and()
implementations have few shortcomings:

1. A value returned by cpu_mask_to_apicid() is written to
hardware registers unconditionally. Should BAD_APICID get ever
returned it will be written to a hardware too. But the value of
BAD_APICID is not universal across all hardware in all modes and
might cause unexpected results, i.e. interrupts might get routed
to CPUs that are not configured to receive it.

2. Because the value of BAD_APICID is not universal it is
counter- intuitive to return it for a hardware where it does not
make sense (i.e. x2apic).

3. cpu_mask_to_apicid_and() operation is thought as an
complement to cpu_mask_to_apicid() that only applies a AND mask
on top of a cpumask being passed. Yet, as consequence of 18374d89
commit the two operations are inconsistent in that of:
  cpu_mask_to_apicid() should not get a offline CPU with the cpumask
  cpu_mask_to_apicid_and() should not fail and return BAD_APICID
These limitations are impossible to realize just from looking at
the operations prototypes.

Most of these shortcomings are resolved by returning a error
code instead of BAD_APICID. As the result, faults are reported
back early rather than possibilities to cause a unexpected
behaviour exist (in case of [1]).

The only exception is setup_timer_IRQ0_pin() routine. Although
obviously controversial to this fix, its existing behaviour is
preserved to not break the fragile check_timer() and would
better addressed in a separate fix.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131559.GF4759@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ff164324

x86/apic: Avoid useless scanning thru a cpumask in assign_irq_vector() · 8637e38a

由 Alexander Gordeev 提交于 6月 07, 2012

In case of static vector allocation domains (i.e. flat) if all
vector numbers are exhausted, an attempt to assign a new vector
will lead to useless scans through all CPUs in the cpumask, even
though it is known that each new pass would fail. Make this
corner case less painful by letting report whether the vector
allocation domain depends on passed arguments or not and stop
scanning early.

The same could have been achived by introducing a static flag to
the apic operations. But let's allow vector_allocation_domain()
have more intelligence here and decide dynamically, in case we
would need it in the future.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131542.GE4759@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

8637e38a

x86/apic: Try to spread IRQ vectors to different priority levels · 1bccd58b

由 Alexander Gordeev 提交于 6月 07, 2012

When assigning a new vector it is primarially done by adding 8
to the previously given out vector number. Hence, two
consequently allocated vector numbers would likely fall into the
same priority level. Try to spread vector numbers to different
priority levels better by changing the step from 8 to 16.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131514.GD4759@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

1bccd58b

x86/apic: Factor out default vector_allocation_domain() operation · 9d8e1066

由 Alexander Gordeev 提交于 6月 07, 2012

Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131449.GC4759@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

9d8e1066

06 6月, 2012 7 次提交

x86/early_printk: Replace obsolete simple_strtoul() usage with kstrtoint() · fbd24153

由 Shuah Khan 提交于 5月 30, 2012

Change early_serial_init() to call kstrtoul() instead of calling
obsoleted simple_strtoul().
Signed-off-by: NShuah Khan <shuahkhan@gmail.com>
Cc: Joe Perches <joe@perches.com>
Link: http://lkml.kernel.org/r/1338424803.3569.5.camel@lorien2Signed-off-by: NIngo Molnar <mingo@kernel.org>

fbd24153

x86/apic: Factor out default cpu_mask_to_apicid() operations · 6398268d

由 Alexander Gordeev 提交于 6月 05, 2012

Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120605112340.GA11454@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

6398268d

x86/apic: Factor out default target_cpus() operation · bf721d3a

由 Alexander Gordeev 提交于 6月 05, 2012

Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120605112324.GA11449@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

bf721d3a

x86/apic: Trivial whitespace fixes · 49d0c7a0

由 Alexander Gordeev 提交于 6月 05, 2012

Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120605112310.GA11443@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

49d0c7a0

x86/x2apic/cluster: Use all the members of one cluster specified in the... · 0b8255e6

由 Suresh Siddha 提交于 5月 21, 2012

x86/x2apic/cluster: Use all the members of one cluster specified in the smp_affinity mask for the interrupt destination

If the HW implements round-robin interrupt delivery, this
enables multiple cpu's (which are part of the user specified
interrupt smp_affinity mask and belong to the same x2apic
cluster) to service the interrupt.

Also if the platform supports Power Aware Interrupt Routing,
then this enables the interrupt to be routed to an idle cpu or a
busy cpu depending on the perf/power bias tunable.

We are now grouping all the cpu's in a cluster to one vector
domain. So that will limit the total number of interrupt sources
handled by Linux. Previously we support "cpu-count *
available-vectors-per-cpu" interrupt sources but this will now
reduce to "cpu-count/16 * available-vectors-per-cpu".
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: gorcunov@openvz.org
Cc: agordeev@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337644682-19854-2-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

0b8255e6

x86/irq: Update irq_cfg domain unless the new affinity is a subset of the current domain · 332afa65

由 Suresh Siddha 提交于 5月 21, 2012

Until now, irq_cfg domain is mostly static. Either all CPU's
(used by flat mode) or one CPU (first CPU in the irq afffinity
mask) to which irq is being migrated (this is used by the rest
of apic modes).

Upcoming x2apic cluster mode optimization patch allows the irq
to be sent to any CPU in the x2apic cluster (if supported by the
HW). So irq_cfg domain changes on the fly (depending on which
CPU in the x2apic cluster is online).

Instead of checking for any intersection between the new irq
affinity mask and the current irq_cfg domain, check if the new
irq affinity mask is a subset of the current irq_cfg domain.
Otherwise proceed with updating the irq_cfg domain aswell as
assigning vector's on all the CPUs specified in the new mask.

This also cleans up a workaround in updating irq_cfg domain for
legacy irq's that are handled by the IO-APIC.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: gorcunov@openvz.org
Cc: agordeev@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337644682-19854-1-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

332afa65

x86/platform: Introduce APIC post-initialization callback · 7db971b2

由 Ido Yariv 提交于 6月 03, 2012

Some subarchitectures (such as vSMP) need to slightly adjust the
underlying APIC structure. Add an APIC post-initialization callback
to 'struct x86_platform_ops' for this purpose and use it for
adjusting the APIC structure on vSMP systems.
Signed-off-by: NIdo Yariv <ido@wizery.com>
Acked-by: NShai Fultheim <shai@scalemp.com>
Link: http://lkml.kernel.org/r/1338675095-27260-1-git-send-email-ido@wizery.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

7db971b2

02 6月, 2012 12 次提交

x86, x32, ptrace: Remove PTRACE_ARCH_PRCTL for x32 · bad1a753

由 H.J. Lu 提交于 5月 21, 2012

When I added x32 ptrace to 3.4 kernel, I also include PTRACE_ARCH_PRCTL
support for x32 GDB For ARCH_GET_FS/GS, it takes a pointer to int64. But
at user level, ARCH_GET_FS/GS takes a pointer to int32. So I have to add
x32 ptrace to glibc to handle it with a temporary int64 passed to kernel and
copy it back to GDB as int32. Roland suggested that PTRACE_ARCH_PRCTL
is obsolete and x32 GDB should use fs_base and gs_base fields of
user_regs_struct instead.

Accordingly, remove PTRACE_ARCH_PRCTL completely from the x32 code to
avoid possible memory overrun when pointer to int32 is passed to
kernel.

Link: http://lkml.kernel.org/r/CAMe9rOpDzHfS7NH7m1vmD9QRw8SSj4Sc%2BaNOgcWm_WJME2eRsQ@mail.gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: <stable@vger.kernel.org> v3.4

bad1a753

x86: get rid of calling do_notify_resume() when returning to kernel mode · 44fbbb3d

由 Al Viro 提交于 4月 30, 2012

If we end up calling do_notify_resume() with !user_mode(refs), it
does nothing (do_signal() explicitly bails out and we can't get there
with TIF_NOTIFY_RESUME in such situations).  Then we jump to
resume_userspace_sig, which rechecks the same thing and bails out
to resume_kernel, thus breaking the loop.

It's easier and cheaper to check *before* calling do_notify_resume()
and bail out to resume_kernel immediately.  And kill the check in
do_signal()...

Note that on amd64 we can't get there with !user_mode() at all - asm
glue takes care of that.
Acked-and-reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

44fbbb3d

new helper: signal_delivered() · efee984c

由 Al Viro 提交于 4月 28, 2012

Does block_sigmask() + tracehook_signal_handler();  called when
sigframe has been successfully built.  All architectures converted
to it; block_sigmask() itself is gone now (merged into this one).

I'm still not too happy with the signature, but that's a separate
story (IMO we need a structure that would contain signal number +
siginfo + k_sigaction, so that get_signal_to_deliver() would fill one,
signal_delivered(), handle_signal() and probably setup...frame() -
take one).
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

efee984c

most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set · 77097ae5

由 Al Viro 提交于 4月 27, 2012

Only 3 out of 63 do not.  Renamed the current variant to __set_current_blocked(),
added set_current_blocked() that will exclude unblockable signals, switched
open-coded instances to it.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

77097ae5

A
set_restore_sigmask() is never called without SIGPENDING (and never should be) · edd63a27
由 Al Viro 提交于 4月 27, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
edd63a27
A
pull clearing RESTORE_SIGMASK into block_sigmask() · a610d6e6
由 Al Viro 提交于 5月 21, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
a610d6e6

new helper: sigmask_to_save() · b7f9a11a

由 Al Viro 提交于 5月 02, 2012

replace boilerplate "should we use ->saved_sigmask or ->blocked?"
with calls of obvious inlined helper...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b7f9a11a

new helper: restore_saved_sigmask() · 51a7b448

由 Al Viro 提交于 5月 21, 2012

first fruits of ..._restore_sigmask() helpers: now we can take
boilerplate "signal didn't have a handler, clear RESTORE_SIGMASK
and restore the blocked mask from ->saved_mask" into a common
helper.  Open-coded instances switched...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

51a7b448

new helpers: {clear,test,test_and_clear}_restore_sigmask() · 4ebefe3e

由 Al Viro 提交于 4月 26, 2012

helpers parallel to set_restore_sigmask(), used in the next commits
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4ebefe3e

x86, efi: Add EFI boot stub documentation · 0c759662

由 Matt Fleming 提交于 3月 16, 2012

Since we can't expect every user to read the EFI boot stub code it
seems prudent to have a couple of paragraphs explaining what it is and
how it works.

The "initrd=" option in particular is tricky because it only
understands absolute EFI-style paths (backslashes as directory
separators), and until now this hasn't been documented anywhere. This
has tripped up a couple of users.

Cc: Matthew Garrett <mjg@redhat.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/1331907517-3985-4-git-send-email-matt@console-pimps.orgSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

0c759662

x86, efi; Add EFI boot stub console support · 9fa7deda

由 Matt Fleming 提交于 2月 20, 2012

We need a way of printing useful messages to the user, for example
when we fail to open an initrd file, instead of just hanging the
machine without giving the user any indication of what went wrong. So
sprinkle some error messages throughout the EFI boot stub code to make
it easier for users to diagnose/report problems.
Reported-by: NKeshav P R <the.ridikulus.rat@gmail.com>
Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/1331907517-3985-3-git-send-email-matt@console-pimps.orgSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

9fa7deda

x86, efi: Only close open files in error path · 30dc0d0f

由 Matt Fleming 提交于 3月 15, 2012

The loop at the 'close_handles' label in handle_ramdisks() should be
using 'i', which represents the number of initrd files that were
successfully opened, not 'nr_initrds' which is the number of initrd=
arguments passed on the command line.

Currently, if we execute the loop to close all file handles and we
failed to open any initrds we'll try to call the close function on a
garbage pointer, causing the machine to hang.

Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/1331907517-3985-2-git-send-email-matt@console-pimps.orgSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

30dc0d0f

01 6月, 2012 2 次提交

ftrace/x86: Do not change stacks in DEBUG when calling lockdep · 5963e317

由 Steven Rostedt 提交于 5月 30, 2012

When both DYNAMIC_FTRACE and LOCKDEP are set, the TRACE_IRQS_ON/OFF
will call into the lockdep code. The lockdep code can call lots of
functions that may be traced by ftrace. When ftrace is updating its
code and hits a breakpoint, the breakpoint handler will call into
lockdep. If lockdep happens to call a function that also has a breakpoint
attached, it will jump back into the breakpoint handler resetting
the stack to the debug stack and corrupt the contents currently on
that stack.

The 'do_sym' call that calls do_int3() is protected by modifying the
IST table to point to a different location if another breakpoint is
hit. But the TRACE_IRQS_OFF/ON are outside that protection, and if
a breakpoint is hit from those, the stack will get corrupted, and
the kernel will crash:

[ 1013.243754] BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
[ 1013.272665] IP: [<ffff880145cc0000>] 0xffff880145cbffff
[ 1013.285186] PGD 1401b2067 PUD 14324c067 PMD 0
[ 1013.298832] Oops: 0010 [#1] PREEMPT SMP
[ 1013.310600] CPU 2
[ 1013.317904] Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables crc32c_intel ghash_clmulni_intel microcode usb_debug serio_raw pcspkr iTCO_wdt i2c_i801 iTCO_vendor_support e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan]
[ 1013.401848]
[ 1013.407399] Pid: 112, comm: kworker/2:1 Not tainted 3.4.0+ #30
[ 1013.437943] RIP: 8eb8:[<ffff88014630a000>]  [<ffff88014630a000>] 0xffff880146309fff
[ 1013.459871] RSP: ffffffff8165e919:ffff88014780f408  EFLAGS: 00010046
[ 1013.477909] RAX: 0000000000000001 RBX: ffffffff81104020 RCX: 0000000000000000
[ 1013.499458] RDX: ffff880148008ea8 RSI: ffffffff8131ef40 RDI: ffffffff82203b20
[ 1013.521612] RBP: ffffffff81005751 R08: 0000000000000000 R09: 0000000000000000
[ 1013.543121] R10: ffffffff82cdc318 R11: 0000000000000000 R12: ffff880145cc0000
[ 1013.564614] R13: ffff880148008eb8 R14: 0000000000000002 R15: ffff88014780cb40
[ 1013.586108] FS:  0000000000000000(0000) GS:ffff880148000000(0000) knlGS:0000000000000000
[ 1013.609458] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1013.627420] CR2: 0000000000000002 CR3: 0000000141f10000 CR4: 00000000001407e0
[ 1013.649051] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1013.670724] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1013.692376] Process kworker/2:1 (pid: 112, threadinfo ffff88013fe0e000, task ffff88014020a6a0)
[ 1013.717028] Stack:
[ 1013.724131]  ffff88014780f570 ffff880145cc0000 0000400000004000 0000000000000000
[ 1013.745918]  cccccccccccccccc ffff88014780cca8 ffffffff811072bb ffffffff81651627
[ 1013.767870]  ffffffff8118f8a7 ffffffff811072bb ffffffff81f2b6c5 ffffffff81f11bdb
[ 1013.790021] Call Trace:
[ 1013.800701] Code: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a <e7> d7 64 81 ff ff ff ff 01 00 00 00 00 00 00 00 65 d9 64 81 ff
[ 1013.861443] RIP  [<ffff88014630a000>] 0xffff880146309fff
[ 1013.884466]  RSP <ffff88014780f408>
[ 1013.901507] CR2: 0000000000000002

The solution was to reuse the NMI functions that change the IDT table to make the debug
stack keep its current stack (in kernel mode) when hitting a breakpoint:

  call debug_stack_set_zero
  TRACE_IRQS_ON
  call debug_stack_reset

If the TRACE_IRQS_ON happens to hit a breakpoint then it will keep the current stack
and not crash the box.
Reported-by: NDave Jones <davej@redhat.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

5963e317

x86: Allow nesting of the debug stack IDT setting · f8988175

由 Steven Rostedt 提交于 5月 30, 2012

When the NMI handler runs, it checks if it preempted a debug handler
and if that handler is using the debug stack. If it is, it changes the
IDT table not to update the stack, otherwise it will reset the debug
stack and corrupt the debug handler it preempted.

Now that ftrace uses breakpoints to change functions from nops to
callers, many more places may hit a breakpoint. Unfortunately this
includes some of the calls that lockdep performs. Which causes issues
with the debug stack. It too needs to change the debug stack before
tracing (if called from the debug handler).

Allow the debug_stack_set_zero() and debug_stack_reset() to be nested
so that the debug handlers can take advantage of them too.

[ Used this_cpu_*() over __get_cpu_var() as suggested by H. Peter Anvin ]
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

f8988175