提交 · e73d84e33f15c099ed1df60437700093cb14e46e · openanolis / cloud-kernel

09 12月, 2013 9 次提交

posix-timers: Remove remaining uses of tasklist_lock · e73d84e3

由 Frederic Weisbecker 提交于 10月 11, 2013

The remaining uses of tasklist_lock were mostly about synchronizing
against sighand modifications, getting coherent and safe group samples
and also thread/process wide timers list handling.

All of this is already safely synchronizable with the target's
sighand lock. Let's use it on these places instead.

Also update the comments about locking.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

e73d84e3

posix-timers: Use sighand lock instead of tasklist_lock on timer deletion · 3d7a1427

由 Frederic Weisbecker 提交于 10月 11, 2013

Timer deletion doesn't need the tasklist lock.
We need to protect against:

* concurrent access to the lists p->cputime_expires and
  p->sighand->cputime_expires

* task reaping that may also delete the timer list entry

* timer firing

We already hold the timer lock which protects us against concurrent
timer firing.

The rest only need the targets sighand to be locked.
So hold it and drop the use of tasklist_lock there.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

3d7a1427

posix-timers: Use sighand lock instead of tasklist_lock for task clock sample · 50875788

由 Frederic Weisbecker 提交于 10月 11, 2013

There is no need for the tasklist_lock just to take a process
wide clock sample.

All we need is to get a coherent sample that doesn't race with
exit() and exec():

* exit() may be concurrently reaping a task and flushing its time

* sighand is unstable under exit() and exec(), and the latter also
  result in group leader that can change

To protect against these, locking the target's sighand is enough.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

50875788

posix-timers: Consolidate posix_cpu_clock_get() · 33ab0fec

由 Frederic Weisbecker 提交于 10月 11, 2013

Consolidate the clock sampling common code used for both local
and remote targets.

Note that this introduces a tiny user ABI change: if a
PID is passed to clock_gettime() along the clockid,
we used to forbid a process wide clock sample when that
PID doesn't belong to a group leader. Now after this patch
we allow process wide clock samples if that PID belongs to
the current task, even if the current task is not the
group leader.

But local process wide clock samples are allowed if PID == 0
(current task) even if the current task is not the group leader.
So in the end this should be no big deal as this actually harmonize
the behaviour when the remote sample is actually a local one.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

33ab0fec

posix-timers: Remove useless clock sample on timers cleanup · af82eb3c

由 Frederic Weisbecker 提交于 10月 11, 2013

a0b2062b
("posix_timers: fix racy timer delta caching on task exit") forgot
to remove the arguments used for timer caching.

Fix this leftover.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

af82eb3c

posix-timers: Remove dead task special case · a3222f88

由 Frederic Weisbecker 提交于 10月 11, 2013

Now that we've removed all the optimizations that could
result in NULL timer's targets, we can remove all the
associated special case handling.

Also add some warnings on NULL targets to spot any possible
leftover.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

a3222f88

posix-timers: Cleanup reaped target handling · e26d70d2

由 Frederic Weisbecker 提交于 10月 11, 2013

When a timer's target is seen to be buried, for example on calls
to timer_gettime(), the posix cpu timers code behaves a bit
like a garbage collector and releases early the reference to the
task.

Then again, this optimization complicates the code for no much
value: it's up to the user to release the timer and its associated
ressources by calling timer_delete() after it buries the target
tasks.

Remove this to simplify the code.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

e26d70d2

posix-timers: Remove dead process posix cpu timers caching · d430b917

由 Frederic Weisbecker 提交于 10月 10, 2013

Now that we removed dead thread posix cpu timers caching,
lets remove the dead process wide version. This caching
is similar to the per thread version but it should be even
more rare:

* If the process id dead, we are not reading its timers
status from a thread belonging to its group since they
are all dead. So this caching only concern remote process
timers reads. Now posix cpu timers using itimers or timer_settime()
can't do remote process timers anyway so it's not even clear if there
is actually a user for this caching.

* Unlike per thread timers caching, this only applies to
zombies targets. Buried targets' process wide timers return
0 values. But then again, timer_gettime() can't read remote
process timers, so if the process is dead, there can't be
any reader left anyway.

Then again this caching seem to complicate the code for
corner cases that are probably not worth it. So lets get
rid of it.

Also remove the sample snapshot on dying process timer
that is now useless, as suggested by Kosaki.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

d430b917

posix-timers: Remove dead thread posix cpu timers caching · 724a3713

由 Frederic Weisbecker 提交于 10月 10, 2013

When a task is exiting or has exited, its posix cpu timers
don't tick anymore and won't elapse further. It's too late
for them to expire.

So any further call to timer_gettime() on these timers will
return the same remaining expiry time.

The current code optimize this by caching the remaining delta
and storing it where we use to save the absolute expiration time.
This way, the future calls to timer_gettime() won't need to
compute the difference between the absolute expiration time and
the current time anymore.

Now this optimization doesn't seem to bring much value. Computing
the timer remaining delta is not very costly. Fetching the timer
value OTOH can be costly in two ways:

* CPUCLOCK_SCHED read requires to lock the target's rq. But some
optimizations are on the way to make task_sched_runtime() not holding
the rq lock of a non-running target.

* CPUCLOCK_VIRT/CPUCLOCK_PROF read simply consist in fetching
current->utime/current->stime except when the system uses full
dynticks cputime accounting. The latter requires a per task lock
in order to correctly compute user and system time. But once the
target is dead, this lock shouldn't be contended anyway.

All in one this caching doesn't seem to be justified.
Given that it complicates the code significantly for
few wins, let's remove it on single thread timers.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

724a3713

04 12月, 2013 1 次提交

Merge branch 'timers/core-v2' of... · a934a56e

由 Ingo Molnar 提交于 12月 04, 2013

Merge branch 'timers/core-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/core

Pull dynticks updates from Frederic Weisbecker:

  * Fix a bug where posix cpu timers requeued due to interval got ignored on full
    dynticks CPUs (not a regression though as it only impacts full dynticks and the
    bug is there since we merged full dynticks).

  * Optimizations and cleanups on the use of per CPU APIs to improve code readability,
    performance and debuggability in the nohz subsystem;

  * Optimize posix cpu timer by sparing stub workqueue queue with full dynticks off case

  * Rename some functions to extend with *_this_cpu() suffix for clarity

  * Refine the naming of some context tracking subsystem state accessors

  * Trivial spelling fix by Paul Gortmaker
Signed-off-by: NIngo Molnar <mingo@kernel.org>

a934a56e

03 12月, 2013 16 次提交

Merge branch 'leds-fixes-for-3.13' of... · dea4f48a

由 Linus Torvalds 提交于 12月 02, 2013

Merge branch 'leds-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds

Pull LED subsystem bugfix from Bryan Wu.

* 'leds-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
  leds: pwm: Fix for deferred probe in DT booted mode

dea4f48a

leds: pwm: Fix for deferred probe in DT booted mode · aa1a6d6d

由 Peter Ujfalusi 提交于 11月 28, 2013

We need to make sure that the error code from devm_of_pwm_get() is the one
the module returns in case of failure.
Restructure the code to make this possible for DT booted case.
With this patch the driver can ask for deferred probing when the board is
booted with DT.
Fixes for example omap4-sdp board's keyboard backlight led.
Signed-off-by: NPeter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: NBryan Wu <cooloney@gmail.com>

aa1a6d6d

uio: we cannot mmap unaligned page contents · b6550287

由 Linus Torvalds 提交于 12月 02, 2013

In commit 7314e613 ("Fix a few incorrectly checked
[io_]remap_pfn_range() calls") the uio driver started more properly
checking the passed-in user mapping arguments against the size of the
actual uio driver data.

That in turn exposed that some driver authors apparently didn't realize
that mmap can only work on a page granularity, and had tried to use it
with smaller mappings, with the new size check catching that out.

So since it's not just the user mmap() arguments that can be confused,
make the uio mmap code also verify that the uio driver has the memory
allocated at page boundaries in order for mmap to work.  If the device
memory isn't properly aligned, we return

  [ENODEV]
    The fildes argument refers to a file whose type is not supported by mmap().

as per the open group documentation on mmap.
Reported-by: NHolger Brunck <holger.brunck@keymile.com>
Acked-by: NGreg KH <gregkh@linuxfoundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b6550287

posix-timers: Fix full dynticks CPUs kick on timer rescheduling · c925077c

由 Frederic Weisbecker 提交于 11月 06, 2013

A posix CPU timer can be rearmed while it is firing or after it is
notified with a signal. This can happen for example with timers that
were set with a non zero interval in timer_settime().

This rearming can happen in two places:

1) On timer firing time, which happens on the target's tick. If the timer
can't trigger a signal because it is ignored, it reschedules itself
to honour the timer interval.

2) On signal handling from the timer's notification target. This one
can be a different task than the timer's target itself. Once the
signal is notified, the notification target rearms the timer, again
to honour the timer interval.

When a timer is rearmed, we need to notify the full dynticks CPUs
such that they restart their tick in case they are running tasks that
may have a share in elapsing this timer.

Now the 1st case above handles full dynticks CPUs with a call to
posix_cpu_timer_kick_nohz() from the posix cpu timer firing code. But
the second case ignores the fact that some CPUs may run non-idle tasks
with their tick off. As a result, when a timer is resheduled after its signal
notification, the full dynticks CPUs may completely ignore it and not
tick on the timer as expected

This patch fixes this bug by handling both cases in one. All we need
is to move the kick to the rearming common code in posix_cpu_timer_schedule().
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Olivier Langlois <olivier@olivierlanglois.net>

c925077c

posix-timers: Spare workqueue if there is no full dynticks CPU to kick · d4283c65

由 Frederic Weisbecker 提交于 11月 06, 2013

After a posix cpu timer is set, a workqueue is scheduled in order to
kick the full dynticks CPUs and let them restart their tick if
necessary in case the task they are running is concerned by the
new timer.

This kick is implemented by way of IPIs, which require interrupts
to be enabled, hence the need for a workqueue to raise them because
the posix cpu timer set path has interrupts disabled.

Now if there is no full dynticks CPU on the system, the workqueue is
still scheduled but it simply won't send any IPI and return immediately.

So lets spare that worqueue when it is not needed.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

d4283c65

context_tracking: Rename context_tracking_active() to context_tracking_cpu_is_enabled() · d0df09eb

由 Frederic Weisbecker 提交于 11月 06, 2013

We currently have a confusing couple of API naming with the existing
context_tracking_active() and context_tracking_is_enabled().

Lets keep the latter one, context_tracking_is_enabled(), for global
context tracking state check and use context_tracking_cpu_is_enabled()
for local state check.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

d0df09eb

context_tracking: Wrap static key check into more intuitive function name · 58135f57

由 Frederic Weisbecker 提交于 11月 06, 2013

Use a function with a meaningful name to check the global context
tracking state. static_key_false() is a bit confusing for reviewers.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

58135f57

trivial: fix spelling in CONTEXT_TRACKING_FORCE help text · 99c8b1ea

由 Paul Gortmaker 提交于 10月 24, 2013

Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

99c8b1ea

nohz: Convert a few places to use local per cpu accesses · e8fcaa5c

由 Frederic Weisbecker 提交于 8月 07, 2013

A few functions use remote per CPU access APIs when they
deal with local values.

Just do the right conversion to improve performance, code
readability and debug checks.

While at it, lets extend some of these function names with *_this_cpu()
suffix in order to display their purpose more clearly.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

e8fcaa5c

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a45299e7

由 Linus Torvalds 提交于 12月 02, 2013

Pull irq fixes from Thomas Gleixner:
 - Correction of fuzzy and fragile IRQ_RETVAL macro
 - IRQ related resume fix affecting only XEN
 - ARM/GIC fix for chained GIC controllers

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip: Gic: fix boot for chained gics
  irq: Enable all irqs unconditionally in irq_resume
  genirq: Correct fuzzy and fragile IRQ_RETVAL() definition

a45299e7

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a0b57ca3

由 Linus Torvalds 提交于 12月 02, 2013

Pull scheduler fixes from Ingo Molnar:
 "Various smaller fixlets, all over the place"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/doc: Fix generation of device-drivers
  sched: Expose preempt_schedule_irq()
  sched: Fix a trivial typo in comments
  sched: Remove unused variable in 'struct sched_domain'
  sched: Avoid NULL dereference on sd_busy
  sched: Check sched_domain before computing group power
  MAINTAINERS: Update file patterns in the lockdep and scheduler entries

a0b57ca3

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e321ae4c

由 Linus Torvalds 提交于 12月 02, 2013

Pull perf fixes from Ingo Molnar:
 "Misc kernel and tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools lib traceevent: Fix conversion of pointer to integer of different size
  perf/trace: Properly use u64 to hold event_id
  perf: Remove fragile swevent hlist optimization
  ftrace, perf: Avoid infinite event generation loop
  tools lib traceevent: Fix use of multiple options in processing field
  perf header: Fix possible memory leaks in process_group_desc()
  perf header: Fix bogus group name
  perf tools: Tag thread comm as overriden

e321ae4c

Merge tag 'stable/for-linus-3.13-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · bcc2f9b7

由 Linus Torvalds 提交于 12月 02, 2013

Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
 "Fixes to patches that went in this merge window along with a latent
  bug:
   - Fix lazy flushing in case m2p override fails.
   - Fix module compile issues with ARM/Xen
   - Add missing call to DMA map page for Xen SWIOTLB for ARM"

* tag 'stable/for-linus-3.13-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/gnttab: leave lazy MMU mode in the case of a m2p override failure
  xen/arm: p2m_init and p2m_lock should be static
  arm/xen: Export phys_to_mach to fix Xen module link errors
  swiotlb-xen: add missing xen_dma_map_page call

bcc2f9b7

Merge tag 'spi-v3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · aeac8103

由 Linus Torvalds 提交于 12月 02, 2013

Pull spi fixes from Mark Brown:
 "A smattering of driver specific fixes here, including a bunch for a
  long standing common pattern in the error handling paths, and a fix
  for an embarrassing thinko in the new devm master registration code"

* tag 'spi-v3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi/pxa2xx: Restore private register bits.
  spi/qspi: Fix qspi remove path.
  spi/qspi: cleanup pm_runtime error check.
  spi/qspi: set correct platform drvdata in ti_qspi_probe()
  spi/pxa2xx: add new ACPI IDs
  spi: core: invert success test in devm_spi_register_master
  spi: spi-mxs: fix reference leak to master in mxs_spi_remove()
  spi: bcm63xx: fix reference leak to master in bcm63xx_spi_remove()
  spi: txx9: fix reference leak to master in txx9spi_remove()
  spi: mpc512x: fix reference leak to master in mpc512x_psc_spi_do_remove()
  spi: rspi: use platform drvdata correctly in rspi_remove()
  spi: bcm2835: fix reference leak to master in bcm2835_spi_remove()

aeac8103

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 5fc92de3

由 Linus Torvalds 提交于 12月 02, 2013

Pull networking updates from David Miller:
 "Here is a pile of bug fixes that accumulated while I was in Europe"

 1) In fixing kernel leaks to userspace during copying of socket
    addresses, we broke a case that used to work, namely the user
    providing a buffer larger than the in-kernel generic socket address
    structure.  This broke Ruby amongst other things.  Fix from Dan
    Carpenter.

 2) Fix regression added by byte queue limit support in 8139cp driver,
    from Yang Yingliang.

 3) The addition of MSG_SENDPAGE_NOTLAST buggered up a few sendpage
    implementations, they should just treat it the same as MSG_MORE.
    Fix from Richard Weinberger and Shawn Landden.

 4) Handle icmpv4 errors received on ipv6 SIT tunnels correctly, from
    Oussama Ghorbel.  In particular we should send an ICMPv6 unreachable
    in such situations.

 5) Fix some regressions in the recent genetlink fixes, in particular
    get the pmcraid driver to use the new safer interfaces correctly.
    From Johannes Berg.

 6) macvtap was converted to use a per-cpu set of statistics, but some
    code was still bumping tx_dropped elsewhere.  From Jason Wang.

 7) Fix build failure of xen-netback due to missing include on some
    architectures, from Andy Whitecroft.

 8) macvtap double counts received packets in statistics, fix from Vlad
    Yasevich.

 9) Fix various cases of using *_STATS_BH() when *_STATS() is more
    appropriate.  From Eric Dumazet and Hannes Frederic Sowa.

10) Pktgen ipsec mode doesn't update the ipv4 header length and checksum
    properly after encapsulation.  Fix from Fan Du.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
  net/mlx4_en: Remove selftest TX queues empty condition
  {pktgen, xfrm} Update IPv4 header total len and checksum after tranformation
  virtio_net: make all RX paths handle erors consistently
  virtio_net: fix error handling for mergeable buffers
  virtio_net: Fixed a trivial typo (fitler --> filter)
  netem: fix gemodel loss generator
  netem: fix loss 4 state model
  netem: missing break in ge loss generator
  net/hsr: Support iproute print_opt ('ip -details ...')
  net/hsr: Very small fix of comment style.
  MAINTAINERS: Added net/hsr/ maintainer
  ipv6: fix possible seqlock deadlock in ip6_finish_output2
  ixgbe: Make ixgbe_identify_qsfp_module_generic static
  ixgbe: turn NETIF_F_HW_L2FW_DOFFLOAD off by default
  ixgbe: ixgbe_fwd_ring_down needs to be static
  e1000: fix possible reset_task running after adapter down
  e1000: fix lockdep warning in e1000_reset_task
  e1000: prevent oops when adapter is being closed and reset simultaneously
  igb: Fixed Wake On LAN support
  inet: fix possible seqlock deadlocks
  ...

5fc92de3

vfs: fix subtle use-after-free of pipe_inode_info · b0d8d229

由 Linus Torvalds 提交于 12月 02, 2013

The pipe code was trying (and failing) to be very careful about freeing
the pipe info only after the last access, with a pattern like:

        spin_lock(&inode->i_lock);
        if (!--pipe->files) {
                inode->i_pipe = NULL;
                kill = 1;
        }
        spin_unlock(&inode->i_lock);
        __pipe_unlock(pipe);
        if (kill)
                free_pipe_info(pipe);

where the final freeing is done last.

HOWEVER.  The above is actually broken, because while the freeing is
done at the end, if we have two racing processes releasing the pipe
inode info, the one that *doesn't* free it will decrement the ->files
count, and unlock the inode i_lock, but then still use the
"pipe_inode_info" afterwards when it does the "__pipe_unlock(pipe)".

This is *very* hard to trigger in practice, since the race window is
very small, and adding debug options seems to just hide it by slowing
things down.

Simon originally reported this way back in July as an Oops in
kmem_cache_allocate due to a single bit corruption (due to the final
"spin_unlock(pipe->mutex.wait_lock)" incrementing a field in a different
allocation that had re-used the free'd pipe-info), it's taken this long
to figure out.

Since the 'pipe->files' accesses aren't even protected by the pipe lock
(we very much use the inode lock for that), the simple solution is to
just drop the pipe lock early.  And since there were two users of this
pattern, create a helper function for it.

Introduced commit ba5bb147 ("pipe: take allocation and freeing of
pipe_inode_info out of ->i_mutex").
Reported-by: NSimon Kirby <sim@hostway.ca>
Reported-by: NIan Applegate <ia@cloudflare.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org   # v3.10+
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b0d8d229

02 12月, 2013 6 次提交

net/mlx4_en: Remove selftest TX queues empty condition · 833846e8

由 Eugenia Emantayev 提交于 12月 01, 2013

Remove waiting for TX queues to become empty during selftest.
This check is not necessary for any purpose, and might put
the driver into an infinite loop.
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

833846e8

{pktgen, xfrm} Update IPv4 header total len and checksum after tranformation · 3868204d

由 fan.du 提交于 12月 01, 2013

commit a553e4a6 ("[PKTGEN]: IPSEC support")
tried to support IPsec ESP transport transformation for pktgen, but acctually
this doesn't work at all for two reasons(The orignal transformed packet has
bad IPv4 checksum value, as well as wrong auth value, reported by wireshark)

- After transpormation, IPv4 header total length needs update,
  because encrypted payload's length is NOT same as that of plain text.

- After transformation, IPv4 checksum needs re-caculate because of payload
  has been changed.

With this patch, armmed pktgen with below cofiguration, Wireshark is able to
decrypted ESP packet generated by pktgen without any IPv4 checksum error or
auth value error.

pgset "flag IPSEC"
pgset "flows 1"
Signed-off-by: NFan Du <fan.du@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3868204d

virtio_net: make all RX paths handle erors consistently · f121159d

由 Michael S. Tsirkin 提交于 11月 28, 2013

receive mergeable now handles errors internally.
Do same for big and small packet paths, otherwise
the logic is too hard to follow.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f121159d

virtio_net: fix error handling for mergeable buffers · 8fc3b9e9

由 Michael S. Tsirkin 提交于 11月 28, 2013

Eric Dumazet noticed that if we encounter an error
when processing a mergeable buffer, we don't
dequeue all of the buffers from this packet,
the result is almost sure to be loss of networking.

Jason Wang noticed that we also leak a page and that we don't decrement
the rq buf count, so we won't repost buffers (a resource leak).

Fix both issues.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael Dalton <mwdalton@google.com>
Reported-by: NEric Dumazet <edumazet@google.com>
Reported-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fc3b9e9

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · e84a2a49

由 Linus Torvalds 提交于 12月 01, 2013

Pull UML fixes from Richard Weinberger:
 "Fixes two regressions which got introduced this merge window"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Build always with -mcmodel=large on 64bit
  um: Rename print_stack_trace to do_stack_trace

e84a2a49

Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · 1d07489a

由 Linus Torvalds 提交于 12月 01, 2013

Pull ARM fixes from Russell King:
 "Some ARM fixes, the biggest of which is the fix for the signal return
  codes; this came up due to an interaction between the V7M nommu
  changes and the BE8 changes.  Dave Martin spotted that the kexec
  trampoline wasn't being correctly copied (in a way which allows
  Thumb-2 to work).

  I've also fixed a number of breakages on footbridge platforms as I've
  upgraded one of my machines to v3.12...  one which had a 1200 day
  uptime"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 7907/1: lib: delay-loop: Add align directive to fix BogoMIPS calculation
  ARM: 7897/1: kexec: Use the right ISA for relocate_new_kernel
  ARM: 7895/1: signal: fix armv7-m build issue in sigreturn_codes.S
  ARM: footbridge: fix EBSA285 LEDs
  ARM: footbridge: fix VGA initialisation
  ARM: fix booting low-vectors machines
  ARM: dma-mapping: check DMA mask against available memory

1d07489a

01 12月, 2013 8 次提交

um: Build always with -mcmodel=large on 64bit · fff6540c

由 Richard Weinberger 提交于 11月 29, 2013

On UML SUBARCH can be x86, x86_64 and i386 and if it is x86
we use uname -m to select a defconfig.
Therefore we can no longer use -mcmodel=large only if SUBARCH
is x86_64.
Reported-and-tested-by: NBoaz Harrosh <bharrosh@panasas.com>
Signed-off-by: NRichard Weinberger <richard@nod.at>

fff6540c

um: Rename print_stack_trace to do_stack_trace · 8ed12fcc

由 Richard Weinberger 提交于 11月 21, 2013

We cannot use print_stack_trace because the name conflicts
with linux/stacktrace.h.
Reported-by: NRandy Dunlap <rdunlap@infradead.org>
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NRichard Weinberger <richard@nod.at>

8ed12fcc

ARM: 7907/1: lib: delay-loop: Add align directive to fix BogoMIPS calculation · 11d4bb1b

由 Fabio Estevam 提交于 11月 30, 2013

Currently mx53 (CortexA8) running at 1GHz reports:
Calibrating delay loop... 663.55 BogoMIPS (lpj=3317760)

Tom Evans verified that alignments of 0x0 and 0x8 run the two instructions of __loop_delay in one clock cycle (1 clock/loop), while alignments of 0x4 and 0xc take 3 clocks to run the loop twice. (1.5 clock/loop)

The original object code looks like this:

00000010 <__loop_const_udelay>:
  10:	e3e01000 	mvn	r1, #0
  14:	e51f201c 	ldr	r2, [pc, #-28]	; 0 <__loop_udelay-0x8>
  18:	e5922000 	ldr	r2, [r2]
  1c:	e0800921 	add	r0, r0, r1, lsr #18
  20:	e1a00720 	lsr	r0, r0, #14
  24:	e0822b21 	add	r2, r2, r1, lsr #22
  28:	e1a02522 	lsr	r2, r2, #10
  2c:	e0000092 	mul	r0, r2, r0
  30:	e0800d21 	add	r0, r0, r1, lsr #26
  34:	e1b00320 	lsrs	r0, r0, #6
  38:	01a0f00e 	moveq	pc, lr

0000003c <__loop_delay>:
  3c:	e2500001 	subs	r0, r0, #1
  40:	8afffffe 	bhi	3c <__loop_delay>
  44:	e1a0f00e 	mov	pc, lr

After adding the 'align 3' directive to __loop_delay (align to 8 bytes):

00000010 <__loop_const_udelay>:
  10:	e3e01000 	mvn	r1, #0
  14:	e51f201c 	ldr	r2, [pc, #-28]	; 0 <__loop_udelay-0x8>
  18:	e5922000 	ldr	r2, [r2]
  1c:	e0800921 	add	r0, r0, r1, lsr #18
  20:	e1a00720 	lsr	r0, r0, #14
  24:	e0822b21 	add	r2, r2, r1, lsr #22
  28:	e1a02522 	lsr	r2, r2, #10
  2c:	e0000092 	mul	r0, r2, r0
  30:	e0800d21 	add	r0, r0, r1, lsr #26
  34:	e1b00320 	lsrs	r0, r0, #6
  38:	01a0f00e 	moveq	pc, lr
  3c:	e320f000 	nop	{0}

00000040 <__loop_delay>:
  40:	e2500001 	subs	r0, r0, #1
  44:	8afffffe 	bhi	40 <__loop_delay>
  48:	e1a0f00e 	mov	pc, lr
  4c:	e320f000 	nop	{0}

, which now reports:
Calibrating delay loop... 996.14 BogoMIPS (lpj=4980736)

Some more test results:

On mx31 (ARM1136) running at 532 MHz, before the patch:
Calibrating delay loop... 351.43 BogoMIPS (lpj=1757184)

On mx31 (ARM1136) running at 532 MHz after the patch:
Calibrating delay loop... 528.79 BogoMIPS (lpj=2643968)

Also tested on mx6 (CortexA9) and on mx27 (ARM926), which shows the same
BogoMIPS value before and after this patch.
Reported-by: NTom Evans <tom_usenet@optusnet.com.au>
Suggested-by: NTom Evans <tom_usenet@optusnet.com.au>
Signed-off-by: NFabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

11d4bb1b

ARM: 7897/1: kexec: Use the right ISA for relocate_new_kernel · e2ccba49

由 Dave Martin 提交于 11月 25, 2013

Copying a function with memcpy() and then trying to execute the
result isn't trivially portable to Thumb.

This patch modifies the kexec soft restart code to copy its
assembler trampoline relocate_new_kernel() using fncpy() instead,
so that relocate_new_kernel can be in the same ISA as the rest of
the kernel without problems.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Reported-by: NTaras Kondratiuk <taras.kondratiuk@linaro.org>
Tested-by: NTaras Kondratiuk <taras.kondratiuk@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

e2ccba49

ARM: 7895/1: signal: fix armv7-m build issue in sigreturn_codes.S · 50913336

由 Victor Kamensky 提交于 11月 21, 2013

After "ARM: signal: sigreturn_codes should be endian neutral to
work in BE8" commit, thumb only platforms, like armv7m, fails to
compile sigreturn_codes.S. The reason is that for such arch
values '.arm' directive and arm opcodes are not allowed.

Fix conditionally enables arm opcodes only if no CONFIG_CPU_THUMBONLY
defined and it uses .org instructions to keep sigreturn_codes
layout.
Suggested-by: NDave Martin <Dave.Martin@arm.com>
Signed-off-by: NVictor Kamensky <victor.kamensky@linaro.org>
Tested-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: NDave Martin <Dave.Martin@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

50913336

ARM: footbridge: fix EBSA285 LEDs · 67130c54

由 Russell King 提交于 11月 29, 2013

- The LEDs register is write-only: it can't be read-modify-written.
- The LEDs are write-1-for-off not 0.
- The check for the platform was inverted.

Fixes: cf6856d6 ("ARM: mach-footbridge: retire custom LED code")
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Cc: stable@vger.kernel.org

67130c54

virtio_net: Fixed a trivial typo (fitler --> filter) · 99e872ae

由 Thomas Huth 提交于 11月 29, 2013

"MAC filter" sounds more reasonable than "MAC fitler".
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

99e872ae

netem: fix gemodel loss generator · eff7979f

由 stephen hemminger 提交于 11月 29, 2013

Patch from developers of the alternative loss models, downloaded from:
   http://netgroup.uniroma2.it/twiki/bin/view.cgi/Main/NetemCLG

 "in case 2, of the switch we change the direction of the inequality to
  net_random()>clg->a3, because clg->a3 is h in the GE model and when h
  is 0 all packets will be lost."
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eff7979f

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功