提交 · 0f57d86787d8b1076ea8f9cbdddda2a46d534a27 · openeuler / Kernel

11 6月, 2015 1 次提交

ALSA: hda - Re-add the lost fake mute support · a686ec4c

由 Takashi Iwai 提交于 6月 11, 2015

Yet another regression by the transition to regmap cache; for better
usability, we had the fake mute control using the zero amp value for
Conexant codecs, and this was forgotten in the new hda core code.

Since the bits 4-7 are unused for the amp registers (as we follow the
syntax of AMP_GET verb), the bit 4 is now used to indicate the fake
mute.  For setting this flag, snd_hda_codec_amp_update() becomes a
function from a simple macro.  The bonus is that it gained a proper
function description.
Signed-off-by: NTakashi Iwai <tiwai@suse.de>

a686ec4c

09 6月, 2015 1 次提交

iommu/vt-d: Change PASID support to bit 40 of Extended Capability Register · bd00c606

由 David Woodhouse 提交于 6月 09, 2015

The existing hardware implementations with PASID support advertised in
bit 28? Forget them. They do not exist. Bit 28 means nothing. When we
have something that works, it'll use bit 40. Do not attempt to infer
anything meaningful from bit 28.

This will be reflected in an updated VT-d spec in the extremely near
future.
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

bd00c606

01 6月, 2015 2 次提交

include/uapi/linux/virtio_balloon.h: include linux/virtio_types.h · 8a7b19d8

由 Mikko Rapeli 提交于 5月 30, 2015

Fixes userspace compilation error:

error: unknown type name ‘__virtio16’
  __virtio16 tag;
Signed-off-by: NMikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

8a7b19d8

tcp: fix child sockets to use system default congestion control if not set · 9f950415

由 Neal Cardwell 提交于 5月 29, 2015

Linux 3.17 and earlier are explicitly engineered so that if the app
doesn't specifically request a CC module on a listener before the SYN
arrives, then the child gets the system default CC when the connection
is established. See tcp_init_congestion_control() in 3.17 or earlier,
which says "if no choice made yet assign the current value set as
default". The change ("net: tcp: assign tcp cong_ops when tcp sk is
created") altered these semantics, so that children got their parent
listener's congestion control even if the system default had changed
after the listener was created.

This commit returns to those original semantics from 3.17 and earlier,
since they are the original semantics from 2007 in 4d4d3d1e ("[TCP]:
Congestion control initialization."), and some Linux congestion
control workflows depend on that.

In summary, if a listener socket specifically sets TCP_CONGESTION to
"x", or the route locks the CC module to "x", then the child gets
"x". Otherwise the child gets current system default from
net.ipv4.tcp_congestion_control. That's the behavior in 3.17 and
earlier, and this commit restores that.

Fixes: 55d8694f ("net: tcp: assign tcp cong_ops when tcp sk is created")
Cc: Florian Westphal <fw@strlen.de>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Glenn Judd <glenn.judd@morganstanley.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f950415

31 5月, 2015 3 次提交

target: Use a PASSTHROUGH flag instead of transport_types · a3541703

由 Andy Grover 提交于 5月 19, 2015

It seems like we only care if a transport is passthrough or not. Convert
transport_type to a flags field and replace TRANSPORT_PLUGIN_* with a
flag, TRANSPORT_FLAG_PASSTHROUGH.
Signed-off-by: NAndy Grover <agrover@redhat.com>
Reviewed-by: NIlias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

a3541703

target: Move passthrough CDB parsing into a common function · 7bfea53b

由 Andy Grover 提交于 5月 19, 2015

Aside from whether they handle BIDI ops or not, parsing of the CDB by
kernel and user SCSI passthrough modules should be identical. Move this
into a new passthrough_parse_cdb() and call it from tcm-pscsi and tcm-user.
Reported-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NIlias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: NAndy Grover <agrover@redhat.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

7bfea53b

target: Fix se_tpg_tfo->tf_subsys regression + remove tf_subsystem · d588cf8f

由 Christoph Hellwig 提交于 5月 03, 2015

There is just one configfs subsystem in the target code, so we might as
well add two helpers to reference / unreference it from the core code
instead of passing pointers to it around.

This fixes a regression introduced for v4.1-rc1 with commit 9ac8928e,
where configfs_depend_item() callers using se_tpg_tfo->tf_subsys would
fail, because the assignment from the original target_core_subsystem[]
is no longer happening at target_register_template() time.

(Fix target_core_exit_configfs pointer dereference - Sagi)
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NHimanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

d588cf8f

29 5月, 2015 5 次提交

tracing/mm: don't trace mm_page_pcpu_drain on offline cpus · 649b8de2

由 Shreyas B. Prabhu 提交于 5月 28, 2015

Since tracepoints use RCU for protection, they must not be called on
offline cpus.  trace_mm_page_pcpu_drain can be called on an offline cpu
in this scenario caught by LOCKDEP:

     ===============================
     [ INFO: suspicious RCU usage. ]
     4.1.0-rc1+ #9 Not tainted
     -------------------------------
     include/trace/events/kmem.h:265 suspicious rcu_dereference_check() usage!

    other info that might help us debug this:

    RCU used illegally from offline CPU!
    rcu_scheduler_active = 1, debug_locks = 1
     1 lock held by swapper/5/0:
      #0:  (&(&zone->lock)->rlock){..-...}, at: [<c0000000002073b0>] .free_pcppages_bulk+0x70/0x920

    stack backtrace:
     CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.1.0-rc1+ #9
     Call Trace:
       .dump_stack+0x98/0xd4 (unreliable)
       .lockdep_rcu_suspicious+0x108/0x170
       .free_pcppages_bulk+0x60c/0x920
       .free_hot_cold_page+0x208/0x280
       .destroy_context+0x90/0xd0
       .__mmdrop+0x58/0x160
       .idle_task_exit+0xf0/0x100
       .pnv_smp_cpu_kill_self+0x58/0x2c0
       .cpu_die+0x34/0x50
       .arch_cpu_idle_dead+0x20/0x40
       .cpu_startup_entry+0x708/0x7a0
       .start_secondary+0x36c/0x3a0
       start_secondary_prolog+0x10/0x14

Fix this by converting mm_page_pcpu_drain trace point into
TRACE_EVENT_CONDITION where condition is cpu_online(smp_processor_id())
Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

649b8de2

tracing/mm: don't trace mm_page_free on offline cpus · 1f0c27b5

由 Shreyas B. Prabhu 提交于 5月 28, 2015

Since tracepoints use RCU for protection, they must not be called on
offline cpus.  trace_mm_page_free can be called on an offline cpu in this
scenario caught by LOCKDEP:

     ===============================
     [ INFO: suspicious RCU usage. ]
     4.1.0-rc1+ #9 Not tainted
     -------------------------------
     include/trace/events/kmem.h:170 suspicious rcu_dereference_check() usage!

    other info that might help us debug this:

    RCU used illegally from offline CPU!
    rcu_scheduler_active = 1, debug_locks = 1
     no locks held by swapper/1/0.

    stack backtrace:
     CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.1.0-rc1+ #9
     Call Trace:
       .dump_stack+0x98/0xd4 (unreliable)
       .lockdep_rcu_suspicious+0x108/0x170
       .free_pages_prepare+0x494/0x680
       .free_hot_cold_page+0x50/0x280
       .destroy_context+0x90/0xd0
       .__mmdrop+0x58/0x160
       .idle_task_exit+0xf0/0x100
       .pnv_smp_cpu_kill_self+0x58/0x2c0
       .cpu_die+0x34/0x50
       .arch_cpu_idle_dead+0x20/0x40
       .cpu_startup_entry+0x708/0x7a0
       .start_secondary+0x36c/0x3a0
       start_secondary_prolog+0x10/0x14

Fix this by converting mm_page_free trace point into TRACE_EVENT_CONDITION
where condition is cpu_online(smp_processor_id())
Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1f0c27b5

tracing/mm: don't trace kmem_cache_free on offline cpus · e5feb1eb

由 Shreyas B. Prabhu 提交于 5月 28, 2015

Since tracepoints use RCU for protection, they must not be called on
offline cpus.  trace_kmem_cache_free can be called on an offline cpu in
this scenario caught by LOCKDEP:

    ===============================
    [ INFO: suspicious RCU usage. ]
    4.1.0-rc1+ #9 Not tainted
    -------------------------------
    include/trace/events/kmem.h:148 suspicious rcu_dereference_check() usage!

    other info that might help us debug this:

    RCU used illegally from offline CPU!
    rcu_scheduler_active = 1, debug_locks = 1
    no locks held by swapper/1/0.

    stack backtrace:
    CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.1.0-rc1+ #9
    Call Trace:
      .dump_stack+0x98/0xd4 (unreliable)
      .lockdep_rcu_suspicious+0x108/0x170
      .kmem_cache_free+0x344/0x4b0
      .__mmdrop+0x4c/0x160
      .idle_task_exit+0xf0/0x100
      .pnv_smp_cpu_kill_self+0x58/0x2c0
      .cpu_die+0x34/0x50
      .arch_cpu_idle_dead+0x20/0x40
      .cpu_startup_entry+0x708/0x7a0
      .start_secondary+0x36c/0x3a0
      start_secondary_prolog+0x10/0x14

Fix this by converting kmem_cache_free trace point into
TRACE_EVENT_CONDITION where condition is cpu_online(smp_processor_id())
Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Reported-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e5feb1eb

percpu_counter: batch size aware __percpu_counter_compare() · 80188b0d

由 Dave Chinner 提交于 5月 29, 2015

XFS uses non-stanard batch sizes for avoiding frequent global
counter updates on it's allocated inode counters, as they increment
or decrement in batches of 64 inodes. Hence the standard percpu
counter batch of 32 means that the counter is effectively a global
counter. Currently Xfs uses a batch size of 128 so that it doesn't
take the global lock on every single modification.

However, Xfs also needs to compare accurately against zero, which
means we need to use percpu_counter_compare(), and that has a
hard-coded batch size of 32, and hence will spuriously fail to
detect when it is supposed to use precise comparisons and hence
the accounting goes wrong.

Add __percpu_counter_compare() to take a custom batch size so we can
use it sanely in XFS and factor percpu_counter_compare() to use it.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NDave Chinner <david@fromorbit.com>

80188b0d

block: discard bdi_unregister() in favour of bdi_destroy() · aad653a0

由 NeilBrown 提交于 5月 19, 2015

bdi_unregister() now contains very little functionality.

It contains a "WARN_ON" if bdi->dev is NULL.  This warning is of no
real consequence as bdi->dev isn't needed by anything else in the function,
and it triggers if
   blk_cleanup_queue() -> bdi_destroy()
is called before bdi_unregister, which happens since
  Commit: 6cd18e71 ("block: destroy bdi before blockdev is unregistered.")

So this isn't wanted.

It also calls bdi_set_min_ratio().  This needs to be called after
writes through the bdi have all been flushed, and before the bdi is destroyed.
Calling it early is better than calling it late as it frees up a global
resource.

Calling it immediately after bdi_wb_shutdown() in bdi_destroy()
perfectly fits these requirements.

So bdi_unregister() can be discarded with the important content moved to
bdi_destroy(), as can the
  writeback_bdi_unregister
event which is already not used.
Reported-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org (v4.0)
Fixes: c4db59d3 ("fs: don't reassign dirty inodes to default_backing_dev_info")
Fixes: 6cd18e71 ("block: destroy bdi before blockdev is unregistered.")
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NDan Williams <dan.j.williams@intel.com>
Tested-by: NNicholas Moulin <nicholas.w.moulin@linux.intel.com>
Signed-off-by: NNeilBrown <neilb@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

aad653a0

28 5月, 2015 3 次提交

mac80211: Fix mac80211.h docbook comments · 3a7af58f

由 Jonathan Corbet 提交于 4月 13, 2015

A couple of enums in mac80211.h became structures recently, but the
comments didn't follow suit, leading to errors like:

  Error(.//include/net/mac80211.h:367): Cannot parse enum!
  Documentation/DocBook/Makefile:93: recipe for target 'Documentation/DocBook/80211.xml' failed
  make[1]: *** [Documentation/DocBook/80211.xml] Error 1
  Makefile:1361: recipe for target 'mandocs' failed
  make: *** [mandocs] Error 2

Fix the comments comments accordingly.  Added a couple of other small
comment fixes while I was there to silence other recently-added docbook
warnings.
Reported-by: NJim Davis <jim.epost@gmail.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

3a7af58f

cpumask_set_cpu_local_first => cpumask_local_spread, lament · f36963c9

由 Rusty Russell 提交于 5月 09, 2015

da91309e (cpumask: Utility function to set n'th cpu...) created a
genuinely weird function.  I never saw it before, it went through DaveM.
(He only does this to make us other maintainers feel better about our own
mistakes.)

cpumask_set_cpu_local_first's purpose is say "I need to spread things
across N online cpus, choose the ones on this numa node first"; you call
it in a loop.

It can fail.  One of the two callers ignores this, the other aborts and
fails the device open.

It can fail in two ways: allocating the off-stack cpumask, or through a
convoluted codepath which AFAICT can only occur if cpu_online_mask
changes.  Which shouldn't happen, because if cpu_online_mask can change
while you call this, it could return a now-offline cpu anyway.

It contains a nonsensical test "!cpumask_of_node(numa_node)".  This was
drawn to my attention by Geert, who said this causes a warning on Sparc.
It sets a single bit in a cpumask instead of returning a cpu number,
because that's what the callers want.

It could be made more efficient by passing the previous cpu rather than
an index, but that would be more invasive to the callers.

Fixes: da91309e
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (then rebased)
Tested-by: NAmir Vadai <amirv@mellanox.com>
Acked-by: NAmir Vadai <amirv@mellanox.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>

f36963c9

sctp: Fix mangled IPv4 addresses on a IPv6 listening socket · 9302d7bb

由 Jason Gunthorpe 提交于 5月 26, 2015

sctp_v4_map_v6 was subtly writing and reading from members
of a union in a way the clobbered data it needed to read before
it read it.

Zeroing the v6 flowinfo overwrites the v4 sin_addr with 0, meaning
that every place that calls sctp_v4_map_v6 gets ::ffff:0.0.0.0 as the
result.

Reorder things to guarantee correct behaviour no matter what the
union layout is.

This impacts user space clients that open an IPv6 SCTP socket and
receive IPv4 connections. Prior to 299ee user space would see a
sockaddr with AF_INET and a correct address, after 299ee the sockaddr
is AF_INET6, but the address is wrong.

Fixes: 299ee123 (sctp: Fixup v4mapped behaviour to comply with Sock API)
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9302d7bb

27 5月, 2015 2 次提交

perf/x86: Fix event/group validation · b371b594

由 Peter Zijlstra 提交于 5月 21, 2015

Commit 43b45780 ("perf/x86: Reduce stack usage of
x86_schedule_events()") violated the rule that 'fake' scheduling; as
used for event/group validation; should not change the event state.

This went mostly un-noticed because repeated calls of
x86_pmu::get_event_constraints() would give the same result. And
x86_pmu::put_event_constraints() would mostly not do anything.

Commit e979121b ("perf/x86/intel: Implement cross-HT corruption
bug workaround") made the situation much worse by actually setting the
event->hw.constraint value to NULL, so when validation and actual
scheduling interact we get NULL ptr derefs.

Fix it by removing the constraint pointer from the event and move it
back to an array, this time in cpuc instead of on the stack.

validate_group()
  x86_schedule_events()
    event->hw.constraint = c; # store

      <context switch>
        perf_task_event_sched_in()
          ...
            x86_schedule_events();
              event->hw.constraint = c2; # store

              ...

              put_event_constraints(event); # assume failure to schedule
                intel_put_event_constraints()
                  event->hw.constraint = NULL;

      <context switch end>

    c = event->hw.constraint; # read -> NULL

    if (!test_bit(hwc->idx, c->idxmsk)) # <- *BOOM* NULL deref

This in particular is possible when the event in question is a
cpu-wide event and group-leader, where the validate_group() tries to
add an event to the group.
Reported-by: NVince Weaver <vincent.weaver@maine.edu>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Hunter <ahh@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 43b45780 ("perf/x86: Reduce stack usage of x86_schedule_events()")
Fixes: e979121b ("perf/x86/intel: Implement cross-HT corruption bug workaround")
Signed-off-by: NIngo Molnar <mingo@kernel.org>

b371b594

drivers: of/base: move of_init to driver_init · 194ec936

由 Sudeep Holla 提交于 5月 14, 2015

Commit 5590f319 ("drivers/core/of: Add symlink to device-tree from
devices with an OF node") adds the symlink `of_node` for each device
pointing to it's device tree node while creating/initialising it.

However the devicetree sysfs is created and setup in of_init which is
executed at core_initcall level. For all the devices created before
of_init, the following error is thrown:
	"Error -2(-ENOENT) creating of_node link"

Like many other components in driver model, initialize the sysfs support
for OF/devicetree from driver_init so that it's ready before any devices
are created.

Fixes: 5590f319 ("drivers/core/of: Add symlink to device-tree from
	devices with an OF node")
Suggested-by: NRob Herring <robh+dt@kernel.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
Tested-by: NRobert Schwebel <r.schwebel@pengutronix.de>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

194ec936

25 5月, 2015 1 次提交

net: phy: bcm7xxx: Fix 7425 PHY ID and flags · cc4a84c3

由 Florian Fainelli 提交于 5月 22, 2015

While adding support for 7425 PHY in the 7xxx PHY driver, the ID that
was used was actually coming from an external PHY: a BCM5461x. Fix this
by using the proper ID for the internal 7425 PHY and set the
PHY_IS_INTERNAL flag, otherwise consumers of this PHY driver would not
be able to properly identify it as such.

Fixes: d068b02c ("net: phy: add BCM7425 and BCM7429 PHYs")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NPetri Gynther <pgynther@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc4a84c3

23 5月, 2015 1 次提交

tcp: fix a potential deadlock in tcp_get_info() · d654976c

由 Eric Dumazet 提交于 5月 21, 2015

Taking socket spinlock in tcp_get_info() can deadlock, as
inet_diag_dump_icsk() holds the &hashinfo->ehash_locks[i],
while packet processing can use the reverse locking order.

We could avoid this locking for TCP_LISTEN states, but lockdep would
certainly get confused as all TCP sockets share same lockdep classes.

[  523.722504] ======================================================
[  523.728706] [ INFO: possible circular locking dependency detected ]
[  523.734990] 4.1.0-dbg-DEV #1676 Not tainted
[  523.739202] -------------------------------------------------------
[  523.745474] ss/18032 is trying to acquire lock:
[  523.750002]  (slock-AF_INET){+.-...}, at: [<ffffffff81669d44>] tcp_get_info+0x2c4/0x360
[  523.758129]
[  523.758129] but task is already holding lock:
[  523.763968]  (&(&hashinfo->ehash_locks[i])->rlock){+.-...}, at: [<ffffffff816bcb75>] inet_diag_dump_icsk+0x1d5/0x6c0
[  523.774661]
[  523.774661] which lock already depends on the new lock.
[  523.774661]
[  523.782850]
[  523.782850] the existing dependency chain (in reverse order) is:
[  523.790326]
-> #1 (&(&hashinfo->ehash_locks[i])->rlock){+.-...}:
[  523.796599]        [<ffffffff811126bb>] lock_acquire+0xbb/0x270
[  523.802565]        [<ffffffff816f5868>] _raw_spin_lock+0x38/0x50
[  523.808628]        [<ffffffff81665af8>] __inet_hash_nolisten+0x78/0x110
[  523.815273]        [<ffffffff816819db>] tcp_v4_syn_recv_sock+0x24b/0x350
[  523.822067]        [<ffffffff81684d41>] tcp_check_req+0x3c1/0x500
[  523.828199]        [<ffffffff81682d09>] tcp_v4_do_rcv+0x239/0x3d0
[  523.834331]        [<ffffffff816842fe>] tcp_v4_rcv+0xa8e/0xc10
[  523.840202]        [<ffffffff81658fa3>] ip_local_deliver_finish+0x133/0x3e0
[  523.847214]        [<ffffffff81659a9a>] ip_local_deliver+0xaa/0xc0
[  523.853440]        [<ffffffff816593b8>] ip_rcv_finish+0x168/0x5c0
[  523.859624]        [<ffffffff81659db7>] ip_rcv+0x307/0x420

Lets use u64_sync infrastructure instead. As a bonus, 64bit
arches get optimized, as these are nop for them.

Fixes: 0df48c26 ("tcp: add tcpi_bytes_acked to tcp_info")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d654976c

20 5月, 2015 2 次提交

Revert "netfilter: bridge: query conntrack about skb dnat" · faecbb45

由 Florian Westphal 提交于 5月 20, 2015

This reverts commit c055d5b0.

There are two issues:
'dnat_took_place' made me think that this is related to
-j DNAT/MASQUERADE.

But thats only one part of the story.  This is also relevant for SNAT
when we undo snat translation in reverse/reply direction.

Furthermore, I originally wanted to do this mainly to avoid
storing ipv6 addresses once we make DNAT/REDIRECT work
for ipv6 on bridges.

However, I forgot about SNPT/DNPT which is stateless.

So we can't escape storing address for ipv6 anyway. Might as
well do it for ipv4 too.
Reported-and-tested-by: NBernhard Thaler <bernhard.thaler@wvnet.at>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

faecbb45

xen/events: don't bind non-percpu VIRQs with percpu chip · 77bb3dfd

由 David Vrabel 提交于 5月 19, 2015

A non-percpu VIRQ (e.g., VIRQ_CONSOLE) may be freed on a different
VCPU than it is bound to.  This can result in a race between
handle_percpu_irq() and removing the action in __free_irq() because
handle_percpu_irq() does not take desc->lock.  The interrupt handler
sees a NULL action and oopses.

Only use the percpu chip/handler for per-CPU VIRQs (like VIRQ_TIMER).

  # cat /proc/interrupts | grep virq
   40:      87246          0  xen-percpu-virq      timer0
   44:          0          0  xen-percpu-virq      debug0
   47:          0      20995  xen-percpu-virq      timer1
   51:          0          0  xen-percpu-virq      debug1
   69:          0          0   xen-dyn-virq      xen-pcpu
   74:          0          0   xen-dyn-virq      mce
   75:         29          0   xen-dyn-virq      hvc_console
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Cc: <stable@vger.kernel.org>

77bb3dfd

19 5月, 2015 1 次提交

inet: properly align icsk_ca_priv · b5d721d7

由 Eric Dumazet 提交于 5月 18, 2015

tcp_illinois and upcoming tcp_cdg require 64bit alignment of
icsk_ca_priv

x86 does not care, but other architectures might.

Fixes: 05cbc0db ("ipv4: Create probe timer for tcp PMTU as per RFC4821")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Fan Du <fan.du@intel.com>
Acked-by: NFan Du <fan.du@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5d721d7

17 5月, 2015 1 次提交

rhashtable: Add cap on number of elements in hash table · 07ee0722

由 Herbert Xu 提交于 5月 15, 2015

We currently have no limit on the number of elements in a hash table.
This is a problem because some users (tipc) set a ceiling on the
maximum table size and when that is reached the hash table may
degenerate.  Others may encounter OOM when growing and if we allow
insertions when that happens the hash table perofrmance may also
suffer.

This patch adds a new paramater insecure_max_entries which becomes
the cap on the table.  If unset it defaults to max_size * 2.  If
it is also zero it means that there is no cap on the number of
elements in the table.  However, the table will grow whenever the
utilisation hits 100% and if that growth fails, you will get ENOMEM
on insertion.

As allowing oversubscription is potentially dangerous, the name
contains the word insecure.

Note that the cap is not a hard limit.  This is done for performance
reasons as enforcing a hard limit will result in use of atomic ops
that are heavier than the ones we currently use.

The reasoning is that we're only guarding against a gross over-
subscription of the table, rather than a small breach of the limit.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07ee0722

16 5月, 2015 1 次提交

conntrack: RFC5961 challenge ACK confuse conntrack LAST-ACK transition · b3cad287

由 Jesper Dangaard Brouer 提交于 5月 07, 2015

In compliance with RFC5961, the network stack send challenge ACK in
response to spurious SYN packets, since commit 0c228e83 ("tcp:
Restore RFC5961-compliant behavior for SYN packets").

This pose a problem for netfilter conntrack in state LAST_ACK, because
this challenge ACK is (falsely) seen as ACKing last FIN, causing a
false state transition (into TIME_WAIT).

The challenge ACK is hard to distinguish from real last ACK.  Thus,
solution introduce a flag that tracks the potential for seeing a
challenge ACK, in case a SYN packet is let through and current state
is LAST_ACK.

When conntrack transition LAST_ACK to TIME_WAIT happens, this flag is
used for determining if we are expecting a challenge ACK.

Scapy based reproducer script avail here:
 https://github.com/netoptimizer/network-testing/blob/master/scapy/tcp_hacks_3WHS_LAST_ACK.py

Fixes: 0c228e83 ("tcp: Restore RFC5961-compliant behavior for SYN packets")
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

b3cad287

15 5月, 2015 3 次提交

rename RTNH_F_EXTERNAL to RTNH_F_OFFLOAD · eea39946

由 Roopa Prabhu 提交于 5月 13, 2015

RTNH_F_EXTERNAL today is printed as "offload" in iproute2 output.

This patch renames the flag to be consistent with what the user sees.
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eea39946

uidgid: make uid_valid and gid_valid work with !CONFIG_MULTIUSER · 929aa5b2

由 Josh Triplett 提交于 5月 14, 2015

{u,g}id_valid call {u,g}id_eq, which calls __k{u,g}id_val on both
arguments and compares.  With !CONFIG_MULTIUSER, __k{u,g}id_val return a
constant 0, which makes {u,g}id_valid always return false.  Change
{u,g}id_valid to compare their argument against -1 instead.  That produces
identical results in the normal CONFIG_MULTIUSER=y case, but with
!CONFIG_MULTIUSER will make {u,g}id_valid constant-fold into "return
true;" rather than "return false;".

This fixes uses of devpts without CONFIG_MULTIUSER.
Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>,
Cc: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

929aa5b2

gfp: add __GFP_NOACCOUNT · 8f4fc071

由 Vladimir Davydov 提交于 5月 14, 2015

Not all kmem allocations should be accounted to memcg.  The following
patch gives an example when accounting of a certain type of allocations to
memcg can effectively result in a memory leak.  This patch adds the
__GFP_NOACCOUNT flag which if passed to kmalloc and friends will force the
allocation to go through the root cgroup.  It will be used by the next
patch.

Note, since in case of kmemleak enabled each kmalloc implies yet another
allocation from the kmemleak_object cache, we add __GFP_NOACCOUNT to
gfp_kmemleak_mask.

Alternatively, we could introduce a per kmem cache flag disabling
accounting for all allocations of a particular kind, but (a) we would not
be able to bypass accounting for kmalloc then and (b) a kmem cache with
this flag set could not be merged with a kmem cache without this flag,
which would increase the number of global caches and therefore
fragmentation even if the memory cgroup controller is not used.

Despite its generic name, currently __GFP_NOACCOUNT disables accounting
only for kmem allocations while user page allocations are always charged.
To catch abusing of this flag, a warning is issued on an attempt of
passing it to mem_cgroup_try_charge.
Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>	[4.0.x]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8f4fc071

13 5月, 2015 4 次提交

ktime: Fix ktime_divns to do signed division · f7bcb70e

由 John Stultz 提交于 5月 08, 2015

It was noted that the 32bit implementation of ktime_divns()
was doing unsigned division and didn't properly handle
negative values.

And when a ktime helper was changed to utilize
ktime_divns, it caused a regression on some IR blasters.
See the following bugzilla for details:
  https://bugzilla.redhat.com/show_bug.cgi?id=1200353

This patch fixes the problem in ktime_divns by checking
and preserving the sign bit, and then reapplying it if
appropriate after the division, it also changes the return
type to a s64 to make it more obvious this is expected.

Nicolas also pointed out that negative dividers would
cause infinite loops on 32bit systems, negative dividers
is unlikely for users of this function, but out of caution
this patch adds checks for negative dividers for both
32-bit (BUG_ON) and 64-bit(WARN_ON) versions to make sure
no such use cases creep in.

[ tglx: Hand an u64 to do_div() to avoid the compiler warning ]

Fixes: 166afb64 'ktime: Sanitize ktime_to_us/ms conversion'
Reported-and-tested-by: NTrevor Cordes <trevor@tecnopolis.ca>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1431118043-23452-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

f7bcb70e

net: Remove remaining remnants of pm_qos from netdevice.h · 01d460dd

由 David Ahern 提交于 5月 12, 2015

Commit e2c65448 removed pm_qos from struct net_device but left the
comment and header file. Remove those.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01d460dd

block: remove export for blk_queue_bio · 336b7e1f

由 Mike Snitzer 提交于 5月 11, 2015

With commit ff36ab34 ("dm: remove request-based logic from
make_request_fn wrapper") DM no longer calls blk_queue_bio() directly,
so remove its export.  Doing so required a forward declaration in
blk-core.c.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

336b7e1f

A
drm/radeon: add new bonaire pci id · fcf3b542
由 Alex Deucher 提交于 5月 12, 2015
```
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
```
fcf3b542

12 5月, 2015 1 次提交

HID: hid-sensor-hub: Fix debug lock warning · 2d94e522

由 Srinivas Pandruvada 提交于 5月 07, 2015

When CONFIG_DEBUG_LOCK_ALLOC is defined, mutex magic is compared and
warned for (l->magic != l), here l is the address of mutex pointer.
In hid-sensor-hub as part of hsdev creation, a per hsdev mutex is
initialized during MFD cell creation. This hsdev, which contains, mutex
is part of platform data for the a cell. But platform_data is copied
in platform_device_add_data() in platform.c. This copy will copy the
whole hsdev structure including mutex. But once copied the magic
will no longer match. So when client driver call
sensor_hub_input_attr_get_raw_value, this will trigger mutex warning.
So to avoid this allocate mutex dynamically. This will be same even
after copy.
Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

2d94e522

11 5月, 2015 1 次提交

pty: Fix input race when closing · 1a48632f

由 Peter Hurley 提交于 4月 13, 2015

A read() from a pty master may mistakenly indicate EOF (errno == -EIO)
after the pty slave has closed, even though input data remains to be read.
For example,

       pty slave       |        input worker        |    pty master
                       |                            |
                       |                            |   n_tty_read()
pty_write()            |                            |     input avail? no
  add data             |                            |     sleep
  schedule worker  --->|                            |     .
                       |---> flush_to_ldisc()       |     .
pty_close()            |       fill read buffer     |     .
  wait for worker      |       wakeup reader    --->|     .
                       |       read buffer full?    |---> input avail ? yes
                       |<---   yes - exit worker    |     copy 4096 bytes to user
  TTY_OTHER_CLOSED <---|                            |<--- kick worker
                       |                            |

		                **** New read() before worker starts ****

                       |                            |   n_tty_read()
                       |                            |     input avail? no
                       |                            |     TTY_OTHER_CLOSED? yes
                       |                            |     return -EIO

Several conditions are required to trigger this race:
1. the ldisc read buffer must become full so the input worker exits
2. the read() count parameter must be >= 4096 so the ldisc read buffer
   is empty
3. the subsequent read() occurs before the kicked worker has processed
   more input

However, the underlying cause of the race is that data is pipelined, while
tty state is not; ie., data already written by the pty slave end is not
yet visible to the pty master end, but state changes by the pty slave end
are visible to the pty master end immediately.

Pipeline the TTY_OTHER_CLOSED state through input worker to the reader.
1. Introduce TTY_OTHER_DONE which is set by the input worker when
   TTY_OTHER_CLOSED is set and either the input buffers are flushed or
   input processing has completed. Readers/polls are woken when
   TTY_OTHER_DONE is set.
2. Reader/poll checks TTY_OTHER_DONE instead of TTY_OTHER_CLOSED.
3. A new input worker is started from pty_close() after setting
   TTY_OTHER_CLOSED, which ensures the TTY_OTHER_DONE state will be
   set if the last input worker is already finished (or just about to
   exit).

Remove tty_flush_to_ldisc(); no in-tree callers.

Fixes: 52bce7f8 ("pty, n_tty: Simplify input processing on final close")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96311
BugLink: http://bugs.launchpad.net/bugs/1429756
Cc: <stable@vger.kernel.org> # 3.19+
Reported-by: NAndy Whitcroft <apw@canonical.com>
Reported-by: NH.J. Lu <hjl.tools@gmail.com>
Signed-off-by: NPeter Hurley <peter@hurleysoftware.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

1a48632f

10 5月, 2015 1 次提交

mpls: Change reserved label names to be consistent with netbsd · 78f5b899

由 Tom Herbert 提交于 5月 07, 2015

Since these are now visible to userspace it is nice to be consistent
with BSD (sys/netmpls/mpls.h in netBSD).
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78f5b899

09 5月, 2015 1 次提交

clk: si5351: Do not pass struct clk in platform_data · 0cd3be6e

由 Sebastian Hesselbarth 提交于 5月 04, 2015

When registering clk-si5351 by platform_data, we should not pass struct clk
for the reference clocks. Drop struct clk from platform_data and rework the
driver to use devm_clk_get of named clock references.

While at it, check for at least one valid input clock and properly prepare/
enable valid reference clocks.
Signed-off-by: NSebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Reported-by: NMichael Welling <mwelling@ieee.org>
Reported-by: NJean-Francois Moine <moinejf@free.fr>
Reported-by: NRussell King <rmk+linux@arm.linux.org.uk>
Tested-by: NMichael Welling <mwelling@ieee.org>
Tested-by: NJean-Francois Moine <moinejf@free.fr>
Signed-off-by: NMichael Turquette <mturquette@linaro.org>

0cd3be6e

08 5月, 2015 1 次提交

sched: Handle priority boosted tasks proper in setscheduler() · 0782e63b

由 Thomas Gleixner 提交于 5月 05, 2015

Ronny reported that the following scenario is not handled correctly:

	T1 (prio = 10)
	   lock(rtmutex);

	T2 (prio = 20)
	   lock(rtmutex)
	      boost T1

	T1 (prio = 20)
	   sys_set_scheduler(prio = 30)
	   T1 prio = 30
	   ....
	   sys_set_scheduler(prio = 10)
	   T1 prio = 30

The last step is wrong as T1 should now be back at prio 20.

Commit c365c292 ("sched: Consider pi boosting in setscheduler()")
only handles the case where a boosted tasks tries to lower its
priority.

Fix it by taking the new effective priority into account for the
decision whether a change of the priority is required.
Reported-by: NRonny Meeus <ronny.meeus@gmail.com>
Tested-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Fixes: c365c292 ("sched: Consider pi boosting in setscheduler()")
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1505051806060.4225@nanosSigned-off-by: NIngo Molnar <mingo@kernel.org>

0782e63b

07 5月, 2015 1 次提交

tracing: Make ftrace_print_array_seq compute buf_len · ac01ce14

由 Alex Bennée 提交于 4月 29, 2015

The only caller to this function (__print_array) was getting it wrong by
passing the array length instead of buffer length. As the element size
was already being passed for other reasons it seems reasonable to push
the calculation of buffer length into the function.

Link: http://lkml.kernel.org/r/1430320727-14582-1-git-send-email-alex.bennee@linaro.orgSigned-off-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

ac01ce14

06 5月, 2015 3 次提交

nilfs2: fix sanity check of btree level in nilfs_btree_root_broken() · d8fd150f

由 Ryusuke Konishi 提交于 5月 05, 2015

The range check for b-tree level parameter in nilfs_btree_root_broken()
is wrong; it accepts the case of "level == NILFS_BTREE_LEVEL_MAX" even
though the level is limited to values in the range of 0 to
(NILFS_BTREE_LEVEL_MAX - 1).

Since the level parameter is read from storage device and used to index
nilfs_btree_path array whose element count is NILFS_BTREE_LEVEL_MAX, it
can cause memory overrun during btree operations if the boundary value
is set to the level parameter on device.

This fixes the broken sanity check and adds a comment to clarify that
the upper bound NILFS_BTREE_LEVEL_MAX is exclusive.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d8fd150f

util_macros.h: have array pointer point to array of constants · 05836c37

由 Guenter Roeck 提交于 5月 05, 2015

Using the new find_closest() macro can result in the following sparse
warnings.

  drivers/hwmon/lm85.c:194:16: warning:
  		incorrect type in initializer (different modifiers)
  drivers/hwmon/lm85.c:194:16:    expected int *__fc_a
  drivers/hwmon/lm85.c:194:16:    got int static const [toplevel] *<noident>
  drivers/hwmon/lm85.c:210:16: warning:
  		incorrect type in initializer (different modifiers)
  drivers/hwmon/lm85.c:210:16:    expected int *__fc_a
  drivers/hwmon/lm85.c:210:16:    got int const *map

This is because the array passed to find_closest() will typically be
declared as array of constants, but the macro declares a non-constant
pointer to it.
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

05836c37

mpls: Move reserved label definitions · c967a087

由 Tom Herbert 提交于 5月 05, 2015

Move to include/uapi/linux/mpls.h to be externally visibile.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c967a087

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功