提交 · b5d721d761fccf491f92eefc611e318561683d0a · openeuler / Kernel

19 5月, 2015 1 次提交

inet: properly align icsk_ca_priv · b5d721d7

由 Eric Dumazet 提交于 5月 18, 2015

tcp_illinois and upcoming tcp_cdg require 64bit alignment of
icsk_ca_priv

x86 does not care, but other architectures might.

Fixes: 05cbc0db ("ipv4: Create probe timer for tcp PMTU as per RFC4821")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Fan Du <fan.du@intel.com>
Acked-by: NFan Du <fan.du@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5d721d7

17 5月, 2015 1 次提交

rhashtable: Add cap on number of elements in hash table · 07ee0722

由 Herbert Xu 提交于 5月 15, 2015

We currently have no limit on the number of elements in a hash table.
This is a problem because some users (tipc) set a ceiling on the
maximum table size and when that is reached the hash table may
degenerate.  Others may encounter OOM when growing and if we allow
insertions when that happens the hash table perofrmance may also
suffer.

This patch adds a new paramater insecure_max_entries which becomes
the cap on the table.  If unset it defaults to max_size * 2.  If
it is also zero it means that there is no cap on the number of
elements in the table.  However, the table will grow whenever the
utilisation hits 100% and if that growth fails, you will get ENOMEM
on insertion.

As allowing oversubscription is potentially dangerous, the name
contains the word insecure.

Note that the cap is not a hard limit.  This is done for performance
reasons as enforcing a hard limit will result in use of atomic ops
that are heavier than the ones we currently use.

The reasoning is that we're only guarding against a gross over-
subscription of the table, rather than a small breach of the limit.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07ee0722

16 5月, 2015 1 次提交

conntrack: RFC5961 challenge ACK confuse conntrack LAST-ACK transition · b3cad287

由 Jesper Dangaard Brouer 提交于 5月 07, 2015

In compliance with RFC5961, the network stack send challenge ACK in
response to spurious SYN packets, since commit 0c228e83 ("tcp:
Restore RFC5961-compliant behavior for SYN packets").

This pose a problem for netfilter conntrack in state LAST_ACK, because
this challenge ACK is (falsely) seen as ACKing last FIN, causing a
false state transition (into TIME_WAIT).

The challenge ACK is hard to distinguish from real last ACK.  Thus,
solution introduce a flag that tracks the potential for seeing a
challenge ACK, in case a SYN packet is let through and current state
is LAST_ACK.

When conntrack transition LAST_ACK to TIME_WAIT happens, this flag is
used for determining if we are expecting a challenge ACK.

Scapy based reproducer script avail here:
 https://github.com/netoptimizer/network-testing/blob/master/scapy/tcp_hacks_3WHS_LAST_ACK.py

Fixes: 0c228e83 ("tcp: Restore RFC5961-compliant behavior for SYN packets")
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

b3cad287

15 5月, 2015 1 次提交

rename RTNH_F_EXTERNAL to RTNH_F_OFFLOAD · eea39946

由 Roopa Prabhu 提交于 5月 13, 2015

RTNH_F_EXTERNAL today is printed as "offload" in iproute2 output.

This patch renames the flag to be consistent with what the user sees.
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eea39946

13 5月, 2015 1 次提交

net: Remove remaining remnants of pm_qos from netdevice.h · 01d460dd

由 David Ahern 提交于 5月 12, 2015

Commit e2c65448 removed pm_qos from struct net_device but left the
comment and header file. Remove those.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01d460dd

10 5月, 2015 1 次提交

mpls: Change reserved label names to be consistent with netbsd · 78f5b899

由 Tom Herbert 提交于 5月 07, 2015

Since these are now visible to userspace it is nice to be consistent
with BSD (sys/netmpls/mpls.h in netBSD).
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78f5b899

07 5月, 2015 1 次提交

tracing: Make ftrace_print_array_seq compute buf_len · ac01ce14

由 Alex Bennée 提交于 4月 29, 2015

The only caller to this function (__print_array) was getting it wrong by
passing the array length instead of buffer length. As the element size
was already being passed for other reasons it seems reasonable to push
the calculation of buffer length into the function.

Link: http://lkml.kernel.org/r/1430320727-14582-1-git-send-email-alex.bennee@linaro.orgSigned-off-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

ac01ce14

06 5月, 2015 5 次提交

nilfs2: fix sanity check of btree level in nilfs_btree_root_broken() · d8fd150f

由 Ryusuke Konishi 提交于 5月 05, 2015

The range check for b-tree level parameter in nilfs_btree_root_broken()
is wrong; it accepts the case of "level == NILFS_BTREE_LEVEL_MAX" even
though the level is limited to values in the range of 0 to
(NILFS_BTREE_LEVEL_MAX - 1).

Since the level parameter is read from storage device and used to index
nilfs_btree_path array whose element count is NILFS_BTREE_LEVEL_MAX, it
can cause memory overrun during btree operations if the boundary value
is set to the level parameter on device.

This fixes the broken sanity check and adds a comment to clarify that
the upper bound NILFS_BTREE_LEVEL_MAX is exclusive.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d8fd150f

util_macros.h: have array pointer point to array of constants · 05836c37

由 Guenter Roeck 提交于 5月 05, 2015

Using the new find_closest() macro can result in the following sparse
warnings.

  drivers/hwmon/lm85.c:194:16: warning:
  		incorrect type in initializer (different modifiers)
  drivers/hwmon/lm85.c:194:16:    expected int *__fc_a
  drivers/hwmon/lm85.c:194:16:    got int static const [toplevel] *<noident>
  drivers/hwmon/lm85.c:210:16: warning:
  		incorrect type in initializer (different modifiers)
  drivers/hwmon/lm85.c:210:16:    expected int *__fc_a
  drivers/hwmon/lm85.c:210:16:    got int const *map

This is because the array passed to find_closest() will typically be
declared as array of constants, but the macro declares a non-constant
pointer to it.
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

05836c37

mpls: Move reserved label definitions · c967a087

由 Tom Herbert 提交于 5月 05, 2015

Move to include/uapi/linux/mpls.h to be externally visibile.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c967a087

IB/core: Fix unaligned accesses · 0d0f738f

由 David Ahern 提交于 5月 03, 2015

Addresses the following kernel logs seen during boot of sparc systems:

Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Signed-off-by: NDavid Ahern <david.ahern@oracle.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

0d0f738f

IB/core: change rdma_gid2ip into void function as it always return zero · 471e7058

由 Honggang LI 提交于 4月 29, 2015

Signed-off-by: NHonggang Li <honli@redhat.com>
Acked-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

471e7058

05 5月, 2015 2 次提交

RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the... · 6eec1774

由 Tatyana Nikolova 提交于 4月 21, 2015

RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the connecting peer to its clients

Add functionality to enable the port mapper on the passive side to provide to its
clients the actual (non-mapped) ip/tcp address information of the connecting peer

1) Adding remote_info_cb() to process the address info of the connecting peer
The address info is provided by the user space port mapper service when
the connection is initiated by the peer
2) Adding a hash list to store the remote address info
3) Adding functionality to add/remove the remote address info
After the info has been provided to the port mapper client,
it is removed from the hash list
Signed-off-by: NTatyana Nikolova <tatyana.e.nikolova@intel.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

6eec1774

blk-mq: fix FUA request hang · b2387ddc

由 Shaohua Li 提交于 5月 01, 2015

When a FUA request enters its DATA stage of flush pipeline, the
request is added to mq requeue list, the request will then be added to
ctx->rq_list. blk_mq_attempt_merge() might merge the request with a bio.
Later when the request is finished the flush pipeline, the
request->__data_len is 0. Then I only saw the bio gets endio called, the
original request never finish.

Adding REQ_FLUSH_SEQ into REQ_NOMERGE_FLAGS looks an easy fix.

stable: 3.15+
Signed-off-by: NShaohua Li <shli@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b2387ddc

04 5月, 2015 3 次提交

mac80211: fix 90 kernel-doc warnings · ff419b3f

由 Randy Dunlap 提交于 4月 26, 2015

Eliminate 90 of these warnings:

Warning(..//include/net/mac80211.h:1682): No description found for parameter 'drv_priv[0]'
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

ff419b3f

lib: make memzero_explicit more robust against dead store elimination · 7829fb09

由 Daniel Borkmann 提交于 4月 30, 2015

In commit 0b053c95 ("lib: memzero_explicit: use barrier instead
of OPTIMIZER_HIDE_VAR"), we made memzero_explicit() more robust in
case LTO would decide to inline memzero_explicit() and eventually
find out it could be elimiated as dead store.

While using barrier() works well for the case of gcc, recent efforts
from LLVMLinux people suggest to use llvm as an alternative to gcc,
and there, Stephan found in a simple stand-alone user space example
that llvm could nevertheless optimize and thus elimitate the memset().
A similar issue has been observed in the referenced llvm bug report,
which is regarded as not-a-bug.

Based on some experiments, icc is a bit special on its own, while it
doesn't seem to eliminate the memset(), it could do so with an own
implementation, and then result in similar findings as with llvm.

The fix in this patch now works for all three compilers (also tested
with more aggressive optimization levels). Arguably, in the current
kernel tree it's more of a theoretical issue, but imho, it's better
to be pedantic about it.

It's clearly visible with gcc/llvm though, with the below code: if we
would have used barrier() only here, llvm would have omitted clearing,
not so with barrier_data() variant:

  static inline void memzero_explicit(void *s, size_t count)
  {
    memset(s, 0, count);
    barrier_data(s);
  }

  int main(void)
  {
    char buff[20];
    memzero_explicit(buff, sizeof(buff));
    return 0;
  }

  $ gcc -O2 test.c
  $ gdb a.out
  (gdb) disassemble main
  Dump of assembler code for function main:
   0x0000000000400400  <+0>: lea   -0x28(%rsp),%rax
   0x0000000000400405  <+5>: movq  $0x0,-0x28(%rsp)
   0x000000000040040e <+14>: movq  $0x0,-0x20(%rsp)
   0x0000000000400417 <+23>: movl  $0x0,-0x18(%rsp)
   0x000000000040041f <+31>: xor   %eax,%eax
   0x0000000000400421 <+33>: retq
  End of assembler dump.

  $ clang -O2 test.c
  $ gdb a.out
  (gdb) disassemble main
  Dump of assembler code for function main:
   0x00000000004004f0  <+0>: xorps  %xmm0,%xmm0
   0x00000000004004f3  <+3>: movaps %xmm0,-0x18(%rsp)
   0x00000000004004f8  <+8>: movl   $0x0,-0x8(%rsp)
   0x0000000000400500 <+16>: lea    -0x18(%rsp),%rax
   0x0000000000400505 <+21>: xor    %eax,%eax
   0x0000000000400507 <+23>: retq
  End of assembler dump.

As gcc, clang, but also icc defines __GNUC__, it's sufficient to define
this in compiler-gcc.h only to be picked up. For a fallback or otherwise
unsupported compiler, we define it as a barrier. Similarly, for ecc which
does not support gcc inline asm.

Reference: https://llvm.org/bugs/show_bug.cgi?id=15495Reported-by: NStephan Mueller <smueller@chronox.de>
Tested-by: NStephan Mueller <smueller@chronox.de>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Stephan Mueller <smueller@chronox.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: mancha security <mancha1@zoho.com>
Cc: Mark Charlebois <charlebm@gmail.com>
Cc: Behan Webster <behanw@converseincode.com>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>

7829fb09

codel: fix maxpacket/mtu confusion · a5d28090

由 Eric Dumazet 提交于 4月 30, 2015

Under presence of TSO/GSO/GRO packets, codel at low rates can be quite
useless. In following example, not a single packet was ever dropped,
while average delay in codel queue is ~100 ms !

qdisc codel 0: parent 1:12 limit 16000p target 5.0ms interval 100.0ms
 Sent 134376498 bytes 88797 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 13626b 3p requeues 0
  count 0 lastcount 0 ldelay 96.9ms drop_next 0us
  maxpacket 9084 ecn_mark 0 drop_overlimit 0

This comes from a confusion of what should be the minimal backlog. It is
pretty clear it is not 64KB or whatever max GSO packet ever reached the
qdisc.

codel intent was to use MTU of the device.

After the fix, we finally drop some packets, and rtt/cwnd of my single
TCP flow are meeting our expectations.

qdisc codel 0: parent 1:12 limit 16000p target 5.0ms interval 100.0ms
 Sent 102798497 bytes 67912 pkt (dropped 1365, overlimits 0 requeues 0)
 backlog 6056b 3p requeues 0
  count 1 lastcount 1 ldelay 36.3ms drop_next 0us
  maxpacket 10598 ecn_mark 0 drop_overlimit 0
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Kathleen Nichols <nichols@pollere.com>
Cc: Dave Taht <dave.taht@gmail.com>
Cc: Van Jacobson <vanj@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5d28090

02 5月, 2015 1 次提交

virtio: fix typo in vring_need_event() doc comment · e412d3a3

由 Stefan Hajnoczi 提交于 5月 02, 2015

Here the "other side" refers to the guest or host.
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e412d3a3

01 5月, 2015 2 次提交

cfg802154: pass name_assign_type to rdev_add_virtual_intf() · 5b4a1039

由 Varka Bhadram 提交于 4月 30, 2015

This code is based on commit 6bab2e19
("cfg80211: pass name_assign_type to rdev_add_virtual_intf()")

This will expose in sysfs whether the ifname of a IEEE-802.15.4
device is set by userspace or generated by the kernel.
We are using two types of name_assign_types
 o NET_NAME_ENUM: Default interface name provided by kernel
 o NET_NAME_USER: Interface name provided by user.
Signed-off-by: NVarka Bhadram <varkab@cdac.in>
Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

5b4a1039

mac802154: add description to mac802154 APIs · 42fb23e2

由 Varka Bhadram 提交于 4月 30, 2015

This patch adds the proper description to the mac802154 core APIs.
Signed-off-by: NVarka Bhadram <varkab@cdac.in>
Acked-by: NAlexander Aring <alex.aring@gmail.com>
Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

42fb23e2

30 4月, 2015 6 次提交

tcp: add TCP_CC_INFO socket option · 6e9250f5

由 Eric Dumazet 提交于 4月 28, 2015

Some Congestion Control modules can provide per flow information,
but current way to get this information is to use netlink.

Like TCP_INFO, let's add TCP_CC_INFO so that applications can
issue a getsockopt() if they have a socket file descriptor,
instead of playing complex netlink games.

Sample usage would be :

  union tcp_cc_info info;
  socklen_t len = sizeof(info);

  if (getsockopt(fd, SOL_TCP, TCP_CC_INFO, &info, &len) == -1)
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e9250f5

tcp: prepare CC get_info() access from getsockopt() · 64f40ff5

由 Eric Dumazet 提交于 4月 28, 2015

We would like that optional info provided by Congestion Control
modules using netlink can also be read using getsockopt()

This patch changes get_info() to put this information in a buffer,
instead of skb, like tcp_get_info(), so that following patch
can reuse this common infrastructure.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

64f40ff5

tcp: add tcpi_bytes_received to tcp_info · bdd1f9ed

由 Eric Dumazet 提交于 4月 28, 2015

This patch tracks total number of payload bytes received on a TCP socket.
This is the sum of all changes done to tp->rcv_nxt

RFC4898 named this : tcpEStatsAppHCThruOctetsReceived

This is a 64bit field, and can be fetched both from TCP_INFO
getsockopt() if one has a handle on a TCP socket, or from inet_diag
netlink facility (iproute2/ss patch will follow)

Note that tp->bytes_received was placed near tp->rcv_nxt for
best data locality and minimal performance impact.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Eric Salo <salo@google.com>
Cc: Martin Lau <kafai@fb.com>
Cc: Chris Rapier <rapier@psc.edu>
Acked-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bdd1f9ed

tcp: add tcpi_bytes_acked to tcp_info · 0df48c26

由 Eric Dumazet 提交于 4月 28, 2015

This patch tracks total number of bytes acked for a TCP socket.
This is the sum of all changes done to tp->snd_una, and allows
for precise tracking of delivered data.

RFC4898 named this : tcpEStatsAppHCThruOctetsAcked

This is a 64bit field, and can be fetched both from TCP_INFO
getsockopt() if one has a handle on a TCP socket, or from inet_diag
netlink facility (iproute2/ss patch will follow)

Note that tp->bytes_acked was placed near tp->snd_una for
best data locality and minimal performance impact.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NYuchung Cheng <ycheng@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Eric Salo <salo@google.com>
Cc: Martin Lau <kafai@fb.com>
Cc: Chris Rapier <rapier@psc.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0df48c26

bridge/nl: remove wrong use of NLM_F_MULTI · 46c264da

由 Nicolas Dichtel 提交于 4月 28, 2015

NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact,
it is sent only at the end of a dump.

Libraries like libnl will wait forever for NLMSG_DONE.

Fixes: e5a55a89 ("net: create generic bridge ops")
Fixes: 815cccbf ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf")
CC: John Fastabend <john.r.fastabend@intel.com>
CC: Sathya Perla <sathya.perla@emulex.com>
CC: Subbu Seetharaman <subbu.seetharaman@emulex.com>
CC: Ajit Khaparde <ajit.khaparde@emulex.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: intel-wired-lan@lists.osuosl.org
CC: Jiri Pirko <jiri@resnulli.us>
CC: Scott Feldman <sfeldma@gmail.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
CC: bridge@lists.linux-foundation.org
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46c264da

xen: Suspend ticks on all CPUs during suspend · 2b953a5e

由 Boris Ostrovsky 提交于 4月 28, 2015

Commit 77e32c89 ("clockevents: Manage device's state separately for
the core") decouples clockevent device's modes from states. With this
change when a Xen guest tries to resume, it won't be calling its
set_mode op which needs to be done on each VCPU in order to make the
hypervisor aware that we are in oneshot mode.

This happens because clockevents_tick_resume() (which is an intermediate
step of resuming ticks on a processor) doesn't call clockevents_set_state()
anymore and because during suspend clockevent devices on all VCPUs (except
for the one doing the suspend) are left in ONESHOT state. As result, during
resume the clockevents state machine will assume that device is already
where it should be and doesn't need to be updated.

To avoid this problem we should suspend ticks on all VCPUs during
suspend.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

2b953a5e

29 4月, 2015 2 次提交

ALSA: emu10k1: Emu10k2 32 bit DMA mode · 7241ea55

由 Peter Zubaj 提交于 4月 28, 2015

Looks like audigy emu10k2 (probably emu10k1 - sb live too) support two
modes for DMA. Second mode is useful for 64 bit os with more then 2 GB
of ram (fixes problems with big soundfont loading)

1) 32MB from 2 GB address space using 8192 pages (used now as default)
2) 16MB from 4 GB address space using 4096 pages

Mode is set using HCFG_EXPANDED_MEM flag in HCFG register.
Also format of emu10k2 page table is then different.
Signed-off-by: NPeter Zubaj <pzubaj@marticonet.sk>
Tested-by: NTakashi Iwai <tiwai@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: NTakashi Iwai <tiwai@suse.de>

7241ea55

ACPICA: remove duplicate u8 typedef · 9e9d55e6

由 Olaf Hering 提交于 4月 28, 2015

During commit e252652f ("ACPICA: acpidump: Remove integer types
translation protection.") two 'unsigned char' types got converted to 'u8'.

The result does not compile with gcc-4.5, it can not cope with duplicate
typedefs.
Signed-off-by: NOlaf Hering <olaf@aepfle.de>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

9e9d55e6

28 4月, 2015 4 次提交

ASoC: Update email-id of Rajeev Kumar · 9d7dd6cd

由 Rajeev Kumar 提交于 4月 28, 2015

rajeev-dlh.kumar@st.com email-id doesn't exist anymore as I have left the
company. Replace ST's id with Rajeev Kumar <rajeevkumar.linux@gmail.com>
Signed-off-by: NRajeev Kumar <rajeevkumar.linux@gmail.com>
Signed-off-by: NMark Brown <broonie@kernel.org>

9d7dd6cd

tty: Re-add external interface for tty_set_termios() · b00f5c2d

由 Frederic Danis 提交于 4月 10, 2015

This is needed by Bluetooth hci_uart module to be able to change speed
of Bluetooth controller and local UART.
Signed-off-by: NFrederic Danis <frederic.danis@linux.intel.com>
Reviewed-by: NPeter Hurley <peter@hurleysoftware.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

b00f5c2d

uas: Add US_FL_MAX_SECTORS_240 flag · ee136af4

由 Hans de Goede 提交于 4月 21, 2015

The usb-storage driver sets max_sectors = 240 in its scsi-host template,
for uas we do not want to do that for all devices, but testing has shown
that some devices need it.

This commit adds a US_FL_MAX_SECTORS_240 flag for such devices, and
implements support for it in uas.c, while at it it also adds support
for US_FL_MAX_SECTORS_64 to uas.c.

Cc: stable@vger.kernel.org # 3.16
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Acked-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ee136af4

SCSI: add 1024 max sectors black list flag · 35e9a9f9

由 Mike Christie 提交于 4月 20, 2015

This works around a issue with qnap iscsi targets not handling large IOs
very well.

The target returns:

VPD INQUIRY: Block limits page (SBC)
  Maximum compare and write length: 1 blocks
  Optimal transfer length granularity: 1 blocks
  Maximum transfer length: 4294967295 blocks
  Optimal transfer length: 4294967295 blocks
  Maximum prefetch, xdread, xdwrite transfer length: 0 blocks
  Maximum unmap LBA count: 8388607
  Maximum unmap block descriptor count: 1
  Optimal unmap granularity: 16383
  Unmap granularity alignment valid: 0
  Unmap granularity alignment: 0
  Maximum write same length: 0xffffffff blocks
  Maximum atomic transfer length: 0
  Atomic alignment: 0
  Atomic transfer length granularity: 0

and it is *sometimes* able to handle at least one IO of size up to 8 MB. We
have seen in traces where it will sometimes work, but other times it
looks like it fails and it looks like it returns failures if we send
multiple large IOs sometimes. Also it looks like it can return 2 different
errors. It will sometimes send iscsi reject errors indicating out of
resources or it will send invalid cdb illegal requests check conditions.
And then when it sends iscsi rejects it does not seem to handle retries
when there are command sequence holes, so I could not just add code to
try and gracefully handle that error code.

The problem is that we do not have a good contact for the company,
so we are not able to determine under what conditions it returns
which error and why it sometimes works.

So, this patch just adds a new black list flag to set targets like this to
the old max safe sectors of 1024. The max_hw_sectors changes added in 3.19
caused this regression, so I also ccing stable.
Reported-by: NChristian Hesse <list@eworm.de>
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Cc: stable@vger.kernel.org
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJames Bottomley <JBottomley@Odin.com>

35e9a9f9

27 4月, 2015 3 次提交

x86: pvclock: Really remove the sched notifier for cross-cpu migrations · 73459e2a

由 Paolo Bonzini 提交于 4月 23, 2015

This reverts commits 0a4e6be9
and 80f7fdb1.

The task migration notifier was originally introduced in order to support
the pvclock vsyscall with non-synchronized TSC, but KVM only supports it
with synchronized TSC.  Hence, on KVM the race condition is only needed
due to a bad implementation on the host side, and even then it's so rare
that it's mostly theoretical.

As far as KVM is concerned it's possible to fix the host, avoiding the
additional complexity in the vDSO and the (re)introduction of the task
migration notifier.

Xen, on the other hand, hasn't yet implemented vsyscall support at
all, so we do not care about its plans for non-synchronized TSC.
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Suggested-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

73459e2a

xen/grant: introduce func gnttab_unmap_refs_sync() · b44166cd

由 Bob Liu 提交于 4月 03, 2015

There are several place using gnttab async unmap and wait for
completion, so move the common code to a function
gnttab_unmap_refs_sync().
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Acked-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

b44166cd

net/bonding: Make DRV macros private · 73b5a6f2

由 Matan Barak 提交于 4月 26, 2015

The bonding modules currently defines four macros with
general names that pollute the global namespace:
DRV_VERSION
DRV_RELDATE
DRV_NAME
DRV_DESCRIPTION

Fixing that by defining a private bonding_priv.h
header files which includes those defines.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73b5a6f2

26 4月, 2015 3 次提交

libata: Ignore spurious PHY event on LPM policy change · 09c5b480

由 Gabriele Mazzotta 提交于 4月 25, 2015

When the LPM policy is set to ATA_LPM_MAX_POWER, the device might
generate a spurious PHY event that cuases errors on the link.
Ignore this event if it occured within 10s after the policy change.

The timeout was chosen observing that on a Dell XPS13 9333 these
spurious events can occur up to roughly 6s after the policy change.

Link: http://lkml.kernel.org/g/3352987.ugV1Ipy7Z5@xps13Signed-off-by: NGabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org

09c5b480

libata: Add helper to determine when PHY events should be ignored · 8393b811

由 Gabriele Mazzotta 提交于 4月 25, 2015

This is a preparation commit that will allow to add other criteria
according to which PHY events should be dropped.
Signed-off-by: NGabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org

8393b811

net: fix crash in build_skb() · 2ea2f62c

由 Eric Dumazet 提交于 4月 24, 2015

When I added pfmemalloc support in build_skb(), I forgot netlink
was using build_skb() with a vmalloc() area.

In this patch I introduce __build_skb() for netlink use,
and build_skb() is a wrapper handling both skb->head_frag and
skb->pfmemalloc

This means netlink no longer has to hack skb->head_frag

[ 1567.700067] kernel BUG at arch/x86/mm/physaddr.c:26!
[ 1567.700067] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
[ 1567.700067] Dumping ftrace buffer:
[ 1567.700067]    (ftrace buffer empty)
[ 1567.700067] Modules linked in:
[ 1567.700067] CPU: 9 PID: 16186 Comm: trinity-c182 Not tainted 4.0.0-next-20150424-sasha-00037-g4796e21 #2167
[ 1567.700067] task: ffff880127efb000 ti: ffff880246770000 task.ti: ffff880246770000
[ 1567.700067] RIP: __phys_addr (arch/x86/mm/physaddr.c:26 (discriminator 3))
[ 1567.700067] RSP: 0018:ffff8802467779d8  EFLAGS: 00010202
[ 1567.700067] RAX: 000041000ed8e000 RBX: ffffc9008ed8e000 RCX: 000000000000002c
[ 1567.700067] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffffb3fd6049
[ 1567.700067] RBP: ffff8802467779f8 R08: 0000000000000019 R09: ffff8801d0168000
[ 1567.700067] R10: ffff8801d01680c7 R11: ffffed003a02d019 R12: ffffc9000ed8e000
[ 1567.700067] R13: 0000000000000f40 R14: 0000000000001180 R15: ffffc9000ed8e000
[ 1567.700067] FS:  00007f2a7da3f700(0000) GS:ffff8801d1000000(0000) knlGS:0000000000000000
[ 1567.700067] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1567.700067] CR2: 0000000000738308 CR3: 000000022e329000 CR4: 00000000000007e0
[ 1567.700067] Stack:
[ 1567.700067]  ffffc9000ed8e000 ffff8801d0168000 ffffc9000ed8e000 ffff8801d0168000
[ 1567.700067]  ffff880246777a28 ffffffffad7c0a21 0000000000001080 ffff880246777c08
[ 1567.700067]  ffff88060d302e68 ffff880246777b58 ffff880246777b88 ffffffffad9a6821
[ 1567.700067] Call Trace:
[ 1567.700067] build_skb (include/linux/mm.h:508 net/core/skbuff.c:316)
[ 1567.700067] netlink_sendmsg (net/netlink/af_netlink.c:1633 net/netlink/af_netlink.c:2329)
[ 1567.774369] ? sched_clock_cpu (kernel/sched/clock.c:311)
[ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273)
[ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273)
[ 1567.774369] sock_sendmsg (net/socket.c:614 net/socket.c:623)
[ 1567.774369] sock_write_iter (net/socket.c:823)
[ 1567.774369] ? sock_sendmsg (net/socket.c:806)
[ 1567.774369] __vfs_write (fs/read_write.c:479 fs/read_write.c:491)
[ 1567.774369] ? get_lock_stats (kernel/locking/lockdep.c:249)
[ 1567.774369] ? default_llseek (fs/read_write.c:487)
[ 1567.774369] ? vtime_account_user (kernel/sched/cputime.c:701)
[ 1567.774369] ? rw_verify_area (fs/read_write.c:406 (discriminator 4))
[ 1567.774369] vfs_write (fs/read_write.c:539)
[ 1567.774369] SyS_write (fs/read_write.c:586 fs/read_write.c:577)
[ 1567.774369] ? SyS_read (fs/read_write.c:577)
[ 1567.774369] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[ 1567.774369] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2594 kernel/locking/lockdep.c:2636)
[ 1567.774369] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
[ 1567.774369] system_call_fastpath (arch/x86/kernel/entry_64.S:261)

Fixes: 79930f58 ("net: do not deplete pfmemalloc reserve")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ea2f62c

25 4月, 2015 2 次提交

fix I_DIO_WAKEUP definition · ac74d8d6

由 Eric Sandeen 提交于 4月 16, 2015

I_DIO_WAKEUP is never directly used, but fix it up anyway.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ac74d8d6

direct-io: only inc/dec inode->i_dio_count for file systems · fe0f07d0

由 Jens Axboe 提交于 4月 15, 2015

do_blockdev_direct_IO() increments and decrements the inode
->i_dio_count for each IO operation. It does this to protect against
truncate of a file. Block devices don't need this sort of protection.

For a capable multiqueue setup, this atomic int is the only shared
state between applications accessing the device for O_DIRECT, and it
presents a scaling wall for that. In my testing, as much as 30% of
system time is spent incrementing and decrementing this value. A mixed
read/write workload improved from ~2.5M IOPS to ~9.6M IOPS, with
better latencies too. Before:

clat percentiles (usec):
 |  1.00th=[   33],  5.00th=[   34], 10.00th=[   34], 20.00th=[   34],
 | 30.00th=[   34], 40.00th=[   34], 50.00th=[   35], 60.00th=[   35],
 | 70.00th=[   35], 80.00th=[   35], 90.00th=[   37], 95.00th=[   80],
 | 99.00th=[   98], 99.50th=[  151], 99.90th=[  155], 99.95th=[  155],
 | 99.99th=[  165]

After:

clat percentiles (usec):
 |  1.00th=[   95],  5.00th=[  108], 10.00th=[  129], 20.00th=[  149],
 | 30.00th=[  155], 40.00th=[  161], 50.00th=[  167], 60.00th=[  171],
 | 70.00th=[  177], 80.00th=[  185], 90.00th=[  201], 95.00th=[  270],
 | 99.00th=[  390], 99.50th=[  398], 99.90th=[  418], 99.95th=[  422],
 | 99.99th=[  438]

In other setups, Robert Elliott reported seeing good performance
improvements:

https://lkml.org/lkml/2015/4/3/557

The more applications accessing the device, the worse it gets.

Add a new direct-io flags, DIO_SKIP_DIO_COUNT, which tells
do_blockdev_direct_IO() that it need not worry about incrementing
or decrementing the inode i_dio_count for this caller.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Elliott, Robert (Server Storage) <elliott@hp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NJens Axboe <axboe@fb.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

fe0f07d0

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功