提交 · ffeedafbf0236f03aeb2e8db273b3e5ae5f5bc89 · openeuler / Kernel

16 6月, 2015 7 次提交

bpf: introduce current->pid, tgid, uid, gid, comm accessors · ffeedafb

由 Alexei Starovoitov 提交于 6月 12, 2015

eBPF programs attached to kprobes need to filter based on
current->pid, uid and other fields, so introduce helper functions:

u64 bpf_get_current_pid_tgid(void)
Return: current->tgid << 32 | current->pid

u64 bpf_get_current_uid_gid(void)
Return: current_gid << 32 | current_uid

bpf_get_current_comm(char *buf, int size_of_buf)
stores current->comm into buf

They can be used from the programs attached to TC as well to classify packets
based on current task fields.

Update tracex2 example to print histogram of write syscalls for each process
instead of aggregated for all.
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ffeedafb

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · ada6c1de

由 David S. Miller 提交于 6月 15, 2015

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

This a bit large (and late) patchset that contains Netfilter updates for
net-next. Most relevantly br_netfilter fixes, ipset RCU support, removal of
x_tables percpu ruleset copy and rework of the nf_tables netdev support. More
specifically, they are:

1) Warn the user when there is a better protocol conntracker available, from
   Marcelo Ricardo Leitner.

2) Fix forwarding of IPv6 fragmented traffic in br_netfilter, from Bernhard
   Thaler. This comes with several patches to prepare the change in first place.

3) Get rid of special mtu handling of PPPoE/VLAN frames for br_netfilter. This
   is not needed anymore since now we use the largest fragment size to
   refragment, from Florian Westphal.

4) Restore vlan tag when refragmenting in br_netfilter, also from Florian.

5) Get rid of the percpu ruleset copy in x_tables, from Florian. Plus another
   follow up patch to refine it from Eric Dumazet.

6) Several ipset cleanups, fixes and finally RCU support, from Jozsef Kadlecsik.

7) Get rid of parens in Netfilter Kconfig files.

8) Attach the net_device to the basechain as opposed to the initial per table
   approach in the nf_tables netdev family.

9) Subscribe to netdev events to detect the removal and registration of a
   device that is referenced by a basechain.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ada6c1de

netfilter: nf_tables_netdev: unregister hooks on net_device removal · 835b8033

由 Pablo Neira Ayuso 提交于 6月 15, 2015

In case the net_device is gone, we have to unregister the hooks and put back
the reference on the net_device object. Once it comes back, register them
again. This also covers the device rename case.

This patch also adds a new flag to indicate that the basechain is disabled, so
their hooks are not registered. This flag is used by the netdev family to
handle the case where the net_device object is gone. Currently this flag is not
exposed to userspace.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

835b8033

P
netfilter: nf_tables: add nft_register_basechain() and nft_unregister_basechain() · d8ee8f7c
由 Pablo Neira Ayuso 提交于 6月 15, 2015
```
This wrapper functions take care of hook registration for basechains.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
```
d8ee8f7c

netfilter: nf_tables: attach net_device to basechain · 2cbce139

由 Pablo Neira Ayuso 提交于 6月 12, 2015

The device is part of the hook configuration, so instead of a global
configuration per table, set it to each of the basechain that we create.

This patch reworks ebddf1a8 ("netfilter: nf_tables: allow to bind table to
net_device").

Note that this adds a dev_name field in the nft_base_chain structure which is
required the netdev notification subscription that follows up in a patch to
handle gone net_devices.
Suggested-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

2cbce139

netfilter: x_tables: remove XT_TABLE_INFO_SZ and a dereference. · 711bdde6

由 Eric Dumazet 提交于 6月 15, 2015

After Florian patches, there is no need for XT_TABLE_INFO_SZ anymore :
Only one copy of table is kept, instead of one copy per cpu.

We also can avoid a dereference if we put table data right after
xt_table_info. It reduces register pressure and helps compiler.

Then, we attempt a kmalloc() if total size is under order-3 allocation,
to reduce TLB pressure, as in many cases, rules fit in 32 KB.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

711bdde6

Merge branch 'master' of git://blackhole.kfki.hu/nf-next · 53b87627

由 Pablo Neira Ayuso 提交于 6月 15, 2015

Jozsef Kadlecsik says:

====================
ipset patches for nf-next

Please consider to apply the next bunch of patches for ipset. First
comes the small changes, then the bugfixes and at the end the RCU
related patches.

* Use MSEC_PER_SEC consistently instead of the number.
* Use SET_WITH_*() helpers to test set extensions from Sergey Popovich.
* Check extensions attributes before getting extensions from Sergey Popovich.
* Permit CIDR equal to the host address CIDR in IPv6 from Sergey Popovich.
* Make sure we always return line number on batch in the case of error
  from Sergey Popovich.
* Check CIDR value only when attribute is given from Sergey Popovich.
* Fix cidr handling for hash:*net* types, reported by Jonathan Johnson.
* Fix parallel resizing and listing of the same set so that the original
  set is kept for the whole dumping.
* Make sure listing doesn't grab a set which is just being destroyed.
* Remove rbtree from ip_set_hash_netiface.c in order to introduce RCU.
* Replace rwlock_t with spinlock_t in "struct ip_set", change the locking
  in the core and simplifications in the timeout routines.
* Introduce RCU locking in bitmap:* types with a slight modification in the
  logic on how an element is added.
* Introduce RCU locking in hash:* types. This is the most complex part of
  the changes.
* Introduce RCU locking in list type where standard rculist is used.
* Fix coding styles reported by checkpatch.pl.
====================
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

53b87627

15 6月, 2015 2 次提交

netfilter: Kconfig: get rid of parens around depends on · f09becc7

由 Pablo Neira Ayuso 提交于 6月 12, 2015

According to the reporter, they are not needed.
Reported-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

f09becc7

tcp: cdg: use div_u64() · 758f0d4b

由 Kenneth Klette Jonassen 提交于 6月 12, 2015

Fixes cross-compile to mips.
Signed-off-by: NKenneth Klette Jonassen <kennetkl@ifi.uio.no>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

758f0d4b

14 6月, 2015 16 次提交

J
netfilter: ipset: Fix coding styles reported by checkpatch.pl · ca0f6a5c
由 Jozsef Kadlecsik 提交于 6月 13, 2015
```
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
```
ca0f6a5c
J
netfilter: ipset: Introduce RCU locking in list type · 00590fdd
由 Jozsef Kadlecsik 提交于 6月 13, 2015
```
Standard rculist is used.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
```
00590fdd

netfilter: ipset: Introduce RCU locking in hash:* types · 18f84d41

由 Jozsef Kadlecsik 提交于 6月 13, 2015

Three types of data need to be protected in the case of the hash types:

a. The hash buckets: standard rcu pointer operations are used.
b. The element blobs in the hash buckets are stored in an array and
   a bitmap is used for book-keeping to tell which elements in the array
   are used or free.
c. Networks per cidr values and the cidr values themselves are stored
   in fix sized arrays and need no protection. The values are modified
   in such an order that in the worst case an element testing is repeated
   once with the same cidr value.

The ipset hash approach uses arrays instead of lists and therefore is
incompatible with rhashtable.

Performance is tested by Jesper Dangaard Brouer:

Simple drop in FORWARD
~~~~~~~~~~~~~~~~~~~~~~

Dropping via simple iptables net-mask match::

 iptables -t raw -N simple || iptables -t raw -F simple
 iptables -t raw -I simple  -s 198.18.0.0/15 -j DROP
 iptables -t raw -D PREROUTING -j simple
 iptables -t raw -I PREROUTING -j simple

Drop performance in "raw": 11.3Mpps

Generator: sending 12.2Mpps (tx:12264083 pps)

Drop via original ipset in RAW table
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Create a set with lots of elements::

 sudo ./ipset destroy test
 echo "create test hash:ip hashsize 65536" > test.set
 for x in `seq 0 255`; do
    for y in `seq 0 255`; do
        echo "add test 198.18.$x.$y" >> test.set
    done
 done
 sudo ./ipset restore < test.set

Dropping via ipset::

 iptables -t raw -F
 iptables -t raw -N net198 || iptables -t raw -F net198
 iptables -t raw -I net198 -m set --match-set test src -j DROP
 iptables -t raw -I PREROUTING -j net198

Drop performance in "raw" with ipset: 8Mpps

Perf report numbers ipset drop in "raw"::

 +   24.65%  ksoftirqd/1  [ip_set]           [k] ip_set_test
 -   21.42%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_lock_bh
    - _raw_read_lock_bh
       + 99.88% ip_set_test
 -   19.42%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_unlock_bh
    - _raw_read_unlock_bh
       + 99.72% ip_set_test
 +    4.31%  ksoftirqd/1  [ip_set_hash_ip]   [k] hash_ip4_kadt
 +    2.27%  ksoftirqd/1  [ixgbe]            [k] ixgbe_fetch_rx_buffer
 +    2.18%  ksoftirqd/1  [ip_tables]        [k] ipt_do_table
 +    1.81%  ksoftirqd/1  [ip_set_hash_ip]   [k] hash_ip4_test
 +    1.61%  ksoftirqd/1  [kernel.kallsyms]  [k] __netif_receive_skb_core
 +    1.44%  ksoftirqd/1  [kernel.kallsyms]  [k] build_skb
 +    1.42%  ksoftirqd/1  [kernel.kallsyms]  [k] ip_rcv
 +    1.36%  ksoftirqd/1  [kernel.kallsyms]  [k] __local_bh_enable_ip
 +    1.16%  ksoftirqd/1  [kernel.kallsyms]  [k] dev_gro_receive
 +    1.09%  ksoftirqd/1  [kernel.kallsyms]  [k] __rcu_read_unlock
 +    0.96%  ksoftirqd/1  [ixgbe]            [k] ixgbe_clean_rx_irq
 +    0.95%  ksoftirqd/1  [kernel.kallsyms]  [k] __netdev_alloc_frag
 +    0.88%  ksoftirqd/1  [kernel.kallsyms]  [k] kmem_cache_alloc
 +    0.87%  ksoftirqd/1  [xt_set]           [k] set_match_v3
 +    0.85%  ksoftirqd/1  [kernel.kallsyms]  [k] inet_gro_receive
 +    0.83%  ksoftirqd/1  [kernel.kallsyms]  [k] nf_iterate
 +    0.76%  ksoftirqd/1  [kernel.kallsyms]  [k] put_compound_page
 +    0.75%  ksoftirqd/1  [kernel.kallsyms]  [k] __rcu_read_lock

Drop via ipset in RAW table with RCU-locking
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

With RCU locking, the RW-lock is gone.

Drop performance in "raw" with ipset with RCU-locking: 11.3Mpps
Performance-tested-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

18f84d41

netfilter: ipset: Introduce RCU locking in bitmap:* types · 96f51428

由 Jozsef Kadlecsik 提交于 6月 13, 2015

There's nothing much required because the bitmap types use atomic
bit operations. However the logic of adding elements slightly changed:
first the MAC address updated (which is not atomic), then the element
activated (added). The extensions may call kfree_rcu() therefore we
call rcu_barrier() at module removal.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

96f51428

netfilter: ipset: Prepare the ipset core to use RCU at set level · b57b2d1f

由 Jozsef Kadlecsik 提交于 6月 13, 2015

Replace rwlock_t with spinlock_t in "struct ip_set" and change the locking
accordingly. Convert the comment extension into an rcu-avare object. Also,
simplify the timeout routines.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

b57b2d1f

netfilter:ipset Remove rbtree from hash:net,iface · bd55389c

由 Jozsef Kadlecsik 提交于 6月 13, 2015

Remove rbtree in order to introduce RCU instead of rwlock in ipset
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

bd55389c

netfilter: ipset: Make sure listing doesn't grab a set which is just being destroyed. · 9c1ba5c8

由 Jozsef Kadlecsik 提交于 6月 13, 2015

There was a small window when all sets are destroyed and a concurrent
listing of all sets could grab a set which is just being destroyed.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

9c1ba5c8

netfilter: ipset: Fix parallel resizing and listing of the same set · c4c99783

由 Jozsef Kadlecsik 提交于 6月 13, 2015

When elements added to a hash:* type of set and resizing triggered,
parallel listing could start to list the original set (before resizing)
and "continue" with listing the new set. Fix it by references and
using the original hash table for listing. Therefore the destroying of
the original hash table may happen from the resizing or listing functions.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

c4c99783

netfilter: ipset: Fix cidr handling for hash:*net* types · f690cbae

由 Jozsef Kadlecsik 提交于 6月 12, 2015

Commit "Simplify cidr handling for hash:*net* types" broke the cidr
handling for the hash:*net* types when the sets were used by the SET
target: entries with invalid cidr values were added to the sets.
Reported by Jonathan Johnson.

Testsuite entry is added to verify the fix.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

f690cbae

netfilter: ipset: Check CIDR value only when attribute is given · aff22758

由 Sergey Popovich 提交于 6月 12, 2015

There is no reason to check CIDR value regardless attribute
specifying CIDR is given.

Initialize cidr array in element structure on element structure
declaration to let more freedom to the compiler to optimize
initialization right before element structure is used.

Remove local variables cidr and cidr2 for netnet and netportnet
hashes as we do not use packed cidr value for such set types and
can store value directly in e.cidr[].
Signed-off-by: NSergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

aff22758

netfilter: ipset: Make sure we always return line number on batch · a212e08e

由 Sergey Popovich 提交于 6月 12, 2015

Even if we return with generic IPSET_ERR_PROTOCOL it is good idea
to return line number if we called in batch mode.

Moreover we are not always exiting with IPSET_ERR_PROTOCOL. For
example hash:ip,port,net may return IPSET_ERR_HASH_RANGE_UNSUPPORTED
or IPSET_ERR_INVALID_CIDR.
Signed-off-by: NSergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

a212e08e

netfilter: ipset: Permit CIDR equal to the host address CIDR in IPv6 · 2c227f27

由 Sergey Popovich 提交于 6月 12, 2015

Permit userspace to supply CIDR length equal to the host address CIDR
length in netlink message. Prohibit any other CIDR length for IPv6
variant of the set.

Also return -IPSET_ERR_HASH_RANGE_UNSUPPORTED instead of generic
-IPSET_ERR_PROTOCOL in IPv6 variant of hash:ip,port,net when
IPSET_ATTR_IP_TO attribute is given.
Signed-off-by: NSergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

2c227f27

netfilter: ipset: Check extensions attributes before getting extensions. · 7dd37bc8

由 Sergey Popovich 提交于 6月 12, 2015

Make all extensions attributes checks within ip_set_get_extensions()
and reduce number of duplicated code.
Signed-off-by: NSergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>

7dd37bc8

S
netfilter: ipset: Use SET_WITH_*() helpers to test set extensions · edda0791
由 Sergey Popovich 提交于 6月 12, 2015
```
Signed-off-by: NSergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
```
edda0791
J
netfilter: ipset: Use MSEC_PER_SEC consistently · aaeb6e24
由 Jozsef Kadlecsik 提交于 6月 12, 2015
```
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
```
aaeb6e24
D

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 25c43bf1
由 David S. Miller 提交于 6月 13, 2015

25c43bf1

13 6月, 2015 13 次提交

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c8d17b45

由 Linus Torvalds 提交于 6月 12, 2015

Pull networking fixes from David Miller:

 1) Fix uninitialized struct station_info in cfg80211_wireless_stats(),
    from Johannes Berg.

 2) Revert commit attempt to fix ipv6 protocol resubmission, it adds
    regressions.

 3) Endless loops can be created in bridge port lists, fix from Nikolay
    Aleksandrov.

 4) Don't WARN_ON() if sk->sk_forward_alloc is non-zero in
    sk_clear_memalloc, it is a legal situation during swap deactivation.
    Fix from Mel Gorman.

 5) Fix order of disabling interrupts and unlocking NAPI in enic driver
    to avoid a race.  From Govindarajulu Varadarajan.

 6) High and low register writes are swapped when programming the start
    of periodic output in igb driver.  From Richard Cochran.

 7) Fix device rename handling in mpls stack, from Robert Shearman.

 8) Do not trigger compaction synchronously when optimistically trying
    to allocate an order 3 page in alloc_skb_with_frags() and
    skb_page_frag_refill().  From Shaohua Li.

 9) Authentication with COOKIE_ECHO is not handled properly in SCTP, fix
    from Marcelo Ricardo Leitner.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  Doc: networking: Fix URL for wiki.wireshark.org in udplite.txt
  sctp: allow authenticating DATA chunks that are bundled with COOKIE_ECHO
  net: don't wait for order-3 page allocation
  mpls: handle device renames for per-device sysctls
  net: igb: fix the start time for periodic output signals
  enic: fix memory leak in rq_clean
  enic: check return value for stat dump
  enic: unlock napi busy poll before unmasking intr
  net, swap: Remove a warning and clarify why sk_mem_reclaim is required when deactivating swap
  bridge: fix multicast router rlist endless loop
  tipc: disconnect socket directly after probe failure
  Revert "ipv6: Fix protocol resubmission"
  cfg80211: wext: clear sinfo struct before calling driver

c8d17b45

tcp: tcp_v6_connect() cleanup · a2f0fad3

由 Eric Dumazet 提交于 6月 12, 2015

Remove dead code from tcp_v6_connect()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2f0fad3

flow_dissector: fix ipv6 dst, hop-by-hop and routing ext hdrs · 1e98a0f0

由 Eric Dumazet 提交于 6月 12, 2015

__skb_header_pointer() returns a pointer that must be checked.

Fixes infinite loop reported by Alexei, and add __must_check to
catch these errors earlier.

Fixes: 6a74fcf4 ("flow_dissector: add support for dst, hop-by-hop and routing ext hdrs")
Reported-by: NAlexei Starovoitov <alexei.starovoitov@gmail.com>
Tested-by: NAlexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1e98a0f0

Fix Cavium Liquidio build related errors and warnings · 5b173cf9

由 Raghu Vatsavayi 提交于 6月 12, 2015

1) Fixed following sparse warnings:
    lio_main.c:213:6: warning: symbol 'octeon_droq_bh' was not
    declared. Should it be static?
    lio_main.c:233:5: warning: symbol 'lio_wait_for_oq_pkts' was
    not declared. Should it be static?
    lio_main.c:3083:5: warning: symbol 'lio_nic_info' was not
    declared. Should it be static?
    lio_main.c:2618:16: warning: cast from restricted __be16
    octeon_device.c:466:6: warning: symbol 'oct_set_config_info'
    was not declared. Should it be static?
    octeon_device.c:573:25: warning: cast to restricted __be32
    octeon_device.c:582:29: warning: cast to restricted __be32
    octeon_device.c:584:39: warning: cast to restricted __be32
    octeon_device.c:594:13: warning: cast to restricted __be32
    octeon_device.c:596:25: warning: cast to restricted __be32
    octeon_device.c:613:25: warning: cast to restricted __be32
    octeon_device.c:614:29: warning: cast to restricted __be64
    octeon_device.c:615:29: warning: cast to restricted __be32
    octeon_device.c:619:37: warning: cast to restricted __be32
    octeon_device.c:623:33: warning: cast to restricted __be32
    cn66xx_device.c:540:6: warning: symbol
    'lio_cn6xxx_get_pcie_qlmport' was not declared. Should it be s
    octeon_mem_ops.c:181:16: warning: cast to restricted __be64
    octeon_mem_ops.c:190:16: warning: cast to restricted __be32
    octeon_mem_ops.c:196:17: warning: incorrect type in initializer
2) Fix build errors corresponding to vmalloc on linux-next 4.1.
3) Liquidio now supports 64 bit only, modified Kconfig accordingly.
4) Fix some code alignment issues based on kernel build warnings.
Signed-off-by: NDerek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: NSatanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: NFelix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: NRaghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b173cf9

Merge branch 'flow_dissector-next' · ea704770

由 David S. Miller 提交于 6月 12, 2015

Tom Herbert says:

====================
flow_dissector: Fix MPLS parsing and add ext hdr support

Need to shift label. Added parsing of dst, hop-by-hop, and routing
extension headers.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea704770

flow_dissector: add support for dst, hop-by-hop and routing ext hdrs · 6a74fcf4

由 Tom Herbert 提交于 6月 12, 2015

If dst, hop-by-hop or routing extension headers are present determine
length of the options and skip over them in flow dissection.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a74fcf4

flow_dissector: Fix MPLS entropy label handling in flow dissector · 611d23c5

由 Tom Herbert 提交于 6月 12, 2015

Need to shift after masking to get label value for comparison.

Fixes: b3baa0fb ("mpls: Add MPLS entropy label in flow_keys")
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

611d23c5

Doc: networking: Fix URL for wiki.wireshark.org in udplite.txt · b07d4961

由 Masanari Iida 提交于 6月 13, 2015

This patch fix URL (http to https) for wiki.wireshark.org.
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b07d4961

net: ipv4: un-inline ip_finish_output2 · b60f2f3d

由 Florian Westphal 提交于 6月 12, 2015

text    data     bss     dec     hex filename
old: 16527      44       0   16571    40bb net/ipv4/ip_output.o
new: 14935      44       0   14979    3a83 net/ipv4/ip_output.o
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b60f2f3d

sctp: allow authenticating DATA chunks that are bundled with COOKIE_ECHO · ae36806a

由 Marcelo Ricardo Leitner 提交于 6月 11, 2015

Currently, we can ask to authenticate DATA chunks and we can send DATA
chunks on the same packet as COOKIE_ECHO, but if you try to combine
both, the DATA chunk will be sent unauthenticated and peer won't accept
it, leading to a communication failure.

This happens because even though the data was queued after it was
requested to authenticate DATA chunks, it was also queued before we
could know that remote peer can handle authenticating, so
sctp_auth_send_cid() returns false.

The fix is whenever we set up an active key, re-check send queue for
chunks that now should be authenticated. As a result, such packet will
now contain COOKIE_ECHO + AUTH + DATA chunks, in that order.
Reported-by: NLiu Wei <weliu@redhat.com>
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NVlad Yasevich <vyasevich@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae36806a

Merge branch 'for-linus' of git://git.kernel.dk/linux-block · b85dfd30

由 Linus Torvalds 提交于 6月 12, 2015

Pull block layer fixes from Jens Axboe:
 "Remember about a week ago when I sent the last pull request for 4.1?
  Well, I lied.  Now, I don't want to shift the blame, but Dan, Ming,
  and Richard made a liar out of me.

  Here are three small patches that should go into 4.1.  More
  specifically, this pull request contains:

   - A Kconfig dependency for the pmem block driver, so it can't be
     selected if HAS_IOMEM isn't availble.  From Richard Weinberger.

   - A fix for genhd, making the ext_devt_lock softirq safe.  This makes
     lockdep happier, since we also end up grabbing this lock on release
     off the softirq path.  From Dan Williams.

   - A blk-mq software queue release fix from Ming Lei.

  Last two are headed to stable, first fixes an issue introduced in this
  cycle"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: pmem: Add dependency on HAS_IOMEM
  block: fix ext_dev_lock lockdep report
  blk-mq: free hctx->ctxs in queue's release handler

b85dfd30

Merge tag 'md/4.1-rc7-fixes' of git://neil.brown.name/md · 7b565d9d

由 Linus Torvalds 提交于 6月 12, 2015

Pull three more md fixes from Neil Brown:
 "Hasn't been a good cycle for md has it :-(

  The main issue fixed here is a rare race which can result in two
  reshape threads running at once, which doesn't end well.

  Also a minor issue with a write to a sysfs file returning the wrong
  value.  Backports to 4.0-stable are indicated"

* tag 'md/4.1-rc7-fixes' of git://neil.brown.name/md:
  md: make sure MD_RECOVERY_DONE is clear before starting recovery/resync
  md: Close race when setting 'action' to 'idle'.
  md: don't return 0 from array_state_store

7b565d9d

Merge git://git.infradead.org/intel-iommu · c39f3bc6

由 Linus Torvalds 提交于 6月 12, 2015

Pull VT-d hardware workarounds from David Woodhouse:
 "This contains a workaround for hardware issues which I *thought* were
  never going to be seen on production hardware.  I'm glad I checked
  that before the 4.1 release...

  Firstly, PASID support is so broken on existing chips that we're just
  going to declare the old capability bit 28 as 'reserved' and change
  the VT-d spec to move PASID support to another bit.  So any existing
  hardware doesn't support SVM; it only sets that (now) meaningless bit
  28.

  That patch *wasn't* imperative for 4.1 because we don't have PASID
  support yet.  But *even* the extended context tables are broken — if
  you just enable the wider tables and use none of the new bits in them,
  which is precisely what 4.1 does, you find that translations don't
  work.  It's this problem which I thought was caught in time to be
  fixed before production, but wasn't.

  To avoid triggering this issue, we now *only* enable the extended
  context tables on hardware which also advertises "we have PASID
  support and we actually tested it this time" with the new PASID
  feature bit.

  In addition, I've added an 'intel_iommu=ecs_off' command line
  parameter to allow us to disable it manually if we need to"

* git://git.infradead.org/intel-iommu:
  iommu/vt-d: Only enable extended context tables if PASID is supported
  iommu/vt-d: Change PASID support to bit 40 of Extended Capability Register

c39f3bc6

12 6月, 2015 2 次提交

netfilter: xtables: avoid percpu ruleset duplication · 482cfc31

由 Florian Westphal 提交于 6月 11, 2015

We store the rule blob per (possible) cpu.  Unfortunately this means we can
waste lot of memory on big smp machines. ipt_entry structure ('rule head')
is 112 byte, so e.g. with maxcpu=64 one single rule eats
close to 8k RAM.

Since previous patch made counters percpu it appears there is nothing
left in the rule blob that needs to be percpu.

On my test system (144 possible cpus, 400k dummy rules) this
change saves close to 9 Gigabyte of RAM.
Reported-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

482cfc31

netfilter: xtables: use percpu rule counters · 71ae0dff

由 Florian Westphal 提交于 6月 11, 2015

The binary arp/ip/ip6tables ruleset is stored per cpu.

The only reason left as to why we need percpu duplication are the rule
counters embedded into ipt_entry et al -- since each cpu has its own copy
of the rules, all counters can be lockless.

The downside is that the more cpus are supported, the more memory is
required.  Rules are not just duplicated per online cpu but for each
possible cpu, i.e. if maxcpu is 144, then rule is duplicated 144 times,
not for the e.g. 64 cores present.

To save some memory and also improve utilization of shared caches it
would be preferable to only store the rule blob once.

So we first need to separate counters and the rule blob.

Instead of using entry->counters, allocate this percpu and store the
percpu address in entry->counters.pcnt on CONFIG_SMP.

This change makes no sense as-is; it is merely an intermediate step to
remove the percpu duplication of the rule set in a followup patch.
Suggested-by: NEric Dumazet <edumazet@google.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Reported-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

71ae0dff

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功