提交 · a2b81b35f9e5ade210e4df2001f7a30ac390114d · openeuler / Kernel

06 8月, 2014 1 次提交

tcp: reduce spurious retransmits due to transient SACK reneging · 5ae344c9

由 Neal Cardwell 提交于 8月 04, 2014

This commit reduces spurious retransmits due to apparent SACK reneging
by only reacting to SACK reneging that persists for a short delay.

When a sequence space hole at snd_una is filled, some TCP receivers
send a series of ACKs as they apparently scan their out-of-order queue
and cumulatively ACK all the packets that have now been consecutiveyly
received. This is essentially misbehavior B in "Misbehaviors in TCP
SACK generation" ACM SIGCOMM Computer Communication Review, April
2011, so we suspect that this is from several common OSes (Windows
2000, Windows Server 2003, Windows XP). However, this issue has also
been seen in other cases, e.g. the netdev thread "TCP being hoodwinked
into spurious retransmissions by lack of timestamps?" from March 2014,
where the receiver was thought to be a BSD box.

Since snd_una would temporarily be adjacent to a previously SACKed
range in these scenarios, this receiver behavior triggered the Linux
SACK reneging code path in the sender. This led the sender to clear
the SACK scoreboard, enter CA_Loss, and spuriously retransmit
(potentially) every packet from the entire write queue at line rate
just a few milliseconds before the ACK for each packet arrives at the
sender.

To avoid such situations, now when a sender sees apparent reneging it
does not yet retransmit, but rather adjusts the RTO timer to give the
receiver a little time (max(RTT/2, 10ms)) to send us some more ACKs
that will restore sanity to the SACK scoreboard. If the reneging
persists until this RTO then, as before, we clear the SACK scoreboard
and enter CA_Loss.

A 10ms delay tolerates a receiver sending such a stream of ACKs at
56Kbit/sec. And to allow for receivers with slower or more congested
paths, we wait for at least RTT/2.

We validated the resulting max(RTT/2, 10ms) delay formula with a mix
of North American and South American Google web server traffic, and
found that for ACKs displaying transient reneging:

 (1) 90% of inter-ACK delays were less than 10ms
 (2) 99% of inter-ACK delays were less than RTT/2

In tests on Google web servers this commit reduced reneging events by
75%-90% (as measured by the TcpExtTCPSACKReneging counter), without
any measurable impact on latency for user HTTP and SPDY requests.
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ae344c9

03 8月, 2014 9 次提交

lib: Resizable, Scalable, Concurrent Hash Table · 7e1e7763

由 Thomas Graf 提交于 8月 02, 2014

Generic implementation of a resizable, scalable, concurrent hash table
based on [0]. The implementation supports both, fixed size keys specified
via an offset and length, or arbitrary keys via own hash and compare
functions.

Lookups are lockless and protected as RCU read side critical sections.
Automatic growing/shrinking based on user configurable watermarks is
available while allowing concurrent lookups to take place.

Objects to be hashed must include a struct rhash_head. The reason for not
using the existing struct hlist_head is that the expansion and shrinking
will have two buckets point to a single entry which would lead in obscure
reverse chaining behaviour.

Code includes a boot selftest if CONFIG_TEST_RHASHTABLE is defined.

[0] https://www.usenix.org/legacy/event/atc11/tech/final_files/Triplett.pdfSigned-off-by: NThomas Graf <tgraf@suug.ch>
Reviewed-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e1e7763

inet: frags: use kmem_cache for inet_frag_queue · d4ad4d22

由 Nikolay Aleksandrov 提交于 8月 01, 2014

Use kmem_cache to allocate/free inet_frag_queue objects since they're
all the same size per inet_frags user and are alloced/freed in high volumes
thus making it a perfect case for kmem_cache.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Acked-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4ad4d22

inet: frags: enum the flag definitions and add descriptions · 1ab1934e

由 Nikolay Aleksandrov 提交于 8月 01, 2014

Move the flags to an enum definion, swap FIRST_IN/LAST_IN to be in increasing
order and add comments explaining each flag and the inet_frag_queue struct
members.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ab1934e

inet: frags: rename last_in to flags · 06aa8b8a

由 Nikolay Aleksandrov 提交于 8月 01, 2014

The last_in field has been used to store various flags different from
first/last frag in so give it a more descriptive name: flags.
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06aa8b8a

net: filter: split 'struct sk_filter' into socket and bpf parts · 7ae457c1

由 Alexei Starovoitov 提交于 7月 30, 2014

clean up names related to socket filtering and bpf in the following way:
- everything that deals with sockets keeps 'sk_*' prefix
- everything that is pure BPF is changed to 'bpf_*' prefix

split 'struct sk_filter' into
struct sk_filter {
	atomic_t        refcnt;
	struct rcu_head rcu;
	struct bpf_prog *prog;
};
and
struct bpf_prog {
        u32                     jited:1,
                                len:31;
        struct sock_fprog_kern  *orig_prog;
        unsigned int            (*bpf_func)(const struct sk_buff *skb,
                                            const struct bpf_insn *filter);
        union {
                struct sock_filter      insns[0];
                struct bpf_insn         insnsi[0];
                struct work_struct      work;
        };
};
so that 'struct bpf_prog' can be used independent of sockets and cleans up
'unattached' bpf use cases

split SK_RUN_FILTER macro into:
    SK_RUN_FILTER to be used with 'struct sk_filter *' and
    BPF_PROG_RUN to be used with 'struct bpf_prog *'

__sk_filter_release(struct sk_filter *) gains
__bpf_prog_release(struct bpf_prog *) helper function

also perform related renames for the functions that work
with 'struct bpf_prog *', since they're on the same lines:

sk_filter_size -> bpf_prog_size
sk_filter_select_runtime -> bpf_prog_select_runtime
sk_filter_free -> bpf_prog_free
sk_unattached_filter_create -> bpf_prog_create
sk_unattached_filter_destroy -> bpf_prog_destroy
sk_store_orig_filter -> bpf_prog_store_orig_filter
sk_release_orig_filter -> bpf_release_orig_filter
__sk_migrate_filter -> bpf_migrate_filter
__sk_prepare_filter -> bpf_prepare_filter

API for attaching classic BPF to a socket stays the same:
sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *)
and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program
which is used by sockets, tun, af_packet

API for 'unattached' BPF programs becomes:
bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *)
and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program
which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ae457c1

net: filter: rename sk_convert_filter() -> bpf_convert_filter() · 8fb575ca

由 Alexei Starovoitov 提交于 7月 30, 2014

to indicate that this function is converting classic BPF into eBPF
and not related to sockets
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fb575ca

net: filter: rename sk_chk_filter() -> bpf_check_classic() · 4df95ff4

由 Alexei Starovoitov 提交于 7月 30, 2014

trivial rename to indicate that this functions performs classic BPF checking
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4df95ff4

net: filter: rename sk_filter_proglen -> bpf_classic_proglen · 009937e7

由 Alexei Starovoitov 提交于 7月 30, 2014

trivial rename to better match semantics of macro
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

009937e7

net: filter: simplify socket charging · 278571ba

由 Alexei Starovoitov 提交于 7月 30, 2014

attaching bpf program to a socket involves multiple socket memory arithmetic,
since size of 'sk_filter' is changing when classic BPF is converted to eBPF.
Also common path of program creation has to deal with two ways of freeing
the memory.

Simplify the code by delaying socket charging until program is ready and
its size is known
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

278571ba

01 8月, 2014 4 次提交

sctp: Fixup v4mapped behaviour to comply with Sock API · 299ee123

由 Jason Gunthorpe 提交于 7月 30, 2014

The SCTP socket extensions API document describes the v4mapping option as
follows:

8.1.15.  Set/Clear IPv4 Mapped Addresses (SCTP_I_WANT_MAPPED_V4_ADDR)

   This socket option is a Boolean flag which turns on or off the
   mapping of IPv4 addresses.  If this option is turned on, then IPv4
   addresses will be mapped to V6 representation.  If this option is
   turned off, then no mapping will be done of V4 addresses and a user
   will receive both PF_INET6 and PF_INET type addresses on the socket.
   See [RFC3542] for more details on mapped V6 addresses.

This description isn't really in line with what the code does though.

Introduce addr_to_user (renamed addr_v4map), which should be called
before any sockaddr is passed back to user space. The new function
places the sockaddr into the correct format depending on the
SCTP_I_WANT_MAPPED_V4_ADDR option.

Audit all places that touched v4mapped and either sanely construct
a v4 or v6 address then call addr_to_user, or drop the
unnecessary v4mapped check entirely.

Audit all places that call addr_to_user and verify they are on a sycall
return path.

Add a custom getname that formats the address properly.

Several bugs are addressed:
 - SCTP_I_WANT_MAPPED_V4_ADDR=0 often returned garbage for
   addresses to user space
 - The addr_len returned from recvmsg was not correct when
   returning AF_INET on a v6 socket
 - flowlabel and scope_id were not zerod when promoting
   a v4 to v6
 - Some syscalls like bind and connect behaved differently
   depending on v4mapped

Tested bind, getpeername, getsockname, connect, and recvmsg for proper
behaviour in v4mapped = 1 and 0 cases.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Tested-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

299ee123

net: kernel-doc compliant documentation for net_device · 536721b1

由 Karoly Kemeny 提交于 7月 30, 2014

Net_device is a vast and important structure, but it has no kernel-doc
compliant documentation. This patch extracts the comments from the structure
to clean it up, and let the scripts extract documentation from it. I know that
the patch is big, but it's just reordering of comments into the appropriate
form, and adding a few more, for the missing members.
Signed-off-by: NKaroly Kemeny <karoly.kemeny@gmail.com>
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

536721b1

net: stmmac: Support devicetree configs for mcast and ucast filter entries · 3b57de95

由 Vince Bridgers 提交于 7月 31, 2014

This patch adds and modifies code to support multiple Multicast and Unicast
Synopsys MAC filter configurations. The default configuration is defined to
support legacy driver behavior, which is 64 Multicast bins. The Unicast
filter code previously assumed all controllers support 32 or 16 Unicast
addresses based on controller version number, but this has been corrected
to support a default of 1 Unicast address. The filter configuration may
be specified through the devicetree using a Synopsys specific device tree
entry. This information was verified with Synopsys through
Synopsys Support Case #8000684337 and shared with the maintainer.
Signed-off-by: NVince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b57de95

bcma: use NS prefix for names of Northstar specific cores · dc6be9f5

由 Rafał Miłecki 提交于 7月 30, 2014

It's cleaner and we don't have quite identical names like
BCMA_CORE_PCIEG2 and BCMA_CORE_PCIE2.
Signed-off-by: NRafał Miłecki <zajec5@gmail.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

dc6be9f5

31 7月, 2014 12 次提交

net: libphy: Add phy specific function to access mmd phy registers · 0c1d77df

由 Vince Bridgers 提交于 7月 29, 2014

libphy was originally written assuming all phy devices support clause 45
access extensions to the mmd registers through the indirection registers
located within the first 16 phy registers. This assumption is not true
in all cases, and one specific example is the Micrel ksz9021 10/100/1000
Mbps phy. Using the stmmac driver, accessing the mmd registers to query
and configure energy efficient Ethernet (EEE) features yielded unexpected
behavior.

This patch adds mmd access functions to the phy driver that can be
overriden by the phy specific driver if the phy does not support this
mechanism or uses it's own non-standard access mechanism. By default,
the IEEE Compatible clause 45 access mechanism described in clause 22
is used. With this patch, EEE query/configure functions as expected
using the stmmac and the Micrel ksz9021 phy.
Signed-off-by: NVince Bridgers <vbridgers2013@gmail.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c1d77df

netfilter: xt_bpf: add mising opaque struct sk_filter definition · e10038a8

由 Pablo Neira 提交于 7月 29, 2014

This structure is not exposed to userspace, so fix this by defining
struct sk_filter; so we skip the casting in kernelspace. This is safe
since userspace has no way to lurk with that internal pointer.

Fixes: e6f30c73 ("netfilter: x_tables: add xt_bpf match")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e10038a8

dcbnl : Fix misleading dcb_app->priority explanation · 16eecd9b

由 Anish Bhatt 提交于 7月 28, 2014

Current explanation of dcb_app->priority is wrong. It says priority is
expected to be a 3-bit unsigned integer which is only true when working with
DCBx-IEEE. Use of dcb_app->priority by DCBx-CEE expects it to be 802.1p user
priority bitmap. Updated accordingly

This affects the cxgb4 driver, but I will post those changes as part of a
larger changeset shortly.

Fixes: 3e29027a ("dcbnl: add support for ieee8021Qaz attributes")
Signed-off-by: NAnish Bhatt <anish@chelsio.com>
Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16eecd9b

mlx5: Adjust events to use unsigned long param instead of void * · 4d2f9bbb

由 Jack Morgenstein 提交于 7月 28, 2014

In the event flow, we currently pass only a port number in the
void *data argument.  Rather than pass a pointer to the event handlers,
we should use an "unsigned long" parameter, and pass the port number
value directly.

In the future, if necessary for some events, we can use the unsigned long
parameter to pass a pointer.

Based on a patch by Eli Cohen <eli@mellanox.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d2f9bbb

mlx5: minor fixes (mainly avoidance of hidden casts) · f241e749

由 Jack Morgenstein 提交于 7月 28, 2014

There were many places where parameters which should be u8/u16 were
integer type.

Additionally, in 2 places, a check for a non-null pointer was added
before dereferencing the pointer (this is actually a bug fix).
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f241e749

mlx5: Move pci device handling from mlx5_ib to mlx5_core · 9603b61d

由 Jack Morgenstein 提交于 7月 28, 2014

In preparation for a new mlx5 device which is VPI (i.e., ports can be
either IB or ETH), move the pci device functionality from mlx5_ib
to mlx5_core.

This involves the following changes:
1. Move mlx5_core_dev struct out of mlx5_ib_dev. mlx5_core_dev
   is now an independent structure maintained by mlx5_core.
   mlx5_ib_dev now has a pointer to that struct.
   This requires changing a lot of places where the core_dev
   struct was accessed via mlx5_ib_dev (now, this needs to
   be a pointer dereference).
2. All PCI initializations are now done in mlx5_core. Thus,
   it is now mlx5_core which does pci_register_device (and not
   mlx5_ib, as was previously).
3. mlx5_ib now registers itself with mlx5_core as an "interface"
   driver. This is very similar to the mechanism employed for
   the mlx4 (ConnectX) driver. Once the HCA is initialized
   (by mlx5_core), it invokes the interface drivers to do
   their initializations.
4. There is a new event handler which the core registers:
   mlx5_core_event(). This event handler invokes the
   event handlers registered by the interfaces.

Based on a patch by Eli Cohen <eli@mellanox.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9603b61d

Bluetooth: Rename pairable mgmt setting to bondable · b2939475

由 Johan Hedberg 提交于 7月 30, 2014

This setting maps to the HCI_BONDABLE flag which tracks whether we're
bondable or not. Therefore, rename the mgmt setting and respective
command accordingly.
Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

b2939475

Bluetooth: Rename HCI_PAIRABLE to HCI_BONDABLE · b6ae8457

由 Johan Hedberg 提交于 7月 30, 2014

The HCI_PAIRABLE flag isn't actually controlling whether we're pairable
but whether we're bondable. Therefore, rename it accordingly.
Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

b6ae8457

6lowpan: remove unused function · 233351bd

由 Varka Bhadram 提交于 7月 30, 2014

This patch removes the unused function.
Signed-off-by: NVarka Bhadram <varkab@cdac.in>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

233351bd

6lowpan: remove unused macros · 267ca9fe

由 Varka Bhadram 提交于 7月 30, 2014

This patch removes the unused macros.
Signed-off-by: NVarka Bhadram <varkab@cdac.in>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

267ca9fe

6lowpan: remove unused LOWPAN_FRAG_SIZE define · 00494244

由 Alexander Aring 提交于 7月 29, 2014

This define is unused since commit
96cb3eb7 ("6lowpan: fix fragmentation on
sending side"). It is a worst case scenario for payload calculation.
Since commit 96cb3eb7 we calculation the
payload to use the optimal size.

This define is also necessary for ieee802154 6lowpan only and the file
include/net/6lowpan.h should contain generic 6lowpan things only.
Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

00494244

6lowpan: iphc: use ipv6 api to check address scope · 556a5bfc

由 Alexander Aring 提交于 7月 29, 2014

This patch removes the own implementation to check of link-layer,
broadcast and any address type and use the IPv6 api for that.
Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>

556a5bfc

30 7月, 2014 8 次提交

Revert "cdc_subset: deal with a device that needs reset for timeout" · 1d8fcba1

由 Linus Torvalds 提交于 7月 30, 2014

This reverts commit 20fbe3ae.

As reported by Stephen Rothwell, it causes compile failures in certain
configurations:

  drivers/net/usb/cdc_subset.c:360:15: error: 'dummy_prereset' undeclared here (not in a function)
    .pre_reset = dummy_prereset,
                 ^
  drivers/net/usb/cdc_subset.c:361:16: error: 'dummy_postreset' undeclared here (not in a function)
    .post_reset = dummy_postreset,
                  ^
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-by: NDavid Miller <davem@davemloft.net>
Cc: Oliver Neukum <oneukum@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1d8fcba1

x86/xen: safely map and unmap grant frames when in atomic context · b7dd0e35

由 David Vrabel 提交于 7月 11, 2014

arch_gnttab_map_frames() and arch_gnttab_unmap_frames() are called in
atomic context but were calling alloc_vm_area() which might sleep.

Also, if a driver attempts to allocate a grant ref from an interrupt
and the table needs expanding, then the CPU may already by in lazy MMU
mode and apply_to_page_range() will BUG when it tries to re-enable
lazy MMU mode.

These two functions are only used in PV guests.

Introduce arch_gnttab_init() to allocates the virtual address space in
advance.

Avoid the use of apply_to_page_range() by using saving and using the
array of PTE addresses from the alloc_vm_area() call (which ensures
that the required page tables are pre-allocated).
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

b7dd0e35

of: Add memory limiting function for flattened devicetrees · 704033ce

由 Laura Abbott 提交于 7月 15, 2014

Buggy bootloaders may pass bogus memory entries in the devicetree.
Add of_fdt_limit_memory to add an upper bound on the number of
entries that can be present in the devicetree.
Signed-off-by: NLaura Abbott <lauraa@codeaurora.org>
Tested-by: NAndreas Färber <afaerber@suse.de>
Signed-off-by: NGrant Likely <grant.likely@linaro.org>

704033ce

of: Split early_init_dt_scan into two parts · 4972a74b

由 Laura Abbott 提交于 7月 15, 2014

Currently, early_init_dt_scan validates the header, sets the
boot params, and scans for chosen/memory all in one function.
Split this up into two separate functions (validation/setting
boot params in one, scanning in another) to allow for
additional setup between boot params and scanning the memory.
Signed-off-by: NLaura Abbott <lauraa@codeaurora.org>
Tested-by: NAndreas Färber <afaerber@suse.de>
[glikely: s/early_init_dt_scan_all/early_init_dt_scan_nodes/]
Signed-off-by: NGrant Likely <grant.likely@linaro.org>

4972a74b

cdc_subset: deal with a device that needs reset for timeout · 20fbe3ae

由 Oliver Neukum 提交于 7月 28, 2014

This device needs to be reset to recover from a timeout.
Unfortunately this can be handled only at the level of
the subdrivers.
Signed-off-by: NOliver Neukum <oneukum@suse.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20fbe3ae

ipv4: fail early when creating netdev named all or default · 20e61da7

由 WANG Cong 提交于 7月 25, 2014

We create a proc dir for each network device, this will cause
conflicts when the devices have name "all" or "default".

Rather than emitting an ugly kernel warning, we could just
fail earlier by checking the device name.
Reported-by: NStephane Chazelas <stephane.chazelas@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20e61da7

net: remove deprecated syststamp timestamp · 4d276eb6

由 Willem de Bruijn 提交于 7月 25, 2014

The SO_TIMESTAMPING API defines three types of timestamps: software,
hardware in raw format (hwtstamp) and hardware converted to system
format (syststamp). The last has been deprecated in favor of combining
hwtstamp with a PTP clock driver. There are no active users in the
kernel.

The option was device driver dependent. If set, but without hardware
support, the correct behavior is to return zero in the relevant field
in the SCM_TIMESTAMPING ancillary message. Without device drivers
implementing the option, this field is effectively always zero.

Remove the internal plumbing to dissuage new drivers from implementing
the feature. Keep the SOF_TIMESTAMPING_SYS_HARDWARE flag, however, to
avoid breaking existing applications that request the timestamp.
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d276eb6

packet: remove deprecated syststamp timestamp · 68a360e8

由 Willem de Bruijn 提交于 7月 25, 2014

No device driver will ever return an skb_shared_info structure with
syststamp non-zero, so remove the branch that tests for this and
optionally marks the packet timestamp as TP_STATUS_TS_SYS_HARDWARE.

Do not remove the definition TP_STATUS_TS_SYS_HARDWARE, as processes
may refer to it.
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68a360e8

29 7月, 2014 3 次提交

b43: add support for BCM43131 chipset with N-PHY rev 17 · a67d19d4

由 Rafał Miłecki 提交于 7月 24, 2014

It contains radio 0x2057 rev 14 just like a BCM43217, so it doesn't
require any magic. The main difference is that BCM4313 is 1x1:1.
Signed-off-by: NRafał Miłecki <zajec5@gmail.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

a67d19d4

ip: make IP identifiers less predictable · 04ca6973

由 Eric Dumazet 提交于 7月 26, 2014

In "Counting Packets Sent Between Arbitrary Internet Hosts", Jeffrey and
Jedidiah describe ways exploiting linux IP identifier generation to
infer whether two machines are exchanging packets.

With commit 73f156a6 ("inetpeer: get rid of ip_id_count"), we
changed IP id generation, but this does not really prevent this
side-channel technique.

This patch adds a random amount of perturbation so that IP identifiers
for a given destination [1] are no longer monotonically increasing after
an idle period.

Note that prandom_u32_max(1) returns 0, so if generator is used at most
once per jiffy, this patch inserts no hole in the ID suite and do not
increase collision probability.

This is jiffies based, so in the worst case (HZ=1000), the id can
rollover after ~65 seconds of idle time, which should be fine.

We also change the hash used in __ip_select_ident() to not only hash
on daddr, but also saddr and protocol, so that ICMP probes can not be
used to infer information for other protocols.

For IPv6, adds saddr into the hash as well, but not nexthdr.

If I ping the patched target, we can see ID are now hard to predict.

21:57:11.008086 IP (...)
    A > target: ICMP echo request, seq 1, length 64
21:57:11.010752 IP (... id 2081 ...)
    target > A: ICMP echo reply, seq 1, length 64

21:57:12.013133 IP (...)
    A > target: ICMP echo request, seq 2, length 64
21:57:12.015737 IP (... id 3039 ...)
    target > A: ICMP echo reply, seq 2, length 64

21:57:13.016580 IP (...)
    A > target: ICMP echo request, seq 3, length 64
21:57:13.019251 IP (... id 3437 ...)
    target > A: ICMP echo reply, seq 3, length 64

[1] TCP sessions uses a per flow ID generator not changed by this patch.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NJeffrey Knockel <jeffk@cs.unm.edu>
Reported-by: NJedidiah R. Crandall <crandall@cs.unm.edu>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Hannes Frederic Sowa <hannes@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04ca6973

netlink: Fix shadow warning on jiffies · d87de1f3

由 Mark Rustad 提交于 7月 25, 2014

Change formal parameter name to not shadow the global jiffies.
Signed-off-by: NMark Rustad <mark.d.rustad@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d87de1f3

28 7月, 2014 3 次提交

inet: frag: use seqlock for hash rebuild · ab1c724f

由 Florian Westphal 提交于 7月 24, 2014

rehash is rare operation, don't force readers to take
the read-side rwlock.

Instead, we only have to detect the (rare) case where
the secret was altered while we are trying to insert
a new inetfrag queue into the table.

If it was changed, drop the bucket lock and recompute
the hash to get the 'new' chain bucket that we have to
insert into.

Joint work with Nikolay Aleksandrov.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab1c724f

inet: frag: remove periodic secret rebuild timer · e3a57d18

由 Florian Westphal 提交于 7月 24, 2014

merge functionality into the eviction workqueue.

Instead of rebuilding every n seconds, take advantage of the upper
hash chain length limit.

If we hit it, mark table for rebuild and schedule workqueue.
To prevent frequent rebuilds when we're completely overloaded,
don't rebuild more than once every 5 seconds.

ipfrag_secret_interval sysctl is now obsolete and has been marked as
deprecated, it still can be changed so scripts won't be broken but it
won't have any effect. A comment is left above each unused secret_timer
variable to avoid confusion.

Joint work with Nikolay Aleksandrov.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3a57d18

inet: frag: remove lru list · 3fd588eb

由 Florian Westphal 提交于 7月 24, 2014

no longer used.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fd588eb

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功