提交 · 9cf1f6a8c4cbb7836b838b51b3b02ddf32c6c6a0 · OpenHarmony / kernel_linux

01 11月, 2016 1 次提交

net: Move functions for configuring traffic classes out of inline headers · 9cf1f6a8

由 Alexander Duyck 提交于 10月 28, 2016

The functions for configuring the traffic class to queue mappings have
other effects that need to be addressed. Instead of trying to export a
bunch of new functions just relocate the functions so that we can
instrument them directly with the functionality they will need.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9cf1f6a8

31 10月, 2016 1 次提交

net: add an ioctl to get a socket network namespace · c62cce2c

由 Andrey Vagin 提交于 10月 24, 2016

Each socket operates in a network namespace where it has been created,
so if we want to dump and restore a socket, we have to know its network
namespace.

We have a socket_diag to get information about sockets, it doesn't
report sockets which are not bound or connected.

This patch introduces a new socket ioctl, which is called SIOCGSKNS
and used to get a file descriptor for a socket network namespace.

A task must have CAP_NET_ADMIN in a target network namespace to
use this ioctl.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NAndrei Vagin <avagin@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c62cce2c

30 10月, 2016 10 次提交

net/mlx5: Add multi dest support · 74491de9

由 Mark Bloch 提交于 8月 31, 2016

Currently when calling mlx5_add_flow_rule we accept
only one flow destination, this commit allows to pass
multiple destinations.

This change forces us to change the return structure to a more
flexible one. We introduce a flow handle (struct mlx5_flow_handle),
it holds internally the number for rules created and holds an array
where each cell points the to a flow rule.

From the consumers (of mlx5_add_flow_rule) point of view this
change is only cosmetic and requires only to change the type
of the returned value they store.

From the core point of view, we now need to use a loop when
allocating and deleting rules (e.g given to us a flow handler).
Signed-off-by: NMark Bloch <markb@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>

74491de9

net/mlx5: Introduce TSAR manipulation firmware commands · 813f8540

由 Mohamad Haj Yahia 提交于 8月 11, 2016

TSAR (stands for Transmit Scheduling ARbiter) is a hardware component
that is responsible for selecting the next entity to serve on the
transmit path.
The arbitration defines the QoS policy between the agents connected to
the TSAR.
The TSAR is a consist two main features:
1) BW Allocation between agents:
The TSAR implements a defecit weighted round robin between the agents.
Each agent attached to the TSAR is assigned with a weight and it is
awarded transmission tokens according to this weight.
2) Rate limer per agent:
Each agent attached to the TSAR is (optionally) assigned with a rate
limit.
TSAR will not allow scheduling for an agent exceeding its defined rate
limit.

In this patch we implement the API of manipulating the TSAR.
Signed-off-by: NMohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>

813f8540

net/mlx5: Ensure SRQ physical address structure endianness · dd257efb

由 Artemy Kovalyov 提交于 8月 31, 2016

SRQ physical address structure field should be in big-endian format.
Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>

dd257efb

net/mlx5: Update struct mlx5_ifc_xrqc_bits · 5579e151

由 Artemy Kovalyov 提交于 8月 31, 2016

Update struct mlx5_ifc_xrqc_bits according to last specification
Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>

5579e151

net/mlx4: Fix firmware command timeout during interrupt test · 6f2e0d2c

由 Eugenia Emantayev 提交于 10月 27, 2016

Currently interrupt test that is part of ethtool selftest runs the
check over all interrupt vectors of the device.
In mlx4_en package part of interrupt vectors are uninitialized since
mlx4_ib doesn't exist. This causes NOP FW command to time out.
Change logic to test current port interrupt vectors only.
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f2e0d2c

F
flow_dissector: __skb_get_hash_symmetric arg can be const · b917783c
由 Florian Westphal 提交于 10月 26, 2016
```
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
b917783c

Revert "hv_netvsc: report vmbus name in ethtool" · e934f684

由 Stephen Hemminger 提交于 10月 26, 2016

This reverts commit e3f74b84
("hv_netvsc: report vmbus name in ethtool")'
because of problem introduced by commit f9a56e5d6a0ba
("Drivers: hv: make VMBus bus ids persistent").
This changed the format of the vmbus name and this new format is too
long to fit in the bus_info field of ethtool.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e934f684

net/mlx5: PCI error recovery health care simulation · 04c0c1ab

由 Mohamad Haj Yahia 提交于 10月 25, 2016

In case that the kernel PCI error handlers are not called, we will
trigger our own recovery flow.

The health work will give priority to the kernel pci error handlers to
recover the PCI by waiting for a small period, if the pci error handlers
are not triggered the manual recovery flow will be executed.

We don't save pci state in case of manual recovery because it will ruin the
pci configuration space and we will lose dma sync.

Fixes: 89d44f0a ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: NMohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04c0c1ab

net/mlx5: Fix race between PCI error handlers and health work · 05ac2c0b

由 Mohamad Haj Yahia 提交于 10月 25, 2016

Currently there is a race between the health care work and the kernel
pci error handlers because both of them detect the error, the first one
to be called will do the error handling.
There is a chance that health care will disable the pci after resuming
pci slot.
Also create a separate WQ because now we will have two types of health
works, one for the error detection and one for the recovery.

Fixes: 89d44f0a ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: NMohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05ac2c0b

{net, ib}/mlx5: Make cache line size determination at runtime. · b47bd6ea

由 Daniel Jurgens 提交于 10月 25, 2016

ARM 64B cache line systems have L1_CACHE_BYTES set to 128.
cache_line_size() will return the correct size.

Fixes: cf50b5efa2fe('net/mlx5_core/ib: New device capabilities
handling.')
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b47bd6ea

28 10月, 2016 6 次提交

perf/powerpc: Don't call perf_event_disable() from atomic context · 5aab90ce

由 Jiri Olsa 提交于 10月 26, 2016

The trinity syscall fuzzer triggered following WARN() on powerpc:

  WARNING: CPU: 9 PID: 2998 at arch/powerpc/kernel/hw_breakpoint.c:278
  ...
  NIP [c00000000093aedc] .hw_breakpoint_handler+0x28c/0x2b0
  LR [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0
  Call Trace:
  [c0000002f7933580] [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 (unreliable)
  [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0
  [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0
  [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0
  [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100
  [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48

Followed by a lockdep warning:

  ===============================
  [ INFO: suspicious RCU usage. ]
  4.8.0-rc5+ #7 Tainted: G        W
  -------------------------------
  ./include/linux/rcupdate.h:556 Illegal context switch in RCU read-side critical section!

  other info that might help us debug this:

  rcu_scheduler_active = 1, debug_locks = 0
  2 locks held by ls/2998:
   #0:  (rcu_read_lock){......}, at: [<c0000000000f6a00>] .__atomic_notifier_call_chain+0x0/0x1c0
   #1:  (rcu_read_lock){......}, at: [<c00000000093ac50>] .hw_breakpoint_handler+0x0/0x2b0

  stack backtrace:
  CPU: 9 PID: 2998 Comm: ls Tainted: G        W       4.8.0-rc5+ #7
  Call Trace:
  [c0000002f7933150] [c00000000094b1f8] .dump_stack+0xe0/0x14c (unreliable)
  [c0000002f79331e0] [c00000000013c468] .lockdep_rcu_suspicious+0x138/0x180
  [c0000002f7933270] [c0000000001005d8] .___might_sleep+0x278/0x2e0
  [c0000002f7933300] [c000000000935584] .mutex_lock_nested+0x64/0x5a0
  [c0000002f7933410] [c00000000023084c] .perf_event_ctx_lock_nested+0x16c/0x380
  [c0000002f7933500] [c000000000230a80] .perf_event_disable+0x20/0x60
  [c0000002f7933580] [c00000000093aeec] .hw_breakpoint_handler+0x29c/0x2b0
  [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0
  [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0
  [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0
  [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100
  [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48

While it looks like the first WARN() is probably valid, the other one is
triggered by disabling event via perf_event_disable() from atomic context.

The event is disabled here in case we were not able to emulate
the instruction that hit the breakpoint. By disabling the event
we unschedule the event and make sure it's not scheduled back.

But we can't call perf_event_disable() from atomic context, instead
we need to use the event's pending_disable irq_work method to disable it.
Reported-by: NJan Stancek <jstancek@redhat.com>
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161026094824.GA21397@kravaSigned-off-by: NIngo Molnar <mingo@kernel.org>

5aab90ce

kconfig.h: remove config_enabled() macro · c0a0aba8

由 Masahiro Yamada 提交于 10月 27, 2016

The use of config_enabled() is ambiguous.  For config options,
IS_ENABLED(), IS_REACHABLE(), etc.  will make intention clearer.
Sometimes config_enabled() has been used for non-config options because
it is useful to check whether the given symbol is defined or not.

I have been tackling on deprecating config_enabled(), and now is the
time to finish this work.

Some new users have appeared for v4.9-rc1, but it is trivial to replace
them:

 - arch/x86/mm/kaslr.c
  replace config_enabled() with IS_ENABLED() because
  CONFIG_X86_ESPFIX64 and CONFIG_EFI are boolean.

 - include/asm-generic/export.h
  replace config_enabled() with __is_defined().

Then, config_enabled() can be removed now.

Going forward, please use IS_ENABLED(), IS_REACHABLE(), etc. for config
options, and __is_defined() for non-config symbols.

Link: http://lkml.kernel.org/r/1476616078-32252-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NNicolas Pitre <nicolas.pitre@linaro.org>
Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c0a0aba8

genetlink: mark families as __ro_after_init · 56989f6d

由 Johannes Berg 提交于 10月 24, 2016

Now genl_register_family() is the only thing (other than the
users themselves, perhaps, but I didn't find any doing that)
writing to the family struct.

In all families that I found, genl_register_family() is only
called from __init functions (some indirectly, in which case
I've add __init annotations to clarifly things), so all can
actually be marked __ro_after_init.

This protects the data structure from accidental corruption.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56989f6d

genetlink: statically initialize families · 489111e5

由 Johannes Berg 提交于 10月 24, 2016

Instead of providing macros/inline functions to initialize
the families, make all users initialize them statically and
get rid of the macros.

This reduces the kernel code size by about 1.6k on x86-64
(with allyesconfig).
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

489111e5

genetlink: no longer support using static family IDs · a07ea4d9

由 Johannes Berg 提交于 10月 24, 2016

Static family IDs have never really been used, the only
use case was the workaround I introduced for those users
that assumed their family ID was also their multicast
group ID.

Additionally, because static family IDs would never be
reserved by the generic netlink code, using a relatively
low ID would only work for built-in families that can be
registered immediately after generic netlink is started,
which is basically only the control family (apart from
the workaround code, which I also had to add code for so
it would reserve those IDs)

Thus, anything other than GENL_ID_GENERATE is flawed and
luckily not used except in the cases I mentioned. Move
those workarounds into a few lines of code, and then get
rid of GENL_ID_GENERATE entirely, making it more robust.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a07ea4d9

mm: remove per-zone hashtable of bitlock waitqueues · 9dcb8b68

由 Linus Torvalds 提交于 10月 26, 2016

The per-zone waitqueues exist because of a scalability issue with the
page waitqueues on some NUMA machines, but it turns out that they hurt
normal loads, and now with the vmalloced stacks they also end up
breaking gfs2 that uses a bit_wait on a stack object:

     wait_on_bit(&gh->gh_iflags, HIF_WAIT, TASK_UNINTERRUPTIBLE)

where 'gh' can be a reference to the local variable 'mount_gh' on the
stack of fill_super().

The reason the per-zone hash table breaks for this case is that there is
no "zone" for virtual allocations, and trying to look up the physical
page to get at it will fail (with a BUG_ON()).

It turns out that I actually complained to the mm people about the
per-zone hash table for another reason just a month ago: the zone lookup
also hurts the regular use of "unlock_page()" a lot, because the zone
lookup ends up forcing several unnecessary cache misses and generates
horrible code.

As part of that earlier discussion, we had a much better solution for
the NUMA scalability issue - by just making the page lock have a
separate contention bit, the waitqueue doesn't even have to be looked at
for the normal case.

Peter Zijlstra already has a patch for that, but let's see if anybody
even notices.  In the meantime, let's fix the actual gfs2 breakage by
simplifying the bitlock waitqueues and removing the per-zone issue.
Reported-by: NAndreas Gruenbacher <agruenba@redhat.com>
Tested-by: NBob Peterson <rpeterso@redhat.com>
Acked-by: NMel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9dcb8b68

27 10月, 2016 6 次提交

cfg80211: Add KEK/nonces for FILS association frames · 348bd456

由 Jouni Malinen 提交于 10月 27, 2016

The new nl80211 attributes can be used to provide KEK and nonces to
allow the driver to encrypt and decrypt FILS (Re)Association
Request/Response frames in station mode.
Signed-off-by: NJouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

348bd456

cfg80211: Add Fast Initial Link Setup (FILS) auth algs · 63181060

由 Jouni Malinen 提交于 10月 27, 2016

This defines authentication algorithms for FILS (IEEE 802.11ai).
Signed-off-by: NJouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

63181060

cfg80211: Define IEEE P802.11ai (FILS) information elements · 3f817fe7

由 Jouni Malinen 提交于 10月 27, 2016

Define the Element IDs and Element ID Extensions from IEEE
P802.11ai/D11.0. In addition, add a new cfg80211_find_ext_ie() wrapper
to make it easier to find information elements that used the Element ID
Extension field.
Signed-off-by: NJouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

3f817fe7

doc: update docbook annotations for socket and skb · 293de7de

由 Stephen Hemminger 提交于 10月 23, 2016

The skbuff and sock structure both had missing parameter annotation
values.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

293de7de

net: phy: broadcom: Add support for BCM54612E · d92ead16

由 Xo Wang 提交于 10月 21, 2016

This PHY has internal delays enabled after reset. This clears the
internal delay enables unless the interface specifically requests them.
Signed-off-by: NXo Wang <xow@google.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NJoel Stanley <joel@jms.id.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d92ead16

net: phy: broadcom: Update Auxiliary Control Register macros · 3cf25904

由 Xo Wang 提交于 10月 21, 2016

Add the RXD-to-RXC skew (delay) time bit in the Miscellaneous Control
shadow register and a mask for the shadow selector field.

Remove a re-definition of MII_BCM54XX_AUXCTL_SHDWSEL_AUXCTL.
Signed-off-by: NXo Wang <xow@google.com>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NJoel Stanley <joel@jms.id.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3cf25904

26 10月, 2016 1 次提交

x86/io: add interface to reserve io memtype for a resource range. (v1.1) · 8ef42276

由 Dave Airlie 提交于 10月 24, 2016

A recent change to the mm code in:
87744ab3 mm: fix cache mode tracking in vm_insert_mixed()

started enforcing checking the memory type against the registered list for
amixed pfn insertion mappings. It happens that the drm drivers for a number
of gpus relied on this being broken. Currently the driver only inserted
VRAM mappings into the tracking table when they came from the kernel,
and userspace mappings never landed in the table. This led to a regression
where all the mapping end up as UC instead of WC now.

I've considered a number of solutions but since this needs to be fixed
in fixes and not next, and some of the solutions were going to introduce
overhead that hadn't been there before I didn't consider them viable at
this stage. These mainly concerned hooking into the TTM io reserve APIs,
but these API have a bunch of fast paths I didn't want to unwind to add
this to.

The solution I've decided on is to add a new API like the arch_phys_wc
APIs (these would have worked but wc_del didn't take a range), and
use them from the drivers to add a WC compatible mapping to the table
for all VRAM on those GPUs. This means we can then create userspace
mapping that won't get degraded to UC.

v1.1: use CONFIG_X86_PAT + add some comments in io.h

Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: x86@kernel.org
Cc: mcgrof@suse.com
Cc: Dan Williams <dan.j.williams@intel.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NDave Airlie <airlied@redhat.com>

8ef42276

25 10月, 2016 1 次提交

mm: unexport __get_user_pages() · 0d731759

由 Lorenzo Stoakes 提交于 10月 24, 2016

This patch unexports the low-level __get_user_pages() function.

Recent refactoring of the get_user_pages* functions allow flags to be
passed through get_user_pages() which eliminates the need for access to
this function from its one user, kvm.

We can see that the two calls to get_user_pages() which replace
__get_user_pages() in kvm_main.c are equivalent by examining their call
stacks:

  get_user_page_nowait():
    get_user_pages(start, 1, flags, page, NULL)
    __get_user_pages_locked(current, current->mm, start, 1, page, NULL, NULL,
			    false, flags | FOLL_TOUCH)
    __get_user_pages(current, current->mm, start, 1,
		     flags | FOLL_TOUCH | FOLL_GET, page, NULL, NULL)

  check_user_page_hwpoison():
    get_user_pages(addr, 1, flags, NULL, NULL)
    __get_user_pages_locked(current, current->mm, addr, 1, NULL, NULL, NULL,
			    false, flags | FOLL_TOUCH)
    __get_user_pages(current, current->mm, addr, 1, flags | FOLL_TOUCH, NULL,
		     NULL, NULL)
Signed-off-by: NLorenzo Stoakes <lstoakes@gmail.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0d731759

24 10月, 2016 1 次提交

ACPI/PCI: pci_link: penalize SCI correctly · f1caa61d

由 Sinan Kaya 提交于 10月 24, 2016

Ondrej reported that IRQs stopped working in v4.7 on several
platforms.  A typical scenario, from Ondrej's VT82C694X/694X, is:

ACPI: Using PIC for interrupt routing
ACPI: PCI Interrupt Link [LNKA] (IRQs 1 3 4 5 6 7 10 *11 12 14 15)
ACPI: No IRQ available for PCI Interrupt Link [LNKA]
8139too 0000:00:0f.0: PCI INT A: no GSI

We're using PIC routing, so acpi_irq_balance == 0, and LNKA is already
active at IRQ 11. In that case, acpi_pci_link_allocate() only tries
to use the active IRQ (IRQ 11) which also happens to be the SCI.

We should penalize the SCI by PIRQ_PENALTY_PCI_USING, but
irq_get_trigger_type(11) returns something other than
IRQ_TYPE_LEVEL_LOW, so we penalize it by PIRQ_PENALTY_ISA_ALWAYS
instead, which makes acpi_pci_link_allocate() assume the IRQ isn't
available and give up.

Add acpi_penalize_sci_irq() so platforms can tell us the SCI IRQ,
trigger, and polarity directly and we don't have to depend on
irq_get_trigger_type().

Fixes: 103544d8 (ACPI,PCI,IRQ: reduce resource requirements)
Link: http://lkml.kernel.org/r/201609251512.05657.linux@rainbow-software.orgReported-by: NOndrej Zary <linux@rainbow-software.org>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NSinan Kaya <okaya@codeaurora.org>
Tested-by: NJonathan Liu <net147@gmail.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

f1caa61d

23 10月, 2016 2 次提交

qed*: Reduce the memory footprint for Rx path · 0e191827

由 Sudarsana Reddy Kalluru 提交于 10月 21, 2016

With the current default values for Rx path i.e., 8 queues of 8Kb entries
each with 4Kb size, interface will consume 256Mb for Rx. The default values
causing the driver probe to fail when the system memory is low. Based on
the perforamnce results, rx-ring count value of 1Kb gives the comparable
performance with Rx coalesce timeout of 12 seconds. Updating the default
values.
Signed-off-by: NSudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e191827

bpf: add helper for retrieving current numa node id · 2d0e30c3

由 Daniel Borkmann 提交于 10月 21, 2016

Use case is mainly for soreuseport to select sockets for the local
numa node, but since generic, lets also add this for other networking
and tracing program types.
Suggested-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d0e30c3

21 10月, 2016 4 次提交

net: use core MTU range checking in misc drivers · b3e3893e

由 Jarod Wilson 提交于 10月 20, 2016

firewire-net:
- set min/max_mtu
- remove fwnet_change_mtu

nes:
- set max_mtu
- clean up nes_netdev_change_mtu

xpnet:
- set min/max_mtu
- remove xpnet_dev_change_mtu

hippi:
- set min/max_mtu
- remove hippi_change_mtu

batman-adv:
- set max_mtu
- remove batadv_interface_change_mtu
- initialization is a little async, not 100% certain that max_mtu is set
  in the optimal place, don't have hardware to test with

rionet:
- set min/max_mtu
- remove rionet_change_mtu

slip:
- set min/max_mtu
- streamline sl_change_mtu

um/net_kern:
- remove pointless ndo_change_mtu

hsi/clients/ssi_protocol:
- use core MTU range checking
- remove now redundant ssip_pn_set_mtu

ipoib:
- set a default max MTU value
- Note: ipoib's actual max MTU can vary, depending on if the device is in
  connected mode or not, so we'll just set the max_mtu value to the max
  possible, and let the ndo_change_mtu function continue to validate any new
  MTU change requests with checks for CM or not. Note that ipoib has no
  min_mtu set, and thus, the network core's mtu > 0 check is the only lower
  bounds here.

mptlan:
- use net core MTU range checking
- remove now redundant mpt_lan_change_mtu

fddi:
- min_mtu = 21, max_mtu = 4470
- remove now redundant fddi_change_mtu (including export)

fjes:
- min_mtu = 8192, max_mtu = 65536
- The max_mtu value is actually one over IP_MAX_MTU here, but the idea is to
  get past the core net MTU range checks so fjes_change_mtu can validate a
  new MTU against what it supports (see fjes_support_mtu in fjes_hw.c)

hsr:
- min_mtu = 0 (calls ether_setup, max_mtu is 1500)

f_phonet:
- min_mtu = 6, max_mtu = 65541

u_ether:
- min_mtu = 14, max_mtu = 15412

phonet/pep-gprs:
- min_mtu = 576, max_mtu = 65530
- remove redundant gprs_set_mtu

CC: netdev@vger.kernel.org
CC: linux-rdma@vger.kernel.org
CC: Stefan Richter <stefanr@s5r6.in-berlin.de>
CC: Faisal Latif <faisal.latif@intel.com>
CC: linux-rdma@vger.kernel.org
CC: Cliff Whickman <cpw@sgi.com>
CC: Robin Holt <robinmholt@gmail.com>
CC: Jes Sorensen <jes@trained-monkey.org>
CC: Marek Lindner <mareklindner@neomailbox.ch>
CC: Simon Wunderlich <sw@simonwunderlich.de>
CC: Antonio Quartulli <a@unstable.cc>
CC: Sathya Prakash <sathya.prakash@broadcom.com>
CC: Chaitra P B <chaitra.basappa@broadcom.com>
CC: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
CC: MPT-FusionLinux.pdl@broadcom.com
CC: Sebastian Reichel <sre@kernel.org>
CC: Felipe Balbi <balbi@kernel.org>
CC: Arvid Brodin <arvid.brodin@alten.se>
CC: Remi Denis-Courmont <courmisch@gmail.com>
Signed-off-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b3e3893e

net: use core MTU range checking in WAN drivers · 8b6b4135

由 Jarod Wilson 提交于 10月 20, 2016

- set min/max_mtu in all hdlc drivers, remove hdlc_change_mtu
- sent max_mtu in lec driver, remove lec_change_mtu
- set min/max_mtu in x25_asy driver

CC: netdev@vger.kernel.org
CC: Krzysztof Halasa <khc@pm.waw.pl>
CC: Krzysztof Halasa <khalasa@piap.pl>
CC: Jan "Yenya" Kasprzak <kas@fi.muni.cz>
CC: Francois Romieu <romieu@fr.zoreil.com>
CC: Kevin Curtis <kevin.curtis@farsite.co.uk>
CC: Zhao Qiang <qiang.zhao@nxp.com>
Signed-off-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8b6b4135

net: add recursion limit to GRO · fcd91dd4

由 Sabrina Dubroca 提交于 10月 20, 2016

Currently, GRO can do unlimited recursion through the gro_receive
handlers.  This was fixed for tunneling protocols by limiting tunnel GRO
to one level with encap_mark, but both VLAN and TEB still have this
problem.  Thus, the kernel is vulnerable to a stack overflow, if we
receive a packet composed entirely of VLAN headers.

This patch adds a recursion counter to the GRO layer to prevent stack
overflow.  When a gro_receive function hits the recursion limit, GRO is
aborted for this skb and it is processed normally.  This recursion
counter is put in the GRO CB, but could be turned into a percpu counter
if we run out of space in the CB.

Thanks to Vladimír Beneš <vbenes@redhat.com> for the initial bug report.

Fixes: CVE-2016-7039
Fixes: 9b174d88 ("net: Add Transparent Ethernet Bridging GRO support.")
Fixes: 66e5133f ("vlan: Add GRO support for non hardware accelerated vlan")
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Reviewed-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fcd91dd4

clocksource: Add J-Core timer/clocksource driver · 9995f4f1

由 Rich Felker 提交于 10月 13, 2016

At the hardware level, the J-Core PIT is integrated with the interrupt
controller, but it is represented as its own device and has an
independent programming interface. It provides a 12-bit countdown
timer, which is not presently used, and a periodic timer. The interval
length for the latter is programmable via a 32-bit throttle register
whose units are determined by a bus-period register. The periodic
timer is used to implement both periodic and oneshot clock event
modes; in oneshot mode the interrupt handler simply disables the timer
as soon as it fires.

Despite its device tree node representing an interrupt for the PIT,
the actual irq generated is programmable, not hard-wired. The driver
is responsible for programming the PIT to generate the hardware irq
number that the DT assigns to it.

On SMP configurations, J-Core provides cpu-local instances of the PIT;
no broadcast timer is needed. This driver supports the creation of the
necessary per-cpu clock_event_device instances.

A nanosecond-resolution clocksource is provided using the J-Core "RTC"
registers, which give a 64-bit seconds count and 32-bit nanoseconds
that wrap every second. The driver converts these to a full-range
32-bit nanoseconds count.
Signed-off-by: NRich Felker <dalias@libc.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Link: http://lkml.kernel.org/r/b591ff12cc5ebf63d1edc98da26046f95a233814.1476393790.git.dalias@libc.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

9995f4f1

20 10月, 2016 7 次提交

cpufreq: fix overflow in cpufreq_table_find_index_dl() · c6fe46a7

由 Sergey Senozhatsky 提交于 10月 18, 2016

'best' is always less or equals to 'pos', so `best - pos' returns
a negative value which is then getting casted to `unsigned int'
and passed to __cpufreq_driver_target()->acpi_cpufreq_target()
for policy->freq_table selection. This results in

 BUG: unable to handle kernel paging request at ffff881019b469f8
 IP: [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
 PGD 267f067
 PUD 0

 Oops: 0000 [#1] PREEMPT SMP
 CPU: 6 PID: 70 Comm: kworker/6:1 Not tainted 4.9.0-rc1-next-20161017-dbg-dirty
 Workqueue: events dbs_work_handler
 task: ffff88041b808000 task.stack: ffff88041b810000
 RIP: 0010:[<ffffffffa00356c1>]  [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
 RSP: 0018:ffff88041b813c60  EFLAGS: 00010282
 RAX: ffff880419b46a00 RBX: ffff88041b848400 RCX: ffff880419b20f80
 RDX: 00000000001dff38 RSI: 00000000ffffffff RDI: ffff88041b848400
 RBP: ffff88041b813cb0 R08: 0000000000000006 R09: 0000000000000040
 R10: ffffffff8207f9e0 R11: ffffffff8173595b R12: 0000000000000000
 R13: ffff88041f1dff38 R14: 0000000000262900 R15: 0000000bfffffff4
 FS:  0000000000000000(0000) GS:ffff88041f000000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffff881019b469f8 CR3: 000000041a2d3000 CR4: 00000000001406e0
 Stack:
  ffff88041b813cb0 ffffffff813347f9 ffff88041b813ca0 ffffffff81334663
  ffff88041f1d4bc0 ffff88041b848400 0000000000000000 0000000000000000
  0000000000262900 0000000000000000 ffff88041b813d00 ffffffff813355dc
 Call Trace:
  [<ffffffff813347f9>] ? cpufreq_freq_transition_begin+0xf1/0xfc
  [<ffffffff81334663>] ? get_cpu_idle_time+0x97/0xa6
  [<ffffffff813355dc>] __cpufreq_driver_target+0x3b6/0x44e
  [<ffffffff81336ca3>] cs_dbs_timer+0x11a/0x135
  [<ffffffff81336fda>] dbs_work_handler+0x39/0x62
  [<ffffffff81057823>] process_one_work+0x280/0x4a5
  [<ffffffff81058719>] worker_thread+0x24f/0x397
  [<ffffffff810584ca>] ? rescuer_thread+0x30b/0x30b
  [<ffffffff81418380>] ? nl80211_get_key+0x29/0x36a
  [<ffffffff8105d2b7>] kthread+0xfc/0x104
  [<ffffffff8107ceea>] ? put_lock_stats.isra.9+0xe/0x20
  [<ffffffff8105d1bb>] ? kthread_create_on_node+0x3f/0x3f
  [<ffffffff814b2092>] ret_from_fork+0x22/0x30
 Code: 56 4d 6b ff 0c 41 55 41 54 53 48 83 ec 28 48 8b 15 ad 1e 00 00 44 8b 41
 08 48 8b 87 c8 00 00 00 49 89 d5 4e 03 2c c5 80 b2 78 81 <46> 8b 74 38 04 45
 3b 75 00 75 11 31 c0 83 39 00 0f 84 1c 01 00
 RIP  [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq]
  RSP <ffff88041b813c60>
 CR2: ffff881019b469f8
 ---[ end trace 16d9fc7a17897d37 ]---

[ rjw: In some cases this bug may also cause incorrect frequencies to
  be selected by cpufreq governors. ]

Fixes: 899bb664 (cpufreq: skip invalid entries when searching the frequency)
Link: http://marc.info/?l=linux-kernel&m=147672030714331&w=2Reported-and-tested-by: NSedat Dilek <sedat.dilek@gmail.com>
Reported-and-tested-by: NJörg Otte <jrg.otte@gmail.com>
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Cc: 4.8+ <stable@vger.kernel.org> # 4.8+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

c6fe46a7

sched/core, x86: Make struct thread_info arch specific again · c8061485

由 Heiko Carstens 提交于 10月 19, 2016

The following commit:

  c65eacbe ("sched/core: Allow putting thread_info into task_struct")

... made 'struct thread_info' a generic struct with only a
single ::flags member, if CONFIG_THREAD_INFO_IN_TASK_STRUCT=y is
selected.

This change however seems to be quite x86 centric, since at least the
generic preemption code (asm-generic/preempt.h) assumes that struct
thread_info also has a preempt_count member, which apparently was not
true for x86.

We could add a bit more #ifdefs to solve this problem too, but it seems
to be much simpler to make struct thread_info arch specific
again. This also makes the conversion to THREAD_INFO_IN_TASK_STRUCT a
bit easier for architectures that have a couple of arch specific stuff
in their thread_info definition.

The arch specific stuff _could_ be moved to thread_struct. However
keeping them in thread_info makes it easier: accessing thread_info
members is simple, since it is at the beginning of the task_struct,
while the thread_struct is at the end. At least on s390 the offsets
needed to access members of the thread_struct (with task_struct as
base) are too large for various asm instructions.  This is not a
problem when keeping these members within thread_info.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: keescook@chromium.org
Cc: linux-arch@vger.kernel.org
Link: http://lkml.kernel.org/r/1476901693-8492-2-git-send-email-mark.rutland@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

c8061485

mm: Change vm_is_stack_for_task() to vm_is_stack_for_current() · d17af505

由 Andy Lutomirski 提交于 9月 30, 2016

Asking for a non-current task's stack can't be done without races
unless the task is frozen in kernel mode.  As far as I know,
vm_is_stack_for_task() never had a safe non-current use case.

The __unused annotation is because some KSTK_ESP implementations
ignore their parameter, which IMO is further justification for this
patch.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux API <linux-api@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Link: http://lkml.kernel.org/r/4c3f68f426e6c061ca98b4fc7ef85ffbb0a25b0c.1475257877.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

d17af505

iomap: add IOMAP_REPORT · d33fd776

由 Christoph Hellwig 提交于 10月 20, 2016

This allows the file system to tell a FIEMAP from a read operation, and thus
avoids the need to report flags that aren't actually used in the read path.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

d33fd776

nvme.h: add an enum for cns values · 329dd768

由 Christoph Hellwig 提交于 9月 30, 2016

Ported over from nvme-cli.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

329dd768

nvme.h: don't use uuid_be · 8d63687a

由 Christoph Hellwig 提交于 9月 30, 2016

This makes life easier for nvme-cli and we don't really need the uuid
type anyway to start with.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Reviewed-by: NJay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

8d63687a

nvme.h: resync with nvme-cli · a446c084

由 Christoph Hellwig 提交于 9月 30, 2016

Import a few updates to nvme.h from nvme-cli.  This mostly includes a few
new fields and error codes, but also a few renames that so far are only
used in user space.  Also one field is moved from an array of two le64
values to one of 16 u8 values so that we can more easily access it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a446c084

OpenHarmony / kernel_linux 上一次同步 4 年多

OpenHarmony / kernel_linux
上一次同步 4 年多