提交 · fb532d6a79b96a4c8f678024d7ed3549ff0ca916 · openeuler / raspberrypi-kernel

22 3月, 2016 3 次提交

IB/{core, ulp} Support above 32 possible device capability flags · fb532d6a

由 Leon Romanovsky 提交于 2月 23, 2016

The old bitwise device_cap_flags variable was limited to u32 which
has all bits already defined. In order to overcome it, we converted
device_cap_flags variable to be u64 type.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

fb532d6a

net/mlx5_core: Introduce offload arithmetic hardware capabilities · 3f0393a5

由 Sagi Grimberg 提交于 2月 23, 2016

Define the necessary hardware structures for the offload
arithmetic capabilities and read/cache them on driver load.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

3f0393a5

net/mlx5_core: Refactor device capability function · b06e7de8

由 Leon Romanovsky 提交于 2月 23, 2016

Device capability function was called similar in all places.
It was called twice for every queried parameter, while the
difference between calls was in HCA capability mode only.

The change proposed unify these calls into one function.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b06e7de8

17 3月, 2016 1 次提交

iwcm: common code for port mapper · b493d91d

由 Faisal Latif 提交于 2月 26, 2016

moved port mapper related code from drivers into common code
Signed-off-by: NMustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: NTatyana E. Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: NFaisal Latif <faisal.latif@intel.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b493d91d

10 3月, 2016 3 次提交

IB/mlx5: Add support for don't trap rules · 35d19011

由 Maor Gottlieb 提交于 3月 07, 2016

Each bypass flow steering priority will be split into two priorities:
1. Priority for don't trap rules.
2. Priority for normal rules.

When user creates a flow using IB_FLOW_ATTR_FLAGS_DONT_TRAP flag, the
driver creates two flow rules, one used for receiving the traffic and
the other one for forwarding the packet to continue matching in lower
or equal priorities.
Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

35d19011

net/mlx5_core: Introduce forward to next priority action · b3638e1a

由 Maor Gottlieb 提交于 3月 07, 2016

Add support to create flow rule that forward packets
to the first flow table in the next priority (next priority
could be the first priority in the next namespace or the
next priority in the same namespace).
This feature could be used for DONT_TRAP rules or rules
that only want to mark the packet with flow tag.

In order to do it optimally, each flow table has list
of all rules that point to this flow table,
when a flow table is destroyed/created, we update the list
head correspondingly.

This kind of rule is created when destination is NULL and
action is MLX5_FLOW_CONTEXT_ACTION_FWD_NEXT_PRIO.
Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b3638e1a

net/mlx5_core: Create anchor of last flow table · 153fefbf

由 Maor Gottlieb 提交于 3月 07, 2016

Create an empty flow table in the end of NIC rx namesapce.
Adding this flow table simplify the implementation of "forward
to next prio" rules.
Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

153fefbf

05 3月, 2016 1 次提交

IB/core: Add arbitrary sg_list support · f5aa9159

由 Sagi Grimberg 提交于 2月 29, 2016

Devices that are capable in registering SG lists
with gaps can now expose it in the core to ULPs
using a new device capability IB_DEVICE_SG_GAPS_REG
(in a new field device_cap_flags_ex in the device attributes
as we ran out of bits), and a new mr_type IB_MR_TYPE_SG_GAPS_REG
which allocates a memory region which is capable of handling
SG lists with gaps.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

f5aa9159

04 3月, 2016 1 次提交

IB/core: Documentation fix in the MAD header file · 5a30247b

由 Hal Rosenstock 提交于 1月 05, 2016

In ib_mad.h, ib_mad_snoop_handler uses send_buf rather than send_wr
Signed-off-by: NHal Rosenstock <hal@mellanox.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5a30247b

02 3月, 2016 4 次提交

IB/mlx5: Add memory windows allocation support · d2370e0a

由 Matan Barak 提交于 2月 29, 2016

This patch adds user-space support for memory windows allocation and
deallocation. It also exposes the supported types via
query_device_caps verb.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Tested-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d2370e0a

IB/core: Add vendor's specific data to alloc mw · b2a239df

由 Matan Barak 提交于 2月 29, 2016

Passing udata to the vendor's driver in order to pass data from the
user-space driver to the kernel-space driver. This data will be
used in downstream patches.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b2a239df

net/mlx5: Refactor mlx5_core_mr to mkey · a606b0f6

由 Matan Barak 提交于 2月 29, 2016

Mlx5's mkey mechanism is also used for memory windows.
The current code base uses MR (memory region) naming, which is
inaccurate. Changing MR to mkey in order to represent its different
usages more accurately.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a606b0f6

IB/mlx5: Add support for setting source QP number · b11a4f9c

由 Haggai Eran 提交于 2月 29, 2016

In order to create multiple GSI QPs, we need to set the source QP number to
one on all these QPs. Add the necessary definitions and infrastructure to
do that.
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b11a4f9c

01 3月, 2016 7 次提交

IB/mlx5: Define interface bits for IPoIB offloads · 1015c2e8

由 Erez Shitrit 提交于 2月 21, 2016

The HW can supply several offloads for UD QP, added  offloads for
checksumming for both TX and RX and LSO for TX.
Two new bits were added in order to expose and enable these offloads:
1. HCA capability bit: declares the support for IPoIB basic offloads.
2. QPC bit which will be used in the QP creation flow, which set these
abilities in the QP.
Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

1015c2e8

IB/mlx5: Modify MAD reading counters method to use counter registers · 3efd9a11

由 Meny Yossefi 提交于 2月 18, 2016

Modify mlx5_ib_process_mad to use PPCNT and query_vport commands
instead of MAD_IFC, as MAD_IFC is deprecated on new firmware
versions (and doesn't support RoCE anyway).

Traffic counters exist in both 32-bit and 64-bit forms.
Declaring support of extended coutners results in traffic counters
to be read in their 64-bit form only via the query_vport command.
Error counters exist only in 32-bit form and read via PPCNT command.

This commit also adds counters support in RoCE.
Signed-off-by: NMeny Yossefi <menyy@mellanox.com>
Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

3efd9a11

net/mlx5_core: Add helper function to read IB error counters · 1c64bf6f

由 Meny Yossefi 提交于 2月 18, 2016

Added helper function to read IB standard error counters
via the PPCNT register.

The PPCNT register read command provides the 32-bit error counters
of both IB/RoCE link layer and transport layer.
Signed-off-by: NMeny Yossefi <menyy@mellanox.com>
Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

1c64bf6f

net/mlx5_core: Add helper function to read virtual port counters · b54ba277

由 Meny Yossefi 提交于 2月 18, 2016

Added helper function to read 64bit virtual port Infiniband traffic
counters.
Signed-off-by: NMeny Yossefi <menyy@mellanox.com>
Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b54ba277

IB/mlx4: Add support for the don't trap rule · 0e451e88

由 Marina Varshaver 提交于 2月 18, 2016

Add support for receiving multicast/unicast traffic with
the don't trap rule.

Sniffing these packets requires a flow steering rule of type NORMAL
at priority 0 with flag IB_FLOW_ATTR_FLAGS_DONT_TRAP set.
Choosing between multicast or unicast is done via ethernet L2 dest_mac
mask and value:
- If mask is all zeros - unicast and multicast are set.
- If mask non zero - only mask with multicast bit 1 and rest 0 is
                     supported, the mac value will choose if it is
                     multicast or unicast rule.

If the mask multicast bit is on and some other bits are on too, it means
a request for specific multicast or unicast, this is not supported,
either receive all multicast or all unicast.

Only when limitations are met registered QP will receive requested type
but other QPs can receive same traffic if registered for it.
Otherwise, if limitations are not met, an error will be returned.

Limitations:
- Rule must be with priority 0.
- A0 mode is not supported.
- Sniffer QP cannot appear in any other flow steering rule.
Signed-off-by: NMarina Varshaver <marinav@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

0e451e88

IB/core: Add don't trap flag to flow creation · a3100a78

由 Marina Varshaver 提交于 2月 18, 2016

Don't trap flag (i.e. IB_FLOW_ATTR_FLAGS_DONT_TRAP) indicates that QP
will receive traffic, but will not steal it.

When a packet matches a flow steering rule that was created with
the don't trap flag, the QPs assigned to this rule will get this
packet, but matching will continue to other equal/lower priority
rules. This will let other QPs assigned to those rules to get the
packet too.

If both don't trap rule and other rules have the same priority
and match the same packet, the behavior is undefined.

The don't trap flag can't be set with default rule types
(i.e. IB_FLOW_ATTR_ALL_DEFAULT, IB_FLOW_ATTR_MC_DEFAULT) as default rules
don't have rules after them and don't trap has no meaning here.
Signed-off-by: NMarina Varshaver <marinav@mellanox.com>
Reviewed-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a3100a78

IB: new common API for draining queues · 765d6774

由 Steve Wise 提交于 2月 17, 2016

Add provider-specific drain_sq/drain_rq functions for providers needing
special drain logic.

Add static functions __ib_drain_sq() and __ib_drain_rq() which post noop
WRs to the SQ or RQ and block until their completions are processed.
This ensures the applications completions for work requests posted prior
to the drain work request have all been processed.

Add API functions ib_drain_sq(), ib_drain_rq(), and ib_drain_qp().

For the drain logic to work, the caller must:

ensure there is room in the CQ(s) and QP for the drain work request
and completion.

allocate the CQ using ib_alloc_cq() and the CQ poll context cannot be
IB_POLL_DIRECT.

ensure that there are no other contexts that are posting WRs concurrently.
Otherwise the drain is not guaranteed.
Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

765d6774

28 2月, 2016 3 次提交

dax: move writeback calls into the filesystems · 7f6d5b52

由 Ross Zwisler 提交于 2月 26, 2016

Previously calls to dax_writeback_mapping_range() for all DAX filesystems
(ext2, ext4 & xfs) were centralized in filemap_write_and_wait_range().

dax_writeback_mapping_range() needs a struct block_device, and it used
to get that from inode->i_sb->s_bdev.  This is correct for normal inodes
mounted on ext2, ext4 and XFS filesystems, but is incorrect for DAX raw
block devices and for XFS real-time files.

Instead, call dax_writeback_mapping_range() directly from the filesystem
->writepages function so that it can supply us with a valid block
device.  This also fixes DAX code to properly flush caches in response
to sync(2).
Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7f6d5b52

dax: give DAX clearing code correct bdev · 20a90f58

由 Ross Zwisler 提交于 2月 26, 2016

dax_clear_blocks() needs a valid struct block_device and previously it
was using inode->i_sb->s_bdev in all cases.  This is correct for normal
inodes on mounted ext2, ext4 and XFS filesystems, but is incorrect for
DAX raw block devices and for XFS real-time devices.

Instead, rename dax_clear_blocks() to dax_clear_sectors(), and change
its arguments to take a bdev and a sector instead of an inode and a
block.  This better reflects what the function does, and it allows the
filesystem and raw block device code to pass in an appropriate struct
block_device.
Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: NDan Williams <dan.j.williams@intel.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

20a90f58

drivers: char: random: add get_random_long() · ec9ee4ac

由 Daniel Cashman 提交于 2月 26, 2016

Commit d07e2259 ("mm: mmap: add new /proc tunable for mmap_base
ASLR") added the ability to choose from a range of values to use for
entropy count in generating the random offset to the mmap_base address.

The maximum value on this range was set to 32 bits for 64-bit x86
systems, but this value could be increased further, requiring more than
the 32 bits of randomness provided by get_random_int(), as is already
possible for arm64.  Add a new function: get_random_long() which more
naturally fits with the mmap usage of get_random_int() but operates
exactly the same as get_random_int().

Also, fix the shifting constant in mmap_rnd() to be an unsigned long so
that values greater than 31 bits generate an appropriate mask without
overflow.  This is especially important on x86, as its shift instruction
uses a 5-bit mask for the shift operand, which meant that any value for
mmap_rnd_bits over 31 acts as a no-op and effectively disables mmap_base
randomization.

Finally, replace calls to get_random_int() with get_random_long() where
appropriate.

This patch (of 2):

Add get_random_long().
Signed-off-by: NDaniel Cashman <dcashman@android.com>
Acked-by: NKees Cook <keescook@chromium.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ec9ee4ac

26 2月, 2016 1 次提交

ALSA: hda - Loop interrupt handling until really cleared · 473f4145

由 Takashi Iwai 提交于 2月 23, 2016

Currently the interrupt handler of HD-audio driver assumes that no irq
update is needed while processing the irq.  But in reality, it has
been confirmed that the HW irq is issued even during the irq
handling.  Since we clear the irq status at the beginning, process the
interrupt, then exits from the handler, the lately issued interrupt is
left untouched without being properly processed.

This patch changes the interrupt handler code to loop over the
check-and-process.  The handler tries repeatedly as long as the IRQ
status are turned on, and either stream or CORB/RIRB is handled.

For checking the stream handling, snd_hdac_bus_handle_stream_irq()
returns a value indicating the stream indices bits.  Other than that,
the change is only in the irq handler itself.
Reported-by: NLibin Yang <libin.yang@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NTakashi Iwai <tiwai@suse.de>

473f4145

25 2月, 2016 2 次提交

perf: Fix race between event install and jump_labels · 9107c89e

由 Peter Zijlstra 提交于 2月 24, 2016

perf_install_in_context() relies upon the context switch hooks to have
scheduled in events when the IPI misses its target -- after all, if
the task has moved from the CPU (or wasn't running at all), it will
have to context switch to run elsewhere.

This however doesn't appear to be happening.

It is possible for the IPI to not happen (task wasn't running) only to
later observe the task running with an inactive context.

The only possible explanation is that the context switch hooks are not
called. Therefore put in a sync_sched() after toggling the jump_label
to guarantee all CPUs will have them enabled before we install an
event.

A simple if (0->1) sync_sched() will not in fact work, because any
further increment can race and complete before the sync_sched().
Therefore we must jump through some hoops.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160224174947.980211985@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

9107c89e

perf: Fix cloning · a69b0ca4

由 Peter Zijlstra 提交于 2月 24, 2016

Alexander reported that when the 'original' context gets destroyed, no
new clones happen.

This can happen irrespective of the ctx switch optimization, any task
can die, even the parent, and we want to continue monitoring the task
hierarchy until we either close the event or no tasks are left in the
hierarchy.

perf_event_init_context() will attempt to pin the 'parent' context
during clone(). At that point current is the parent, and since current
cannot have exited while executing clone(), its context cannot have
passed through perf_event_exit_task_context(). Therefore
perf_pin_task_context() cannot observe ctx->task == TASK_TOMBSTONE.

However, since inherit_event() does:

	if (parent_event->parent)
		parent_event = parent_event->parent;

it looks at the 'original' event when it does: is_orphaned_event().
This can return true if the context that contains the this event has
passed through perf_event_exit_task_context(). And thus we'll fail to
clone the perf context.

Fix this by adding a new state: STATE_DEAD, which is set by
perf_release() to indicate that the filedesc (or kernel reference) is
dead and there are no observers for our data left.

Only for STATE_DEAD will is_orphaned_event() be true and inhibit
cloning.

STATE_EXIT is otherwise preserved such that is_event_hup() remains
functional and will report when the observed task hierarchy becomes
empty.
Reported-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: eranian@google.com
Cc: oleg@redhat.com
Cc: panand@redhat.com
Cc: sasha.levin@oracle.com
Cc: vince@deater.net
Fixes: c6e5b732 ("perf: Synchronously clean up child events")
Link: http://lkml.kernel.org/r/20160224174947.919845295@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

a69b0ca4

24 2月, 2016 1 次提交

nfit: update address range scrub commands to the acpi 6.1 format · 4577b066

由 Dan Williams 提交于 2月 17, 2016

The original format of these commands from the "NVDIMM DSM Interface
Example" [1] are superseded by the ACPI 6.1 definition of the "NVDIMM Root
Device _DSMs" [2].

[1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
[2]: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
     "9.20.7 NVDIMM Root Device _DSMs"

Changes include:
1/ New 'restart' fields in ars_status, unfortunately these are
   implemented in the middle of the existing definition so this change
   is not backwards compatible.  The expectation is that shipping
   platforms will only ever support the ACPI 6.1 definition.

2/ New status values for ars_start ('busy') and ars_status ('overflow').

Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

4577b066

22 2月, 2016 2 次提交

soc: ti: knav_dma: rename pad in struct knav_dma_desc to sw_data · b1cb86ae

由 Karicheri, Muralidharan 提交于 2月 19, 2016

Rename the pad to sw_data as per description of this field in the hardware
spec(refer sprugr9 from www.ti.com). Latest version of the document is
at http://www.ti.com/lit/ug/sprugr9h/sprugr9h.pdf and section 3.1
Host Packet Descriptor describes this field.

Define and use a constant for the size of sw_data field similar to
other fields in the struct for desc and document the sw_data field
in the header. As the sw_data is not touched by hw, it's type can be
changed to u32.

Rename the helpers to match with the updated dma desc field sw_data.

Cc: Wingman Kwok <w-kwok2@ti.com>
Cc: Mugunthan V N <mugunthanvnm@ti.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Grygorii Strashko <grygorii.strashko@ti.com>
CC: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: NMurali Karicheri <m-karicheri2@ti.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1cb86ae

power: bq27xxx_battery: Restore device name · 9aafabc7

由 Ivaylo Dimitrov 提交于 2月 02, 2016

Patch <703df6c0> ("power: bq27xxx_battery: Reorganize I2C
into a module") has removed the device name numbering from
bq27xxx_battery_i2c_probe. Fix that by restoring the code.

Fixes: 703df6c0Signed-off-by: NIvaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Reviewed-by: NPali Rohár <pali.rohar@gmail.com>
Tested-by: NPali Rohár <pali.rohar@gmail.com>
Signed-off-by: NSebastian Reichel <sre@kernel.org>

9aafabc7

20 2月, 2016 2 次提交

libnvdimm, tools/testing/nvdimm: fix 'ars_status' output buffer sizing · 747ffe11

由 Dan Williams 提交于 2月 19, 2016

Use the output length specified in the command to size the receive
buffer rather than the arbitrary 4K limit.

This bug was hiding the fact that the ndctl implementation of
ndctl_bus_cmd_new_ars_status() was not specifying an output buffer size.

Cc: <stable@vger.kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

747ffe11

net: make netdev_for_each_lower_dev safe for device removal · cfdd28be

由 Nikolay Aleksandrov 提交于 2月 17, 2016

When I used netdev_for_each_lower_dev in commit bad53162 ("vrf:
remove slave queue and private slave struct") I thought that it acts
like netdev_for_each_lower_private and can be used to remove the current
device from the list while walking, but unfortunately it acts more like
netdev_for_each_lower_private_rcu and doesn't allow it. The difference
is where the "iter" points to, right now it points to the current element
and that makes it impossible to remove it. Change the logic to be
similar to netdev_for_each_lower_private and make it point to the "next"
element so we can safely delete the current one. VRF is the only such
user right now, there's no change for the read-only users.

Here's what can happen now:
[98423.249858] general protection fault: 0000 [#1] SMP
[98423.250175] Modules linked in: vrf bridge(O) stp llc nfsd auth_rpcgss
oid_registry nfs_acl nfs lockd grace sunrpc crct10dif_pclmul
crc32_pclmul crc32c_intel ghash_clmulni_intel jitterentropy_rng
sha256_generic hmac drbg ppdev aesni_intel aes_x86_64 glue_helper lrw
gf128mul ablk_helper cryptd evdev serio_raw pcspkr virtio_balloon
parport_pc parport i2c_piix4 i2c_core virtio_console acpi_cpufreq button
9pnet_virtio 9p 9pnet fscache ipv6 autofs4 ext4 crc16 mbcache jbd2 sg
virtio_blk virtio_net sr_mod cdrom e1000 ata_generic ehci_pci uhci_hcd
ehci_hcd usbcore usb_common virtio_pci ata_piix libata floppy
virtio_ring virtio scsi_mod [last unloaded: bridge]
[98423.255040] CPU: 1 PID: 14173 Comm: ip Tainted: G           O
4.5.0-rc2+ #81
[98423.255386] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.1-20150318_183358- 04/01/2014
[98423.255777] task: ffff8800547f5540 ti: ffff88003428c000 task.ti:
ffff88003428c000
[98423.256123] RIP: 0010:[<ffffffff81514f3e>]  [<ffffffff81514f3e>]
netdev_lower_get_next+0x1e/0x30
[98423.256534] RSP: 0018:ffff88003428f940  EFLAGS: 00010207
[98423.256766] RAX: 0002000100000004 RBX: ffff880054ff9000 RCX:
0000000000000000
[98423.257039] RDX: ffff88003428f8b8 RSI: ffff88003428f950 RDI:
ffff880054ff90c0
[98423.257287] RBP: ffff88003428f940 R08: 0000000000000000 R09:
0000000000000000
[98423.257537] R10: 0000000000000001 R11: 0000000000000000 R12:
ffff88003428f9e0
[98423.257802] R13: ffff880054a5fd00 R14: ffff88003428f970 R15:
0000000000000001
[98423.258055] FS:  00007f3d76881700(0000) GS:ffff88005d000000(0000)
knlGS:0000000000000000
[98423.258418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[98423.258650] CR2: 00007ffe5951ffa8 CR3: 0000000052077000 CR4:
00000000000406e0
[98423.258902] Stack:
[98423.259075]  ffff88003428f960 ffffffffa0442636 0002000100000004
ffff880054ff9000
[98423.259647]  ffff88003428f9b0 ffffffff81518205 ffff880054ff9000
ffff88003428f978
[98423.260208]  ffff88003428f978 ffff88003428f9e0 ffff88003428f9e0
ffff880035b35f00
[98423.260739] Call Trace:
[98423.260920]  [<ffffffffa0442636>] vrf_dev_uninit+0x76/0xa0 [vrf]
[98423.261156]  [<ffffffff81518205>]
rollback_registered_many+0x205/0x390
[98423.261401]  [<ffffffff815183ec>] unregister_netdevice_many+0x1c/0x70
[98423.261641]  [<ffffffff8153223c>] rtnl_delete_link+0x3c/0x50
[98423.271557]  [<ffffffff815335bb>] rtnl_dellink+0xcb/0x1d0
[98423.271800]  [<ffffffff811cd7da>] ? __inc_zone_state+0x4a/0x90
[98423.272049]  [<ffffffff815337b4>] rtnetlink_rcv_msg+0x84/0x200
[98423.272279]  [<ffffffff810cfe7d>] ? trace_hardirqs_on+0xd/0x10
[98423.272513]  [<ffffffff8153370b>] ? rtnetlink_rcv+0x1b/0x40
[98423.272755]  [<ffffffff81533730>] ? rtnetlink_rcv+0x40/0x40
[98423.272983]  [<ffffffff8155d6e7>] netlink_rcv_skb+0x97/0xb0
[98423.273209]  [<ffffffff8153371a>] rtnetlink_rcv+0x2a/0x40
[98423.273476]  [<ffffffff8155ce8b>] netlink_unicast+0x11b/0x1a0
[98423.273710]  [<ffffffff8155d2f1>] netlink_sendmsg+0x3e1/0x610
[98423.273947]  [<ffffffff814fbc98>] sock_sendmsg+0x38/0x70
[98423.274175]  [<ffffffff814fc253>] ___sys_sendmsg+0x2e3/0x2f0
[98423.274416]  [<ffffffff810d841e>] ? do_raw_spin_unlock+0xbe/0x140
[98423.274658]  [<ffffffff811e1bec>] ? handle_mm_fault+0x26c/0x2210
[98423.274894]  [<ffffffff811e19cd>] ? handle_mm_fault+0x4d/0x2210
[98423.275130]  [<ffffffff81269611>] ? __fget_light+0x91/0xb0
[98423.275365]  [<ffffffff814fcd42>] __sys_sendmsg+0x42/0x80
[98423.275595]  [<ffffffff814fcd92>] SyS_sendmsg+0x12/0x20
[98423.275827]  [<ffffffff81611bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
[98423.276073] Code: c3 31 c0 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 66
90 48 8b 06 55 48 81 c7 c0 00 00 00 48 89 e5 48 8b 00 48 39 f8 74 09 48
89 06 <48> 8b 40 e8 5d c3 31 c0 5d c3 0f 1f 84 00 00 00 00 00 66 66 66
[98423.279639] RIP  [<ffffffff81514f3e>] netdev_lower_get_next+0x1e/0x30
[98423.279920]  RSP <ffff88003428f940>

CC: David Ahern <dsa@cumulusnetworks.com>
CC: David S. Miller <davem@davemloft.net>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Vlad Yasevich <vyasevic@redhat.com>
Fixes: bad53162 ("vrf: remove slave queue and private slave struct")
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: NDavid Ahern <dsa@cumulusnetworks.com>
Tested-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cfdd28be

19 2月, 2016 4 次提交

drm/atomic: Allow for holes in connector state, v2. · 5fff80bb

由 Maarten Lankhorst 提交于 2月 17, 2016

Because we record connector_mask using 1 << drm_connector_index now
the connector_mask should stay the same even when other connectors
are removed. This was not the case with MST, in that case when removing
a connector all other connectors may change their index.

This is fixed by waiting until the first get_connector_state to allocate
connector_state, and force reallocation when state is too small.

As a side effect connector arrays no longer have to be preallocated,
and can be allocated on first use which means a less allocations in
the page flip only path.

Changes since v1:
- Whitespace. (Ville)
- Call ida_remove when destroying the connector. (Ville)
- u32 alloc -> int. (Ville)

Fixes: 14de6c44 ("drm/atomic: Remove drm_atomic_connectors_for_crtc.")
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NLyude <cpaul@redhat.com>
Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

5fff80bb

Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread" · 13d34ac6

由 Jeff Layton 提交于 2月 17, 2016

This reverts commit c510eff6 ("fsnotify: destroy marks with
call_srcu instead of dedicated thread").

Eryu reported that he was seeing some OOM kills kick in when running a
testcase that adds and removes inotify marks on a file in a tight loop.

The above commit changed the code to use call_srcu to clean up the
marks.  While that does (in principle) work, the srcu callback job is
limited to cleaning up entries in small batches and only once per jiffy.
It's easily possible to overwhelm that machinery with too many call_srcu
callbacks, and Eryu's reproduer did just that.

There's also another potential problem with using call_srcu here.  While
you can obviously sleep while holding the srcu_read_lock, the callbacks
run under local_bh_disable, so you can't sleep there.

It's possible when putting the last reference to the fsnotify_mark that
we'll end up putting a chain of references including the fsnotify_group,
uid, and associated keys.  While I don't see any obvious ways that that
could occurs, it's probably still best to avoid using call_srcu here
after all.

This patch reverts the above patch.  A later patch will take a different
approach to eliminated the dedicated thread here.
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Reported-by: NEryu Guan <guaneryu@gmail.com>
Tested-by: NEryu Guan <guaneryu@gmail.com>
Cc: Jan Kara <jack@suse.com>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

13d34ac6

tcp/dccp: fix another race at listener dismantle · 7716682c

由 Eric Dumazet 提交于 2月 18, 2016

Ilya reported following lockdep splat:

kernel: =========================
kernel: [ BUG: held lock freed! ]
kernel: 4.5.0-rc1-ceph-00026-g5e0a311 #1 Not tainted
kernel: -------------------------
kernel: swapper/5/0 is freeing memory
ffff880035c9d200-ffff880035c9dbff, with a lock still held there!
kernel: (&(&queue->rskq_lock)->rlock){+.-...}, at:
[<ffffffff816f6a88>] inet_csk_reqsk_queue_add+0x28/0xa0
kernel: 4 locks held by swapper/5/0:
kernel: #0:  (rcu_read_lock){......}, at: [<ffffffff8169ef6b>]
netif_receive_skb_internal+0x4b/0x1f0
kernel: #1:  (rcu_read_lock){......}, at: [<ffffffff816e977f>]
ip_local_deliver_finish+0x3f/0x380
kernel: #2:  (slock-AF_INET){+.-...}, at: [<ffffffff81685ffb>]
sk_clone_lock+0x19b/0x440
kernel: #3:  (&(&queue->rskq_lock)->rlock){+.-...}, at:
[<ffffffff816f6a88>] inet_csk_reqsk_queue_add+0x28/0xa0

To properly fix this issue, inet_csk_reqsk_queue_add() needs
to return to its callers if the child as been queued
into accept queue.

We also need to make sure listener is still there before
calling sk->sk_data_ready(), by holding a reference on it,
since the reference carried by the child can disappear as
soon as the child is put on accept queue.
Reported-by: NIlya Dryomov <idryomov@gmail.com>
Fixes: ebb516af ("tcp/dccp: fix race at listener dismantle phase")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7716682c

route: check and remove route cache when we get route · deed49df

由 Xin Long 提交于 2月 18, 2016

Since the gc of ipv4 route was removed, the route cached would has
no chance to be removed, and even it has been timeout, it still could
be used, cause no code to check it's expires.

Fix this issue by checking  and removing route cache when we get route.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

deed49df

18 2月, 2016 3 次提交

Revert "PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed" · 67b4eab9

由 Bjorn Helgaas 提交于 2月 17, 2016

Revert 811a4e6f ("PCI: Add helpers to manage pci_dev->irq and
pci_dev->irq_managed").

This is part of reverting 991de2e5 ("PCI, x86: Implement
pcibios_alloc_irq() and pcibios_free_irq()") to fix regressions it
introduced.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=111211
Fixes: 991de2e5 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()")
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NRafael J. Wysocki <rafael@kernel.org>
CC: Jiang Liu <jiang.liu@linux.intel.com>

67b4eab9

ftrace/module: remove ftrace module notifier · 7dcd182b

由 Jessica Yu 提交于 2月 16, 2016

Remove the ftrace module notifier in favor of directly calling
ftrace_module_enable() and ftrace_release_mod() in the module loader.
Hard-coding the function calls directly in the module loader removes
dependence on the module notifier call chain and provides better
visibility and control over what gets called when, which is important
to kernel utilities such as livepatch.

This fixes a notifier ordering issue in which the ftrace module notifier
(and hence ftrace_module_enable()) for coming modules was being called
after klp_module_notify(), which caused livepatch modules to initialize
incorrectly. This patch removes dependence on the module notifier call
chain in favor of hard coding the corresponding function calls in the
module loader. This ensures that ftrace and livepatch code get called in
the correct order on patch module load and unload.

Fixes: 5156dca3 ("ftrace: Fix the race between ftrace and insmod")
Signed-off-by: NJessica Yu <jeyu@redhat.com>
Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
Reviewed-by: NPetr Mladek <pmladek@suse.cz>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: NMiroslav Benes <mbenes@suse.cz>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

7dcd182b

pnfs/blocklayout: fix a memeory leak when using,vmalloc_to_page · c8975706

由 Kinglong Mee 提交于 2月 01, 2016

unreferenced object 0xffffc90000abf000 (size 16900):
  comm "fsync02", pid 15765, jiffies 4297431627 (age 423.772s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 a0 c2 19 00 88 ff ff  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8174d54e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff811b9b91>] __vmalloc_node_range+0x231/0x280
    [<ffffffff811b9c2a>] __vmalloc+0x4a/0x50
    [<ffffffffa02c9ec1>] ext_tree_prepare_commit+0x231/0x2e0 [blocklayoutdriver]
    [<ffffffffa02c700e>] bl_prepare_layoutcommit+0xe/0x10 [blocklayoutdriver]
    [<ffffffffa0596a6c>] pnfs_layoutcommit_inode+0x29c/0x330 [nfsv4]
    [<ffffffffa0596b13>] pnfs_generic_sync+0x13/0x20 [nfsv4]
    [<ffffffffa0585188>] nfs4_file_fsync+0x58/0x150 [nfsv4]
    [<ffffffff81228e5b>] vfs_fsync_range+0x4b/0xb0
    [<ffffffff81228f1d>] do_fsync+0x3d/0x70
    [<ffffffff812291d0>] SyS_fsync+0x10/0x20
    [<ffffffff81757def>] entry_SYSCALL_64_fastpath+0x12/0x76
    [<ffffffffffffffff>] 0xffffffffffffffff

v2, add missing include header
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c8975706

17 2月, 2016 2 次提交

net/mlx4_core: Set UAR page size to 4KB regardless of system page size · 85743f1e

由 Huy Nguyen 提交于 2月 17, 2016

problem description:

The current code sets UAR page size equal to system page size.
The ConnectX-3 and ConnectX-3 Pro HWs require minimum 128 UAR pages.
The mlx4 kernel drivers are not loaded if there is less than 128 UAR pages.

solution:

Always set UAR page to 4KB. This allows more UAR pages if the OS
has PAGE_SIZE larger than 4KB. For example, PowerPC kernel use 64KB
system page size, with 4MB uar region, there are 4MB/2/64KB = 32
uars (half for uar, half for blueflame). This does not meet minimum 128
UAR pages requirement. With 4KB UAR page, there are 4MB/2/4KB = 512 uars
which meet the minimum requirement.

Note that only codes in mlx4_core that deal with firmware know that uar
page size is 4KB. Codes that deal with usr page in cq and qp context
(mlx4_ib, mlx4_en and part of mlx4_core) still have the same assumption
that uar page size equals to system page size.

Note that with this implementation, on 64KB system page size kernel, there
are 16 uars per system page but only one uars is used. The other 15
uars are ignored because of the above assumption.

Regarding SR-IOV, mlx4_core in hypervisor will set the uar page size
to 4KB and mlx4_core code in virtual OS will obtain the uar page size from
firmware.

Regarding backward compatibility in SR-IOV, if hypervisor has this new code,
the virtual OS must be updated. If hypervisor has old code, and the virtual
OS has this new code, the new code will be backward compatible with the
old code. If the uar size is big enough, this new code in VF continues to
work with 64 KB uar page size (on PowerPc kernel). If the uar size does not
meet 128 uars requirement, this new code not loaded in VF and print the same
error message as the old code in Hypervisor.
Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

85743f1e

net/mlx5: Use offset based reserved field names in the IFC header file · b4ff3a36

由 Matan Barak 提交于 2月 09, 2016

mlx5_ifc.h is a header file representing the API and ABI between
the driver to the firmware and hardware. This file is used from
both the mlx5_ib and mlx5_core drivers.

Previously, this file used incrementing counter to indicate
reserved fields, for example:

struct mlx5_ifc_odp_per_transport_service_cap_bits {
        u8         send[0x1];
        u8         receive[0x1];
        u8         write[0x1];
        u8         read[0x1];
        u8         reserved_0[0x1];
        u8         srq_receive[0x1];
        u8         reserved_1[0x1a];
};

If one developer implements through net-next feature A that uses
reserved_0, they replace it with featureA and renames reserved_1 to
reserved_0. In the same kernel cycle, a 2nd developer could implement
feature B through the rdma tree, that uses reserved_1 and split it to
featureB and a smaller reserved_1 field. This will cause a conflict
when the two trees are merged.

The source of this conflict is that the 1st developer changed *all*
reserved fields.

As Linus suggested, we change the layout of structs to:

struct mlx5_ifc_odp_per_transport_service_cap_bits {
	u8         send[0x1];
	u8         receive[0x1];
	u8         write[0x1];
	u8         read[0x1];
	u8         reserved_at_4[0x1];
	u8         srq_receive[0x1];
	u8         reserved_at_6[0x1a];
};

This makes the conflicts much more rare and preserves the locality of
changes.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NAlaa Hleihel <alaa@mellanox.com>
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b4ff3a36