- 22 3月, 2016 3 次提交
-
-
由 Leon Romanovsky 提交于
The old bitwise device_cap_flags variable was limited to u32 which has all bits already defined. In order to overcome it, we converted device_cap_flags variable to be u64 type. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Sagi Grimberg 提交于
Define the necessary hardware structures for the offload arithmetic capabilities and read/cache them on driver load. Signed-off-by: NSagi Grimberg <sagig@mellanox.com> Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Leon Romanovsky 提交于
Device capability function was called similar in all places. It was called twice for every queried parameter, while the difference between calls was in HCA capability mode only. The change proposed unify these calls into one function. Signed-off-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 17 3月, 2016 1 次提交
-
-
由 Faisal Latif 提交于
moved port mapper related code from drivers into common code Signed-off-by: NMustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: NTatyana E. Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: NFaisal Latif <faisal.latif@intel.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Tested-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 10 3月, 2016 3 次提交
-
-
由 Maor Gottlieb 提交于
Each bypass flow steering priority will be split into two priorities: 1. Priority for don't trap rules. 2. Priority for normal rules. When user creates a flow using IB_FLOW_ATTR_FLAGS_DONT_TRAP flag, the driver creates two flow rules, one used for receiving the traffic and the other one for forwarding the packet to continue matching in lower or equal priorities. Signed-off-by: NMaor Gottlieb <maorg@mellanox.com> Reviewed-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Maor Gottlieb 提交于
Add support to create flow rule that forward packets to the first flow table in the next priority (next priority could be the first priority in the next namespace or the next priority in the same namespace). This feature could be used for DONT_TRAP rules or rules that only want to mark the packet with flow tag. In order to do it optimally, each flow table has list of all rules that point to this flow table, when a flow table is destroyed/created, we update the list head correspondingly. This kind of rule is created when destination is NULL and action is MLX5_FLOW_CONTEXT_ACTION_FWD_NEXT_PRIO. Signed-off-by: NMaor Gottlieb <maorg@mellanox.com> Reviewed-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Maor Gottlieb 提交于
Create an empty flow table in the end of NIC rx namesapce. Adding this flow table simplify the implementation of "forward to next prio" rules. Signed-off-by: NMaor Gottlieb <maorg@mellanox.com> Reviewed-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 05 3月, 2016 1 次提交
-
-
由 Sagi Grimberg 提交于
Devices that are capable in registering SG lists with gaps can now expose it in the core to ULPs using a new device capability IB_DEVICE_SG_GAPS_REG (in a new field device_cap_flags_ex in the device attributes as we ran out of bits), and a new mr_type IB_MR_TYPE_SG_GAPS_REG which allocates a memory region which is capable of handling SG lists with gaps. Signed-off-by: NSagi Grimberg <sagig@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 04 3月, 2016 1 次提交
-
-
由 Hal Rosenstock 提交于
In ib_mad.h, ib_mad_snoop_handler uses send_buf rather than send_wr Signed-off-by: NHal Rosenstock <hal@mellanox.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 02 3月, 2016 4 次提交
-
-
由 Matan Barak 提交于
This patch adds user-space support for memory windows allocation and deallocation. It also exposes the supported types via query_device_caps verb. Signed-off-by: NMatan Barak <matanb@mellanox.com> Reviewed-by: NYishai Hadas <yishaih@mellanox.com> Tested-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Matan Barak 提交于
Passing udata to the vendor's driver in order to pass data from the user-space driver to the kernel-space driver. This data will be used in downstream patches. Signed-off-by: NMatan Barak <matanb@mellanox.com> Reviewed-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Matan Barak 提交于
Mlx5's mkey mechanism is also used for memory windows. The current code base uses MR (memory region) naming, which is inaccurate. Changing MR to mkey in order to represent its different usages more accurately. Signed-off-by: NMatan Barak <matanb@mellanox.com> Reviewed-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Haggai Eran 提交于
In order to create multiple GSI QPs, we need to set the source QP number to one on all these QPs. Add the necessary definitions and infrastructure to do that. Reviewed-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NHaggai Eran <haggaie@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 01 3月, 2016 7 次提交
-
-
由 Erez Shitrit 提交于
The HW can supply several offloads for UD QP, added offloads for checksumming for both TX and RX and LSO for TX. Two new bits were added in order to expose and enable these offloads: 1. HCA capability bit: declares the support for IPoIB basic offloads. 2. QPC bit which will be used in the QP creation flow, which set these abilities in the QP. Signed-off-by: NErez Shitrit <erezsh@mellanox.com> Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com> Reviewed-by: NLeon Romanovsky <leonro@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Meny Yossefi 提交于
Modify mlx5_ib_process_mad to use PPCNT and query_vport commands instead of MAD_IFC, as MAD_IFC is deprecated on new firmware versions (and doesn't support RoCE anyway). Traffic counters exist in both 32-bit and 64-bit forms. Declaring support of extended coutners results in traffic counters to be read in their 64-bit form only via the query_vport command. Error counters exist only in 32-bit form and read via PPCNT command. This commit also adds counters support in RoCE. Signed-off-by: NMeny Yossefi <menyy@mellanox.com> Signed-off-by: NMajd Dibbiny <majd@mellanox.com> Reviewed-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Meny Yossefi 提交于
Added helper function to read IB standard error counters via the PPCNT register. The PPCNT register read command provides the 32-bit error counters of both IB/RoCE link layer and transport layer. Signed-off-by: NMeny Yossefi <menyy@mellanox.com> Signed-off-by: NMajd Dibbiny <majd@mellanox.com> Reviewed-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Meny Yossefi 提交于
Added helper function to read 64bit virtual port Infiniband traffic counters. Signed-off-by: NMeny Yossefi <menyy@mellanox.com> Signed-off-by: NMajd Dibbiny <majd@mellanox.com> Reviewed-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Marina Varshaver 提交于
Add support for receiving multicast/unicast traffic with the don't trap rule. Sniffing these packets requires a flow steering rule of type NORMAL at priority 0 with flag IB_FLOW_ATTR_FLAGS_DONT_TRAP set. Choosing between multicast or unicast is done via ethernet L2 dest_mac mask and value: - If mask is all zeros - unicast and multicast are set. - If mask non zero - only mask with multicast bit 1 and rest 0 is supported, the mac value will choose if it is multicast or unicast rule. If the mask multicast bit is on and some other bits are on too, it means a request for specific multicast or unicast, this is not supported, either receive all multicast or all unicast. Only when limitations are met registered QP will receive requested type but other QPs can receive same traffic if registered for it. Otherwise, if limitations are not met, an error will be returned. Limitations: - Rule must be with priority 0. - A0 mode is not supported. - Sniffer QP cannot appear in any other flow steering rule. Signed-off-by: NMarina Varshaver <marinav@mellanox.com> Reviewed-by: NMatan Barak <matanb@mellanox.com> Reviewed-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Marina Varshaver 提交于
Don't trap flag (i.e. IB_FLOW_ATTR_FLAGS_DONT_TRAP) indicates that QP will receive traffic, but will not steal it. When a packet matches a flow steering rule that was created with the don't trap flag, the QPs assigned to this rule will get this packet, but matching will continue to other equal/lower priority rules. This will let other QPs assigned to those rules to get the packet too. If both don't trap rule and other rules have the same priority and match the same packet, the behavior is undefined. The don't trap flag can't be set with default rule types (i.e. IB_FLOW_ATTR_ALL_DEFAULT, IB_FLOW_ATTR_MC_DEFAULT) as default rules don't have rules after them and don't trap has no meaning here. Signed-off-by: NMarina Varshaver <marinav@mellanox.com> Reviewed-by: NMatan Barak <matanb@mellanox.com> Reviewed-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Steve Wise 提交于
Add provider-specific drain_sq/drain_rq functions for providers needing special drain logic. Add static functions __ib_drain_sq() and __ib_drain_rq() which post noop WRs to the SQ or RQ and block until their completions are processed. This ensures the applications completions for work requests posted prior to the drain work request have all been processed. Add API functions ib_drain_sq(), ib_drain_rq(), and ib_drain_qp(). For the drain logic to work, the caller must: ensure there is room in the CQ(s) and QP for the drain work request and completion. allocate the CQ using ib_alloc_cq() and the CQ poll context cannot be IB_POLL_DIRECT. ensure that there are no other contexts that are posting WRs concurrently. Otherwise the drain is not guaranteed. Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 28 2月, 2016 3 次提交
-
-
由 Ross Zwisler 提交于
Previously calls to dax_writeback_mapping_range() for all DAX filesystems (ext2, ext4 & xfs) were centralized in filemap_write_and_wait_range(). dax_writeback_mapping_range() needs a struct block_device, and it used to get that from inode->i_sb->s_bdev. This is correct for normal inodes mounted on ext2, ext4 and XFS filesystems, but is incorrect for DAX raw block devices and for XFS real-time files. Instead, call dax_writeback_mapping_range() directly from the filesystem ->writepages function so that it can supply us with a valid block device. This also fixes DAX code to properly flush caches in response to sync(2). Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NJan Kara <jack@suse.cz> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@fb.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ross Zwisler 提交于
dax_clear_blocks() needs a valid struct block_device and previously it was using inode->i_sb->s_bdev in all cases. This is correct for normal inodes on mounted ext2, ext4 and XFS filesystems, but is incorrect for DAX raw block devices and for XFS real-time devices. Instead, rename dax_clear_blocks() to dax_clear_sectors(), and change its arguments to take a bdev and a sector instead of an inode and a block. This better reflects what the function does, and it allows the filesystem and raw block device code to pass in an appropriate struct block_device. Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Suggested-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NJan Kara <jack@suse.cz> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@fb.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Daniel Cashman 提交于
Commit d07e2259 ("mm: mmap: add new /proc tunable for mmap_base ASLR") added the ability to choose from a range of values to use for entropy count in generating the random offset to the mmap_base address. The maximum value on this range was set to 32 bits for 64-bit x86 systems, but this value could be increased further, requiring more than the 32 bits of randomness provided by get_random_int(), as is already possible for arm64. Add a new function: get_random_long() which more naturally fits with the mmap usage of get_random_int() but operates exactly the same as get_random_int(). Also, fix the shifting constant in mmap_rnd() to be an unsigned long so that values greater than 31 bits generate an appropriate mask without overflow. This is especially important on x86, as its shift instruction uses a 5-bit mask for the shift operand, which meant that any value for mmap_rnd_bits over 31 acts as a no-op and effectively disables mmap_base randomization. Finally, replace calls to get_random_int() with get_random_long() where appropriate. This patch (of 2): Add get_random_long(). Signed-off-by: NDaniel Cashman <dcashman@android.com> Acked-by: NKees Cook <keescook@chromium.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: David S. Miller <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Nick Kralevich <nnk@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Mark Salyzyn <salyzyn@android.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 2月, 2016 1 次提交
-
-
由 Takashi Iwai 提交于
Currently the interrupt handler of HD-audio driver assumes that no irq update is needed while processing the irq. But in reality, it has been confirmed that the HW irq is issued even during the irq handling. Since we clear the irq status at the beginning, process the interrupt, then exits from the handler, the lately issued interrupt is left untouched without being properly processed. This patch changes the interrupt handler code to loop over the check-and-process. The handler tries repeatedly as long as the IRQ status are turned on, and either stream or CORB/RIRB is handled. For checking the stream handling, snd_hdac_bus_handle_stream_irq() returns a value indicating the stream indices bits. Other than that, the change is only in the irq handler itself. Reported-by: NLibin Yang <libin.yang@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: NTakashi Iwai <tiwai@suse.de>
-
- 25 2月, 2016 2 次提交
-
-
由 Peter Zijlstra 提交于
perf_install_in_context() relies upon the context switch hooks to have scheduled in events when the IPI misses its target -- after all, if the task has moved from the CPU (or wasn't running at all), it will have to context switch to run elsewhere. This however doesn't appear to be happening. It is possible for the IPI to not happen (task wasn't running) only to later observe the task running with an inactive context. The only possible explanation is that the context switch hooks are not called. Therefore put in a sync_sched() after toggling the jump_label to guarantee all CPUs will have them enabled before we install an event. A simple if (0->1) sync_sched() will not in fact work, because any further increment can race and complete before the sync_sched(). Therefore we must jump through some hoops. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.980211985@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Alexander reported that when the 'original' context gets destroyed, no new clones happen. This can happen irrespective of the ctx switch optimization, any task can die, even the parent, and we want to continue monitoring the task hierarchy until we either close the event or no tasks are left in the hierarchy. perf_event_init_context() will attempt to pin the 'parent' context during clone(). At that point current is the parent, and since current cannot have exited while executing clone(), its context cannot have passed through perf_event_exit_task_context(). Therefore perf_pin_task_context() cannot observe ctx->task == TASK_TOMBSTONE. However, since inherit_event() does: if (parent_event->parent) parent_event = parent_event->parent; it looks at the 'original' event when it does: is_orphaned_event(). This can return true if the context that contains the this event has passed through perf_event_exit_task_context(). And thus we'll fail to clone the perf context. Fix this by adding a new state: STATE_DEAD, which is set by perf_release() to indicate that the filedesc (or kernel reference) is dead and there are no observers for our data left. Only for STATE_DEAD will is_orphaned_event() be true and inhibit cloning. STATE_EXIT is otherwise preserved such that is_event_hup() remains functional and will report when the observed task hierarchy becomes empty. Reported-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Tested-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Fixes: c6e5b732 ("perf: Synchronously clean up child events") Link: http://lkml.kernel.org/r/20160224174947.919845295@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 24 2月, 2016 1 次提交
-
-
由 Dan Williams 提交于
The original format of these commands from the "NVDIMM DSM Interface Example" [1] are superseded by the ACPI 6.1 definition of the "NVDIMM Root Device _DSMs" [2]. [1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf [2]: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf "9.20.7 NVDIMM Root Device _DSMs" Changes include: 1/ New 'restart' fields in ars_status, unfortunately these are implemented in the middle of the existing definition so this change is not backwards compatible. The expectation is that shipping platforms will only ever support the ACPI 6.1 definition. 2/ New status values for ars_start ('busy') and ars_status ('overflow'). Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Linda Knippers <linda.knippers@hpe.com> Cc: <stable@vger.kernel.org> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 22 2月, 2016 2 次提交
-
-
由 Karicheri, Muralidharan 提交于
Rename the pad to sw_data as per description of this field in the hardware spec(refer sprugr9 from www.ti.com). Latest version of the document is at http://www.ti.com/lit/ug/sprugr9h/sprugr9h.pdf and section 3.1 Host Packet Descriptor describes this field. Define and use a constant for the size of sw_data field similar to other fields in the struct for desc and document the sw_data field in the header. As the sw_data is not touched by hw, it's type can be changed to u32. Rename the helpers to match with the updated dma desc field sw_data. Cc: Wingman Kwok <w-kwok2@ti.com> Cc: Mugunthan V N <mugunthanvnm@ti.com> CC: Arnd Bergmann <arnd@arndb.de> CC: Grygorii Strashko <grygorii.strashko@ti.com> CC: David Laight <David.Laight@ACULAB.COM> Signed-off-by: NMurali Karicheri <m-karicheri2@ti.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ivaylo Dimitrov 提交于
Patch <703df6c0> ("power: bq27xxx_battery: Reorganize I2C into a module") has removed the device name numbering from bq27xxx_battery_i2c_probe. Fix that by restoring the code. Fixes: 703df6c0Signed-off-by: NIvaylo Dimitrov <ivo.g.dimitrov.75@gmail.com> Reviewed-by: NPali Rohár <pali.rohar@gmail.com> Tested-by: NPali Rohár <pali.rohar@gmail.com> Signed-off-by: NSebastian Reichel <sre@kernel.org>
-
- 20 2月, 2016 2 次提交
-
-
由 Dan Williams 提交于
Use the output length specified in the command to size the receive buffer rather than the arbitrary 4K limit. This bug was hiding the fact that the ndctl implementation of ndctl_bus_cmd_new_ars_status() was not specifying an output buffer size. Cc: <stable@vger.kernel.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
由 Nikolay Aleksandrov 提交于
When I used netdev_for_each_lower_dev in commit bad53162 ("vrf: remove slave queue and private slave struct") I thought that it acts like netdev_for_each_lower_private and can be used to remove the current device from the list while walking, but unfortunately it acts more like netdev_for_each_lower_private_rcu and doesn't allow it. The difference is where the "iter" points to, right now it points to the current element and that makes it impossible to remove it. Change the logic to be similar to netdev_for_each_lower_private and make it point to the "next" element so we can safely delete the current one. VRF is the only such user right now, there's no change for the read-only users. Here's what can happen now: [98423.249858] general protection fault: 0000 [#1] SMP [98423.250175] Modules linked in: vrf bridge(O) stp llc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac drbg ppdev aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd evdev serio_raw pcspkr virtio_balloon parport_pc parport i2c_piix4 i2c_core virtio_console acpi_cpufreq button 9pnet_virtio 9p 9pnet fscache ipv6 autofs4 ext4 crc16 mbcache jbd2 sg virtio_blk virtio_net sr_mod cdrom e1000 ata_generic ehci_pci uhci_hcd ehci_hcd usbcore usb_common virtio_pci ata_piix libata floppy virtio_ring virtio scsi_mod [last unloaded: bridge] [98423.255040] CPU: 1 PID: 14173 Comm: ip Tainted: G O 4.5.0-rc2+ #81 [98423.255386] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 [98423.255777] task: ffff8800547f5540 ti: ffff88003428c000 task.ti: ffff88003428c000 [98423.256123] RIP: 0010:[<ffffffff81514f3e>] [<ffffffff81514f3e>] netdev_lower_get_next+0x1e/0x30 [98423.256534] RSP: 0018:ffff88003428f940 EFLAGS: 00010207 [98423.256766] RAX: 0002000100000004 RBX: ffff880054ff9000 RCX: 0000000000000000 [98423.257039] RDX: ffff88003428f8b8 RSI: ffff88003428f950 RDI: ffff880054ff90c0 [98423.257287] RBP: ffff88003428f940 R08: 0000000000000000 R09: 0000000000000000 [98423.257537] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88003428f9e0 [98423.257802] R13: ffff880054a5fd00 R14: ffff88003428f970 R15: 0000000000000001 [98423.258055] FS: 00007f3d76881700(0000) GS:ffff88005d000000(0000) knlGS:0000000000000000 [98423.258418] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [98423.258650] CR2: 00007ffe5951ffa8 CR3: 0000000052077000 CR4: 00000000000406e0 [98423.258902] Stack: [98423.259075] ffff88003428f960 ffffffffa0442636 0002000100000004 ffff880054ff9000 [98423.259647] ffff88003428f9b0 ffffffff81518205 ffff880054ff9000 ffff88003428f978 [98423.260208] ffff88003428f978 ffff88003428f9e0 ffff88003428f9e0 ffff880035b35f00 [98423.260739] Call Trace: [98423.260920] [<ffffffffa0442636>] vrf_dev_uninit+0x76/0xa0 [vrf] [98423.261156] [<ffffffff81518205>] rollback_registered_many+0x205/0x390 [98423.261401] [<ffffffff815183ec>] unregister_netdevice_many+0x1c/0x70 [98423.261641] [<ffffffff8153223c>] rtnl_delete_link+0x3c/0x50 [98423.271557] [<ffffffff815335bb>] rtnl_dellink+0xcb/0x1d0 [98423.271800] [<ffffffff811cd7da>] ? __inc_zone_state+0x4a/0x90 [98423.272049] [<ffffffff815337b4>] rtnetlink_rcv_msg+0x84/0x200 [98423.272279] [<ffffffff810cfe7d>] ? trace_hardirqs_on+0xd/0x10 [98423.272513] [<ffffffff8153370b>] ? rtnetlink_rcv+0x1b/0x40 [98423.272755] [<ffffffff81533730>] ? rtnetlink_rcv+0x40/0x40 [98423.272983] [<ffffffff8155d6e7>] netlink_rcv_skb+0x97/0xb0 [98423.273209] [<ffffffff8153371a>] rtnetlink_rcv+0x2a/0x40 [98423.273476] [<ffffffff8155ce8b>] netlink_unicast+0x11b/0x1a0 [98423.273710] [<ffffffff8155d2f1>] netlink_sendmsg+0x3e1/0x610 [98423.273947] [<ffffffff814fbc98>] sock_sendmsg+0x38/0x70 [98423.274175] [<ffffffff814fc253>] ___sys_sendmsg+0x2e3/0x2f0 [98423.274416] [<ffffffff810d841e>] ? do_raw_spin_unlock+0xbe/0x140 [98423.274658] [<ffffffff811e1bec>] ? handle_mm_fault+0x26c/0x2210 [98423.274894] [<ffffffff811e19cd>] ? handle_mm_fault+0x4d/0x2210 [98423.275130] [<ffffffff81269611>] ? __fget_light+0x91/0xb0 [98423.275365] [<ffffffff814fcd42>] __sys_sendmsg+0x42/0x80 [98423.275595] [<ffffffff814fcd92>] SyS_sendmsg+0x12/0x20 [98423.275827] [<ffffffff81611bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a [98423.276073] Code: c3 31 c0 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 06 55 48 81 c7 c0 00 00 00 48 89 e5 48 8b 00 48 39 f8 74 09 48 89 06 <48> 8b 40 e8 5d c3 31 c0 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 [98423.279639] RIP [<ffffffff81514f3e>] netdev_lower_get_next+0x1e/0x30 [98423.279920] RSP <ffff88003428f940> CC: David Ahern <dsa@cumulusnetworks.com> CC: David S. Miller <davem@davemloft.net> CC: Roopa Prabhu <roopa@cumulusnetworks.com> CC: Vlad Yasevich <vyasevic@redhat.com> Fixes: bad53162 ("vrf: remove slave queue and private slave struct") Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: NDavid Ahern <dsa@cumulusnetworks.com> Tested-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 2月, 2016 4 次提交
-
-
由 Maarten Lankhorst 提交于
Because we record connector_mask using 1 << drm_connector_index now the connector_mask should stay the same even when other connectors are removed. This was not the case with MST, in that case when removing a connector all other connectors may change their index. This is fixed by waiting until the first get_connector_state to allocate connector_state, and force reallocation when state is too small. As a side effect connector arrays no longer have to be preallocated, and can be allocated on first use which means a less allocations in the page flip only path. Changes since v1: - Whitespace. (Ville) - Call ida_remove when destroying the connector. (Ville) - u32 alloc -> int. (Ville) Fixes: 14de6c44 ("drm/atomic: Remove drm_atomic_connectors_for_crtc.") Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NLyude <cpaul@redhat.com> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Jeff Layton 提交于
This reverts commit c510eff6 ("fsnotify: destroy marks with call_srcu instead of dedicated thread"). Eryu reported that he was seeing some OOM kills kick in when running a testcase that adds and removes inotify marks on a file in a tight loop. The above commit changed the code to use call_srcu to clean up the marks. While that does (in principle) work, the srcu callback job is limited to cleaning up entries in small batches and only once per jiffy. It's easily possible to overwhelm that machinery with too many call_srcu callbacks, and Eryu's reproduer did just that. There's also another potential problem with using call_srcu here. While you can obviously sleep while holding the srcu_read_lock, the callbacks run under local_bh_disable, so you can't sleep there. It's possible when putting the last reference to the fsnotify_mark that we'll end up putting a chain of references including the fsnotify_group, uid, and associated keys. While I don't see any obvious ways that that could occurs, it's probably still best to avoid using call_srcu here after all. This patch reverts the above patch. A later patch will take a different approach to eliminated the dedicated thread here. Signed-off-by: NJeff Layton <jeff.layton@primarydata.com> Reported-by: NEryu Guan <guaneryu@gmail.com> Tested-by: NEryu Guan <guaneryu@gmail.com> Cc: Jan Kara <jack@suse.com> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Dumazet 提交于
Ilya reported following lockdep splat: kernel: ========================= kernel: [ BUG: held lock freed! ] kernel: 4.5.0-rc1-ceph-00026-g5e0a311 #1 Not tainted kernel: ------------------------- kernel: swapper/5/0 is freeing memory ffff880035c9d200-ffff880035c9dbff, with a lock still held there! kernel: (&(&queue->rskq_lock)->rlock){+.-...}, at: [<ffffffff816f6a88>] inet_csk_reqsk_queue_add+0x28/0xa0 kernel: 4 locks held by swapper/5/0: kernel: #0: (rcu_read_lock){......}, at: [<ffffffff8169ef6b>] netif_receive_skb_internal+0x4b/0x1f0 kernel: #1: (rcu_read_lock){......}, at: [<ffffffff816e977f>] ip_local_deliver_finish+0x3f/0x380 kernel: #2: (slock-AF_INET){+.-...}, at: [<ffffffff81685ffb>] sk_clone_lock+0x19b/0x440 kernel: #3: (&(&queue->rskq_lock)->rlock){+.-...}, at: [<ffffffff816f6a88>] inet_csk_reqsk_queue_add+0x28/0xa0 To properly fix this issue, inet_csk_reqsk_queue_add() needs to return to its callers if the child as been queued into accept queue. We also need to make sure listener is still there before calling sk->sk_data_ready(), by holding a reference on it, since the reference carried by the child can disappear as soon as the child is put on accept queue. Reported-by: NIlya Dryomov <idryomov@gmail.com> Fixes: ebb516af ("tcp/dccp: fix race at listener dismantle phase") Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
Since the gc of ipv4 route was removed, the route cached would has no chance to be removed, and even it has been timeout, it still could be used, cause no code to check it's expires. Fix this issue by checking and removing route cache when we get route. Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 2月, 2016 3 次提交
-
-
由 Bjorn Helgaas 提交于
Revert 811a4e6f ("PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed"). This is part of reverting 991de2e5 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()") to fix regressions it introduced. Link: https://bugzilla.kernel.org/show_bug.cgi?id=111211 Fixes: 991de2e5 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()") Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NRafael J. Wysocki <rafael@kernel.org> CC: Jiang Liu <jiang.liu@linux.intel.com>
-
由 Jessica Yu 提交于
Remove the ftrace module notifier in favor of directly calling ftrace_module_enable() and ftrace_release_mod() in the module loader. Hard-coding the function calls directly in the module loader removes dependence on the module notifier call chain and provides better visibility and control over what gets called when, which is important to kernel utilities such as livepatch. This fixes a notifier ordering issue in which the ftrace module notifier (and hence ftrace_module_enable()) for coming modules was being called after klp_module_notify(), which caused livepatch modules to initialize incorrectly. This patch removes dependence on the module notifier call chain in favor of hard coding the corresponding function calls in the module loader. This ensures that ftrace and livepatch code get called in the correct order on patch module load and unload. Fixes: 5156dca3 ("ftrace: Fix the race between ftrace and insmod") Signed-off-by: NJessica Yu <jeyu@redhat.com> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Reviewed-by: NPetr Mladek <pmladek@suse.cz> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: NMiroslav Benes <mbenes@suse.cz> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Kinglong Mee 提交于
unreferenced object 0xffffc90000abf000 (size 16900): comm "fsync02", pid 15765, jiffies 4297431627 (age 423.772s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 a0 c2 19 00 88 ff ff ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8174d54e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811b9b91>] __vmalloc_node_range+0x231/0x280 [<ffffffff811b9c2a>] __vmalloc+0x4a/0x50 [<ffffffffa02c9ec1>] ext_tree_prepare_commit+0x231/0x2e0 [blocklayoutdriver] [<ffffffffa02c700e>] bl_prepare_layoutcommit+0xe/0x10 [blocklayoutdriver] [<ffffffffa0596a6c>] pnfs_layoutcommit_inode+0x29c/0x330 [nfsv4] [<ffffffffa0596b13>] pnfs_generic_sync+0x13/0x20 [nfsv4] [<ffffffffa0585188>] nfs4_file_fsync+0x58/0x150 [nfsv4] [<ffffffff81228e5b>] vfs_fsync_range+0x4b/0xb0 [<ffffffff81228f1d>] do_fsync+0x3d/0x70 [<ffffffff812291d0>] SyS_fsync+0x10/0x20 [<ffffffff81757def>] entry_SYSCALL_64_fastpath+0x12/0x76 [<ffffffffffffffff>] 0xffffffffffffffff v2, add missing include header Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 17 2月, 2016 2 次提交
-
-
由 Huy Nguyen 提交于
problem description: The current code sets UAR page size equal to system page size. The ConnectX-3 and ConnectX-3 Pro HWs require minimum 128 UAR pages. The mlx4 kernel drivers are not loaded if there is less than 128 UAR pages. solution: Always set UAR page to 4KB. This allows more UAR pages if the OS has PAGE_SIZE larger than 4KB. For example, PowerPC kernel use 64KB system page size, with 4MB uar region, there are 4MB/2/64KB = 32 uars (half for uar, half for blueflame). This does not meet minimum 128 UAR pages requirement. With 4KB UAR page, there are 4MB/2/4KB = 512 uars which meet the minimum requirement. Note that only codes in mlx4_core that deal with firmware know that uar page size is 4KB. Codes that deal with usr page in cq and qp context (mlx4_ib, mlx4_en and part of mlx4_core) still have the same assumption that uar page size equals to system page size. Note that with this implementation, on 64KB system page size kernel, there are 16 uars per system page but only one uars is used. The other 15 uars are ignored because of the above assumption. Regarding SR-IOV, mlx4_core in hypervisor will set the uar page size to 4KB and mlx4_core code in virtual OS will obtain the uar page size from firmware. Regarding backward compatibility in SR-IOV, if hypervisor has this new code, the virtual OS must be updated. If hypervisor has old code, and the virtual OS has this new code, the new code will be backward compatible with the old code. If the uar size is big enough, this new code in VF continues to work with 64 KB uar page size (on PowerPc kernel). If the uar size does not meet 128 uars requirement, this new code not loaded in VF and print the same error message as the old code in Hypervisor. Signed-off-by: NHuy Nguyen <huyn@mellanox.com> Reviewed-by: NYishai Hadas <yishaih@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Matan Barak 提交于
mlx5_ifc.h is a header file representing the API and ABI between the driver to the firmware and hardware. This file is used from both the mlx5_ib and mlx5_core drivers. Previously, this file used incrementing counter to indicate reserved fields, for example: struct mlx5_ifc_odp_per_transport_service_cap_bits { u8 send[0x1]; u8 receive[0x1]; u8 write[0x1]; u8 read[0x1]; u8 reserved_0[0x1]; u8 srq_receive[0x1]; u8 reserved_1[0x1a]; }; If one developer implements through net-next feature A that uses reserved_0, they replace it with featureA and renames reserved_1 to reserved_0. In the same kernel cycle, a 2nd developer could implement feature B through the rdma tree, that uses reserved_1 and split it to featureB and a smaller reserved_1 field. This will cause a conflict when the two trees are merged. The source of this conflict is that the 1st developer changed *all* reserved fields. As Linus suggested, we change the layout of structs to: struct mlx5_ifc_odp_per_transport_service_cap_bits { u8 send[0x1]; u8 receive[0x1]; u8 write[0x1]; u8 read[0x1]; u8 reserved_at_4[0x1]; u8 srq_receive[0x1]; u8 reserved_at_6[0x1a]; }; This makes the conflicts much more rare and preserves the locality of changes. Signed-off-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NAlaa Hleihel <alaa@mellanox.com> Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-