- 19 8月, 2015 10 次提交
-
-
由 Tejun Heo 提交于
* Rename blkcg->pd[] to blkcg->cpd[] so that cpd is consistently used for blkcg_policy_data. * Make blkcg_policy->cpd_init_fn() take blkcg_policy_data instead of blkcg. This makes it consistent with blkg_policy_data methods and to-be-added cpd alloc/free methods. * blkcg_policy_data->blkcg and cpd_to_blkcg() added so that cpd_init_fn() can determine the associated blkcg from blkcg_policy_data. v2: blkcg_policy_data->blkcg initializations were missing. Added. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tejun Heo 提交于
The newly added ->pd_alloc_fn() and ->pd_free_fn() deal with pd (blkg_policy_data) while the older ones use blkg (blkcg_gq). As using blkg doesn't make sense for ->pd_alloc_fn() and after allocation pd can always be mapped to blkg and given that these are policy-specific methods, it makes sense to converge on pd. This patch makes all methods deal with pd instead of blkg. Most conversions are trivial. In blk-cgroup.c, a couple method invocation sites now test whether pd exists instead of policy state for consistency. This shouldn't cause any behavioral differences. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tejun Heo 提交于
With the recent addition of alloc and free methods, things became messier. This patch reorganizes them according to the followings. * ->pd_alloc_fn() Responsible for allocation and static initializations - the ones which can be done independent of where the pd might be attached. * ->pd_init_fn() Initializations which require the knowledge of where the pd is attached. * ->pd_free_fn() The counter part of pd_alloc_fn(). Static de-init and freeing. This leaves ->pd_exit_fn() without any users. Removed. While at it, collapse an one liner function throtl_pd_exit(), which has only one user, into its user. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tejun Heo 提交于
A blkg (blkcg_gq) represents the relationship between a cgroup and request_queue. Each active policy has a pd (blkg_policy_data) on each blkg. The pd's were allocated by blkcg core and each policy could request to allocate extra space at the end by setting blkcg_policy->pd_size larger than the size of pd. This is a bit unusual but was done this way mostly to simplify error handling and all the existing use cases could be handled this way; however, this is becoming too restrictive now that percpu memory can be allocated without blocking. This introduces two new mandatory blkcg_policy methods - pd_alloc_fn() and pd_free_fn() - which are used to allocate and release pd for a given policy. As pd allocation is now done from policy side, it can simply allocate a larger area which embeds pd at the beginning. This change makes ->pd_size pointless. Removed. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tejun Heo 提交于
When a policy gets activated, it needs to allocate and install its policy data on all existing blkg's (blkcg_gq's). Because blkg iteration is protected by a spinlock, it currently counts the total number of blkg's in the system, allocates the matching number of policy data on a list and installs them during a single iteration. This can be simplified by using speculative GFP_NOWAIT allocations while iterating and falling back to a preallocated policy data on failure. If the preallocated one has already been consumed, it releases the lock, preallocate with GFP_KERNEL and then restarts the iteration. This can be a bit more expensive than before but policy activation is a very cold path and shouldn't matter. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tejun Heo 提交于
Since ec13b1d6 ("blkcg: always create the blkcg_gq for the root blkcg"), a request_list always has its blkg associated. Drop unnecessary rl->blkg NULL test from blk_put_rl(). Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tejun Heo 提交于
The following tracepoints are updated to report the cgroup used during cgroup writeback. * writeback_write_inode[_start] * writeback_queue * writeback_exec * writeback_start * writeback_written * writeback_wait * writeback_nowork * writeback_wake_background * wbc_writepage * writeback_queue_io * bdi_dirty_ratelimit * balance_dirty_pages * writeback_sb_inodes_requeue * writeback_single_inode[_start] Note that writeback_bdi_register is separated out from writeback_class as reporting cgroup doesn't make sense to it. Tracepoints which take bdi are updated to take bdi_writeback instead. Signed-off-by: NTejun Heo <tj@kernel.org> Suggested-by: NJan Kara <jack@suse.cz> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tejun Heo 提交于
Add a function to determine the path length of a kernfs node. This for now will be used by writeback tracepoint updates. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tejun Heo 提交于
wb's (bdi_writeback's) are currently keyed by memcg ID; however, in an earlier implementation, wb's were keyed by blkcg ID. bdi_for_each_wb() walks bdi->cgwb_tree in the ascending ID order and allows iterations to start from an arbitrary ID which is used to interrupt and resume iterations. Unfortunately, while changing wb to be keyed by memcg ID instead of blkcg, bdi_for_each_wb() was missed and is still assuming that wb's are keyed by blkcg ID. This doesn't affect iterations which don't get interrupted but bdi_split_work_to_wbs() makes use of iteration resuming on allocation failures and thus may incorrectly skip or repeat wb's. Fix it by changing bdi_for_each_wb() to take memcg IDs instead of blkcg IDs and updating bdi_split_work_to_wbs() accordingly. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tejun Heo 提交于
This allows cgroup subsystems to use a different name on the unified hierarchy. cgroup_subsys->name is used on the unified hierarchy, ->legacy_name elsewhere. If ->legacy_name is not explicitly set, it's automatically set to ->name and the userland visible behavior remains unchanged. v2: Make parse_cgroupfs_options() only consider ->legacy_name as mount options are used only on legacy hierarchies. Suggested by Li Zefan. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: cgroups@vger.kernel.org
-
- 12 8月, 2015 1 次提交
-
-
由 Tejun Heo 提交于
It's pretty unusual to have an int as a private data field and it makes it impossible to carray a pointer value through it. Let's make it an unsigned long. AFAICS, this shouldn't break anything. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NLi Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org>
-
- 05 8月, 2015 1 次提交
-
-
由 Tejun Heo 提交于
Traditionally, each cgroup controller implemented whatever interface it wanted leading to interfaces which are widely inconsistent. Examining the requirements of the controllers readily yield that there are only a few control schemes shared among all. Two major controllers already had to implement new interface for the unified hierarchy due to significant structural changes. Let's take the chance to establish common conventions throughout all controllers. This patch defines CGROUP_WEIGHT_MIN/DFL/MAX to be used on all weight based control knobs and documents the conventions that controllers should follow on the unified hierarchy. Except for io.weight knob, all existing unified hierarchy knobs are already compliant. A follow-up patch will update io.weight. v2: Added descriptions of min, low and high knobs. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org>
-
- 25 7月, 2015 1 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
Commit 4104d326 ("ftrace: Remove global function list and call function directly") simplified the ftrace code by removing the global_ops list with a new design. But this cleanup also broke the filtering of PIDs that are added to the set_ftrace_pid file. Add back the proper hooks to have pid filtering working once again. Cc: stable@vger.kernel.org # 3.16+ Reported-by: NMatt Fleming <matt@console-pimps.org> Reported-by: NRichard Weinberger <richard.weinberger@gmail.com> Tested-by: NMatt Fleming <matt@console-pimps.org> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 24 7月, 2015 1 次提交
-
-
由 Dong Aisheng 提交于
After commit 8d86e4fc ("mmc: sdhci-esdhc-imx: Call mmc_of_parse()"), it's not used anymore. Signed-off-by: NDong Aisheng <aisheng.dong@freescale.com> Reviewed-by: NJohan Derycke <johan.derycke@barco.com> Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
-
- 23 7月, 2015 1 次提交
-
-
由 Leo Liu 提交于
Signed-off-by: NLeo Liu <leo.liu@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com>
-
- 21 7月, 2015 1 次提交
-
-
由 Chris Wilson 提交于
Since the hardware sometimes mysteriously totally flummoxes the 64bit read of a 64bit register when read using a single instruction, split the read into two instructions. Since the read here is of automatically incrementing timestamp counters, we also have to be very careful in order to make sure that it does not increment between the two instructions. However, since userspace tried to workaround this issue and so enshrined this ABI for a broken hardware read and in the process neglected that the read only fails in some environments, we have to introduce a new uABI flag for userspace to request the 2x32 bit accurate read of the timestamp. v2: Fix alignment check and include details of the workaround for userspace. Reported-by: NKarol Herbst <freedesktop@karolherbst.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91317 Testcase: igt/gem_reg_read Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: stable@vger.kernel.org Tested-by: NMichał Winiarski <michal.winiarski@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 19 7月, 2015 1 次提交
-
-
由 Srinivas Pandruvada 提交于
By default all the sensors are runtime suspended state (lowest power state). During Linux suspend process, all the run time suspended devices are resumed and then suspended. This caused all sensors to power up and introduced delay in suspend time, when we introduced runtime PM for HID sensors. The opposite process happens during resume process. To fix this, we do powerup process of the sensors only when the request is issued from user (raw or tiggerred). In this way when runtime, resume calls for powerup it will simply return as this will not match user requested state. Note this is a regression fix as the increase in suspend / resume times can be substantial (report of 8 seconds on Len's laptop!) Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: NLen Brown <len.brown@intel.com> Cc: <Stable@vger.kernel.org> Signed-off-by: NJonathan Cameron <jic23@kernel.org>
-
- 18 7月, 2015 7 次提交
-
-
由 Ingo Molnar 提交于
Don't burden architectures without dynamic task_struct sizing with the overhead of dynamic sizing. Also optimize the x86 code a bit by caching task_struct_size. Acked-and-Tested-by: NDave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1437128892-9831-3-git-send-email-mingo@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Dave Hansen 提交于
The FPU rewrite removed the dynamic allocations of 'struct fpu'. But, this potentially wastes massive amounts of memory (2k per task on systems that do not have AVX-512 for instance). Instead of having a separate slab, this patch just appends the space that we need to the 'task_struct' which we dynamically allocate already. This saves from doing an extra slab allocation at fork(). The only real downside here is that we have to stick everything and the end of the task_struct. But, I think the BUILD_BUG_ON()s I stuck in there should keep that from being too fragile. Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1437128892-9831-2-git-send-email-mingo@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Joonsoo Kim 提交于
Currently, we set wrong gfp_mask to page_owner info in case of isolated freepage by compaction and split page. It causes incorrect mixed pageblock report that we can get from '/proc/pagetypeinfo'. This metric is really useful to measure fragmentation effect so should be accurate. This patch fixes it by setting correct information. Without this patch, after kernel build workload is finished, number of mixed pageblock is 112 among roughly 210 movable pageblocks. But, with this fix, output shows that mixed pageblock is just 57. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Laurent Dufour 提交于
Commit 2ae416b1 ("mm: new mm hook framework") introduced an empty header file (mm-arch-hooks.h) for every architecture, even those which doesn't need to define mm hooks. As suggested by Geert Uytterhoeven, this could be cleaned through the use of a generic header file included via each per architecture asm/include/Kbuild file. The PowerPC architecture is not impacted here since this architecture has to defined the arch_remap MM hook. Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com> Suggested-by: NGeert Uytterhoeven <geert@linux-m68k.org> Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org> Acked-by: NVineet Gupta <vgupta@synopsys.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Viresh Kumar 提交于
Switch to my kernel.org alias instead of a badly named gmail address, which I rarely use. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nicolas Iooss 提交于
Using __printf attributes helps to detect several format string issues at compile time (even though -Wformat-security is currently disabled in Makefile). For example it can detect when formatting a pointer as a number, like the issue fixed in commit a3fa71c4 ("wl18xx: show rx_frames_per_rates as an array as it really is"), or when the arguments do not match the format string, c.f. for example commit 5ce1aca8 ("reiserfs: fix __RASSERT format string"). To prevent similar bugs in the future, add a __printf attribute to every function prototype which needs one in include/linux/ and lib/. These functions were mostly found by using gcc's -Wsuggest-attribute=format flag. Signed-off-by: NNicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Felipe Balbi <balbi@ti.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dominik Dingel 提交于
s390 has a constant hugepage size, by setting HPAGE_SHIFT we also change e.g. the pageblock_order, which should be independent in respect to hugepage support. With this patch every architecture is free to define how to check for hugepage support. Signed-off-by: NDominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 7月, 2015 1 次提交
-
-
由 Arik Nemtsov 提交于
The RTNL is required to check for IR-relaxation conditions that allow more channels to beacon. Export an RTNL locked version of reg_can_beacon and use it where possible in AP/STA interface type flows, where IR-relaxation may be applicable. Fixes: 06f207fc ("cfg80211: change GO_CONCURRENT to IR_CONCURRENT for STA") Signed-off-by: NArik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
- 16 7月, 2015 2 次提交
-
-
由 Eric Dumazet 提交于
ip6_datagram_connect() is doing a lot of socket changes without socket being locked. This looks wrong, at least for udp_lib_rehash() which could corrupt lists because of concurrent udp_sk(sk)->udp_portaddr_hash accesses. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Geert Uytterhoeven 提交于
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
-
- 15 7月, 2015 5 次提交
-
-
由 David Milburn 提交于
Since no longer limiting max_sectors to BLK_DEF_MAX_SECTORS (commit 34b48db6), data corruption may occur on ST380013AS drive configured on 82801JI (ICH10 Family) SATA controller. This patch will allow the driver to limit max_sectors as before # cat /sys/block/sdb/queue/max_sectors_kb 512 I was able to double the max_sectors_kb value up to 16384 on linux-4.2.0-rc2 before seeing corruption, but seems safer to use previous limit. Without this patch max_sectors_kb will be 32767. tj: Minor comment update. Reported-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NDavid Milburn <dmilburn@redhat.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org # v3.19 and later Fixes: 34b48db6 ("block: remove artifical max_hw_sectors cap")
-
由 Arne Fitzenreiter 提交于
Some devices lose data on TRIM whether queued or not. This patch adds a horkage to disable TRIM. tj: Collapsed unnecessary if() nesting. Signed-off-by: NArne Fitzenreiter <arne_f@ipfire.org> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org
-
由 Luck, Tony 提交于
The memory error record structure includes as its first field a bitmask of which subsequent fields are valid. The allows new fields to be added to the structure while keeping compatibility with older software that parses these records. This mechanism was used between versions 2.2 and 2.3 to add four new fields, growing the size of the structure from 73 bytes to 80. But Linux just added all the new fields so this test: if (gdata->error_data_length >= sizeof(*mem_err)) cper_print_mem(newpfx, mem_err); else goto err_section_too_small; now make Linux complain about old format records being too short. Add a definition for the old format of the structure and use that for the minimum size check. Pass the actual size to cper_print_mem() so it can sanity check the validation_bits field to ensure that if a BIOS using the old format sets bits as if it were new, we won't access fields beyond the end of the structure. Signed-off-by: NTony Luck <tony.luck@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
由 Sagi Grimberg 提交于
We might return res which is not initialized. Also reduce code duplication by exporting srp_parse_tmo so srp_tmo_set can reuse it. Detected by Coverity. Signed-off-by: NSagi Grimberg <sagig@mellanox.com> Signed-off-by: NJenny Falkovich <jennyf@mellanox.com> Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Hal Rosenstock 提交于
Persuant to Liran's comments on node_type on linux-rdma mailing list: In an effort to reform the RDMA core and ULPs to minimize use of node_type in struct ib_device, an additional bit is added to struct ib_device for is_switch (IB switch). This is needed to be initialized by any IB switch device driver. This is a NEW requirement on such device drivers which are all "out of tree". In addition, an ib_switch helper was added to ib_verbs.h based on the is_switch device bit rather than node_type (although those should be consistent). The RDMA core (MAD, SMI, agent, sa_query, multicast, sysfs) as well as (IPoIB and SRP) ULPs are updated where appropriate to use this new helper. In some cases, the helper is now used under the covers of using rdma_[start end]_port rather than the open coding previously used. Reviewed-by: NSean Hefty <sean.hefty@intel.com> Reviewed-By: NJason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Tested-by: NIra Weiny <ira.weiny@intel.com> Signed-off-by: NHal Rosenstock <hal@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 13 7月, 2015 3 次提交
-
-
由 Jeff Layton 提交于
They just call file_inode and then the corresponding *_inode_file_wait function. Just make them static inlines instead. Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
-
由 Jeff Layton 提交于
Allow callers to pass in an inode instead of a filp. Signed-off-by: NJeff Layton <jeff.layton@primarydata.com> Reviewed-by: N"J. Bruce Fields" <bfields@fieldses.org> Tested-by: N"J. Bruce Fields" <bfields@fieldses.org>
-
由 Oliver Hartkopp 提交于
Commit 514ac99c "can: fix multiple delivery of a single CAN frame for overlapping CAN filters" requires the skb->tstamp to be set to check for identical CAN skbs. Without timestamping to be required by user space applications this timestamp was not generated which lead to commit 36c01245 "can: fix loss of CAN frames in raw_rcv" - which forces the timestamp to be set in all CAN related skbuffs by introducing several __net_timestamp() calls. This forces e.g. out of tree drivers which are not using alloc_can{,fd}_skb() to add __net_timestamp() after skbuff creation to prevent the frame loss fixed in mainline Linux. This patch removes the timestamp dependency and uses an atomic counter to create an unique identifier together with the skbuff pointer. Btw: the new skbcnt element introduced in struct can_skb_priv has to be initialized with zero in out-of-tree drivers which are not using alloc_can{,fd}_skb() too. Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
-
- 10 7月, 2015 5 次提交
-
-
由 Paolo Bonzini 提交于
If there are no assigned devices, the guest PAT are not providing any useful information and can be overridden to writeback; VMX always does this because it has the "IPAT" bit in its extended page table entries, but SVM does not have anything similar. Hook into VFIO and legacy device assignment so that they provide this information to KVM. Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Tested-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Enrico Mioso 提交于
NCM specs are not actually mandating a specific position in the frame for the NDP (Network Datagram Pointer). However, some Huawei devices will ignore our aggregates if it is not placed after the datagrams it points to. Add support for doing just this, in a per-device configurable way. While at it, update NCM subdrivers, disabling this functionality in all of them, except in huawei_cdc_ncm where it is enabled instead. We aren't making any distinction between different Huawei NCM devices, based on what the vendor driver does. Standard NCM devices are left unaffected: if they are compliant, they should be always usable, still stay on the safe side. This change has been tested and working with a Huawei E3131 device (which works regardless of NDP position), a Huawei E3531 (also working both ways) and an E3372 (which mandates NDP to be after indexed datagrams). V1->V2: - corrected wrong NDP acronym definition - fixed possible NULL pointer dereference - patch cleanup V2->V3: - Properly account for the NDP size when writing new packets to SKB Signed-off-by: NEnrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tejun Heo 提交于
e48453c3 ("block, cgroup: implement policy-specific per-blkcg data") updated per-blkcg policy data to be dynamically allocated. When a policy is registered, its policy data aren't created. Instead, when the policy is activated on a queue, the policy data are allocated if there are blkg's (blkcg_gq's) which are attached to a given blkcg. This is buggy. Consider the following scenario. 1. A blkcg is created. No blkg's attached yet. 2. The policy is registered. No policy data is allocated. 3. The policy is activated on a queue. As the above blkcg doesn't have any blkg's, it won't allocate the matching blkcg_policy_data. 4. An IO is issued from the blkcg and blkg is created and the blkcg still doesn't have the matching policy data allocated. With cfq-iosched, this leads to an oops. It also doesn't free policy data on policy unregistration assuming that freeing of all policy data on blkcg destruction should take care of it; however, this also is incorrect. 1. A blkcg has policy data. 2. The policy gets unregistered but the policy data remains. 3. Another policy gets registered on the same slot. 4. Later, the new policy tries to allocate policy data on the previous blkcg but the slot is already occupied and gets skipped. The policy ends up operating on the policy data of the previous policy. There's no reason to manage blkcg_policy_data lazily. The reason we do lazy allocation of blkg's is that the number of all possible blkg's is the product of cgroups and block devices which can reach a surprising level. blkcg_policy_data is contrained by the number of cgroups and shouldn't be a problem. This patch makes blkcg_policy_data to be allocated for all existing blkcg's on policy registration and freed on unregistration and removes blkcg_policy_data handling from policy [de]activation paths. This makes that blkcg_policy_data are created and removed with the policy they belong to and fixes the above described problems. Signed-off-by: NTejun Heo <tj@kernel.org> Fixes: e48453c3 ("block, cgroup: implement policy-specific per-blkcg data") Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Tejun Heo 提交于
Add all_blkcgs list goes through blkcg->all_blkcgs_node and is protected by blkcg_pol_mutex. This will be used to fix blkcg_policy_data allocation bug. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Ilya Dryomov 提交于
Grab a reference on a network namespace of the 'rbd map' (in case of rbd) or 'mount' (in case of ceph) process and use that to open sockets instead of always using init_net and bailing if network namespace is anything but init_net. Be careful to not share struct ceph_client instances between different namespaces and don't add any code in the !CONFIG_NET_NS case. This is based on a patch from Hong Zhiguo <zhiguohong@tencent.com>. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NSage Weil <sage@redhat.com>
-