- 13 3月, 2014 1 次提交
-
-
由 Daniel Vetter 提交于
Thierry created such nice kerneldocs, it's a shame we've left them lingering! For the fun of it also add a bit of kerneldoc to the header so that we can also include that. Just in case someone adds kerneldoc in there. Cc: Thierry Reding <thierry.reding@gmail.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 03 3月, 2014 2 次提交
-
-
由 Marek Olšák 提交于
The statistics are: - VRAM usage in bytes - GTT usage in bytes - number of bytes moved by TTM The last one is actually a counter, so you need to sample it before and after command submission and take the difference. This is useful for finding performance bottlenecks. Userspace queries are also added. v2: use atomic64_t Signed-off-by: NMarek Olšák <marek.olsak@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com>
-
由 Marek Olšák 提交于
When passing buffers between processes, the receiving process needs to know the original buffer domain, so that it doesn't accidentally move the buffer. v2: reserve the buffer Signed-off-by: NMarek Olšák <marek.olsak@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com>
-
- 27 2月, 2014 4 次提交
-
-
由 Thierry Reding 提交于
Implements an I2C-over-AUX I2C adapter on top of the generic drm_dp_aux infrastructure. It extracts the retry logic from existing drivers, which should help in porting those drivers to this new helper. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NThierry Reding <treding@nvidia.com> --- Changes in v5: - move comments partially to to header file - keep MOT set between I2C messages - return -EPROTO on short reads Changes in v4: - fix typo "bitrate" -> "bit rate" Changes in v3: - add back DRM_DEBUG_KMS and DRM_ERROR messages - embed i2c_adapter within struct drm_dp_aux - fix typo in comment
-
由 Thierry Reding 提交于
Add a helper to probe a DP link (read out the supported DPCD revision, maximum rate, link count and capabilities) as well as power up the DP link and configure it accordingly. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NThierry Reding <treding@nvidia.com> --- Changes in v5: - export helpers Changes in v4: - fix a couple of typos in comments as pointed out by Alex Deucher Changes in v3: - split into drm_dp_link_power_up() and drm_dp_link_configure() - do not change sink state for DPCD versions earlier than 1.1 - sleep for 1-2 ms after setting local sink to D0 state - read and write consecutive registers where possible - read DPCD revision when link is probed - remove duplicate kerneldoc
-
由 Thierry Reding 提交于
The function reads the link status (6 bytes starting at offset 0x202) from the DPCD so that it can be conveniently passed to other DPCD helpers. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NThierry Reding <treding@nvidia.com>
-
由 Thierry Reding 提交于
This is a superset of the current i2c_dp_aux bus functionality and can be used to transfer native AUX in addition to I2C-over-AUX messages. Helpers are provided to read and write the DPCD, either blockwise or byte-wise. Many of the existing helpers for DisplayPort take a copy of a portion of the DPCD and operate on that, without a way to write data back to the DPCD (e.g. for configuration of the link). Subsequent patches will build upon this infrastructure to provide common functionality in a generic way. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NThierry Reding <treding@nvidia.com> --- Changes in v5: - move comments partially to struct drm_dp_aux_msg in header file - return -EPROTO on short reads in DPCD helpers Changes in v4: - fix a typo in a comment Changes in v3: - reorder drm_dp_dpcd_writeb() arguments to be more intuitive - return number of bytes transferred in drm_dp_dpcd_write() - factor out drm_dp_dpcd_access() - describe error codes
-
- 22 2月, 2014 1 次提交
-
-
由 Peter Zijlstra 提交于
Because of a recent syscall design debate; its deemed appropriate for each syscall to have a flags argument for future extension; without immediately requiring new syscalls. Cc: juri.lelli@gmail.com Cc: Ingo Molnar <mingo@redhat.com> Suggested-by: NMichael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140214161929.GL27965@twins.programming.kicks-ass.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 19 2月, 2014 4 次提交
-
-
由 Lee Jones 提交于
If we compile the TPS65217 for a 64bit architecture we receive the following warnings: drivers/mfd/tps65217.c: In function ‘tps65217_probe’: drivers/mfd/tps65217.c:173:13: warning: cast from pointer to integer of different size chip_id = (unsigned int)match->data; ^ Signed-off-by: NLee Jones <lee.jones@linaro.org>
-
由 Lee Jones 提交于
If we compile the MAX8998 for a 64bit architecture we receive the following warnings: drivers/mfd/max8998.c: In function ‘max8998_i2c_get_driver_data’: drivers/mfd/max8998.c:178:10: warning: cast from pointer to integer of different size return (int)match->data; ^ Signed-off-by: NLee Jones <lee.jones@linaro.org>
-
由 Lee Jones 提交于
If we compile the MAX8997 for a 64bit architecture we receive the following warnings: drivers/mfd/max8997.c: In function ‘max8997_i2c_get_driver_data’: drivers/mfd/max8997.c:173:10: warning: cast from pointer to integer of different size return (int)match->data; ^ Signed-off-by: NLee Jones <lee.jones@linaro.org>
-
由 Alex Deucher 提交于
Some hardware may not support standard 64x64 cursors. Add a drm cap to query the cursor size from the kernel. Some examples include radeon CIK parts (128x128 cursors) and armada (32x64 or 64x32). This allows things like device specific ddxes to remove asics specific logic and also allows xf86-video-modesetting to work properly with hw cursors on this hardware. Default to 64 if the driver doesn't specify a size. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NRob Clark <robdclark@gmail.com>
-
- 18 2月, 2014 4 次提交
-
-
由 Christian König 提交于
Also make the result available to userspace. Signed-off-by: NChristian König <christian.koenig@amd.com>
-
由 Christian König 提交于
Only VCE 2.0 support so far. v2: squashing multiple patches into this one v3: add IRQ support for CIK, major cleanups, basic code documentation v4: remove HAINAN from chipset list Signed-off-by: NChristian König <christian.koenig@amd.com>
-
由 Alexandre Courbot 提交于
Declare 'struct device' explicitly in ttm_page_alloc.h as this file does not include any file declaring it. This removes the following warning: warning: 'struct device' declared inside parameter list Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com> Reviewed-by: NThierry Reding <treding@nvidia.com>
-
由 Yan, Zheng 提交于
For the setxattr request, introduce a new flag CEPH_XATTR_REMOVE to distinguish null value case from the zero-length value case. Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
-
- 17 2月, 2014 4 次提交
-
-
由 Daniel Borkmann 提交于
In order to allow users to invoke netdev_cap_txqueue, it needs to be moved into netdevice.h header file. While at it, also add kernel doc header to document the API. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Add a new argument for ndo_select_queue() callback that passes a fallback handler. This gets invoked through netdev_pick_tx(); fallback handler is currently __netdev_pick_tx() as most drivers invoke this function within their customized implementation in case for skbs that don't need any special handling. This fallback handler can then be replaced on other call-sites with different queue selection methods (e.g. in packet sockets, pktgen etc). This also has the nice side-effect that __netdev_pick_tx() is then only invoked from netdev_pick_tx() and export of that function to modules can be undone. Suggested-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Matija Glavinic Pecotic 提交于
Implementation of (a)rwnd calculation might lead to severe performance issues and associations completely stalling. These problems are described and solution is proposed which improves lksctp's robustness in congestion state. 1) Sudden drop of a_rwnd and incomplete window recovery afterwards Data accounted in sctp_assoc_rwnd_decrease takes only payload size (sctp data), but size of sk_buff, which is blamed against receiver buffer, is not accounted in rwnd. Theoretically, this should not be the problem as actual size of buffer is double the amount requested on the socket (SO_RECVBUF). Problem here is that this will have bad scaling for data which is less then sizeof sk_buff. E.g. in 4G (LTE) networks, link interfacing radio side will have a large portion of traffic of this size (less then 100B). An example of sudden drop and incomplete window recovery is given below. Node B exhibits problematic behavior. Node A initiates association and B is configured to advertise rwnd of 10000. A sends messages of size 43B (size of typical sctp message in 4G (LTE) network). On B data is left in buffer by not reading socket in userspace. Lets examine when we will hit pressure state and declare rwnd to be 0 for scenario with above stated parameters (rwnd == 10000, chunk size == 43, each chunk is sent in separate sctp packet) Logic is implemented in sctp_assoc_rwnd_decrease: socket_buffer (see below) is maximum size which can be held in socket buffer (sk_rcvbuf). current_alloced is amount of data currently allocated (rx_count) A simple expression is given for which it will be examined after how many packets for above stated parameters we enter pressure state: We start by condition which has to be met in order to enter pressure state: socket_buffer < currently_alloced; currently_alloced is represented as size of sctp packets received so far and not yet delivered to userspace. x is the number of chunks/packets (since there is no bundling, and each chunk is delivered in separate packet, we can observe each chunk also as sctp packet, and what is important here, having its own sk_buff): socket_buffer < x*each_sctp_packet; each_sctp_packet is sctp chunk size + sizeof(struct sk_buff). socket_buffer is twice the amount of initially requested size of socket buffer, which is in case of sctp, twice the a_rwnd requested: 2*rwnd < x*(payload+sizeof(struc sk_buff)); sizeof(struct sk_buff) is 190 (3.13.0-rc4+). Above is stated that rwnd is 10000 and each payload size is 43 20000 < x(43+190); x > 20000/233; x ~> 84; After ~84 messages, pressure state is entered and 0 rwnd is advertised while received 84*43B ~= 3612B sctp data. This is why external observer notices sudden drop from 6474 to 0, as it will be now shown in example: IP A.34340 > B.12345: sctp (1) [INIT] [init tag: 1875509148] [rwnd: 81920] [OS: 10] [MIS: 65535] [init TSN: 1096057017] IP B.12345 > A.34340: sctp (1) [INIT ACK] [init tag: 3198966556] [rwnd: 10000] [OS: 10] [MIS: 10] [init TSN: 902132839] IP A.34340 > B.12345: sctp (1) [COOKIE ECHO] IP B.12345 > A.34340: sctp (1) [COOKIE ACK] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057017] [SID: 0] [SSEQ 0] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057017] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057018] [SID: 0] [SSEQ 1] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057018] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057019] [SID: 0] [SSEQ 2] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057019] [a_rwnd 9914] [#gap acks 0] [#dup tsns 0] <...> IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057098] [SID: 0] [SSEQ 81] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057098] [a_rwnd 6517] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057099] [SID: 0] [SSEQ 82] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057099] [a_rwnd 6474] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057100] [SID: 0] [SSEQ 83] [PPID 0x18] --> Sudden drop IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 0] [#gap acks 0] [#dup tsns 0] At this point, rwnd_press stores current rwnd value so it can be later restored in sctp_assoc_rwnd_increase. This however doesn't happen as condition to start slowly increasing rwnd until rwnd_press is returned to rwnd is never met. This condition is not met since rwnd, after it hit 0, must first reach rwnd_press by adding amount which is read from userspace. Let us observe values in above example. Initial a_rwnd is 10000, pressure was hit when rwnd was ~6500 and the amount of actual sctp data currently waiting to be delivered to userspace is ~3500. When userspace starts to read, sctp_assoc_rwnd_increase will be blamed only for sctp data, which is ~3500. Condition is never met, and when userspace reads all data, rwnd stays on 3569. IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 1505] [#gap acks 0] [#dup tsns 0] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 3010] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057101] [SID: 0] [SSEQ 84] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057101] [a_rwnd 3569] [#gap acks 0] [#dup tsns 0] --> At this point userspace read everything, rwnd recovered only to 3569 IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057102] [SID: 0] [SSEQ 85] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057102] [a_rwnd 3569] [#gap acks 0] [#dup tsns 0] Reproduction is straight forward, it is enough for sender to send packets of size less then sizeof(struct sk_buff) and receiver keeping them in its buffers. 2) Minute size window for associations sharing the same socket buffer In case multiple associations share the same socket, and same socket buffer (sctp.rcvbuf_policy == 0), different scenarios exist in which congestion on one of the associations can permanently drop rwnd of other association(s). Situation will be typically observed as one association suddenly having rwnd dropped to size of last packet received and never recovering beyond that point. Different scenarios will lead to it, but all have in common that one of the associations (let it be association from 1)) nearly depleted socket buffer, and the other association blames socket buffer just for the amount enough to start the pressure. This association will enter pressure state, set rwnd_press and announce 0 rwnd. When data is read by userspace, similar situation as in 1) will occur, rwnd will increase just for the size read by userspace but rwnd_press will be high enough so that association doesn't have enough credit to reach rwnd_press and restore to previous state. This case is special case of 1), being worse as there is, in the worst case, only one packet in buffer for which size rwnd will be increased. Consequence is association which has very low maximum rwnd ('minute size', in our case down to 43B - size of packet which caused pressure) and as such unusable. Scenario happened in the field and labs frequently after congestion state (link breaks, different probabilities of packet drop, packet reordering) and with scenario 1) preceding. Here is given a deterministic scenario for reproduction: >From node A establish two associations on the same socket, with rcvbuf_policy being set to share one common buffer (sctp.rcvbuf_policy == 0). On association 1 repeat scenario from 1), that is, bring it down to 0 and restore up. Observe scenario 1). Use small payload size (here we use 43). Once rwnd is 'recovered', bring it down close to 0, as in just one more packet would close it. This has as a consequence that association number 2 is able to receive (at least) one more packet which will bring it in pressure state. E.g. if association 2 had rwnd of 10000, packet received was 43, and we enter at this point into pressure, rwnd_press will have 9957. Once payload is delivered to userspace, rwnd will increase for 43, but conditions to restore rwnd to original state, just as in 1), will never be satisfied. --> Association 1, between A.y and B.12345 IP A.55915 > B.12345: sctp (1) [INIT] [init tag: 836880897] [rwnd: 10000] [OS: 10] [MIS: 65535] [init TSN: 4032536569] IP B.12345 > A.55915: sctp (1) [INIT ACK] [init tag: 2873310749] [rwnd: 81920] [OS: 10] [MIS: 10] [init TSN: 3799315613] IP A.55915 > B.12345: sctp (1) [COOKIE ECHO] IP B.12345 > A.55915: sctp (1) [COOKIE ACK] --> Association 2, between A.z and B.12346 IP A.55915 > B.12346: sctp (1) [INIT] [init tag: 534798321] [rwnd: 10000] [OS: 10] [MIS: 65535] [init TSN: 2099285173] IP B.12346 > A.55915: sctp (1) [INIT ACK] [init tag: 516668823] [rwnd: 81920] [OS: 10] [MIS: 10] [init TSN: 3676403240] IP A.55915 > B.12346: sctp (1) [COOKIE ECHO] IP B.12346 > A.55915: sctp (1) [COOKIE ACK] --> Deplete socket buffer by sending messages of size 43B over association 1 IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315613] [SID: 0] [SSEQ 0] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315613] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0] <...> IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315696] [a_rwnd 6388] [#gap acks 0] [#dup tsns 0] IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315697] [SID: 0] [SSEQ 84] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315697] [a_rwnd 6345] [#gap acks 0] [#dup tsns 0] --> Sudden drop on 1 IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315698] [SID: 0] [SSEQ 85] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315698] [a_rwnd 0] [#gap acks 0] [#dup tsns 0] --> Here userspace read, rwnd 'recovered' to 3698, now deplete again using association 1 so there is place in buffer for only one more packet IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315799] [SID: 0] [SSEQ 186] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315799] [a_rwnd 86] [#gap acks 0] [#dup tsns 0] IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315800] [SID: 0] [SSEQ 187] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 43] [#gap acks 0] [#dup tsns 0] --> Socket buffer is almost depleted, but there is space for one more packet, send them over association 2, size 43B IP B.12346 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3676403240] [SID: 0] [SSEQ 0] [PPID 0x18] IP A.55915 > B.12346: sctp (1) [SACK] [cum ack 3676403240] [a_rwnd 0] [#gap acks 0] [#dup tsns 0] --> Immediate drop IP A.60995 > B.12346: sctp (1) [SACK] [cum ack 387491510] [a_rwnd 0] [#gap acks 0] [#dup tsns 0] --> Read everything from the socket, both association recover up to maximum rwnd they are capable of reaching, note that association 1 recovered up to 3698, and association 2 recovered only to 43 IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 1548] [#gap acks 0] [#dup tsns 0] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 3053] [#gap acks 0] [#dup tsns 0] IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315801] [SID: 0] [SSEQ 188] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315801] [a_rwnd 3698] [#gap acks 0] [#dup tsns 0] IP B.12346 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3676403241] [SID: 0] [SSEQ 1] [PPID 0x18] IP A.55915 > B.12346: sctp (1) [SACK] [cum ack 3676403241] [a_rwnd 43] [#gap acks 0] [#dup tsns 0] A careful reader might wonder why it is necessary to reproduce 1) prior reproduction of 2). It is simply easier to observe when to send packet over association 2 which will push association into the pressure state. Proposed solution: Both problems share the same root cause, and that is improper scaling of socket buffer with rwnd. Solution in which sizeof(sk_buff) is taken into concern while calculating rwnd is not possible due to fact that there is no linear relationship between amount of data blamed in increase/decrease with IP packet in which payload arrived. Even in case such solution would be followed, complexity of the code would increase. Due to nature of current rwnd handling, slow increase (in sctp_assoc_rwnd_increase) of rwnd after pressure state is entered is rationale, but it gives false representation to the sender of current buffer space. Furthermore, it implements additional congestion control mechanism which is defined on implementation, and not on standard basis. Proposed solution simplifies whole algorithm having on mind definition from rfc: o Receiver Window (rwnd): This gives the sender an indication of the space available in the receiver's inbound buffer. Core of the proposed solution is given with these lines: sctp_assoc_rwnd_update: if ((asoc->base.sk->sk_rcvbuf - rx_count) > 0) asoc->rwnd = (asoc->base.sk->sk_rcvbuf - rx_count) >> 1; else asoc->rwnd = 0; We advertise to sender (half of) actual space we have. Half is in the braces depending whether you would like to observe size of socket buffer as SO_RECVBUF or twice the amount, i.e. size is the one visible from userspace, that is, from kernelspace. In this way sender is given with good approximation of our buffer space, regardless of the buffer policy - we always advertise what we have. Proposed solution fixes described problems and removes necessity for rwnd restoration algorithm. Finally, as proposed solution is simplification, some lines of code, along with some bytes in struct sctp_association are saved. Version 2 of the patch addressed comments from Vlad. Name of the function is set to be more descriptive, and two parts of code are changed, in one removing the superfluous call to sctp_assoc_rwnd_update since call would not result in update of rwnd, and the other being reordering of the code in a way that call to sctp_assoc_rwnd_update updates rwnd. Version 3 corrected change introduced in v2 in a way that existing function is not reordered/copied in line, but it is correctly called. Thanks Vlad for suggesting. Signed-off-by: NMatija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com> Reviewed-by: NAlexander Sverdlin <alexander.sverdlin@nsn.com> Acked-by: NVlad Yasevich <vyasevich@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Aneesh Kumar K.V 提交于
Archs like ppc64 doesn't do tlb flush in set_pte/pmd functions when using a hash table MMU for various reasons (the flush is handled as part of the PTE modification when necessary). ppc64 thus doesn't implement flush_tlb_range for hash based MMUs. Additionally ppc64 require the tlb flushing to be batched within ptl locks. The reason to do that is to ensure that the hash page table is in sync with linux page table. We track the hpte index in linux pte and if we clear them without flushing hash and drop the ptl lock, we can have another cpu update the pte and can end up with duplicate entry in the hash table, which is fatal. We also want to keep set_pte_at simpler by not requiring them to do hash flush for performance reason. We do that by assuming that set_pte_at() is never *ever* called on a PTE that is already valid. This was the case until the NUMA code went in which broke that assumption. Fix that by introducing a new pair of helpers to set _PAGE_NUMA in a way similar to ptep/pmdp_set_wrprotect(), with a generic implementation using set_pte_at() and a powerpc specific one using the appropriate mechanism needed to keep the hash table in sync. Acked-by: NMel Gorman <mgorman@suse.de> Reviewed-by: NRik van Riel <riel@redhat.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 15 2月, 2014 1 次提交
-
-
由 Chris Mason 提交于
This reverts commit 01e219e8. David Sterba found a different way to provide these features without adding a new ioctl. We haven't released any progs with this ioctl yet, so I'm taking this out for now until we finalize things. Signed-off-by: NChris Mason <clm@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.cz> CC: Jeff Mahoney <jeffm@suse.com>
-
- 14 2月, 2014 6 次提交
-
-
由 Li Zhong 提交于
Tommi noticed a 'funny' lock class name: "%s#5" from a lock acquired in process_one_work(). Maybe #fmt plus #args could be used as the lock_name to give some more information for some fmt string like the above. __builtin_constant_p() check is removed (as there seems no good way to check all the variables in args list). However, by removing the check, it only adds two additional "s for those constants. Some lockdep name examples printed out after the change: lockdep name wq->name "events_long" events_long "%s"("khelper") khelper "xfs-data/%s"mp->m_fsname xfs-data/dm-3 Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Roland Dreier 提交于
On some architectures (for example, arm), we don't end up indirectly pulling in the declaration of kzalloc() and kfree(), and so building anything that includes <linux/mlx5/driver.h> breaks. Fix this by adding an explicit include to get the declaration. Reported-by: Nkbuild test robot <fengguang.wu@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Moni Shoua 提交于
For userspace RoCE UD QPs we need to know the GID format that the kernel uses, e.g when working over older kernels. For that end, add a new port capability IB_PORT_IP_BASED_GIDS and report it when query port is issued. Signed-off-by: NMoni Shoua <monis@mellanox.co.il> Signed-off-by: NMatan Barak <matanb@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
由 Florian Westphal 提交于
Marcelo Ricardo Leitner reported problems when the forwarding link path has a lower mtu than the incoming one if the inbound interface supports GRO. Given: Host <mtu1500> R1 <mtu1200> R2 Host sends tcp stream which is routed via R1 and R2. R1 performs GRO. In this case, the kernel will fail to send ICMP fragmentation needed messages (or pkt too big for ipv6), as GSO packets currently bypass dstmtu checks in forward path. Instead, Linux tries to send out packets exceeding the mtu. When locking route MTU on Host (i.e., no ipv4 DF bit set), R1 does not fragment the packets when forwarding, and again tries to send out packets exceeding R1-R2 link mtu. This alters the forwarding dstmtu checks to take the individual gso segment lengths into account. For ipv6, we send out pkt too big error for gso if the individual segments are too big. For ipv4, we either send icmp fragmentation needed, or, if the DF bit is not set, perform software segmentation and let the output path create fragments when the packet is leaving the machine. It is not 100% correct as the error message will contain the headers of the GRO skb instead of the original/segmented one, but it seems to work fine in my (limited) tests. Eric Dumazet suggested to simply shrink mss via ->gso_size to avoid sofware segmentation. However it turns out that skb_segment() assumes skb nr_frags is related to mss size so we would BUG there. I don't want to mess with it considering Herbert and Eric disagree on what the correct behavior should be. Hannes Frederic Sowa notes that when we would shrink gso_size skb_segment would then also need to deal with the case where SKB_MAX_FRAGS would be exceeded. This uses sofware segmentation in the forward path when we hit ipv4 non-DF packets and the outgoing link mtu is too small. Its not perfect, but given the lack of bug reports wrt. GRO fwd being broken this is a rare case anyway. Also its not like this could not be improved later once the dust settles. Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Reported-by: NMarcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
Will be used by upcoming ipv4 forward path change that needs to determine feature mask using skb->dst->dev instead of skb->dev. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Gordeev 提交于
The new functions are special cases for pci_enable_msi_range() and pci_enable_msix_range() when a particular number of MSI or MSI-X is needed. By contrast with pci_enable_msi_range() and pci_enable_msix_range() functions, pci_enable_msi_exact() and pci_enable_msix_exact() return zero in case of success, which indicates MSI or MSI-X interrupts have been successfully allocated. Signed-off-by: NAlexander Gordeev <agordeev@redhat.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 13 2月, 2014 6 次提交
-
-
由 Steven Noonan 提交于
I started noticing problems with KVM guest destruction on Linux 3.12+, where guest memory wasn't being cleaned up. I bisected it down to the commit introducing the new 'asm goto'-based atomics, and found this quirk was later applied to those. Unfortunately, even with GCC 4.8.2 (which ostensibly fixed the known 'asm goto' bug) I am still getting some kind of miscompilation. If I enable the asm_volatile_goto quirk for my compiler, KVM guests are destroyed correctly and the memory is cleaned up. So make the quirk unconditional for now, until bug is found and fixed. Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSteven Noonan <steven@uplinklabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/1392274867-15236-1-git-send-email-steven@uplinklabs.net Link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Sumit Semwal 提交于
Russell King observed 'wierd' looking output from debugfs, and also suggested better ways of getting device names (use KBUILD_MODNAME, dev_name()) This patch addresses these issues to make the debugfs output correct and better looking. While at it, replace seq_printf with seq_puts to remove the checkpatch.pl warnings. Reported-by: NRussell King - ARM Linux <linux@arm.linux.org.uk> Signed-off-by: NSumit Semwal <sumit.semwal@linaro.org>
-
由 Dirk Brandewie 提交于
Remove the reporting of energy since it does not provide any useful information about the state of the driver and will be a maintainance headache going forward since the RAPL energy units register is not architectural and subject to change between micro-architectures References: https://bugzilla.kernel.org/show_bug.cgi?id=69831 Fixes: b69880f9 (intel_pstate: Add trace point to report internal state.) Signed-off-by: NDirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Roland Dreier 提交于
The CMD_T_FAILED flag is set used in one place to record the result of a trivial test, and it is only tested once, few lines later. We might as well make the code simpler and easier to read by directly doing the test of "success" where we want to use it. Signed-off-by: NRoland Dreier <roland@purestorage.com> Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
-
由 Jesse Barnes 提交于
This allows drivers to use them in custom initial_config functions. Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org> Acked-by: NDave Airlie <airlied@gmail.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Jesse Barnes 提交于
Just like we have for connector type etc. v2: drop static array (Chris) v3: add kdoc (Daniel) Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org> Acked-by: NDave Airlie <airlied@gmail.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 12 2月, 2014 1 次提交
-
-
由 Charmaine Lee 提交于
This patch queries the register SVGA_REG_MOB_MAX_SIZE for the maximum size of a single mob. Signed-off-by: NCharmaine Lee <charmainel@vmware.com> Reviewed-by: NThomas Hellstrom <thellstrom@vmware.com>
-
- 11 2月, 2014 6 次提交
-
-
由 Kent Overstreet 提交于
Immutable biovecs changed the way bio segments are treated in such a way that bio_for_each_segment() cannot now do what we want for discard/write same bios, since bi_size means something completely different for them. Fortunately discard and write same bios never have more than a single biovec, so bio_for_each_segment() is unnecessary and not terribly meaningful for them, but we still have to special case them in a few places. Signed-off-by: NKent Overstreet <kmo@daterainc.com> Tested-by: NRichard W.M. Jones <rjones@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Li Zefan 提交于
Setup cgroupfs like this: # mount -t cgroup -o cpuacct xxx /cgroup # mkdir /cgroup/sub1 # mkdir /cgroup/sub2 Then run these two commands: # for ((; ;)) { mkdir /cgroup/sub1/tmp && rmdir /mnt/sub1/tmp; } & # for ((; ;)) { mkdir /cgroup/sub2/tmp && rmdir /mnt/sub2/tmp; } & After seconds you may see this warning: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 25243 at lib/idr.c:527 sub_remove+0x87/0x1b0() idr_remove called for id=6 which is not allocated. ... Call Trace: [<ffffffff8156063c>] dump_stack+0x7a/0x96 [<ffffffff810591ac>] warn_slowpath_common+0x8c/0xc0 [<ffffffff81059296>] warn_slowpath_fmt+0x46/0x50 [<ffffffff81300aa7>] sub_remove+0x87/0x1b0 [<ffffffff810f3f02>] ? css_killed_work_fn+0x32/0x1b0 [<ffffffff81300bf5>] idr_remove+0x25/0xd0 [<ffffffff810f2bab>] cgroup_destroy_css_killed+0x5b/0xc0 [<ffffffff810f4000>] css_killed_work_fn+0x130/0x1b0 [<ffffffff8107cdbc>] process_one_work+0x26c/0x550 [<ffffffff8107eefe>] worker_thread+0x12e/0x3b0 [<ffffffff81085f96>] kthread+0xe6/0xf0 [<ffffffff81570bac>] ret_from_fork+0x7c/0xb0 ---[ end trace 2d1577ec10cf80d0 ]--- It's because allocating/removing cgroup ID is not properly synchronized. The bug was introduced when we converted cgroup_ida to cgroup_idr. While synchronization is already done inside ida_simple_{get,remove}(), users are responsible for concurrent calls to idr_{alloc,remove}(). tj: Refreshed on top of b58c8998 ("cgroup: fix error return from cgroup_create()"). Fixes: 4e96ee8e ("cgroup: convert cgroup_ida to cgroup_idr") Cc: <stable@vger.kernel.org> #3.12+ Reported-by: NMichal Hocko <mhocko@suse.cz> Signed-off-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Paul Bolle 提交于
Commit d52eefb4 ("ia64/xen: Remove Xen support for ia64") removed the Kconfig symbol XEN_XENCOMM. But it didn't remove the code depending on that symbol. Remove that code now. Signed-off-by: NPaul Bolle <pebolle@tiscali.nl> Acked-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 David Vrabel 提交于
xen/gntdev.h and xen/gntalloc.h both provide userspace ABIs so they should be installed. CC: stable@vger.kernel.org Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Paul Gortmaker 提交于
Use what we already do for arch_disable_smp_support() to fix these: arch/x86/kernel/smpboot.c:1155:6: warning: symbol 'arch_enable_nonboot_cpus_begin' was not declared. Should it be static? arch/x86/kernel/smpboot.c:1160:6: warning: symbol 'arch_enable_nonboot_cpus_end' was not declared. Should it be static? kernel/cpu.c:512:13: warning: symbol 'arch_enable_nonboot_cpus_begin' was not declared. Should it be static? kernel/cpu.c:516:13: warning: symbol 'arch_enable_nonboot_cpus_end' was not declared. Should it be static? Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
Witch to using a preallocated flush_rq for blk-mq similar to what's done with the old request path. This allows us to set up the request properly with a tag from the actually allowed range and ->rq_disk as needed by some drivers. To make life easier we also switch to dynamic allocation of ->flush_rq for the old path. This effectively reverts most of "blk-mq: fix for flush deadlock" and "blk-mq: Don't reserve a tag for flush request" Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-