- 02 4月, 2018 31 次提交
-
-
由 Ilya Dryomov 提交于
If the layout is "fancy", we need to be able to rearrange the provided bio_vecs in stripe unit chunks to make it possible for the messenger to read/write directly from/to the provided data buffer, without employing a temporary data buffer for assembling the result. Higher level bio_vec arrays are generally immutable, so this requires copying into a private array. Only the bio_vecs themselves are shuffled around, not the actual data. OWN_BVECS doesn't own any pages. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
rbd_parent_request_create() takes a ref on obj_req for child_img_req. There is no point in doing that because child_img_req is created on behalf of obj_req -- obj_req is the initiator and can't be completed before child_img_req. Open-code the rest of rbd_parent_request_create() and remove it along with rbd_parent_request_destroy(). Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
These are set, but no longer used. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
A whole-object layered discard is implemented as a truncate rather than a delete: a dummy object is needed to prevent the CoW machinery from kicking in. However, a truncate on a non-existent object is a no-op. If the object doesn't exist in HEAD, a discard request is effectively ignored, which violates our "discard zeroes data" promise and breaks REQ_OP_WRITE_ZEROES implementation. A non-exclusive create on an existing object is also a no-op, so the fix is to do a compound create+truncate instead. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
In preparation for rbd "fancy" striping, replace obj_req->img_offset with obj_req->img_extents. A single starting offset isn't sufficient because we want only one OSD request per object and will merge adjacent object extents in ceph_file_to_extents(). The final object extent may map into multiple different byte ranges in the image. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
obj_req->object_no -> obj_req->ex.oe_objno obj_req->offset -> obj_req->ex.oe_off obj_req->length -> obj_req->ex.oe_len ... and use ex for linking object requests to image requests. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
ceph_calc_file_object_mapping() has nothing to do with osdmaps. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
All object requests are associated with an image request now -- avoid duplicating the same info in each object request. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
There are no standalone (!IMG_DATA) object requests anymore. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Do away with partial request completions and all the associated complexity. Individual object requests no longer need to be completed in order -- when the last one becomes ready, we complete the entire higher level request all at once. This also wraps up the conversion to a state machine model and eliminates the recursion described in commit 6d69bb53 ("rbd: prevent kernel stack blow up on rbd map"). Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
It should be void now. Also, object requests are unlinked only in image request destructor, which can't run before rbd_img_request_put(), so no need for _safe. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Store op_type in its own field instead of packing it into flags. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
No need to pass rbd_dev and op_type to rbd_osd_req_create(): there are no standalone (!IMG_DATA) object requests anymore and osd_req->r_flags can be set in rbd_osd_req_format_{read,write}(). Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
The notable changes are: - instead of explicitly stat'ing the object to see if it exists before issuing the write, send the write optimistically along with the stat in a single OSD request - zero copyup optimization - all object requests are associated with an image request and have a valid ->img_request pointer; there are no standalone (!IMG_DATA) object requests anymore - code is structured as a state machine (vs a bunch of callbacks with implicit state) Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
rbd needs this for null copyups -- if copyup data is all zeroes, we want to save some I/O and network bandwidth. See rbd_obj_issue_copyup() in the next commit. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Ilya Dryomov 提交于
In preparation for rbd "fancy" striping which requires bio_vec arrays, wire up BVECS data type and kill off PAGES data type. There is nothing wrong with using page vectors for copyup requests -- it's just less iterator boilerplate code to write for the new striping framework. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Ilya Dryomov 提交于
In preparation for rbd "fancy" striping, introduce ceph_bvec_iter for working with bio_vec array data buffers. The wrappers are trivial, but make it look similar to ceph_bio_iter. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
The initiating object request is the proper owner -- save a bit of space. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Ilya Dryomov 提交于
obj_req->pages is for provided data buffers. stat requests are internal and should be NODATA. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Ilya Dryomov 提交于
Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Ilya Dryomov 提交于
The reason we clone bios is to be able to give each object request (and consequently each ceph_osd_data/ceph_msg_data item) its own pointer to a (list of) bio(s). The messenger then initializes its cursor with cloned bio's ->bi_iter, so it knows where to start reading from/writing to. That's all the cloned bios are used for: to determine each object request's starting position in the provided data buffer. Introduce ceph_bio_iter to do exactly that -- store position within bio list (i.e. pointer to bio) + position within that bio (i.e. bvec_iter). Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
- make it void - xlen (object extent length) out parameter should be u32 because only a single stripe unit is mapped at a time Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Ilya Dryomov 提交于
bl, stripeno and objsetno should be u64 -- otherwise large enough files get corrupted. How large depends on file layout: - 4M-objects layout (default): any file over 16P - 64K-objects layout (smallest possible object size): any file over 512T Only CephFS is affected, rbd doesn't use ceph_calc_file_object_mapping() yet. Fortunately, CephFS has a max_file_size configurable, the default for which is way below both of the above numbers. Reimplement the logic from scratch with no layout validation -- it's done on the MDS side. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Ilya Dryomov 提交于
Commit 21acdf45 ("rbd: set max_segments to USHRT_MAX") removed the limit on max_segments. Remove the limit on max_segment_size as well. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NAlex Elder <elder@linaro.org>
-
由 Linus Torvalds 提交于
-
- 01 4月, 2018 3 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull perf fixes from Ingo Molnar: "Two fixlets" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/hwbp: Simplify the perf-hwbp code, fix documentation perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUs
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull x86 fixes from Ingo Molnar: "Two UV platform fixes, and a kbuild fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/UV: Fix critical UV MMR address error x86/platform/uv/BAU: Add APIC idt entry x86/purgatory: Avoid creating stray .<pid>.d files, remove -MD from KBUILD_CFLAGS
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
Pull x86 PTI fixes from Ingo Molnar: "Two fixes: a relatively simple objtool fix that makes Clang built kernels work with ORC debug info, plus an alternatives macro fix" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternatives: Fixup alternative_call_2 objtool: Add Clang support
-
- 31 3月, 2018 6 次提交
-
-
由 Linus Torvalds 提交于
Merge tag 'kbuild-fixes-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - fix missed rebuild of TRIM_UNUSED_KSYMS - fix rpm-pkg for GNU tar >= 1.29 - include scripts/dtc/include-prefixes/* to kernel header deb-pkg - add -no-integrated-as option ealier to fix building with Clang - fix netfilter Makefile for parallel building * tag 'kbuild-fixes-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: netfilter: nf_nat_snmp_basic: add correct dependency to Makefile kbuild: rpm-pkg: Support GNU tar >= 1.29 builddeb: Fix header package regarding dtc source links kbuild: set no-integrated-as before incl. arch Makefile kbuild: make scripts/adjust_autoksyms.sh robust against timestamp races
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net由 Linus Torvalds 提交于
Pull networking fixes from David Miller: 1) Fix RCU locking in xfrm_local_error(), from Taehee Yoo. 2) Fix return value assignments and thus error checking in iwl_mvm_start_ap_ibss(), from Johannes Berg. 3) Don't count header length twice in vti4, from Stefano Brivio. 4) Fix deadlock in rt6_age_examine_exception, from Eric Dumazet. 5) Fix out-of-bounds access in nf_sk_lookup_slow{v4,v6}() from Subash Abhinov. 6) Check nladdr size in netlink_connect(), from Alexander Potapenko. 7) VF representor SQ numbers are 32 not 16 bits, in mlx5 driver, from Or Gerlitz. 8) Out of bounds read in skb_network_protocol(), from Eric Dumazet. 9) r8169 driver sets driver data pointer after register_netdev() which is too late. Fix from Heiner Kallweit. 10) Fix memory leak in mlx4 driver, from Moshe Shemesh. 11) The multi-VLAN decap fix added a regression when dealing with device that lack a MAC header, such as tun. Fix from Toshiaki Makita. 12) Fix integer overflow in dynamic interrupt coalescing code. From Tal Gilboa. 13) Use after free in vrf code, from David Ahern. 14) IPV6 route leak between VRFs fix, also from David Ahern. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits) net: mvneta: fix enable of all initialized RXQs net/ipv6: Fix route leaking between VRFs vrf: Fix use after free and double free in vrf_finish_output ipv6: sr: fix seg6 encap performances with TSO enabled net/dim: Fix int overflow vlan: Fix vlan insertion for packets without ethernet header net: Fix untag for vlan packets without ethernet header atm: iphase: fix spelling mistake: "Receiverd" -> "Received" vhost: validate log when IOTLB is enabled qede: Do not drop rx-checksum invalidated packets. hv_netvsc: enable multicast if necessary ip_tunnel: Resolve ipsec merge conflict properly. lan78xx: Crash in lan78xx_writ_reg (Workqueue: events lan78xx_deferred_multicast_write) qede: Fix barrier usage after tx doorbell write. vhost: correctly remove wait queue during poll failure net/mlx4_core: Fix memory leak while delete slave's resources net/mlx4_en: Fix mixed PFC and Global pause user control requests net/smc: use announced length in sock_recvmsg() llc: properly handle dev_queue_xmit() return value strparser: Fix sign of err codes ...
-
由 Yelena Krivosheev 提交于
In mvneta_port_up() we enable relevant RX and TX port queues by write queues bit map to an appropriate register. q_map must be ZERO in the beginning of this process. Signed-off-by: NYelena Krivosheev <yelena@marvell.com> Signed-off-by: NGregory CLEMENT <gregory.clement@bootlin.com> Acked-by: NThomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Donald reported that IPv6 route leaking between VRFs is not working. The root cause is the strict argument in the call to rt6_lookup when validating the nexthop spec. ip6_route_check_nh validates the gateway and device (if given) of a route spec. It in turn could call rt6_lookup (e.g., lookup in a given table did not succeed so it falls back to a full lookup) and if so sets the strict argument to 1. That means if the egress device is given, the route lookup needs to return a result with the same device. This strict requirement does not work with VRFs (IPv4 or IPv6) because the oif in the flow struct is overridden with the index of the VRF device to trigger a match on the l3mdev rule and force the lookup to its table. The right long term solution is to add an l3mdev index to the flow struct such that the oif is not overridden. That solution will not backport well, so this patch aims for a simpler solution to relax the strict argument if the route spec device is an l3mdev slave. As done in other places, use the FLOWI_FLAG_SKIP_NH_OIF to know that the RT6_LOOKUP_F_IFACE flag needs to be removed. Fixes: ca254490 ("net: Add VRF support to IPv6 stack") Reported-by: NDonald Sharp <sharpd@cumulusnetworks.com> Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Miguel reported an skb use after free / double free in vrf_finish_output when neigh_output returns an error. The vrf driver should return after the call to neigh_output as it takes over the skb on error path as well. Patch is a simplified version of Miguel's patch which was written for 4.9, and updated to top of tree. Fixes: 8f58336d ("net: Add ethernet header for pass through VRF device") Signed-off-by: NMiguel Fadon Perlines <mfadon@teldat.com> Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Lebrun 提交于
Enabling TSO can lead to abysmal performances when using seg6 in encap mode, such as with the ixgbe driver. This patch adds a call to iptunnel_handle_offloads() to remove the encapsulation bit if needed. Before: root@comp4-seg6bpf:~# iperf3 -c fc00::55 Connecting to host fc00::55, port 5201 [ 4] local fc45::4 port 36592 connected to fc00::55 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 196 KBytes 1.60 Mbits/sec 47 6.66 KBytes [ 4] 1.00-2.00 sec 304 KBytes 2.49 Mbits/sec 100 5.33 KBytes [ 4] 2.00-3.00 sec 284 KBytes 2.32 Mbits/sec 92 5.33 KBytes After: root@comp4-seg6bpf:~# iperf3 -c fc00::55 Connecting to host fc00::55, port 5201 [ 4] local fc45::4 port 43062 connected to fc00::55 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 1.03 GBytes 8.89 Gbits/sec 0 743 KBytes [ 4] 1.00-2.00 sec 1.03 GBytes 8.87 Gbits/sec 0 743 KBytes [ 4] 2.00-3.00 sec 1.03 GBytes 8.87 Gbits/sec 0 743 KBytes Reported-by: NTom Herbert <tom@quantonium.net> Fixes: 6c8702c6 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels") Signed-off-by: NDavid Lebrun <dlebrun@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-