提交 · 19873eec7e13fda140a0ebc75d6664e57c00bfb1 · openeuler / Kernel

09 9月, 2020 1 次提交

Drivers: hv: vmbus: hibernation: do not hang forever in vmbus_bus_resume() · 19873eec

由 Dexuan Cui 提交于 9月 04, 2020

After we Stop and later Start a VM that uses Accelerated Networking (NIC
SR-IOV), currently the VF vmbus device's Instance GUID can change, so after
vmbus_bus_resume() -> vmbus_request_offers(), vmbus_onoffer() can not find
the original vmbus channel of the VF, and hence we can't complete()
vmbus_connection.ready_for_resume_event in check_ready_for_resume_event(),
and the VM hangs in vmbus_bus_resume() forever.

Fix the issue by adding a timeout, so the resuming can still succeed, and
the saved state is not lost, and according to my test, the user can disable
Accelerated Networking and then will be able to SSH into the VM for
further recovery. Also prevent the VM in question from suspending again.

The host will be fixed so in future the Instance GUID will stay the same
across hibernation.

Fixes: d8bd2d44 ("Drivers: hv: vmbus: Resume after fixing up old primary channels")
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200905025555.45614-1-decui@microsoft.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

19873eec

07 8月, 2020 1 次提交

Drivers: hv: vmbus: Only notify Hyper-V for die events that are oops · 608a973b

由 Michael Kelley 提交于 8月 06, 2020

Hyper-V currently may be notified of a panic for any die event. But
this results in false panic notifications for various user space traps
that are die events. Fix this by ignoring die events that aren't oops.

Fixes: 510f7aef ("Drivers: hv: vmbus: prefer 'die' notification chain to 'panic'")
Signed-off-by: NMichael Kelley <mikelley@microsoft.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/1596730935-11564-1-git-send-email-mikelley@microsoft.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

608a973b

29 6月, 2020 1 次提交

Drivers: hv: Change flag to write log level in panic msg to false · 77b48bea

由 Joseph Salisbury 提交于 6月 26, 2020

When the kernel panics, one page of kmsg data may be collected and sent to
Hyper-V to aid in diagnosing the failure. The collected kmsg data typically
contains 50 to 100 lines, each of which has a log level prefix that isn't
very useful from a diagnostic standpoint. So tell kmsg_dump_get_buffer()
to not include the log level, enabling more information that *is* useful to
fit in the page.

Requesting in stable kernels, since many kernels running in production are
stable releases.

Cc: stable@vger.kernel.org
Signed-off-by: NJoseph Salisbury <joseph.salisbury@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1593210497-114310-1-git-send-email-joseph.salisbury@microsoft.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

77b48bea

23 5月, 2020 1 次提交

Drivers: hv: vmbus: Resolve more races involving init_vp_index() · afaa33da

由 Andrea Parri (Microsoft) 提交于 5月 22, 2020

init_vp_index() uses the (per-node) hv_numa_map[] masks to record the
CPUs allocated for channel interrupts at a given time, and distribute
the performance-critical channels across the available CPUs: in part.,
the mask of "candidate" target CPUs in a given NUMA node, for a newly
offered channel, is determined by XOR-ing the node's CPU mask and the
node's hv_numa_map. This operation/mechanism assumes that no offline
CPUs is set in the hv_numa_map mask, an assumption that does not hold
since such mask is currently not updated when a channel is removed or
assigned to a different CPU.

To address the issues described above, this adds hooks in the channel
removal path (hv_process_channel_removal()) and in target_cpu_store()
in order to clear, resp. to update, the hv_numa_map[] masks as needed.
This also adds a (missed) update of the masks in init_vp_index() (cf.,
e.g., the memory-allocation failure path in this function).

Like in the case of init_vp_index(), such hooks require to determine
if the given channel is performance critical. init_vp_index() does
this by parsing the channel's offer, it can not rely on the device
data structure (device_obj) to retrieve such information because the
device data structure has not been allocated/linked with the channel
by the time that init_vp_index() executes. A similar situation may
hold in hv_is_alloced_cpu() (defined below); the adopted approach is
to "cache" the device type of the channel, as computed by parsing the
channel's offer, in the channel structure itself.

Fixes: 75278105 ("Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type")
Signed-off-by: NAndrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200522171901.204127-3-parri.andrea@gmail.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

afaa33da

20 5月, 2020 4 次提交

Driver: hv: vmbus: drop a no long applicable comment · 723c425f

由 Wei Liu 提交于 5月 06, 2020

None of the things mentioned in the comment is initialized in hv_init.
They've been moved elsewhere.
Signed-off-by: NWei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20200506160806.118965-1-wei.liu@kernel.orgReviewed-by: NMichael Kelley <mikelley@microsoft.com>

723c425f

hyper-v: Replace open-coded variant of %*phN specifier · 0027e3fd

由 Andy Shevchenko 提交于 4月 23, 2020

printf() like functions in the kernel have extensions, such as
%*phN to dump small pieces of memory as hex values.

Replace print_alias_name() with the direct use of %*phN.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200423134505.78221-3-andriy.shevchenko@linux.intel.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

0027e3fd

hyper-v: Supply GUID pointer to printf() like functions · 458c4475

由 Andy Shevchenko 提交于 4月 23, 2020

Drop dereference when printing the GUID with printf() like functions.
This allows to hide the uuid_t internals.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200423134505.78221-2-andriy.shevchenko@linux.intel.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

458c4475

scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned · 7769e18c

由 Andrea Parri (Microsoft) 提交于 4月 06, 2020

For each storvsc_device, storvsc keeps track of the channel target CPUs
associated to the device (alloced_cpus) and it uses this information to
fill a "cache" (stor_chns) mapping CPU->channel according to a certain
heuristic.  Update the alloced_cpus mask and the stor_chns array when a
channel of the storvsc device is re-assigned to a different CPU.
Signed-off-by: NAndrea Parri (Microsoft) <parri.andrea@gmail.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: <linux-scsi@vger.kernel.org>
Link: https://lore.kernel.org/r/20200406001514.19876-12-parri.andrea@gmail.com
Reviewed-by; Long Li <longli@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
[ wei: fix a small issue reported by kbuild test robot <lkp@intel.com> ]
Signed-off-by: NWei Liu <wei.liu@kernel.org>

7769e18c

23 4月, 2020 10 次提交

Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type · 75278105

由 Andrea Parri (Microsoft) 提交于 4月 06, 2020

VMBus version 4.1 and later support the CHANNELMSG_MODIFYCHANNEL(22)
message type which can be used to request Hyper-V to change the vCPU
that a channel will interrupt.

Introduce the CHANNELMSG_MODIFYCHANNEL message type, and define the
vmbus_send_modifychannel() function to send CHANNELMSG_MODIFYCHANNEL
requests to the host via a hypercall. The function is then used to
define a sysfs "store" operation, which allows to change the (v)CPU
the channel will interrupt by using the sysfs interface. The feature
can be used for load balancing or other purposes.

One interesting catch here is that Hyper-V can *not* currently ACK
CHANNELMSG_MODIFYCHANNEL messages with the promise that (after the ACK
is sent) the channel won't send any more interrupts to the "old" CPU.

The peculiarity of the CHANNELMSG_MODIFYCHANNEL messages is problematic
if the user want to take a CPU offline, since we don't want to take a
CPU offline (and, potentially, "lose" channel interrupts on such CPU)
if the host is still processing a CHANNELMSG_MODIFYCHANNEL message
associated to that CPU.

It is worth mentioning, however, that we have been unable to observe
the above mentioned "race": in all our tests, CHANNELMSG_MODIFYCHANNEL
requests appeared *as if* they were processed synchronously by the host.
Suggested-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NAndrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20200406001514.19876-11-parri.andrea@gmail.comReviewed-by: NMichael Kelley <mikelley@microsoft.com>
[ wei: fix conflict in channel_mgmt.c ]
Signed-off-by: NWei Liu <wei.liu@kernel.org>

75278105

Drivers: hv: vmbus: Use a spin lock for synchronizing channel scheduling vs. channel removal · 9403b66e

由 Andrea Parri (Microsoft) 提交于 4月 06, 2020

Since vmbus_chan_sched() dereferences the ring buffer pointer, we have
to make sure that the ring buffer data structures don't get freed while
such dereferencing is happening. Current code does this by sending an
IPI to the CPU that is allowed to access that ring buffer from interrupt
level, cf., vmbus_reset_channel_cb(). But with the new functionality
to allow changing the CPU that a channel will interrupt, we can't be
sure what CPU will be running the vmbus_chan_sched() function for a
particular channel, so the current IPI mechanism is infeasible.

Instead synchronize vmbus_chan_sched() and vmbus_reset_channel_cb() by
using the (newly introduced) per-channel spin lock "sched_lock". Move
the test for onchannel_callback being NULL before the "switch" control
statement in vmbus_chan_sched(), in order to not access the ring buffer
if the vmbus_reset_channel_cb() has been completed on the channel.
Suggested-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NAndrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20200406001514.19876-7-parri.andrea@gmail.comReviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NWei Liu <wei.liu@kernel.org>

9403b66e

Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels · 8b6a877c

由 Andrea Parri (Microsoft) 提交于 4月 06, 2020

When Hyper-V sends an interrupt to the guest, the guest has to figure
out which channel the interrupt is associated with. Hyper-V sets a bit
in a memory page that is shared with the guest, indicating a particular
"relid" that the interrupt is associated with. The current Linux code
then uses a set of per-CPU linked lists to map a given "relid" to a
pointer to a channel structure.

This design introduces a synchronization problem if the CPU that Hyper-V
will interrupt for a certain channel is changed. If the interrupt comes
on the "old CPU" and the channel was already moved to the per-CPU list
of the "new CPU", then the relid -> channel mapping will fail and the
interrupt is dropped. Similarly, if the interrupt comes on the new CPU
but the channel was not moved to the per-CPU list of the new CPU, then
the mapping will fail and the interrupt is dropped.

Relids are integers ranging from 0 to 2047. The mapping from relids to
channel structures can be done by setting up an array with 2048 entries,
each entry being a pointer to a channel structure (hence total size ~16K
bytes, which is not a problem). The array is global, so there are no
per-CPU linked lists to update. The array can be searched and updated
by loading from/storing to the array at the specified index. With no
per-CPU data structures, the above mentioned synchronization problem is
avoided and the relid2channel() function gets simpler.
Suggested-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NAndrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20200406001514.19876-4-parri.andrea@gmail.comReviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NWei Liu <wei.liu@kernel.org>

8b6a877c

Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU · b9fa1b87

由 Andrea Parri (Microsoft) 提交于 4月 06, 2020

The offer and rescind works are currently scheduled on the so called
"connect CPU". However, this is not really needed: we can synchronize
the works by relying on the usage of the offer_in_progress counter and
of the channel_mutex mutex. This synchronization is already in place.
So, remove this unnecessary "bind to the connect CPU" constraint and
update the inline comments accordingly.
Suggested-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NAndrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20200406001514.19876-3-parri.andrea@gmail.comReviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NWei Liu <wei.liu@kernel.org>

b9fa1b87

Drivers: hv: vmbus: Always handle the VMBus messages on CPU0 · 8a857c55

由 Andrea Parri (Microsoft) 提交于 4月 06, 2020

A Linux guest have to pick a "connect CPU" to communicate with the
Hyper-V host. This CPU can not be taken offline because Hyper-V does
not provide a way to change that CPU assignment.

Current code sets the connect CPU to whatever CPU ends up running the
function vmbus_negotiate_version(), and this will generate problems if
that CPU is taken offine.

Establish CPU0 as the connect CPU, and add logics to prevents the
connect CPU from being taken offline. We could pick some other CPU,
and we could pick that "other CPU" dynamically if there was a reason to
do so at some point in the future. But for now, #defining the connect
CPU to 0 is the most straightforward and least complex solution.

While on this, add inline comments explaining "why" offer and rescind
messages should not be handled by a same serialized work queue.
Suggested-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NAndrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20200406001514.19876-2-parri.andrea@gmail.comReviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NWei Liu <wei.liu@kernel.org>

8a857c55

Drivers: hv: check VMBus messages lengths · 52c7803f

由 Vitaly Kuznetsov 提交于 4月 06, 2020

VMBus message handlers (channel_message_table) receive a pointer to
'struct vmbus_channel_message_header' and cast it to a structure of their
choice, which is sometimes longer than the header. We, however, don't check
that the message is long enough so in case hypervisor screws up we'll be
accessing memory beyond what was allocated for temporary buffer.

Previously, we used to always allocate and copy 256 bytes from message page
to temporary buffer but this is hardly better: in case the message is
shorter than we expect we'll be trying to consume garbage as some real
data and no memory guarding technique will be able to identify an issue.

Introduce 'min_payload_len' to 'struct vmbus_channel_message_table_entry'
and check against it in vmbus_on_msg_dpc(). Note, we can't require the
exact length as new hypervisor versions may add extra fields to messages,
we only check that the message is not shorter than we expect.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200406104326.45361-1-vkuznets@redhat.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

52c7803f

Drivers: hv: make sure that 'struct vmbus_channel_message_header' compiles correctly · b0a284dc

由 Vitaly Kuznetsov 提交于 4月 06, 2020

Strictly speaking, compiler is free to use something different from 'u32'
for 'enum vmbus_channel_message_type' (e.g. char) but it doesn't happen in
real life, just add a BUILD_BUG_ON() guardian.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200406104316.45303-1-vkuznets@redhat.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

b0a284dc

Drivers: hv: avoid passing opaque pointer to vmbus_onmessage() · 5cc41500

由 Vitaly Kuznetsov 提交于 4月 06, 2020

vmbus_onmessage() doesn't need the header of the message, it only
uses it to get to the payload, we can pass the pointer to the
payload directly.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200406104154.45010-4-vkuznets@redhat.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

5cc41500

Drivers: hv: allocate the exact needed memory for messages · a276463b

由 Vitaly Kuznetsov 提交于 4月 06, 2020

When we need to pass a buffer with Hyper-V message we don't need to always
allocate 256 bytes for the message: the real message length is known from
the header. Change 'struct onmessage_work_context' to make it possible to
not over-allocate.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200406104154.45010-3-vkuznets@redhat.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

a276463b

Drivers: hv: copy from message page only what's needed · ac0f7d42

由 Vitaly Kuznetsov 提交于 4月 06, 2020

Hyper-V Interrupt Message Page (SIMP) has 16 256-byte slots for
messages. Each message comes with a header (16 bytes) which specifies the
payload length (up to 240 bytes). vmbus_on_msg_dpc(), however, doesn't
look at the real message length and copies the whole slot to a temporary
buffer before passing it to message handlers. This is potentially dangerous
as hypervisor doesn't have to clean the whole slot when putting a new
message there and a message handler can get access to some data which
belongs to a previous message.

Note, this is not currently a problem because all message handlers are
in-kernel but eventually we may e.g. get this exported to userspace.

Note also, that this is not a performance critical path: messages (unlike
events) represent rare events so it doesn't really matter (from performance
point of view) if we copy too much.

Fix the issue by taking into account the real message length. The temporary
buffer allocated by vmbus_on_msg_dpc() remains fixed size for now. Also,
check that the supplied payload length is valid (<= 240 bytes).
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200406104154.45010-2-vkuznets@redhat.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

ac0f7d42

15 4月, 2020 1 次提交

Drivers: hv: vmbus: Fix Suspend-to-Idle for Generation-2 VM · 1a06d017

由 Dexuan Cui 提交于 4月 11, 2020

Before the hibernation patchset (e.g. f53335e3), in a Generation-2
Linux VM on Hyper-V, the user can run "echo freeze > /sys/power/state" to
freeze the system, i.e. Suspend-to-Idle. The user can press the keyboard
or move the mouse to wake up the VM.

With the hibernation patchset, Linux VM on Hyper-V can hibernate to disk,
but Suspend-to-Idle is broken: when the synthetic keyboard/mouse are
suspended, there is no way to wake up the VM.

Fix the issue by not suspending and resuming the vmbus devices upon
Suspend-to-Idle.

Fixes: f53335e3 ("Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation")
Cc: stable@vger.kernel.org
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1586663435-36243-1-git-send-email-decui@microsoft.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

1a06d017

12 4月, 2020 3 次提交

x86/Hyper-V: Report crash data in die() when panic_on_oops is set · f3a99e76

由 Tianyu Lan 提交于 4月 06, 2020

When oops happens with panic_on_oops unset, the oops
thread is killed by die() and system continues to run.
In such case, guest should not report crash register
data to host since system still runs. Check panic_on_oops
and return directly in hyperv_report_panic() when the function
is called in the die() and panic_on_oops is unset. Fix it.

Fixes: 7ed4325a ("Drivers: hv: vmbus: Make panic reporting to be more useful")
Signed-off-by: NTianyu Lan <Tianyu.Lan@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200406155331.2105-7-Tianyu.Lan@microsoft.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

f3a99e76

x86/Hyper-V: Report crash register data when sysctl_record_panic_msg is not set · 040026df

由 Tianyu Lan 提交于 4月 06, 2020

When sysctl_record_panic_msg is not set, the panic will
not be reported to Hyper-V via hyperv_report_panic_msg().
So the crash should be reported via hyperv_report_panic().

Fixes: 81b18bce ("Drivers: HV: Send one page worth of kmsg dump over Hyper-V during panic")
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NTianyu Lan <Tianyu.Lan@microsoft.com>
Link: https://lore.kernel.org/r/20200406155331.2105-6-Tianyu.Lan@microsoft.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

040026df

x86/Hyper-V: Trigger crash enlightenment only once during system crash. · 73f26e52

由 Tianyu Lan 提交于 4月 06, 2020

When a guest VM panics, Hyper-V should be notified only once via the
crash synthetic MSRs. Current Linux code might write these crash MSRs
twice during a system panic:
1) hyperv_panic/die_event() calling hyperv_report_panic()
2) hv_kmsg_dump() calling hyperv_report_panic_msg()

Fix this by not calling hyperv_report_panic() if a kmsg dump has been
successfully registered. The notification will happen later via
hyperv_report_panic_msg().

Fixes: 7ed4325a ("Drivers: hv: vmbus: Make panic reporting to be more useful")
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NTianyu Lan <Tianyu.Lan@microsoft.com>
Link: https://lore.kernel.org/r/20200406155331.2105-4-Tianyu.Lan@microsoft.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

73f26e52

10 4月, 2020 2 次提交

x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump · 7f11a2cc

由 Tianyu Lan 提交于 4月 06, 2020

If kmsg_dump_register() fails, hv_panic_page will not be used
anywhere. So free and reset it.

Fixes: 81b18bce ("Drivers: HV: Send one page worth of kmsg dump over Hyper-V during panic")
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NTianyu Lan <Tianyu.Lan@microsoft.com>
Link: https://lore.kernel.org/r/20200406155331.2105-3-Tianyu.Lan@microsoft.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

7f11a2cc

x86/Hyper-V: Unload vmbus channel in hv panic callback · 74347a99

由 Tianyu Lan 提交于 4月 06, 2020

When kdump is not configured, a Hyper-V VM might still respond to
network traffic after a kernel panic when kernel parameter panic=0.
The panic CPU goes into an infinite loop with interrupts enabled,
and the VMbus driver interrupt handler still works because the
VMbus connection is unloaded only in the kdump path. The network
responses make the other end of the connection think the VM is
still functional even though it has panic'ed, which could affect any
failover actions that should be taken.

Fix this by unloading the VMbus connection during the panic process.
vmbus_initiate_unload() could then be called twice (e.g., by
hyperv_panic_event() and hv_crash_handler(), so reset the connection
state in vmbus_initiate_unload() to ensure the unload is done only
once.

Fixes: 81b18bce ("Drivers: HV: Send one page worth of kmsg dump over Hyper-V during panic")
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NTianyu Lan <Tianyu.Lan@microsoft.com>
Link: https://lore.kernel.org/r/20200406155331.2105-2-Tianyu.Lan@microsoft.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

74347a99

26 1月, 2020 1 次提交

Drivers: hv: vmbus: Ignore CHANNELMSG_TL_CONNECT_RESULT(23) · ddc9d357

由 Dexuan Cui 提交于 1月 19, 2020

When a Linux hv_sock app tries to connect to a Service GUID on which no
host app is listening, a recent host (RS3+) sends a
CHANNELMSG_TL_CONNECT_RESULT (23) message to Linux and this triggers such
a warning:

unknown msgtype=23
WARNING: CPU: 2 PID: 0 at drivers/hv/vmbus_drv.c:1031 vmbus_on_msg_dpc

Actually Linux can safely ignore the message because the Linux app's
connect() will time out in 2 seconds: see VSOCK_DEFAULT_CONNECT_TIMEOUT
and vsock_stream_connect(). We don't bother to make use of the message
because: 1) it's only supported on recent hosts; 2) a non-trivial effort
is required to use the message in Linux, but the benefit is small.

So, let's not see the warning by silently ignoring the message.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

ddc9d357

22 11月, 2019 5 次提交

Drivers: hv: vmbus: Fix crash handler reset of Hyper-V synic · 7a1323b5

由 Michael Kelley 提交于 11月 14, 2019

The crash handler calls hv_synic_cleanup() to shutdown the
Hyper-V synthetic interrupt controller.  But if the CPU
that calls hv_synic_cleanup() has a VMbus channel interrupt
assigned to it (which is likely the case in smaller VM sizes),
hv_synic_cleanup() returns an error and the synthetic
interrupt controller isn't shutdown.  While the lack of
being shutdown hasn't caused a known problem, it still
should be fixed for highest reliability.

So directly call hv_synic_disable_regs() instead of
hv_synic_cleanup(), which ensures that the synic is always
shutdown.
Signed-off-by: NMichael Kelley <mikelley@microsoft.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

7a1323b5

drivers/hv: Replace binary semaphore with mutex · 8aea7f82

由 Davidlohr Bueso 提交于 11月 01, 2019

At a slight footprint cost (24 vs 32 bytes), mutexes are more optimal
than semaphores; it's also a nicer interface for mutual exclusion,
which is why they are encouraged over binary semaphores, when possible.

Replace the hyperv_mmio_lock, its semantics implies traditional lock
ownership; that is, the lock owner is the same for both lock/unlock
operations. Therefore it is safe to convert.
Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
Signed-off-by: NSasha Levin <sashal@kernel.org>

8aea7f82

Drivers: hv: vmbus: Remove dependencies on guest page size · 53edce00

由 Himadri Pandya 提交于 7月 30, 2019

Hyper-V assumes page size to be 4K. This might not be the case for ARM64
architecture. Hence use hyper-v page size and page allocation function
to avoid conflicts between different host and guest page size on ARM64.
Signed-off-by: NHimadri Pandya <himadri18.07@gmail.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

53edce00

drivers: hv: vmbus: Introduce latency testing · af9ca6f9

由 Branden Bonaby 提交于 10月 03, 2019

Introduce user specified latency in the packet reception path
By exposing the test parameters as part of the debugfs channel
attributes. We will control the testing state via these attributes.
Signed-off-by: NBranden Bonaby <brandonbonaby94@gmail.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

af9ca6f9

Drivers: hv: vmbus: Introduce table of VMBus protocol versions · bedc61a9

由 Andrea Parri 提交于 10月 15, 2019

The technique used to get the next VMBus version seems increasisly
clumsy as the number of VMBus versions increases.  Performance is
not a concern since this is only done once during system boot; it's
just that we'll end up with more lines of code than is really needed.

As an alternative, introduce a table with the version numbers listed
in order (from the most recent to the oldest).  vmbus_connect() loops
through the versions listed in the table until it gets an accepted
connection or gets to the end of the table (invalid version).
Suggested-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NAndrea Parri <parri.andrea@gmail.com>
Reviewed-by: NWei Liu <wei.liu@kernel.org>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

bedc61a9

15 11月, 2019 1 次提交

x86/hyperv: Initialize clockevents earlier in CPU onlining · 4df4cb9e

由 Michael Kelley 提交于 11月 13, 2019

Hyper-V has historically initialized stimer-based clockevents late in the
process of onlining a CPU because clockevents depend on stimer
interrupts. In the original Hyper-V design, stimer interrupts generate a
VMbus message, so the VMbus machinery must be running first, and VMbus
can't be initialized until relatively late. On x86/64, LAPIC timer based
clockevents are used during early initialization before VMbus and
stimer-based clockevents are ready, and again during CPU offlining after
the stimer clockevents have been shut down.

Unfortunately, this design creates problems when offlining CPUs for
hibernation or other purposes. stimer-based clockevents are shut down
relatively early in the offlining process, so clockevents_unbind_device()
must be used to fallback to the LAPIC-based clockevents for the remainder
of the offlining process. Furthermore, the late initialization and early
shutdown of stimer-based clockevents doesn't work well on ARM64 since there
is no other timer like the LAPIC to fallback to. So CPU onlining and
offlining doesn't work properly.

Fix this by recognizing that stimer Direct Mode is the normal path for
newer versions of Hyper-V on x86/64, and the only path on other
architectures. With stimer Direct Mode, stimer interrupts don't require any
VMbus machinery. stimer clockevents can be initialized and shut down
consistent with how it is done for other clockevent devices. While the old
VMbus-based stimer interrupts must still be supported for backward
compatibility on x86, that mode of operation can be treated as legacy.

So add a new Hyper-V stimer entry in the CPU hotplug state list, and use
that new state when in Direct Mode. Update the Hyper-V clocksource driver
to allocate and initialize stimer clockevents earlier during boot. Update
Hyper-V initialization and the VMbus driver to use this new design. As a
result, the LAPIC timer is no longer used during boot or CPU
onlining/offlining and clockevents_unbind_device() is not called. But
retain the old design as a legacy implementation for older versions of
Hyper-V that don't support Direct Mode.
Signed-off-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NDexuan Cui <decui@microsoft.com>
Reviewed-by: NDexuan Cui <decui@microsoft.com>
Link: https://lkml.kernel.org/r/1573607467-9456-1-git-send-email-mikelley@microsoft.com

4df4cb9e

02 10月, 2019 1 次提交

Drivers: hv: vmbus: Fix harmless building warnings without CONFIG_PM_SLEEP · 83b50f83

由 Dexuan Cui 提交于 9月 19, 2019

If CONFIG_PM_SLEEP is not set, we can comment out these functions to avoid
the below warnings:

drivers/hv/vmbus_drv.c:2208:12: warning: ‘vmbus_bus_resume’ defined but not used [-Wunused-function]
drivers/hv/vmbus_drv.c:2128:12: warning: ‘vmbus_bus_suspend’ defined but not used [-Wunused-function]
drivers/hv/vmbus_drv.c:937:12: warning: ‘vmbus_resume’ defined but not used [-Wunused-function]
drivers/hv/vmbus_drv.c:918:12: warning: ‘vmbus_suspend’ defined but not used [-Wunused-function]

Fixes: 271b2224 ("Drivers: hv: vmbus: Implement suspend/resume for VSC drivers for hibernation")
Fixes: f53335e3 ("Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation")
Reported-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

83b50f83

07 9月, 2019 6 次提交

Drivers: hv: vmbus: Resume after fixing up old primary channels · d8bd2d44

由 Dexuan Cui 提交于 9月 05, 2019

When the host re-offers the primary channels upon resume, the host only
guarantees the Instance GUID  doesn't change, so vmbus_bus_suspend()
should invalidate channel->offermsg.child_relid and figure out the
number of primary channels that need to be fixed up upon resume.

Upon resume, vmbus_onoffer() finds the old channel structs, and maps
the new offers to the old channels, and fixes up the old structs,
and finally the resume callbacks of the VSC drivers will re-open
the channels.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

d8bd2d44

Drivers: hv: vmbus: Suspend after cleaning up hv_sock and sub channels · b307b389

由 Dexuan Cui 提交于 9月 05, 2019

Before suspend, Linux must make sure all the hv_sock channels have been
properly cleaned up, because a hv_sock connection can not persist across
hibernation, and the user-space app must be properly notified of the
state change of the connection.

Before suspend, Linux also must make sure all the sub-channels have been
destroyed, i.e. the related channel structs of the sub-channels must be
properly removed, otherwise they would cause a conflict when the
sub-channels are recreated upon resume.

Add a counter to track such channels, and vmbus_bus_suspend() should wait
for the counter to drop to zero.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

b307b389

Drivers: hv: vmbus: Clean up hv_sock channels by force upon suspend · 1f48dcf1

由 Dexuan Cui 提交于 9月 05, 2019

Fake RESCIND_CHANNEL messages to clean up hv_sock channels by force for
hibernation. There is no better method to clean up the channels since
some of the channels may still be referenced by the userspace apps when
hibernation is triggered: in this case, with this patch, the "rescind"
fields of the channels are set, and the apps will thoroughly destroy
the channels after hibernation.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

1f48dcf1

Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation · f53335e3

由 Dexuan Cui 提交于 9月 05, 2019

Before Linux enters hibernation, it sends the CHANNELMSG_UNLOAD message to
the host so all the offers are gone. After hibernation, Linux needs to
re-negotiate with the host using the same vmbus protocol version (which
was in use before hibernation), and ask the host to re-offer the vmbus
devices.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

f53335e3

Drivers: hv: vmbus: Implement suspend/resume for VSC drivers for hibernation · 271b2224

由 Dexuan Cui 提交于 9月 05, 2019

The high-level VSC drivers will implement device-specific callbacks.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

271b2224

Drivers: hv: vmbus: Suspend/resume the synic for hibernation · 63ecc6d2

由 Dexuan Cui 提交于 9月 05, 2019

This is needed when we resume the old kernel from the "current" kernel.

Note: when hv_synic_suspend() and hv_synic_resume() run, all the
non-boot CPUs have been offlined, and interrupts are disabled on CPU0.
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

63ecc6d2

19 7月, 2019 1 次提交

proc/sysctl: add shared variables for range check · eec4844f

由 Matteo Croce 提交于 7月 18, 2019

In the sysctl code the proc_dointvec_minmax() function is often used to
validate the user supplied value between an allowed range.  This
function uses the extra1 and extra2 members from struct ctl_table as
minimum and maximum allowed value.

On sysctl handler declaration, in every source file there are some
readonly variables containing just an integer which address is assigned
to the extra1 and extra2 members, so the sysctl range is enforced.

The special values 0, 1 and INT_MAX are very often used as range
boundary, leading duplication of variables like zero=0, one=1,
int_max=INT_MAX in different source files:

    $ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l
    248

Add a const int array containing the most commonly used values, some
macros to refer more easily to the correct array member, and use them
instead of creating a local one for every object file.

This is the bloat-o-meter output comparing the old and new binary
compiled with the default Fedora config:

    # scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
    add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
    Data                                         old     new   delta
    sysctl_vals                                    -      12     +12
    __kstrtab_sysctl_vals                          -      12     +12
    max                                           14      10      -4
    int_max                                       16       -     -16
    one                                           68       -     -68
    zero                                         128      28    -100
    Total: Before=20583249, After=20583085, chg -0.00%

[mcroce@redhat.com: tipc: remove two unused variables]
  Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
[akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
[arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
  Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
[akpm@linux-foundation.org: fix fs/eventpoll.c]
Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.comSigned-off-by: NMatteo Croce <mcroce@redhat.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NAaron Tomlin <atomlin@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eec4844f

03 7月, 2019 1 次提交

clocksource/drivers: Make Hyper-V clocksource ISA agnostic · fd1fea68

由 Michael Kelley 提交于 7月 01, 2019

Hyper-V clock/timer code and data structures are currently mixed
in with other code in the ISA independent drivers/hv directory as
well as the ISA dependent Hyper-V code under arch/x86.

Consolidate this code and data structures into a Hyper-V clocksource driver
to better follow the Linux model. In doing so, separate out the ISA
dependent portions so the new clocksource driver works for x86 and for the
in-process Hyper-V on ARM64 code.

To start, move the existing clockevents code to create the new clocksource
driver. Update the VMbus driver to call initialization and cleanup routines
since the Hyper-V synthetic timers are not independently enumerated in
ACPI.

No behavior is changed and no new functionality is added.
Suggested-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Cc: "bp@alien8.de" <bp@alien8.de>
Cc: "will.deacon@arm.com" <will.deacon@arm.com>
Cc: "catalin.marinas@arm.com" <catalin.marinas@arm.com>
Cc: "mark.rutland@arm.com" <mark.rutland@arm.com>
Cc: "linux-arm-kernel@lists.infradead.org" <linux-arm-kernel@lists.infradead.org>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
Cc: "linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>
Cc: "olaf@aepfle.de" <olaf@aepfle.de>
Cc: "apw@canonical.com" <apw@canonical.com>
Cc: "jasowang@redhat.com" <jasowang@redhat.com>
Cc: "marcelo.cerri@canonical.com" <marcelo.cerri@canonical.com>
Cc: Sunil Muthuswamy <sunilmut@microsoft.com>
Cc: KY Srinivasan <kys@microsoft.com>
Cc: "sashal@kernel.org" <sashal@kernel.org>
Cc: "vincenzo.frascino@arm.com" <vincenzo.frascino@arm.com>
Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Cc: "linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org>
Cc: "linux-kselftest@vger.kernel.org" <linux-kselftest@vger.kernel.org>
Cc: "arnd@arndb.de" <arnd@arndb.de>
Cc: "linux@armlinux.org.uk" <linux@armlinux.org.uk>
Cc: "ralf@linux-mips.org" <ralf@linux-mips.org>
Cc: "paul.burton@mips.com" <paul.burton@mips.com>
Cc: "daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>
Cc: "salyzyn@android.com" <salyzyn@android.com>
Cc: "pcc@google.com" <pcc@google.com>
Cc: "shuah@kernel.org" <shuah@kernel.org>
Cc: "0x7f454c46@gmail.com" <0x7f454c46@gmail.com>
Cc: "linux@rasmusvillemoes.dk" <linux@rasmusvillemoes.dk>
Cc: "huw@codeweavers.com" <huw@codeweavers.com>
Cc: "sfr@canb.auug.org.au" <sfr@canb.auug.org.au>
Cc: "pbonzini@redhat.com" <pbonzini@redhat.com>
Cc: "rkrcmar@redhat.com" <rkrcmar@redhat.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>
Link: https://lkml.kernel.org/r/1561955054-1838-2-git-send-email-mikelley@microsoft.com

fd1fea68

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功