提交 · 87e93d61708fe2c44875d1ecdb174aad070dbd08 · openanolis / cloud-kernel

02 3月, 2015 27 次提交

Drivers: hv: vmbus: Suport an API to send pagebuffers with additional control · 87e93d61

由 K. Y. Srinivasan 提交于 2月 28, 2015

Implement an API for sending pagebuffers that gives more control to the client
in terms of setting the vmbus flags as well as deciding when to
notify the host. This will be useful for enabling batch processing.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

87e93d61

Drivers: hv: vmbus: Use a round-robin algorithm for picking the outgoing channel · a13e8bbe

由 K. Y. Srinivasan 提交于 2月 28, 2015

The current algorithm for picking an outgoing channel was not distributing
the load well. Implement a simple round-robin scheme to ensure good
distribution of the outgoing traffic.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: NLong Li <longli@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

a13e8bbe

Drivers: hv: vmbus: Add support for VMBus panic notifier handler · 96c1d058

由 Nick Meier 提交于 2月 28, 2015

Hyper-V allows a guest to notify the Hyper-V host that a panic
condition occured. This notification can include up to five 64
bit values. These 64 bit values are written into crash MSRs.
Once the data has been written into the crash MSRs, the host is
then notified by writing into a Crash Control MSR. On the Hyper-V
host, the panic notification data is captured in the Windows Event
log as a 18590 event.

Crash MSRs are defined in appendix H of the Hypervisor Top Level
Functional Specification. At the time of this patch, v4.0 is the
current functional spec. The URL for the v4.0 document is:

http://download.microsoft.com/download/A/B/4/AB43A34E-BDD0-4FA6-BDEF-79EEF16E880B/Hypervisor Top Level Functional Specification v4.0.docx
Signed-off-by: NNick Meier <nmeier@microsoft.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

96c1d058

Drivers: hv: hv_balloon: refuse to balloon below the floor · 530d15b9

由 Vitaly Kuznetsov 提交于 2月 28, 2015

When host asks us to balloon up we need to be sure we're not committing suicide
by overballooning. Use already existent 'floor' metric as our lowest possible
value for free ram.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

530d15b9

Drivers: hv: hv_balloon: report offline pages as being used · 549fd280

由 Vitaly Kuznetsov 提交于 2月 28, 2015

When hot-added memory pages are not brought online or when some memory blocks
are sent offline the subsequent ballooning process kills the guest with OOM
killer. This happens as we don't report these pages as neither used nor free
and apparently host algorithm considers them as being unused. Keep track of
all online/offline operations and report all currently offline pages as being
used so host won't try to balloon them out.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

549fd280

Drivers: hv: hv_balloon: eliminate the trylock path in acquire/release_region_mutex · b05d8d9e

由 Vitaly Kuznetsov 提交于 2月 28, 2015

When many memory regions are being added and automatically onlined the
following lockup is sometimes observed:

INFO: task udevd:1872 blocked for more than 120 seconds.
...
Call Trace:
 [<ffffffff816ec0bc>] schedule_timeout+0x22c/0x350
 [<ffffffff816eb98f>] wait_for_common+0x10f/0x160
 [<ffffffff81067650>] ? default_wake_function+0x0/0x20
 [<ffffffff816eb9fd>] wait_for_completion+0x1d/0x20
 [<ffffffff8144cb9c>] hv_memory_notifier+0xdc/0x120
 [<ffffffff816f298c>] notifier_call_chain+0x4c/0x70
...

When several memory blocks are going online simultaneously we got several
hv_memory_notifier() trying to acquire the ha_region_mutex. When this mutex is
being held by hot_add_req() all these competing acquire_region_mutex() do
mutex_trylock, fail, and queue themselves into wait_for_completion(..). However
when we do complete() from release_region_mutex() only one of them wakes up.
This could be solved by changing complete() -> complete_all() memory onlining
can be delayed as well, in that case we can still get several
hv_memory_notifier() runners at the same time trying to grab the mutex.
Only one of them will succeed and the others will hang for forever as
complete() is not being called. We don't see this issue often because we have
5sec onlining timeout in hv_mem_hot_add() and usually all udev events arrive
in this time frame.

Get rid of the trylock path, waiting on the mutex is supposed to provide the
required serialization.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

b05d8d9e

Drivers: hv: vmbus: Get rid of some unnecessary messages · 37f492ce

由 K. Y. Srinivasan 提交于 2月 28, 2015

Currently we log messages when either we are not able to map an ID to a
channel or when the channel does not have a callback associated
(in the channel interrupt handling path). These messages don't add
any value, get rid of them.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

37f492ce

Drivers: hv: util: On device remove, close the channel after de-initializing the service · 5380b383

由 K. Y. Srinivasan 提交于 2月 28, 2015

When the offer is rescinded, vmbus_close() can free up the channel;
deinitialize the service before closing the channel.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

5380b383

Drivers: hv: vmbus: Remove the channel from the channel list(s) on failure · 5b1e5b53

由 K. Y. Srinivasan 提交于 2月 28, 2015

Properly rollback state in vmbus_pocess_offer() in the failure paths.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

5b1e5b53

Drivers: hv: vmbus: Handle both rescind and offer messages in the same context · 2dd37cb8

由 K. Y. Srinivasan 提交于 2月 28, 2015

Execute both ressind and offer messages in the same work context. This serializes these
operations and naturally addresses the various corner cases.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

2dd37cb8

Drivers: hv: vmbus: Introduce a function to remove a rescinded offer · ed6cfcc5

由 K. Y. Srinivasan 提交于 2月 28, 2015

In response to a rescind message, we need to remove the channel and the
corresponding device. Cleanup this code path by factoring out the code
to remove a channel.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ed6cfcc5

Drivers: hv: vmbus: Properly handle child device remove · d15a0301

由 K. Y. Srinivasan 提交于 2月 28, 2015

Handle the case when the device may be removed when the device has no driver
attached to it.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d15a0301

Drivers: hv: vmbus: Add support for the NetworkDirect GUID · 04653a00

由 K. Y. Srinivasan 提交于 2月 27, 2015

NetworkDirect is a service that supports guest RDMA.
Define the GUID for this service.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

04653a00

Drivers: hv: vmbus: Fix a bug in the error path in vmbus_open() · 40384e4b

由 K. Y. Srinivasan 提交于 2月 27, 2015

Correctly rollback state if the failure occurs after we have handed over
the ownership of the buffer to the host.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

40384e4b

hv: hv_balloon: match var type to return type of wait_for_completion · b057b3ad

由 Nicholas Mc Guire 提交于 2月 27, 2015

return type of wait_for_completion_timeout is unsigned long not int, this
patch changes the type of t from int to unsigned long.
Signed-off-by: NNicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

b057b3ad

hv: channel_mgmt: match var type to return type of wait_for_completion · 51e5181d

由 Nicholas Mc Guire 提交于 2月 27, 2015

return type of wait_for_completion_timeout is unsigned long not int, this
patch changes the type of t from int to unsigned long.
Signed-off-by: NNicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

51e5181d

hv: channel: match var type to return type of wait_for_completion · 08a9513f

由 Nicholas Mc Guire 提交于 2月 27, 2015

return type of wait_for_completion_timeout is unsigned long not int, this
patch changes the type of t from int to unsigned long.
Signed-off-by: NNicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

08a9513f

hv: vmbus_open(): reset the channel state on ENOMEM · ac0d12b7

由 Dexuan Cui 提交于 2月 27, 2015

Without this patch, the state is put to CHANNEL_OPENING_STATE, and when
the driver is loaded next time, vmbus_open() will fail immediately due to
newchannel->state != CHANNEL_OPEN_STATE.

CC: "K. Y. Srinivasan" <kys@microsoft.com>
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ac0d12b7

hv: vmbus_post_msg: retry the hypercall on some transient errors · 89f9f679

由 Dexuan Cui 提交于 2月 27, 2015

I got HV_STATUS_INVALID_CONNECTION_ID on Hyper-V 2008 R2 when keeping running
"rmmod hv_netvsc; modprobe hv_netvsc; rmmod hv_utils; modprobe hv_utils"
in a Linux guest. Looks the host has some kind of throttling mechanism if
some kinds of hypercalls are sent too frequently.
Without the patch, the driver can occasionally fail to load.

Also let's retry HV_STATUS_INSUFFICIENT_MEMORY, though we didn't get it
before.

Removed 'case -ENOMEM', since the hypervisor doesn't return this.

CC: "K. Y. Srinivasan" <kys@microsoft.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

89f9f679

hv: hv_util: move vmbus_open() to a later place · 18965663

由 Dexuan Cui 提交于 2月 27, 2015

Before the line vmbus_open() returns, srv->util_cb can be already running
and the variables, like util_fw_version, are needed by the srv->util_cb.

So we have to make sure the variables are initialized before the vmbus_open().

CC: "K. Y. Srinivasan" <kys@microsoft.com>
Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

18965663

Drivers: hv: vmbus: Teardown clockevent devices on module unload · e086748c

由 Vitaly Kuznetsov 提交于 2月 27, 2015

Newly introduced clockevent devices made it impossible to unload hv_vmbus
module as clockevents_config_and_register() takes additional reverence to
the module. To make it possible again we do the following:
- avoid setting dev->owner for clockevent devices;
- implement hv_synic_clockevents_cleanup() doing clockevents_unbind_device();
- call it from vmbus_exit().

In theory hv_synic_clockevents_cleanup() can be merged with hv_synic_cleanup(),
however, we call hv_synic_cleanup() from smp_call_function_single() and this
doesn't work for clockevents_unbind_device() as it does such call on its own. I
opted for a separate function.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e086748c

clockevents: export clockevents_unbind_device instead of clockevents_unbind · 32a15832

由 Vitaly Kuznetsov 提交于 2月 27, 2015

It looks like clockevents_unbind is being exported by mistake as:
- it is static;
- it is not listed in include/linux/clockchips.h;
- EXPORT_SYMBOL_GPL(clockevents_unbind) follows clockevents_unbind_device()
  implementation.

I think clockevents_unbind_device should be exported instead. This is going to
be used to teardown Hyper-V clockevent devices on module unload.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

32a15832

drivers: hv: vmbus: Teardown synthetic interrupt controllers on module unload · e72e7ac5

由 Vitaly Kuznetsov 提交于 2月 27, 2015

SynIC has to be switched off when we unload the module, otherwise registered
memory pages can get corrupted after (as Hyper-V host still writes there) and
we see the following crashes for random processes:

[   89.116774] BUG: Bad page map in process sh  pte:4989c716 pmd:36f81067
[   89.159454] addr:0000000000437000 vm_flags:00000875 anon_vma:          (null) mapping:ffff88007bba55a0 index:37
[   89.226146] vma->vm_ops->fault: filemap_fault+0x0/0x410
[   89.257776] vma->vm_file->f_op->mmap: generic_file_mmap+0x0/0x60
[   89.297570] CPU: 0 PID: 215 Comm: sh Tainted: G    B          3.19.0-rc5_bug923184+ #488
[   89.353738] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
[   89.409138]  0000000000000000 000000004e083d7b ffff880036e9fa18 ffffffff81a68d31
[   89.468724]  0000000000000000 0000000000437000 ffff880036e9fa68 ffffffff811a1e3a
[   89.519233]  000000004989c716 0000000000000037 ffffea0001edc340 0000000000437000
[   89.575751] Call Trace:
[   89.591060]  [<ffffffff81a68d31>] dump_stack+0x45/0x57
[   89.625164]  [<ffffffff811a1e3a>] print_bad_pte+0x1aa/0x250
[   89.667234]  [<ffffffff811a2c95>] vm_normal_page+0x55/0xa0
[   89.703818]  [<ffffffff811a3105>] unmap_page_range+0x425/0x8a0
[   89.737982]  [<ffffffff811a3601>] unmap_single_vma+0x81/0xf0
[   89.780385]  [<ffffffff81184320>] ? lru_deactivate_fn+0x190/0x190
[   89.820130]  [<ffffffff811a4131>] unmap_vmas+0x51/0xa0
[   89.860168]  [<ffffffff811ad12c>] exit_mmap+0xac/0x1a0
[   89.890588]  [<ffffffff810763c3>] mmput+0x63/0x100
[   89.919205]  [<ffffffff811eba48>] flush_old_exec+0x3f8/0x8b0
[   89.962135]  [<ffffffff8123b5bb>] load_elf_binary+0x32b/0x1260
[   89.998581]  [<ffffffff811a14f2>] ? get_user_pages+0x52/0x60

hv_synic_cleanup() function exists but noone calls it now. Do the following:
- call hv_synic_cleanup() on each cpu from vmbus_exit();
- write global disable bit through MSR;
- use hv_synic_free_cpu() to avoid memory leask and code duplication.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e72e7ac5

Drivers: hv: vmbus: teardown hv_vmbus_con workqueue and vmbus_connection pages on shutdown · 09a19628

由 Vitaly Kuznetsov 提交于 2月 27, 2015

We need to destroy hv_vmbus_con on module shutdown, otherwise the following
crash is sometimes observed:

[   76.569845] hv_vmbus: Hyper-V Host Build:9600-6.3-17-0.17039; Vmbus version:3.0
[   82.598859] BUG: unable to handle kernel paging request at ffffffffa0003480
[   82.599287] IP: [<ffffffffa0003480>] 0xffffffffa0003480
[   82.599287] PGD 1f34067 PUD 1f35063 PMD 3f72d067 PTE 0
[   82.599287] Oops: 0010 [#1] SMP
[   82.599287] Modules linked in: [last unloaded: hv_vmbus]
[   82.599287] CPU: 0 PID: 26 Comm: kworker/0:1 Not tainted 3.19.0-rc5_bug923184+ #488
[   82.599287] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[   82.599287] Workqueue: hv_vmbus_con 0xffffffffa0003480
[   82.599287] task: ffff88007b6ddfa0 ti: ffff88007f8f8000 task.ti: ffff88007f8f8000
[   82.599287] RIP: 0010:[<ffffffffa0003480>]  [<ffffffffa0003480>] 0xffffffffa0003480
[   82.599287] RSP: 0018:ffff88007f8fbe00  EFLAGS: 00010202
...

To avoid memory leaks we need to free monitor_pages and int_page for
vmbus_connection. Implement vmbus_disconnect() function by separating cleanup
path from vmbus_connect().

As we use hv_vmbus_con to release channels (see free_channel() in channel_mgmt.c)
we need to make sure the work was done before we remove the queue, do that with
drain_workqueue(). We also need to avoid handling messages  which can (potentially)
create new channels, so set vmbus_connection.conn_state = DISCONNECTED at the very
beginning of vmbus_exit() and check for that in vmbus_onmessage_work().
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

09a19628

Drivers: hv: vmbus: avoid double kfree for device_obj · adcde069

由 Vitaly Kuznetsov 提交于 2月 27, 2015

On driver shutdown device_obj is being freed twice:
1) In vmbus_free_channels()
2) vmbus_device_release() (which is being triggered by device_unregister() in
   vmbus_device_unregister().
This double kfree leads to the following sporadic crash on driver unload:

[   23.469876] general protection fault: 0000 [#1] SMP
[   23.470036] Modules linked in: hv_vmbus(-)
[   23.470036] CPU: 2 PID: 213 Comm: rmmod Not tainted 3.19.0-rc5_bug923184+ #488
[   23.470036] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
[   23.470036] task: ffff880036ef1cb0 ti: ffff880036ce8000 task.ti: ffff880036ce8000
[   23.470036] RIP: 0010:[<ffffffff811d2e1b>]  [<ffffffff811d2e1b>] __kmalloc_node_track_caller+0xdb/0x1e0
[   23.470036] RSP: 0018:ffff880036cebcc8  EFLAGS: 00010246
...

When this crash does not happen on driver unload the similar one is expected if
we try to load hv_vmbus again.

Remove kfree from vmbus_free_channels() as freeing it from
vmbus_device_release() seems right.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

adcde069

Drivers: hv: vmbus: rename channel work queues · bc63b6f6

由 Vitaly Kuznetsov 提交于 2月 27, 2015

All channel work queues are named 'hv_vmbus_ctl', this makes them
indistinguishable in ps output and makes it hard to link to the corresponding
vmbus device. Rename them to hv_vmbus_ctl/N and make vmbus device names match,
e.g. now vmbus_1 device is served by hv_vmbus_ctl/1 work queue.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

bc63b6f6

Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors · e513229b

由 Vitaly Kuznetsov 提交于 2月 27, 2015

When an SMP Hyper-V guest is running on top of 2012R2 Server and secondary
cpus are sent offline (with echo 0 > /sys/devices/system/cpu/cpu$cpu/online)
the system freeze is observed. This happens due to the fact that on newer
hypervisors (Win8, WS2012R2, ...) vmbus channel handlers are distributed
across all cpus (see init_vp_index() function in drivers/hv/channel_mgmt.c)
and on cpu offlining nobody reassigns them to CPU0. Prevent cpu offlining
when vmbus is loaded until the issue is fixed host-side.

This patch also disables hibernation but it is OK as it is also broken (MCE
error is hit on resume). Suspend still works.

Tested with WS2008R2 and WS2012R2.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e513229b

23 2月, 2015 13 次提交

Linux 4.0-rc1 · c517d838

由 Linus Torvalds 提交于 2月 22, 2015

.. after extensive statistical analysis of my G+ polling, I've come to
the inescapable conclusion that internet polls are bad.

Big surprise.

But "Hurr durr I'ma sheep" trounced "I like online polls" by a 62-to-38%
margin, in a poll that people weren't even supposed to participate in.
Who can argue with solid numbers like that? 5,796 votes from people who
can't even follow the most basic directions?

In contrast, "v4.0" beat out "v3.20" by a slimmer margin of 56-to-44%,
but with a total of 29,110 votes right now.

Now, arguably, that vote spread is only about 3,200 votes, which is less
than the almost six thousand votes that the "please ignore" poll got, so
it could be considered noise.

But hey, I asked, so I'll honor the votes.

c517d838

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · feaf2229

由 Linus Torvalds 提交于 2月 22, 2015

Pull ext4 fixes from Ted Ts'o:
 "Ext4 bug fixes.

  We also reserved code points for encryption and read-only images (for
  which the implementation is mostly just the reserved code point for a
  read-only feature :-)"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix indirect punch hole corruption
  ext4: ignore journal checksum on remount; don't fail
  ext4: remove duplicate remount check for JOURNAL_CHECKSUM change
  ext4: fix mmap data corruption in nodelalloc mode when blocksize < pagesize
  ext4: support read-only images
  ext4: change to use setup_timer() instead of init_timer()
  ext4: reserve codepoints used by the ext4 encryption feature
  jbd2: complain about descriptor block checksum errors

feaf2229

Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · be5e6616

由 Linus Torvalds 提交于 2月 22, 2015

Pull more vfs updates from Al Viro:
 "Assorted stuff from this cycle.  The big ones here are multilayer
  overlayfs from Miklos and beginning of sorting ->d_inode accesses out
  from David"

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (51 commits)
  autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation
  procfs: fix race between symlink removals and traversals
  debugfs: leave freeing a symlink body until inode eviction
  Documentation/filesystems/Locking: ->get_sb() is long gone
  trylock_super(): replacement for grab_super_passive()
  fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
  Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
  VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
  SELinux: Use d_is_positive() rather than testing dentry->d_inode
  Smack: Use d_is_positive() rather than testing dentry->d_inode
  TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()
  Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode
  Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb
  VFS: Split DCACHE_FILE_TYPE into regular and special types
  VFS: Add a fallthrough flag for marking virtual dentries
  VFS: Add a whiteout dentry type
  VFS: Introduce inode-getting helpers for layered/unioned fs environments
  Infiniband: Fix potential NULL d_inode dereference
  posix_acl: fix reference leaks in posix_acl_create
  autofs4: Wrong format for printing dentry
  ...

be5e6616

Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · 90c453ca

由 Linus Torvalds 提交于 2月 22, 2015

Pull ARM fix from Russell King:
 "Just one fix this time around.  __iommu_alloc_buffer() can cause a
  BUG() if dma_alloc_coherent() is called with either __GFP_DMA32 or
  __GFP_HIGHMEM set.  The patch from Alexandre addresses this"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8305/1: DMA: Fix kzalloc flags in __iommu_alloc_buffer()

90c453ca

A
autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation · 0a280962
由 Al Viro 提交于 2月 21, 2015
```
X-Coverup: just ask spender
Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
0a280962

procfs: fix race between symlink removals and traversals · 7e0e953b

由 Al Viro 提交于 2月 21, 2015

use_pde()/unuse_pde() in ->follow_link()/->put_link() resp.

Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7e0e953b

debugfs: leave freeing a symlink body until inode eviction · 0db59e59

由 Al Viro 提交于 2月 21, 2015

As it is, we have debugfs_remove() racing with symlink traversals.
Supply ->evict_inode() and do freeing there - inode will remain
pinned until we are done with the symlink body.

And rip the idiocy with checking if dentry is positive right after
we'd verified debugfs_positive(), which is a stronger check...

Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0db59e59

A
Documentation/filesystems/Locking: ->get_sb() is long gone · dca11178
由 Al Viro 提交于 2月 21, 2015
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
dca11178

trylock_super(): replacement for grab_super_passive() · eb6ef3df

由 Konstantin Khlebnikov 提交于 2月 19, 2015

I've noticed significant locking contention in memory reclaimer around
sb_lock inside grab_super_passive(). Grab_super_passive() is called from
two places: in icache/dcache shrinkers (function super_cache_scan) and
from writeback (function __writeback_inodes_wb). Both are required for
progress in memory allocator.

Grab_super_passive() acquires sb_lock to increment sb->s_count and check
sb->s_instances. It seems sb->s_umount locked for read is enough here:
super-block deactivation always runs under sb->s_umount locked for write.
Protecting super-block itself isn't a problem: in super_cache_scan() sb
is protected by shrinker_rwsem: it cannot be freed if its slab shrinkers
are still active. Inside writeback super-block comes from inode from bdi
writeback list under wb->list_lock.

This patch removes locking sb_lock and checks s_instances under s_umount:
generic_shutdown_super() unlinks it under sb->s_umount locked for write.
New variant is called trylock_super() and since it only locks semaphore,
callers must call up_read(&sb->s_umount) instead of drop_super(sb) when
they're done.
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

eb6ef3df

fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions · 54f2a2f4

由 David Howells 提交于 1月 29, 2015

Fanotify probably doesn't want to watch autodirs so make it use d_can_lookup()
rather than d_is_dir() when checking a dir watch and give an error on fake
directories.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

54f2a2f4

Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions · ce40fa78

由 David Howells 提交于 1月 29, 2015

Fix up the following scripted S_ISDIR/S_ISREG/S_ISLNK conversions (or lack
thereof) in cachefiles:

 (1) Cachefiles mostly wants to use d_can_lookup() rather than d_is_dir() as
     it doesn't want to deal with automounts in its cache.

 (2) Coccinelle didn't find S_IS* expressions in ASSERT() statements in
     cachefiles.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ce40fa78

VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) · e36cb0b8

由 David Howells 提交于 1月 29, 2015

Convert the following where appropriate:

 (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).

 (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).

 (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry).  This is actually more
     complicated than it appears as some calls should be converted to
     d_can_lookup() instead.  The difference is whether the directory in
     question is a real dir with a ->lookup op or whether it's a fake dir with
     a ->d_automount op.

In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).

Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer.  In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.

However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.

There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE.  Strictly, this was
intended for special directory entry types that don't have attached inodes.

The following perl+coccinelle script was used:

use strict;

my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
    die "Can't grep for S_ISDIR and co. callers";
@callers = <$fd>;
close($fd);
unless (@callers) {
    print "No matches\n";
    exit(0);
}

my @cocci = (
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISLNK(E->d_inode->i_mode)',
    '+ d_is_symlink(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISDIR(E->d_inode->i_mode)',
    '+ d_is_dir(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISREG(E->d_inode->i_mode)',
    '+ d_is_reg(E)' );

my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);

foreach my $file (@callers) {
    chomp $file;
    print "Processing ", $file, "\n";
    system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
	die "spatch failed";
}

[AV: overlayfs parts skipped]
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e36cb0b8

SELinux: Use d_is_positive() rather than testing dentry->d_inode · 2c616d4d

由 David Howells 提交于 1月 29, 2015

Use d_is_positive() rather than testing dentry->d_inode in SELinux to get rid
of direct references to d_inode outside of the VFS.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2c616d4d

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功