- 25 10月, 2015 1 次提交
-
-
由 Sai Praneeth 提交于
When CONFIG_DEBUG_VIRTUAL is enabled, all accesses to __pa(address) are monitored to see whether address falls in direct mapping or kernel text mapping (see Documentation/x86/x86_64/mm.txt for details), if it does not, the kernel panics. During 1:1 mapping of EFI runtime services we access virtual addresses which are == physical addresses, thus the 1:1 mapping and these addresses do not fall in either of the above two regions and hence when passed as arguments to __pa() kernel panics as reported by Dave Hansen here https://lkml.kernel.org/r/5462999A.7090706@intel.com. So, before calling __pa() virtual addresses should be validated which results in skipping call to split_page_count() and that should be fine because it is used to keep track of everything *but* 1:1 mappings. Signed-off-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Reported-by: NDave Hansen <dave.hansen@intel.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Glenn P Williamson <glenn.p.williamson@intel.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
-
- 23 9月, 2015 4 次提交
-
-
由 Toshi Kani 提交于
try_preserve_large_page() checks if new_prot is the same as old_prot. If so, it simply sets do_split to 0, and returns with no-operation. However, old_prot is set as a 4KB pgprot value while new_prot is a large page pgprot value. Now that old_prot is initially set from p?d_pgprot() as a large page pgprot value, fix it by not overwriting old_prot with a 4KB pgprot value. Signed-off-by: NToshi Kani <toshi.kani@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Konrad Wilk <konrad.wilk@oracle.com> Cc: Robert Elliot <elliott@hpe.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1442514264-12475-12-git-send-email-toshi.kani@hpe.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Toshi Kani 提交于
__split_large_page() is called from __change_page_attr() to change the mapping attribute by splitting a given large page into smaller pages. This function uses pte_pfn() and pte_pgprot() for PUD/PMD, which do not handle the large PAT bit properly. Fix __split_large_page() by using the corresponding pud/pmd pfn/ pgprot interfaces. Also remove '#ifdef CONFIG_X86_64', which is not necessary. Signed-off-by: NToshi Kani <toshi.kani@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Konrad Wilk <konrad.wilk@oracle.com> Cc: Robert Elliot <elliott@hpe.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1442514264-12475-11-git-send-email-toshi.kani@hpe.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Toshi Kani 提交于
try_preserve_large_page() is called from __change_page_attr() to change the mapping attribute of a given large page. This function uses pte_pfn() and pte_pgprot() for PUD/PMD, which do not handle the large PAT bit properly. Fix try_preserve_large_page() by using the corresponding pud/pmd prot/pfn interfaces. Also remove '#ifdef CONFIG_X86_64', which is not necessary. Signed-off-by: NToshi Kani <toshi.kani@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Konrad Wilk <konrad.wilk@oracle.com> Cc: Robert Elliot <elliott@hpe.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1442514264-12475-10-git-send-email-toshi.kani@hpe.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Toshi Kani 提交于
slow_virt_to_phys() calls lookup_address() to obtain *pte and its level. It then calls pte_pfn() to obtain a physical address for any level. However, this physical address is not correct when the large PAT bit is set because pte_pfn() does not mask the large PAT bit properly for PUD/PMD. Fix slow_virt_to_phys() to use pud_pfn() and pmd_pfn() for 1GB and 2MB mapping levels. Signed-off-by: NToshi Kani <toshi.kani@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Konrad Wilk <konrad.wilk@oracle.com> Cc: Robert Elliot <elliott@hpe.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1442514264-12475-8-git-send-email-toshi.kani@hpe.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 25 8月, 2015 1 次提交
-
-
由 Paul Gortmaker 提交于
The file pageattr.c is obj-y and it includes pageattr-test.c based on CPA_DEBUG (a bool), meaning that no code here is currently being built as a module by anyone. Lets remove the couple traces of modularity so that when reading the code there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1440459295-21814-3-git-send-email-paul.gortmaker@windriver.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 07 6月, 2015 2 次提交
-
-
由 Toshi Kani 提交于
Now that reserve_ram_pages_type() accepts the WT type, add set_memory_wt(), set_memory_array_wt() and set_pages_array_wt() in order to be able to set memory to Write-Through page cache mode. Also, extend ioremap_change_attr() to accept the WT type. Signed-off-by: NToshi Kani <toshi.kani@hp.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Elliott@hp.com Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arnd@arndb.de Cc: hch@lst.de Cc: hmh@hmh.eng.br Cc: jgross@suse.com Cc: konrad.wilk@oracle.com Cc: linux-mm <linux-mm@kvack.org> Cc: linux-nvdimm@lists.01.org Cc: stefan.bader@canonical.com Cc: yigal@plexistor.com Link: http://lkml.kernel.org/r/1433436928-31903-13-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Borislav Petkov 提交于
Now that we emulate a PAT table when PAT is disabled, there's no need for those checks anymore as the PAT abstraction will handle those cases too. Based on a conglomerate patch from Toshi Kani. Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NToshi Kani <toshi.kani@hp.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Elliott@hp.com Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arnd@arndb.de Cc: hch@lst.de Cc: hmh@hmh.eng.br Cc: jgross@suse.com Cc: konrad.wilk@oracle.com Cc: linux-mm <linux-mm@kvack.org> Cc: linux-nvdimm@lists.01.org Cc: stefan.bader@canonical.com Cc: yigal@plexistor.com Link: http://lkml.kernel.org/r/1433436928-31903-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 03 6月, 2015 1 次提交
-
-
由 Stephen Rothwell 提交于
Nothing in <asm/io.h> uses anything from <linux/vmalloc.h>, so remove it from there and fix up the resulting build problems triggered on x86 {64|32}-bit {def|allmod|allno}configs. The breakages were triggering in places where x86 builds relied on vmalloc() facilities but did not include <linux/vmalloc.h> explicitly and relied on the implicit inclusion via <asm/io.h>. Also add: - <linux/init.h> to <linux/io.h> - <asm/pgtable_types> to <asm/io.h> ... which were two other implicit header file dependencies. Suggested-by: NDavid Miller <davem@davemloft.net> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> [ Tidied up the changelog. ] Acked-by: NDavid Miller <davem@davemloft.net> Acked-by: NTakashi Iwai <tiwai@suse.de> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Acked-by: NVinod Koul <vinod.koul@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Colin Cross <ccross@android.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: James E.J. Bottomley <JBottomley@odin.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Kees Cook <keescook@chromium.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Kristen Carlson Accardi <kristen@linux.intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Suma Ramars <sramars@cisco.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 27 5月, 2015 1 次提交
-
-
由 Luis R. Rodriguez 提交于
We use pat_enabled in x86-specific code to see if PAT is enabled or not but we're granting full access to it even though readers do not need to set it. If, for instance, we granted access to it to modules later they then could override the variable setting... no bueno. This renames pat_enabled to a new static variable __pat_enabled. Folks are redirected to use pat_enabled() now. Code that sets this can only be internal to pat.c. Apart from the early kernel parameter "nopat" to disable PAT, we also have a few cases that disable it later and make use of a helper pat_disable(). It is wrapped under an ifdef but since that code cannot run unless PAT was enabled its not required to wrap it with ifdefs, unwrap that. Likewise, since "nopat" doesn't really change non-PAT systems just remove that ifdef as well. Although we could add and use an early_param_off(), these helpers don't use __read_mostly but we want to keep __read_mostly for __pat_enabled as this is a hot path -- upon boot, for instance, a simple guest may see ~4k accesses to pat_enabled(). Since __read_mostly early boot params are not that common we don't add a helper for them just yet. Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Walls <awalls@md.metrocast.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Doug Ledford <dledford@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kyle McMartin <kyle@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1430425520-22275-3-git-send-email-mcgrof@do-not-panic.com Link: http://lkml.kernel.org/r/1432628901-18044-13-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 11 5月, 2015 3 次提交
-
-
由 Dexuan Cui 提交于
The patch doesn't change any logic. Signed-off-by: NDexuan Cui <decui@microsoft.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1429776428-4475-1-git-send-email-decui@microsoft.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Luis R. Rodriguez 提交于
ioremap_nocache() currently uses UC- by default. Our goal is to eventually make UC the default. Linux maps UC- to PCD=1, PWT=0 page attributes on non-PAT systems. Linux maps UC to PCD=1, PWT=1 page attributes on non-PAT systems. On non-PAT and PAT systems a WC MTRR has different effects on pages with either of these attributes. In order to help with a smooth transition its best to enable use of UC (PCD,1, PWT=1) on a region as that ensures a WC MTRR will have no effect on a region, this however requires us to have an way to declare a region as UC and we currently do not have a way to do this. WC MTRR on non-PAT system with PCD=1, PWT=0 (UC-) yields WC. WC MTRR on non-PAT system with PCD=1, PWT=1 (UC) yields UC. WC MTRR on PAT system with PCD=1, PWT=0 (UC-) yields WC. WC MTRR on PAT system with PCD=1, PWT=1 (UC) yields UC. A flip of the default ioremap_nocache() behaviour from UC- to UC can therefore regress a memory region from effective memory type WC to UC if MTRRs are used. Use of MTRRs should be phased out and in the best case only arch_phys_wc_add() use will remain, even if this happens arch_phys_wc_add() will have an effect on non-PAT systems and changes to default ioremap_nocache() behaviour could regress drivers. Now, ideally we'd use ioremap_nocache() on the regions in which we'd need uncachable memory types and avoid any MTRRs on those regions. There are however some restrictions on MTRRs use, such as the requirement of having the base and size of variable sized MTRRs to be powers of two, which could mean having to use a WC MTRR over a large area which includes a region in which write-combining effects are undesirable. Add ioremap_uc() to help with the both phasing out of MTRR use and also provide a way to blacklist small WC undesirable regions in devices with mixed regions which are size-implicated to use large WC MTRRs. Use of ioremap_uc() helps phase out MTRR use by avoiding regressions with an eventual flip of default behaviour or ioremap_nocache() from UC- to UC. Drivers working with WC MTRRs can use the below table to review and consider the use of ioremap*() and similar helpers to ensure appropriate behaviour long term even if default ioremap_nocache() behaviour changes from UC- to UC. Although ioremap_uc() is being added we leave set_memory_uc() to use UC- as only initial memory type setup is required to be able to accommodate existing device drivers and phase out MTRR use. It should also be clarified that set_memory_uc() cannot be used with IO memory, even though its use will not return any errors, it really has no effect. ---------------------------------------------------------------------- MTRR Non-PAT PAT Linux ioremap value Effective memory type ---------------------------------------------------------------------- Non-PAT | PAT PAT |PCD ||PWT ||| WC 000 WB _PAGE_CACHE_MODE_WB WC | WC WC 001 WC _PAGE_CACHE_MODE_WC WC* | WC WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS WC* | WC WC 011 UC _PAGE_CACHE_MODE_UC UC | UC ---------------------------------------------------------------------- Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NH. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Antonino Daplas <adaplas@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Travis <travis@sgi.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thierry Reding <treding@nvidia.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Ville Syrjälä <syrjala@sci.fi> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-fbdev@vger.kernel.org Link: http://lkml.kernel.org/r/1430343851-967-2-git-send-email-mcgrof@do-not-panic.com Link: http://lkml.kernel.org/r/1431332153-18566-9-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ross Zwisler 提交于
The current algorithm used in clflush_cache_range() can cause the last cache line of the buffer to be flushed twice. Fix that algorithm so that each cache line will only be flushed once. Reported-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/1430259192-18802-1-git-send-email-ross.zwisler@linux.intel.com Link: http://lkml.kernel.org/r/1431332153-18566-5-git-send-email-bp@alien8.de [ Changed it to 'void *' to simplify the type conversions. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 05 3月, 2015 1 次提交
-
-
由 Luis R. Rodriguez 提交于
direct_gbpages can be force enabled as an early parameter but not really have taken effect when DEBUG_PAGEALLOC or KMEMCHECK is enabled. You can also enable direct_gbpages right now if you have an x86_64 architecture but your CPU doesn't really have support for this feature. In both cases PG_LEVEL_1G won't actually be enabled but direct_gbpages is used in other areas under the assumptions that PG_LEVEL_1G was set. Fix this by putting together all requirements which make this feature sensible to enable under, and only enable both finally flipping on PG_LEVEL_1G and leaving PG_LEVEL_1G set when this is true. We only enable this feature then to be possible on sensible builds defined by the new ENABLE_DIRECT_GBPAGES. If the CPU has support for it you can either enable this by using the DIRECT_GBPAGES option or using the early kernel parameter. If a platform had support for this you can always force disable it as well. Signed-off-by: NLuis R. Rodriguez <mcgrof@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: JBeulich@suse.com Cc: Jan Beulich <JBeulich@suse.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Lindgren <tony@atomide.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: julia.lawall@lip6.fr Link: http://lkml.kernel.org/r/1425518654-3403-3-git-send-email-mcgrof@do-not-panic.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 28 2月, 2015 1 次提交
-
-
由 Daniel Borkmann 提交于
This effectively unexports set_memory_ro() and set_memory_rw() functions, and thus reverts: a03352d2 ("x86: export set_memory_ro and set_memory_rw"). They have been introduced for debugging purposes in e1000e, but no module user is in mainline kernel (anymore?) and we explicitly do not want modules to use these functions, as they i.e. protect eBPF (interpreted & JIT'ed) images from malicious modifications or bugs. Outside of eBPF scope, I believe also other set_memory_*() functions should be unexported on x86 for modules. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@plumgrid.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Link: http://lkml.kernel.org/r/a064393a0a5d319eebde5c761cfd743132d4f213.1425040940.git.daniel@iogearbox.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 14 12月, 2014 1 次提交
-
-
由 Joonsoo Kim 提交于
Now, we have prepared to avoid using debug-pagealloc in boottime. So introduce new kernel-parameter to disable debug-pagealloc in boottime, and makes related functions to be disabled in this case. Only non-intuitive part is change of guard page functions. Because guard page is effective only if debug-pagealloc is enabled, turning off according to debug-pagealloc is reasonable thing to do. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Dave Hansen <dave@sr71.net> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Jungsoo Son <jungsoo.son@lge.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 12月, 2014 1 次提交
-
-
由 Juergen Gross 提交于
Introduces lookup_pmd_address() to get the address of the pmd entry related to a virtual address in the current address space. This function is needed for support of a virtual mapped sparse p2m list in xen pv domains, as we need the address of the pmd entry, not the one of the pte in that case. Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
-
- 16 11月, 2014 4 次提交
-
-
由 Juergen Gross 提交于
The PAT bit in the ptes is not moved to the correct position when copying page protection attributes between entries of different sized pages. Translate the ptes according to their page size. Based-on-patch-by: NStefan Bader <stefan.bader@canonical.com> Signed-off-by: NJuergen Gross <jgross@suse.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-17-git-send-email-jgross@suse.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Juergen Gross 提交于
Instead of directly using the cache mode bits in the pte switch to using the cache mode type. Based-on-patch-by: NStefan Bader <stefan.bader@canonical.com> Signed-off-by: NJuergen Gross <jgross@suse.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-14-git-send-email-jgross@suse.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Juergen Gross 提交于
Instead of directly using the cache mode bits in the pte switch to using the cache mode type in the functions for modifying page attributes. Based-on-patch-by: NStefan Bader <stefan.bader@canonical.com> Signed-off-by: NJuergen Gross <jgross@suse.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-12-git-send-email-jgross@suse.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Juergen Gross 提交于
When modifying page attributes via change_page_attr_set_clr() don't test for setting _PAGE_PAT_LARGE, as this is - never done - PAT support for large pages is not included in the kernel up to now Signed-off-by: NJuergen Gross <jgross@suse.com> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-11-git-send-email-jgross@suse.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 29 10月, 2014 1 次提交
-
-
由 Dexuan Cui 提交于
pte_pfn() returns a PFN of long (32 bits in 32-PAE), so "long << PAGE_SHIFT" will overflow for PFNs above 4GB. Due to this issue, some Linux 32-PAE distros, running as guests on Hyper-V, with 5GB memory assigned, can't load the netvsc driver successfully and hence the synthetic network device can't work (we can use the kernel parameter mem=3000M to work around the issue). Cast pte_pfn() to phys_addr_t before shifting. Fixes: "commit d7656534: x86, mm: Create slow_virt_to_phys()" Signed-off-by: NDexuan Cui <decui@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: gregkh@linuxfoundation.org Cc: linux-mm@kvack.org Cc: olaf@aepfle.de Cc: apw@canonical.com Cc: jasowang@redhat.com Cc: dave.hansen@intel.com Cc: riel@redhat.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1414580017-27444-1-git-send-email-decui@microsoft.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 14 3月, 2014 1 次提交
-
-
由 Borislav Petkov 提交于
It is WBINVD, for INValiDate and not "wbindv". Use caps for instruction names, while at it. Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1394633584-5509-4-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 05 3月, 2014 2 次提交
-
-
由 Matt Fleming 提交于
Now that we have EFI-specific page tables we need to lookup the pgd when dumping those page tables, rather than assuming that swapper_pgdir is the current pgdir. Remove the double underscore prefix, which is usually reserved for static functions. Acked-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
由 Borislav Petkov 提交于
We will use it in efi so expose it. Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NToshi Kani <toshi.kani@hp.com> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
- 28 2月, 2014 1 次提交
-
-
由 Ross Zwisler 提交于
If clflushopt is available on the system, use it instead of clflush in clflush_cache_range. Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Link: http://lkml.kernel.org/r/1393441612-19729-3-git-send-email-ross.zwisler@linux.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 02 11月, 2013 8 次提交
-
-
由 Borislav Petkov 提交于
Add the ability to map pages in an arbitrary pgd. This wires in the remaining stuff so that there's a new interface with which you can map a region into an arbitrary PGD. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
由 Borislav Petkov 提交于
We try to free the pagetable pages once we've unmapped our portion. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
由 Borislav Petkov 提交于
In case we encounter an error during the mapping of a region, we want to unwind what we've established so far exactly the way we did the mapping. This is the PUD part kept deliberately small for easier review. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
由 Borislav Petkov 提交于
Handle last level by unconditionally writing the PTEs into the PTE page while paying attention to the NX bit. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
由 Borislav Petkov 提交于
Handle PMD-level mappings the same as PUD ones. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
由 Borislav Petkov 提交于
Add the next level of the pagetable populating function, we handle chunks around a 1G boundary by mapping them with the lower level functions - otherwise we use 1G pages for the mappings, thus using as less amount of pagetable pages as possible. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
由 Borislav Petkov 提交于
This allocates, if necessary, and populates the corresponding PGD entry with a PUD page. The next population level is a dummy macro which will be removed by the next patch and it is added here to keep the patch small and easily reviewable but not break bisection, at the same time. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
由 Borislav Petkov 提交于
This is preparatory work in order to be able to map pages into a specified PGD and not implicitly and only into init_mm. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
-
- 12 4月, 2013 1 次提交
-
-
由 Boris Ostrovsky 提交于
When CONFIG_DEBUG_PAGEALLOC is set page table updates made by kernel_map_pages() are not made visible (via TLB flush) immediately if lazy MMU is on. In environments that support lazy MMU (e.g. Xen) this may lead to fatal page faults, for example, when zap_pte_range() needs to allocate pages in __tlb_remove_page() -> tlb_next_batch(). Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Cc: konrad.wilk@oracle.com Link: http://lkml.kernel.org/r/1365703192-2089-1-git-send-email-boris.ostrovsky@oracle.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 11 4月, 2013 1 次提交
-
-
由 Andrea Arcangeli 提交于
Commit: a8aed3e0 ("x86/mm/pageattr: Prevent PSE and GLOABL leftovers to confuse pmd/pte_present and pmd_huge") introduced a valid fix but one location that didn't trigger the bug that lead to finding those (small) problems, wasn't updated using the right variable. The wrong variable was also initialized for no good reason, that may have been the source of the confusion. Remove the noop initialization accordingly. Commit a8aed3e0 also erroneously removed one canon_pgprot pass meant to clear pmd bitflags not supported in hardware by older CPUs, that automatically gets corrected by this patch too by applying it to the right variable in the new location. Reported-by: NStefan Bader <stefan.bader@canonical.com> Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Acked-by: NBorislav Petkov <bp@alien8.de> Cc: Andy Whitcroft <apw@canonical.com> Cc: Mel Gorman <mgorman@suse.de> Link: http://lkml.kernel.org/r/1365600505-19314-1-git-send-email-aarcange@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 10 4月, 2013 1 次提交
-
-
由 Borislav Petkov 提交于
So basically we're generating the pte_t * from a struct page and we're handing it down to the __split_large_page() internal version which then goes and gets back struct page * from it because it needs it. Change the caller to hand down struct page * directly and the callee can compute the pte_t itself. Net save is one virt_to_page() call and simpler code. While at it, make __split_large_page() static. Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NThomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1363886217-24703-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 24 2月, 2013 2 次提交
-
-
由 Andrea Arcangeli 提交于
Without this patch any kernel code that reads kernel memory in non present kernel pte/pmds (as set by pageattr.c) will crash. With this kernel code: static struct page *crash_page; static unsigned long *crash_address; [..] crash_page = alloc_pages(GFP_KERNEL, 9); crash_address = page_address(crash_page); if (set_memory_np((unsigned long)crash_address, 1)) printk("set_memory_np failure\n"); [..] The kernel will crash if inside the "crash tool" one would try to read the memory at the not present address. crash> p crash_address crash_address = $8 = (long unsigned int *) 0xffff88023c000000 crash> rd 0xffff88023c000000 [ *lockup* ] The lockup happens because _PAGE_GLOBAL and _PAGE_PROTNONE shares the same bit, and pageattr leaves _PAGE_GLOBAL set on a kernel pte which is then mistaken as _PAGE_PROTNONE (so pte_present returns true by mistake and the kernel fault then gets confused and loops). With THP the same can happen after we taught pmd_present to check _PAGE_PROTNONE and _PAGE_PSE in commit 027ef6c8 ("mm: thp: fix pmd_present for split_huge_page and PROT_NONE with THP"). THP has the same problem with _PAGE_GLOBAL as the 4k pages, but it also has a problem with _PAGE_PSE, which must be cleared too. After the patch is applied copy_user correctly returns -EFAULT and doesn't lockup anymore. crash> p crash_address crash_address = $9 = (long unsigned int *) 0xffff88023c000000 crash> rd 0xffff88023c000000 rd: read error: kernel virtual address: ffff88023c000000 type: "64-bit KVADDR" Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Shaohua Li <shaohua.li@intel.com> Cc: "H. Peter Anvin" <hpa@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Wen Congyang 提交于
When memory is removed, the corresponding pagetables should alse be removed. This patch introduces some common APIs to support vmemmap pagetable and x86_64 architecture direct mapping pagetable removing. All pages of virtual mapping in removed memory cannot be freed if some pages used as PGD/PUD include not only removed memory but also other memory. So this patch uses the following way to check whether a page can be freed or not. 1) When removing memory, the page structs of the removed memory are filled with 0FD. 2) All page structs are filled with 0xFD on PT/PMD, PT/PMD can be cleared. In this case, the page used as PT/PMD can be freed. For direct mapping pages, update direct_pages_count[level] when we freed their pagetables. And do not free the pages again because they were freed when offlining. For vmemmap pages, free the pages and their pagetables. For larger pages, do not split them into smaller ones because there is no way to know if the larger page has been split. As a result, there is no way to decide when to split. We deal the larger pages in the following way: 1) For direct mapped pages, all the pages were freed when they were offlined. And since menmory offline is done section by section, all the memory ranges being removed are aligned to PAGE_SIZE. So only need to deal with unaligned pages when freeing vmemmap pages. 2) For vmemmap pages being used to store page_struct, if part of the larger page is still in use, just fill the unused part with 0xFD. And when the whole page is fulfilled with 0xFD, then free the larger page. [akpm@linux-foundation.org: fix typo in comment] [tangchen@cn.fujitsu.com: do not calculate direct mapping pages when freeing vmemmap pagetables] [tangchen@cn.fujitsu.com: do not free direct mapping pages twice] [tangchen@cn.fujitsu.com: do not free page split from hugepage one by one] [tangchen@cn.fujitsu.com: do not split pages when freeing pagetable pages] [akpm@linux-foundation.org: use pmd_page_vaddr()] [akpm@linux-foundation.org: fix used-uninitialised bug] Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: NJianguo Wu <wujianguo@huawei.com> Signed-off-by: NWen Congyang <wency@cn.fujitsu.com> Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 1月, 2013 1 次提交
-
-
由 Dave Hansen 提交于
This is necessary because __pa() does not work on some kinds of memory, like vmalloc() or the alloc_remap() areas on 32-bit NUMA systems. We have some functions to do conversions _like_ this in the vmalloc() code (like vmalloc_to_page()), but they do not work on sizes other than 4k pages. We would potentially need to be able to handle all the page sizes that we use for the kernel linear mapping (4k, 2M, 1G). In practice, on 32-bit NUMA systems, the percpu areas get stuck in the alloc_remap() area. Any __pa() call on them will break and basically return garbage. This patch introduces a new function slow_virt_to_phys(), which walks the kernel page tables on x86 and should do precisely the same logical thing as __pa(), but actually work on a wider range of memory. It should work on the normal linear mapping, vmalloc(), kmap(), etc... Signed-off-by: NDave Hansen <dave@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20130122212433.4D1FCA62@kernel.stglabs.ibm.comAcked-by: NRik van Riel <riel@redhat.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-