1. 30 4月, 2013 15 次提交
    • J
      sparse-vmemmap: specify vmemmap population range in bytes · 0aad818b
      Johannes Weiner 提交于
      The sparse code, when asking the architecture to populate the vmemmap,
      specifies the section range as a starting page and a number of pages.
      
      This is an awkward interface, because none of the arch-specific code
      actually thinks of the range in terms of 'struct page' units and always
      translates it to bytes first.
      
      In addition, later patches mix huge page and regular page backing for
      the vmemmap.  For this, they need to call vmemmap_populate_basepages()
      on sub-section ranges with PAGE_SIZE and PMD_SIZE in mind.  But these
      are not necessarily multiples of the 'struct page' size and so this unit
      is too coarse.
      
      Just translate the section range into bytes once in the generic sparse
      code, then pass byte ranges down the stack.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Bernhard Schmidt <Bernhard.Schmidt@lrz.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Tested-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0aad818b
    • D
      mm, hugetlb: include hugepages in meminfo · 949f7ec5
      David Rientjes 提交于
      Particularly in oom conditions, it's troublesome that hugetlb memory is
      not displayed.  All other meminfo that is emitted will not add up to
      what is expected, and there is no artifact left in the kernel log to
      show that a potentially significant amount of memory is actually
      allocated as hugepages which are not available to be reclaimed.
      
      Booting with hugepages=8192 on the command line, this memory is now
      shown in oom conditions.  For example, with echo m >
      /proc/sysrq-trigger:
      
        Node 0 hugepages_total=2048 hugepages_free=2048 hugepages_surp=0 hugepages_size=2048kB
        Node 1 hugepages_total=2048 hugepages_free=2048 hugepages_surp=0 hugepages_size=2048kB
        Node 2 hugepages_total=2048 hugepages_free=2048 hugepages_surp=0 hugepages_size=2048kB
        Node 3 hugepages_total=2048 hugepages_free=2048 hugepages_surp=0 hugepages_size=2048kB
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      949f7ec5
    • A
      kexec, vmalloc: export additional vmalloc layer information · 13ba3fcb
      Atsushi Kumagai 提交于
      Now, vmap_area_list is exported as VMCOREINFO for makedumpfile to get
      the start address of vmalloc region (vmalloc_start).  The address which
      contains vmalloc_start value is represented as below:
      
        vmap_area_list.next - OFFSET(vmap_area.list) + OFFSET(vmap_area.va_start)
      
      However, both OFFSET(vmap_area.va_start) and OFFSET(vmap_area.list)
      aren't exported as VMCOREINFO.
      
      So this patch exports them externally with small cleanup.
      
      [akpm@linux-foundation.org: vmalloc.h should include list.h for list_head]
      Signed-off-by: NAtsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Dave Anderson <anderson@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      13ba3fcb
    • J
      mm, vmalloc: export vmap_area_list, instead of vmlist · f1c4069e
      Joonsoo Kim 提交于
      Although our intention is to unexport internal structure entirely, but
      there is one exception for kexec.  kexec dumps address of vmlist and
      makedumpfile uses this information.
      
      We are about to remove vmlist, then another way to retrieve information
      of vmalloc layer is needed for makedumpfile.  For this purpose, we
      export vmap_area_list, instead of vmlist.
      Signed-off-by: NJoonsoo Kim <js1304@gmail.com>
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Dave Anderson <anderson@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f1c4069e
    • J
      mm, vmalloc: move get_vmalloc_info() to vmalloc.c · db3808c1
      Joonsoo Kim 提交于
      Now get_vmalloc_info() is in fs/proc/mmu.c.  There is no reason that this
      code must be here and it's implementation needs vmlist_lock and it iterate
      a vmlist which may be internal data structure for vmalloc.
      
      It is preferable that vmlist_lock and vmlist is only used in vmalloc.c
      for maintainability. So move the code to vmalloc.c
      Signed-off-by: NJoonsoo Kim <js1304@gmail.com>
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Dave Anderson <anderson@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      db3808c1
    • D
      mm: make snapshotting pages for stable writes a per-bio operation · 71368511
      Darrick J. Wong 提交于
      Walking a bio's page mappings has proved problematic, so create a new
      bio flag to indicate that a bio's data needs to be snapshotted in order
      to guarantee stable pages during writeback.  Next, for the one user
      (ext3/jbd) of snapshotting, hook all the places where writes can be
      initiated without PG_writeback set, and set BIO_SNAP_STABLE there.
      
      We must also flag journal "metadata" bios for stable writeout, since
      file data can be written through the journal.  Finally, the
      MS_SNAP_STABLE mount flag (only used by ext3) is now superfluous, so get
      rid of it.
      
      [akpm@linux-foundation.org: rename _submit_bh()'s `flags' to `bio_flags', delobotomize the _submit_bh declaration]
      [akpm@linux-foundation.org: teeny cleanup]
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Artem Bityutskiy <dedekind1@gmail.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      71368511
    • J
      fs: don't compile in drop_caches.c when CONFIG_SYSCTL=n · 146732ce
      Josh Triplett 提交于
      drop_caches.c provides code only invokable via sysctl, so don't compile it
      in when CONFIG_SYSCTL=n.
      Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
      Acked-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      146732ce
    • M
      cgroup: remove css_get_next · 6d2488f6
      Michal Hocko 提交于
      Now that we have generic and well ordered cgroup tree walkers there is
      no need to keep css_get_next in the place.
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Ying Han <yinghan@google.com>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Glauber Costa <glommer@parallels.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6d2488f6
    • J
      mm: introduce free_highmem_page() helper to free highmem pages into buddy system · cfa11e08
      Jiang Liu 提交于
      The original goal of this patchset is to fix the bug reported by
      
        https://bugzilla.kernel.org/show_bug.cgi?id=53501
      
      Now it has also been expanded to reduce common code used by memory
      initializion.
      
      This is the second part, which applies to the previous part at:
        http://marc.info/?l=linux-mm&m=136289696323825&w=2
      
      It introduces a helper function free_highmem_page() to free highmem
      pages into the buddy system when initializing mm subsystem.
      Introduction of free_highmem_page() is one step forward to clean up
      accesses and modificaitons of totalhigh_pages, totalram_pages and
      zone->managed_pages etc. I hope we could remove all references to
      totalhigh_pages from the arch/ subdirectory.
      
      We have only tested these patchset on x86 platforms, and have done basic
      compliation tests using cross-compilers from ftp.kernel.org. That means
      some code may not pass compilation on some architectures. So any help
      to test this patchset are welcomed!
      
      There are several other parts still under development:
      Part3: refine code to manage totalram_pages, totalhigh_pages and
      	zone->managed_pages
      Part4: introduce helper functions to simplify mem_init() and remove the
      	global variable num_physpages.
      
      This patch:
      
      Introduce helper function free_highmem_page(), which will be used by
      architectures with HIGHMEM enabled to free highmem pages into the buddy
      system.
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Suzuki K. Poulose" <suzuki@in.ibm.com>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Attilio Rao <attilio.rao@citrix.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Cong Wang <amwang@redhat.com>
      Cc: David Daney <david.daney@cavium.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Reviewed-by: NPekka Enberg <penberg@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cfa11e08
    • J
      mm: introduce common help functions to deal with reserved/managed pages · 69afade7
      Jiang Liu 提交于
      The original goal of this patchset is to fix the bug reported by
      
        https://bugzilla.kernel.org/show_bug.cgi?id=53501
      
      Now it has also been expanded to reduce common code used by memory
      initializion.
      
      This is the first part, which applies to v3.9-rc1.
      
      It introduces following common helper functions to simplify
      free_initmem() and free_initrd_mem() on different architectures:
      
      adjust_managed_page_count():
      	will be used to adjust totalram_pages, totalhigh_pages,
      	zone->managed_pages when reserving/unresering a page.
      
      __free_reserved_page():
      	free a reserved page into the buddy system without adjusting
      	page statistics info
      
      free_reserved_page():
      	free a reserved page into the buddy system and adjust page
      	statistics info
      
      mark_page_reserved():
      	mark a page as reserved and adjust page statistics info
      
      free_reserved_area():
      	free a continous ranges of pages by calling free_reserved_page()
      
      free_initmem_default():
      	default method to free __init pages.
      
      We have only tested these patchset on x86 platforms, and have done basic
      compliation tests using cross-compilers from ftp.kernel.org.  That means
      some code may not pass compilation on some architectures.  So any help to
      test this patchset are welcomed!
      
      There are several other parts still under development:
      Part2: introduce free_highmem_page() to simplify freeing highmem pages
      Part3: refine code to manage totalram_pages, totalhigh_pages and
      	zone->managed_pages
      Part4: introduce helper functions to simplify mem_init() and remove the
      	global variable num_physpages.
      
      This patch:
      
      Code to deal with reserved/managed pages are duplicated by many
      architectures, so introduce common help functions to reduce duplicated
      code.  These common help functions will also be used to concentrate code
      to modify totalram_pages and zone->managed_pages, which makes the code
      much more clear.
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Anatolij Gustschin <agust@denx.de>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      69afade7
    • P
      vm: adjust ifdef for TINY_RCU · 8375ad98
      Paul E. McKenney 提交于
      There is an ifdef in page_cache_get_speculative() that checks for !SMP
      and TREE_RCU, which has been an impossible combination since the advent
      of TINY_RCU.  The ifdef enables a fastpath that is valid when preemption
      is disabled by rcu_read_lock() in UP systems, which is the case when
      TINY_RCU is enabled.  This commit therefore adjusts the ifdef to
      generate the fastpath when TINY_RCU is enabled.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reported-by: NAndi Kleen <andi@firstfloor.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8375ad98
    • A
      mm/shmem.c: remove an ifdef · 250297ed
      Andrew Morton 提交于
      Create a CONFIG_MMU=y stub for ramfs_nommu_expand_for_mapping() in the
      usual fashion.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Wolfram Sang <wsa@the-dreams.de>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      250297ed
    • D
      mm, show_mem: suppress page counts in non-blockable contexts · 4b59e6c4
      David Rientjes 提交于
      On large systems with a lot of memory, walking all RAM to determine page
      types may take a half second or even more.
      
      In non-blockable contexts, the page allocator will emit a page allocation
      failure warning unless __GFP_NOWARN is specified.  In such contexts, irqs
      are typically disabled and such a lengthy delay may even result in NMI
      watchdog timeouts.
      
      To fix this, suppress the page walk in such contexts when printing the
      page allocation failure warning.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b59e6c4
    • J
      debug_locks.h: make warning more verbose · 2c2fea11
      James Hogan 提交于
      The WARN_ON(1) in DEBUG_LOCKS_WARN_ON is surprisingly awkward to track
      down when it's hit, as it's usually buried in macros, causing multiple
      instances to land on the same line number.
      
      This patch makes it more useful by switching to:
      
          WARN(1, "DEBUG_LOCKS_WARN_ON(%s)", #c);
      
      so that the particular DEBUG_LOCKS_WARN_ON is more easily identified and
      grep'd for.  For example:
      
          WARNING: at kernel/mutex.c:198 _mutex_lock_nested+0x31c/0x380()
          DEBUG_LOCKS_WARN_ON(l->magic != l)
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2c2fea11
    • H
      drivers/video: add Hyper-V Synthetic Video Frame Buffer Driver · 68a2d20b
      Haiyang Zhang 提交于
      This is the driver for the Hyper-V Synthetic Video, which supports
      screen resolution up to Full HD 1920x1080 on Windows Server 2012 host,
      and 1600x1200 on Windows Server 2008 R2 or earlier.  It also solves the
      double mouse cursor issue of the emulated video mode.
      Signed-off-by: NHaiyang Zhang <haiyangz@microsoft.com>
      Reviewed-by: NK. Y. Srinivasan <kys@microsoft.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
      Cc: Olaf Hering <olaf@aepfle.de>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      68a2d20b
  2. 24 4月, 2013 1 次提交
    • A
      usb: phy: tegra: don't call into tegra-ehci directly · ee5d5499
      Arnd Bergmann 提交于
      Both phy-tegra-usb.c and ehci-tegra.c export symbols used by the other one,
      which does not work if one of them or both are loadable modules, resulting
      in an error like:
      
      drivers/built-in.o: In function `utmi_phy_clk_disable':
      drivers/usb/phy/phy-tegra-usb.c:302: undefined reference to `tegra_ehci_set_phcd'
      drivers/built-in.o: In function `utmi_phy_clk_enable':
      drivers/usb/phy/phy-tegra-usb.c:324: undefined reference to `tegra_ehci_set_phcd'
      drivers/built-in.o: In function `utmi_phy_power_on':
      drivers/usb/phy/phy-tegra-usb.c:447: undefined reference to `tegra_ehci_set_pts'
      
      This turns the interface into a one-way dependency by letting the tegra ehci
      driver pass two function pointers for callbacks that need to be called by
      the phy driver.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Venu Byravarasu <vbyravarasu@nvidia.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Stephen Warren <swarren@nvidia.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee5d5499
  3. 23 4月, 2013 3 次提交
  4. 19 4月, 2013 4 次提交
    • L
      pinctrl/pinconfig: add debug interface · f07512e6
      Laurent Meunier 提交于
      This update adds a debugfs interface to modify a pin configuration
      for a given state in the pinctrl map. This allows to modify the
      configuration for a non-active state, typically sleep state.
      This configuration is not applied right away, but only when the state
      will be entered.
      
      This solution is mandated for us by HW validation: in order
      to test and verify several pin configurations during sleep without
      recompiling the software.
      
      Change log in this patch set;
      Take into account latest feedback from Stephen Warren:
      - stale comments update
      - improved code efficiency and readibility
      - limit size of global variable pinconf_dbg_conf
      - remove req_type as it can easily be added later when
      add/delete requests support is implemented
      Signed-off-by: NLaurent Meunier <laurent.meunier@st.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      f07512e6
    • W
      mutex: Queue mutex spinners with MCS lock to reduce cacheline contention · 2bd2c92c
      Waiman Long 提交于
      The current mutex spinning code (with MUTEX_SPIN_ON_OWNER option
      turned on) allow multiple tasks to spin on a single mutex
      concurrently. A potential problem with the current approach is
      that when the mutex becomes available, all the spinning tasks
      will try to acquire the mutex more or less simultaneously. As a
      result, there will be a lot of cacheline bouncing especially on
      systems with a large number of CPUs.
      
      This patch tries to reduce this kind of contention by putting
      the mutex spinners into a queue so that only the first one in
      the queue will try to acquire the mutex. This will reduce
      contention and allow all the tasks to move forward faster.
      
      The queuing of mutex spinners is done using an MCS lock based
      implementation which will further reduce contention on the mutex
      cacheline than a similar ticket spinlock based implementation.
      This patch will add a new field into the mutex data structure
      for holding the MCS lock. This expands the mutex size by 8 bytes
      for 64-bit system and 4 bytes for 32-bit system. This overhead
      will be avoid if the MUTEX_SPIN_ON_OWNER option is turned off.
      
      The following table shows the jobs per minute (JPM) scalability
      data on an 8-node 80-core Westmere box with a 3.7.10 kernel. The
      numactl command is used to restrict the running of the fserver
      workloads to 1/2/4/8 nodes with hyperthreading off.
      
      +-----------------+-----------+-----------+-------------+----------+
      |  Configuration  | Mean JPM  | Mean JPM  |  Mean JPM   | % Change |
      |                 | w/o patch | patch 1   | patches 1&2 |  1->1&2  |
      +-----------------+------------------------------------------------+
      |                 |              User Range 1100 - 2000            |
      +-----------------+------------------------------------------------+
      | 8 nodes, HT off |  227972   |  227237   |   305043    |  +34.2%  |
      | 4 nodes, HT off |  393503   |  381558   |   394650    |   +3.4%  |
      | 2 nodes, HT off |  334957   |  325240   |   338853    |   +4.2%  |
      | 1 node , HT off |  198141   |  197972   |   198075    |   +0.1%  |
      +-----------------+------------------------------------------------+
      |                 |              User Range 200 - 1000             |
      +-----------------+------------------------------------------------+
      | 8 nodes, HT off |  282325   |  312870   |   332185    |   +6.2%  |
      | 4 nodes, HT off |  390698   |  378279   |   393419    |   +4.0%  |
      | 2 nodes, HT off |  336986   |  326543   |   340260    |   +4.2%  |
      | 1 node , HT off |  197588   |  197622   |   197582    |    0.0%  |
      +-----------------+-----------+-----------+-------------+----------+
      
      At low user range 10-100, the JPM differences were within +/-1%.
      So they are not that interesting.
      
      The fserver workload uses mutex spinning extensively. With just
      the mutex change in the first patch, there is no noticeable
      change in performance.  Rather, there is a slight drop in
      performance. This mutex spinning patch more than recovers the
      lost performance and show a significant increase of +30% at high
      user load with the full 8 nodes. Similar improvements were also
      seen in a 3.8 kernel.
      
      The table below shows the %time spent by different kernel
      functions as reported by perf when running the fserver workload
      at 1500 users with all 8 nodes.
      
      +-----------------------+-----------+---------+-------------+
      |        Function       |  % time   | % time  |   % time    |
      |                       | w/o patch | patch 1 | patches 1&2 |
      +-----------------------+-----------+---------+-------------+
      | __read_lock_failed    |  34.96%   | 34.91%  |   29.14%    |
      | __write_lock_failed   |  10.14%   | 10.68%  |    7.51%    |
      | mutex_spin_on_owner   |   3.62%   |  3.42%  |    2.33%    |
      | mspin_lock            |    N/A    |   N/A   |    9.90%    |
      | __mutex_lock_slowpath |   1.46%   |  0.81%  |    0.14%    |
      | _raw_spin_lock        |   2.25%   |  2.50%  |    1.10%    |
      +-----------------------+-----------+---------+-------------+
      
      The fserver workload for an 8-node system is dominated by the
      contention in the read/write lock. Mutex contention also plays a
      role. With the first patch only, mutex contention is down (as
      shown by the __mutex_lock_slowpath figure) which help a little
      bit. We saw only a few percents improvement with that.
      
      By applying patch 2 as well, the single mutex_spin_on_owner
      figure is now split out into an additional mspin_lock figure.
      The time increases from 3.42% to 11.23%. It shows a great
      reduction in contention among the spinners leading to a 30%
      improvement. The time ratio 9.9/2.33=4.3 indicates that there
      are on average 4+ spinners waiting in the spin_lock loop for
      each spinner in the mutex_spin_on_owner loop. Contention in
      other locking functions also go down by quite a lot.
      
      The table below shows the performance change of both patches 1 &
      2 over patch 1 alone in other AIM7 workloads (at 8 nodes,
      hyperthreading off).
      
      +--------------+---------------+----------------+-----------------+
      |   Workload   | mean % change | mean % change  | mean % change   |
      |              | 10-100 users  | 200-1000 users | 1100-2000 users |
      +--------------+---------------+----------------+-----------------+
      | alltests     |      0.0%     |     -0.8%      |     +0.6%       |
      | five_sec     |     -0.3%     |     +0.8%      |     +0.8%       |
      | high_systime |     +0.4%     |     +2.4%      |     +2.1%       |
      | new_fserver  |     +0.1%     |    +14.1%      |    +34.2%       |
      | shared       |     -0.5%     |     -0.3%      |     -0.4%       |
      | short        |     -1.7%     |     -9.8%      |     -8.3%       |
      +--------------+---------------+----------------+-----------------+
      
      The short workload is the only one that shows a decline in
      performance probably due to the spinner locking and queuing
      overhead.
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Reviewed-by: NDavidlohr Bueso <davidlohr.bueso@hp.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Chandramouleeswaran Aswin <aswin@hp.com>
      Cc: Norton Scott J <scott.norton@hp.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1366226594-5506-4-git-send-email-Waiman.Long@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2bd2c92c
    • W
      mutex: Move mutex spinning code from sched/core.c back to mutex.c · 41fcb9f2
      Waiman Long 提交于
      As mentioned by Ingo, the SCHED_FEAT_OWNER_SPIN scheduler
      feature bit was really just an early hack to make with/without
      mutex-spinning testable. So it is no longer necessary.
      
      This patch removes the SCHED_FEAT_OWNER_SPIN feature bit and
      move the mutex spinning code from kernel/sched/core.c back to
      kernel/mutex.c which is where they should belong.
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Chandramouleeswaran Aswin <aswin@hp.com>
      Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
      Cc: Norton Scott J <scott.norton@hp.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1366226594-5506-2-git-send-email-Waiman.Long@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      41fcb9f2
    • L
      Revert "block: add missing block_bio_complete() tracepoint" · 0a82a8d1
      Linus Torvalds 提交于
      This reverts commit 3a366e61.
      
      Wanlong Gao reports that it causes a kernel panic on his machine several
      minutes after boot. Reverting it removes the panic.
      
      Jens says:
       "It's not quite clear why that is yet, so I think we should just revert
        the commit for 3.9 final (which I'm assuming is pretty close).
      
        The wifi is crap at the LSF hotel, so sending this email instead of
        queueing up a revert and pull request."
      Reported-by: NWanlong Gao <gaowanlong@cn.fujitsu.com>
      Requested-by: NJens Axboe <axboe@kernel.dk>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0a82a8d1
  5. 18 4月, 2013 3 次提交
  6. 17 4月, 2013 2 次提交
    • L
      vm: add vm_iomap_memory() helper function · b4cbb197
      Linus Torvalds 提交于
      Various drivers end up replicating the code to mmap() their memory
      buffers into user space, and our core memory remapping function may be
      very flexible but it is unnecessarily complicated for the common cases
      to use.
      
      Our internal VM uses pfn's ("page frame numbers") which simplifies
      things for the VM, and allows us to pass physical addresses around in a
      denser and more efficient format than passing a "phys_addr_t" around,
      and having to shift it up and down by the page size.  But it just means
      that drivers end up doing that shifting instead at the interface level.
      
      It also means that drivers end up mucking around with internal VM things
      like the vma details (vm_pgoff, vm_start/end) way more than they really
      need to.
      
      So this just exports a function to map a certain physical memory range
      into user space (using a phys_addr_t based interface that is much more
      natural for a driver) and hides all the complexity from the driver.
      Some drivers will still end up tweaking the vm_page_prot details for
      things like prefetching or cacheability etc, but that's actually
      relevant to the driver, rather than caring about what the page offset of
      the mapping is into the particular IO memory region.
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b4cbb197
    • M
      PCI/ACPI: Remove support of ACPI PCI subdrivers · c309dbb4
      Myron Stowe 提交于
      Both sub-drivers of the "PCI Root Bridge ("pci_bridge")" driver, "acpiphp"
      and "pci_slot", have been converted to hook directly into the PCI core.
      
      With the conversions there are no remaining usages of the 'struct
      acpi_pci_driver' list based infrastructure.  This patch removes it.
      Signed-off-by: NMyron Stowe <myron.stowe@redhat.com>
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NYinghai Lu <yinghai@kernel.org>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      c309dbb4
  7. 16 4月, 2013 3 次提交
  8. 15 4月, 2013 1 次提交
  9. 13 4月, 2013 6 次提交
  10. 12 4月, 2013 2 次提交
    • T
      kthread: Prevent unpark race which puts threads on the wrong cpu · f2530dc7
      Thomas Gleixner 提交于
      The smpboot threads rely on the park/unpark mechanism which binds per
      cpu threads on a particular core. Though the functionality is racy:
      
      CPU0	       	 	CPU1  	     	    CPU2
      unpark(T)				    wake_up_process(T)
        clear(SHOULD_PARK)	T runs
      			leave parkme() due to !SHOULD_PARK  
        bind_to(CPU2)		BUG_ON(wrong CPU)						    
      
      We cannot let the tasks move themself to the target CPU as one of
      those tasks is actually the migration thread itself, which requires
      that it starts running on the target cpu right away.
      
      The solution to this problem is to prevent wakeups in park mode which
      are not from unpark(). That way we can guarantee that the association
      of the task to the target cpu is working correctly.
      
      Add a new task state (TASK_PARKED) which prevents other wakeups and
      use this state explicitly for the unpark wakeup.
      
      Peter noticed: Also, since the task state is visible to userspace and
      all the parked tasks are still in the PID space, its a good hint in ps
      and friends that these tasks aren't really there for the moment.
      
      The migration thread has another related issue.
      
      CPU0	      	     	 CPU1
      Bring up CPU2
      create_thread(T)
      park(T)
       wait_for_completion()
      			 parkme()
      			 complete()
      sched_set_stop_task()
      			 schedule(TASK_PARKED)
      
      The sched_set_stop_task() call is issued while the task is on the
      runqueue of CPU1 and that confuses the hell out of the stop_task class
      on that cpu. So we need the same synchronizaion before
      sched_set_stop_task().
      Reported-by: NDave Jones <davej@redhat.com>
      Reported-and-tested-by: NDave Hansen <dave@sr71.net>
      Reported-and-tested-by: NBorislav Petkov <bp@alien8.de>
      Acked-by: NPeter Ziljstra <peterz@infradead.org>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: dhillf@gmail.com
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304091635430.21884@ionosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f2530dc7
    • A
      tty: serial/samsung: make register definitions global · 9ee51f01
      Arnd Bergmann 提交于
      The registers for the Samsung S3C serial port are currently defined in
      the platform specific arch/arm/plat-samsung/include/plat/regs-serial.h
      file, which is not visible to multiplatform capable drivers.
      
      Unfortunately, it is not possible to move the file into a more local
      place as we should normally try to, because the same registers
      may be used in one of four places:
      
      * In the driver itself
      * In platform-independent ARM code for early debug output
      * In platform_data definitions
      * In the Samsung platform power management code
      
      I have also found no way to logically split out a platform_data
      file, other than possibly move everything into
      include/linux/platform_data, which also felt wrong. The only
      part of this file that makes sense to keep specific to the s3c24xx
      platform are the virtual and physical addresses defined here,
      which are needed in no other location.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ee51f01