1. 18 3月, 2016 3 次提交
  2. 14 3月, 2016 2 次提交
    • A
      ipv6: Pass proto to csum_ipv6_magic as __u8 instead of unsigned short · 1e940829
      Alexander Duyck 提交于
      This patch updates csum_ipv6_magic so that it correctly recognizes that
      protocol is a unsigned 8 bit value.
      
      This will allow us to better understand what limitations may or may not be
      present in how we handle the data.  For example there are a number of
      places that call htonl on the protocol value.  This is likely not necessary
      and can be replaced with a multiplication by ntohl(1) which will be
      converted to a shift by the compiler.
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1e940829
    • A
      ipv4: Update parameters for csum_tcpudp_magic to their original types · 01cfbad7
      Alexander Duyck 提交于
      This patch updates all instances of csum_tcpudp_magic and
      csum_tcpudp_nofold to reflect the types that are usually used as the source
      inputs.  For example the protocol field is populated based on nexthdr which
      is actually an unsigned 8 bit value.  The length is usually populated based
      on skb->len which is an unsigned integer.
      
      This addresses an issue in which the IPv6 function csum_ipv6_magic was
      generating a checksum using the full 32b of skb->len while
      csum_tcpudp_magic was only using the lower 16 bits.  As a result we could
      run into issues when attempting to adjust the checksum as there was no
      protocol agnostic way to update it.
      
      With this change the value is still truncated as many architectures use
      "(len + proto) << 8", however this truncation only occurs for values
      greater than 16776960 in length and as such is unlikely to occur as we stop
      the inner headers at ~64K in size.
      
      I did have to make a few minor changes in the arm, mn10300, nios2, and
      score versions of the function in order to support these changes as they
      were either using things such as an OR to combine the protocol and length,
      or were using ntohs to convert the length which would have truncated the
      value.
      
      I also updated a few spots in terms of whitespace and type differences for
      the addresses.  Most of this was just to make sure all of the definitions
      were in sync going forward.
      Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      01cfbad7
  3. 11 3月, 2016 11 次提交
    • M
      xtensa: support hardware breakpoints/watchpoints · c91e02bd
      Max Filippov 提交于
      Use perf framework to manage hardware instruction and data breakpoints.
      Add two new ptrace calls: PTRACE_GETHBPREGS and PTRACE_SETHBPREGS to
      query and set instruction and data breakpoints.
      Address bit 0 choose instruction (0) or data (1) break register, bits
      31..1 are the register number.
      Both calls transfer two 32-bit words: address (0) and control (1).
      Instruction breakpoint contorl word is 0 to clear breakpoint, 1 to set.
      Data breakpoint control word bit 31 is 'trigger on store', bit 30 is
      'trigger on load, bits 29..0 are length. Length 0 is used to clear a
      breakpoint. To set a breakpoint length must be a power of 2 in the range
      1..64 and the address must be length-aligned.
      
      Introduce new thread_info flag: TIF_DB_DISABLED. Set it if debug
      exception is raised by the kernel code accessing watched userspace
      address and disable corresponding data breakpoint. On exit to userspace
      check that flag and, if set, restore all data breakpoints.
      
      Handle debug exceptions raised with PS.EXCM set. This may happen when
      window overflow/underflow handler or fast exception handler hits data
      breakpoint, in which case save and disable all data breakpoints,
      single-step faulting instruction and restore data breakpoints.
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      c91e02bd
    • M
      xtensa: use context structure for debug exceptions · 6ec7026a
      Max Filippov 提交于
      With implementation of data breakpoints debug exceptions raised when
      PS.EXCM is set need to be handled, e.g. window overflow code can write
      to watched userspace address. Currently debug exception handler uses
      EXCSAVE and DEPC SRs to save temporary registers, but DEPC may not be
      available when PS.EXCM is set and more space will be needed to save
      additional state.
      Reorganize debug context: create per-CPU structure debug_table instance
      and store its address in the EXCSAVE<debug level> instead of
      debug_exception function address. Expand this structure when more save
      space is needed.
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      6ec7026a
    • M
      xtensa: remove remaining non-functional KGDB bits · 816aa588
      Max Filippov 提交于
      KGDB is not supported on xtensa, but there are bits of related code
      under arch/xtensa/kernel. Remove these bits.
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      816aa588
    • M
      xtensa: clear all DBREAKC registers on start · 7de7ac78
      Max Filippov 提交于
      There are XCHAL_NUM_DBREAK registers, clear them all.
      This also fixes cryptic assembler error message with binutils 2.25 when
      XCHAL_NUM_DBREAK is 0:
      
        as: out of memory allocating 18446744073709551575 bytes after a total
        of 495616 bytes
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      7de7ac78
    • M
      xtensa: xtfpga: fix earlycon endianness · 56b9f9d6
      Max Filippov 提交于
      Serial port is attached to XTFPGA boards as native endian device, now
      that earlycon parameter parser understands mmio32native put it into
      earlycon kernel parameter. This makes early console functional on both
      little- and big-endian CPUs with identical kernel command lines.
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      56b9f9d6
    • M
      xtensa: xtfpga: fix i2c controller register width and endianness · bce299ca
      Max Filippov 提交于
      I2C controller is attached to XTFPGA boards as native endian device, mark
      it as such in DTS.
      Set register width in DTS to 4, this way it works both for little- and
      big-endian CPUs.
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      bce299ca
    • M
      xtensa: xtfpga: fix ethernet controller endianness · d99434e1
      Max Filippov 提交于
      Ethernet controller is attached to XTFPGA boards as native endian device,
      mark it as such in DTS and pass correct endianness in platform data.
      This makes network functional on big-endian CPUs.
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      d99434e1
    • M
      xtensa: xtfpga: fix serial port register width and endianness · abfbd895
      Max Filippov 提交于
      Serial port is attached to XTFPGA boards as native endian device, mark
      it as such in DTS and pass correct endianness in platform data.
      Set register width in DTS to 4, this way it matches the platform data
      and works correctly on big-endian CPUs.
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      abfbd895
    • M
      xtensa: define CONFIG_CPU_{BIG,LITTLE}_ENDIAN · 4611bf7e
      Max Filippov 提交于
      Query compiler for the CPU endianness and add corresponding definition
      to KBUILD_CPPFLAGS. This allows using 'native-endian' property in DTS.
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      4611bf7e
    • M
      xtensa: fix preemption in {clear,copy}_user_highpage · a67cc9aa
      Max Filippov 提交于
      Disabling pagefault makes little sense there, preemption disabling is
      what was meant.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      a67cc9aa
    • M
      xtensa: ISS: don't hang if stdin EOF is reached · 362014c8
      Max Filippov 提交于
      Simulator stdin may be connected to a file, when its end is reached
      kernel hangs in infinite loop inside rs_poll, because simc_poll always
      signals that descriptor 0 is readable and simc_read always returns 0.
      Check simc_read return value and exit loop if it's not positive. Also
      don't rewind polling timer if it's zero.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      362014c8
  4. 09 3月, 2016 1 次提交
    • B
      PCI: Include pci/hotplug Kconfig directly from pci/Kconfig · e7e127e3
      Bjorn Helgaas 提交于
      Include pci/hotplug/Kconfig directly from pci/Kconfig, so arches don't
      have to source both pci/Kconfig and pci/hotplug/Kconfig.
      
      Note that this effectively adds pci/hotplug/Kconfig to the following
      arches, because they already sourced drivers/pci/Kconfig but they
      previously did not source drivers/pci/hotplug/Kconfig:
      
        alpha
        arm
        avr32
        frv
        m68k
        microblaze
        mn10300
        sparc
        unicore32
      
      Inspired-by-patch-from: Bogicevic Sasa <brutallesale@gmail.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      e7e127e3
  5. 08 3月, 2016 1 次提交
  6. 02 3月, 2016 1 次提交
    • T
      arch/hotplug: Call into idle with a proper state · fc6d73d6
      Thomas Gleixner 提交于
      Let the non boot cpus call into idle with the corresponding hotplug state, so
      the hotplug core can handle the further bringup. That's a first step to
      convert the boot side of the hotplugged cpus to do all the synchronization
      with the other side through the state machine. For now it'll only start the
      hotplug thread and kick the full bringup of the cpu.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: linux-arch@vger.kernel.org
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
      Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Turner <pjt@google.com>
      Link: http://lkml.kernel.org/r/20160226182341.614102639@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      fc6d73d6
  7. 26 2月, 2016 1 次提交
    • T
      net: Facility to report route quality of connected sockets · a87cb3e4
      Tom Herbert 提交于
      This patch add the SO_CNX_ADVICE socket option (setsockopt only). The
      purpose is to allow an application to give feedback to the kernel about
      the quality of the network path for a connected socket. The value
      argument indicates the type of quality report. For this initial patch
      the only supported advice is a value of 1 which indicates "bad path,
      please reroute"-- the action taken by the kernel is to call
      dst_negative_advice which will attempt to choose a different ECMP route,
      reset the TX hash for flow label and UDP source port in encapsulation,
      etc.
      
      This facility should be useful for connected UDP sockets where only the
      application can provide any feedback about path quality. It could also
      be useful for TCP applications that have additional knowledge about the
      path outside of the normal TCP control loop.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a87cb3e4
  8. 16 2月, 2016 1 次提交
  9. 07 2月, 2016 1 次提交
  10. 29 1月, 2016 1 次提交
  11. 21 1月, 2016 3 次提交
    • C
      dma-mapping: remove <asm-generic/dma-coherent.h> · 20d666e4
      Christoph Hellwig 提交于
      This wasn't an asm-generic header to start with, and can be merged into
      dma-mapping.h trivially.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Helge Deller <deller@gmx.de>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      20d666e4
    • C
      dma-mapping: always provide the dma_map_ops based implementation · e1c7e324
      Christoph Hellwig 提交于
      Move the generic implementation to <linux/dma-mapping.h> now that all
      architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now
      that everyone supports them.
      
      [valentinrothberg@gmail.com: remove leftovers in Kconfig]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Helge Deller <deller@gmx.de>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: NValentin Rothberg <valentinrothberg@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e1c7e324
    • G
      mm: arch: remove duplicate definitions of MADV_FREE · dcd6c87c
      Guenter Roeck 提交于
      Commits 21f55b01 ("arch/*/include/uapi/asm/mman.h: : let MADV_FREE
      have same value for all architectures") and ef58978f ("mm: define
      MADV_FREE for some arches") both defined MADV_FREE, but did not use the
      same values.  This results in build errors such as
      
        ./arch/alpha/include/uapi/asm/mman.h:53:0: error: "MADV_FREE" redefined
        ./arch/alpha/include/uapi/asm/mman.h:50:0: note: this is the location of the previous definition
      
      for the affected architectures.
      
      Fixes: 21f55b01 ("arch/*/include/uapi/asm/mman.h: : let MADV_FREE have same value for all architectures")
      Fixes: ef58978f ("mm: define MADV_FREE for some arches")
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Cc: Chen Gang <gang.chen.5i5j@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Acked-by: Helge Deller <deller@gmx.de>	[parisc]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dcd6c87c
  12. 16 1月, 2016 3 次提交
    • C
      arch/*/include/uapi/asm/mman.h: : let MADV_FREE have same value for all architectures · 21f55b01
      Chen Gang 提交于
      For uapi, need try to let all macros have same value, and MADV_FREE is
      added into main branch recently, so need redefine MADV_FREE for it.
      
      At present, '8' can be shared with all architectures, so redefine it to
      '8'.
      
      [sudipm.mukherjee@gmail.com: correct uniform value of MADV_FREE]
      Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com>
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Acked-by: NHugh Dickins <hughd@google.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Darrick J. Wong <darrick.wong@oracle.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: <yalin.wang2010@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Daniel Micay <danielmicay@gmail.com>
      Cc: Jason Evans <je@fb.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mika Penttil <mika.penttila@nextfour.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NSudip Mukherjee <sudip@vectorindia.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      21f55b01
    • M
      mm: define MADV_FREE for some arches · ef58978f
      Minchan Kim 提交于
      Most architectures use asm-generic, but alpha, mips, parisc, xtensa need
      their own definitions.
      
      This patch defines MADV_FREE for them so it should fix build break for
      their architectures.
      
      Maybe, I should split and feed pieces to arch maintainers but included
      here for mmotm convenience.
      
      [gang.chen.5i5j@gmail.com: let MADV_FREE have same value for all architectures]
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Acked-by: NMax Filippov <jcmvbkbc@gmail.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: <yalin.wang2010@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Gang <gang.chen.5i5j@gmail.com>
      Cc: Daniel Micay <danielmicay@gmail.com>
      Cc: Darrick J. Wong <darrick.wong@oracle.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jason Evans <je@fb.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mika Penttil <mika.penttila@nextfour.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com>
      Acked-by: NHugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef58978f
    • K
      mm: differentiate page_mapped() from page_mapcount() for compound pages · e1534ae9
      Kirill A. Shutemov 提交于
      Let's define page_mapped() to be true for compound pages if any
      sub-pages of the compound page is mapped (with PMD or PTE).
      
      On other hand page_mapcount() return mapcount for this particular small
      page.
      
      This will make cases like page_get_anon_vma() behave correctly once we
      allow huge pages to be mapped with PTE.
      
      Most users outside core-mm should use page_mapcount() instead of
      page_mapped().
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: NSasha Levin <sasha.levin@oracle.com>
      Tested-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: NJerome Marchand <jmarchan@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Steve Capper <steve.capper@linaro.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e1534ae9
  13. 13 1月, 2016 1 次提交
  14. 11 1月, 2016 3 次提交
  15. 05 1月, 2016 1 次提交
  16. 04 1月, 2016 1 次提交
  17. 09 11月, 2015 2 次提交
  18. 08 11月, 2015 1 次提交
  19. 07 11月, 2015 1 次提交
    • K
      mm: make compound_head() robust · 1d798ca3
      Kirill A. Shutemov 提交于
      Hugh has pointed that compound_head() call can be unsafe in some
      context. There's one example:
      
      	CPU0					CPU1
      
      isolate_migratepages_block()
        page_count()
          compound_head()
            !!PageTail() == true
      					put_page()
      					  tail->first_page = NULL
            head = tail->first_page
      					alloc_pages(__GFP_COMP)
      					   prep_compound_page()
      					     tail->first_page = head
      					     __SetPageTail(p);
            !!PageTail() == true
          <head == NULL dereferencing>
      
      The race is pure theoretical. I don't it's possible to trigger it in
      practice. But who knows.
      
      We can fix the race by changing how encode PageTail() and compound_head()
      within struct page to be able to update them in one shot.
      
      The patch introduces page->compound_head into third double word block in
      front of compound_dtor and compound_order. Bit 0 encodes PageTail() and
      the rest bits are pointer to head page if bit zero is set.
      
      The patch moves page->pmd_huge_pte out of word, just in case if an
      architecture defines pgtable_t into something what can have the bit 0
      set.
      
      hugetlb_cgroup uses page->lru.next in the second tail page to store
      pointer struct hugetlb_cgroup. The patch switch it to use page->private
      in the second tail page instead. The space is free since ->first_page is
      removed from the union.
      
      The patch also opens possibility to remove HUGETLB_CGROUP_MIN_ORDER
      limitation, since there's now space in first tail page to store struct
      hugetlb_cgroup pointer. But that's out of scope of the patch.
      
      That means page->compound_head shares storage space with:
      
       - page->lru.next;
       - page->next;
       - page->rcu_head.next;
      
      That's too long list to be absolutely sure, but looks like nobody uses
      bit 0 of the word.
      
      page->rcu_head.next guaranteed[1] to have bit 0 clean as long as we use
      call_rcu(), call_rcu_bh(), call_rcu_sched(), or call_srcu(). But future
      call_rcu_lazy() is not allowed as it makes use of the bit and we can
      get false positive PageTail().
      
      [1] http://lkml.kernel.org/g/20150827163634.GD4029@linux.vnet.ibm.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1d798ca3
  20. 06 11月, 2015 1 次提交
    • E
      mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage · b0f205c2
      Eric B Munson 提交于
      The previous patch introduced a flag that specified pages in a VMA should
      be placed on the unevictable LRU, but they should not be made present when
      the area is created.  This patch adds the ability to set this state via
      the new mlock system calls.
      
      We add MLOCK_ONFAULT for mlock2 and MCL_ONFAULT for mlockall.
      MLOCK_ONFAULT will set the VM_LOCKONFAULT modifier for VM_LOCKED.
      MCL_ONFAULT should be used as a modifier to the two other mlockall flags.
      When used with MCL_CURRENT, all current mappings will be marked with
      VM_LOCKED | VM_LOCKONFAULT.  When used with MCL_FUTURE, the mm->def_flags
      will be marked with VM_LOCKED | VM_LOCKONFAULT.  When used with both
      MCL_CURRENT and MCL_FUTURE, all current mappings and mm->def_flags will be
      marked with VM_LOCKED | VM_LOCKONFAULT.
      
      Prior to this patch, mlockall() will unconditionally clear the
      mm->def_flags any time it is called without MCL_FUTURE.  This behavior is
      maintained after adding MCL_ONFAULT.  If a call to mlockall(MCL_FUTURE) is
      followed by mlockall(MCL_CURRENT), the mm->def_flags will be cleared and
      new VMAs will be unlocked.  This remains true with or without MCL_ONFAULT
      in either mlockall() invocation.
      
      munlock() will unconditionally clear both vma flags.  munlockall()
      unconditionally clears for VMA flags on all VMAs and in the mm->def_flags
      field.
      Signed-off-by: NEric B Munson <emunson@akamai.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b0f205c2