1. 12 12月, 2012 19 次提交
    • W
      memory-hotplug: skip HWPoisoned page when offlining pages · b023f468
      Wen Congyang 提交于
      hwpoisoned may be set when we offline a page by the sysfs interface
      /sys/devices/system/memory/soft_offline_page or
      /sys/devices/system/memory/hard_offline_page. If we don't clear
      this flag when onlining pages, this page can't be freed, and will
      not in free list. So we can't offline these pages again. So we
      should skip such page when offlining pages.
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b023f468
    • Y
      memory hotplug: suppress "Device memoryX does not have a release() function" warning · fa7194eb
      Yasuaki Ishimatsu 提交于
      When calling remove_memory_block(), the function shows following message
      at device_release().
      
      "Device 'memory528' does not have a release() function, it is broken and
      must be fixed."
      
      The reason is memory_block's device struct does not have a release()
      function.
      
      So the patch registers memory_block_release() to the device's release()
      function for suppressing the warning message.  Additionally, the patch
      moves kfree(mem) into the release function since the release function is
      prepared as a means to free a memory_block struct.
      Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fa7194eb
    • B
      thp: cleanup: introduce mk_huge_pmd() · b3092b3b
      Bob Liu 提交于
      Introduce mk_huge_pmd() to simplify the code
      Signed-off-by: NBob Liu <lliubbo@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Ni zhan Chen <nizhan.chen@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3092b3b
    • B
      thp: introduce hugepage_vma_check() · fa475e51
      Bob Liu 提交于
      Multiple places do the same check.
      Signed-off-by: NBob Liu <lliubbo@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Ni zhan Chen <nizhan.chen@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fa475e51
    • B
      mm: introduce mm_find_pmd() · 6219049a
      Bob Liu 提交于
      Several place need to find the pmd by(mm_struct, address), so introduce a
      function to simplify it.
      
      [akpm@linux-foundation.org: fix warning]
      Signed-off-by: NBob Liu <lliubbo@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Ni zhan Chen <nizhan.chen@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6219049a
    • B
      thp: clean up __collapse_huge_page_isolate · 344aa35c
      Bob Liu 提交于
      There are duplicated places using release_pte_pages().
      And release_all_pte_pages() can be removed.
      Signed-off-by: NBob Liu <lliubbo@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Ni zhan Chen <nizhan.chen@gmail.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      344aa35c
    • K
      mm: use IS_ENABLED(CONFIG_COMPACTION) instead of COMPACTION_BUILD · d84da3f9
      Kirill A. Shutemov 提交于
      We don't need custom COMPACTION_BUILD anymore, since we have handy
      IS_ENABLED().
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d84da3f9
    • K
      mm: use IS_ENABLED(CONFIG_NUMA) instead of NUMA_BUILD · e5adfffc
      Kirill A. Shutemov 提交于
      We don't need custom NUMA_BUILD anymore, since we have handy
      IS_ENABLED().
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e5adfffc
    • D
      mm, memcg: make mem_cgroup_out_of_memory() static · 19965460
      David Rientjes 提交于
      mem_cgroup_out_of_memory() is only referenced from within file scope, so
      it can be marked static.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      19965460
    • R
      mm: show migration types in show_mem · 377e4f16
      Rabin Vincent 提交于
      This is useful to diagnose the reason for page allocation failure for
      cases where there appear to be several free pages.
      
      Example, with this alloc_pages(GFP_ATOMIC) failure:
      
       swapper/0: page allocation failure: order:0, mode:0x0
       ...
       Mem-info:
       Normal per-cpu:
       CPU    0: hi:   90, btch:  15 usd:  48
       CPU    1: hi:   90, btch:  15 usd:  21
       active_anon:0 inactive_anon:0 isolated_anon:0
        active_file:0 inactive_file:84 isolated_file:0
        unevictable:0 dirty:0 writeback:0 unstable:0
        free:4026 slab_reclaimable:75 slab_unreclaimable:484
        mapped:0 shmem:0 pagetables:0 bounce:0
       Normal free:16104kB min:2296kB low:2868kB high:3444kB active_anon:0kB
       inactive_anon:0kB active_file:0kB inactive_file:336kB unevictable:0kB
       isolated(anon):0kB isolated(file):0kB present:331776kB mlocked:0kB
       dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:300kB
       slab_unreclaimable:1936kB kernel_stack:328kB pagetables:0kB unstable:0kB
       bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
       lowmem_reserve[]: 0 0
      
      Before the patch, it's hard (for me, at least) to say why all these free
      chunks weren't considered for allocation:
      
       Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 1*256kB 1*512kB
       1*1024kB 1*2048kB 3*4096kB = 16128kB
      
      After the patch, it's obvious that the reason is that all of these are
      in the MIGRATE_CMA (C) freelist:
      
       Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 1*256kB (C) 1*512kB
       (C) 1*1024kB (C) 1*2048kB (C) 3*4096kB (C) = 16128kB
      Signed-off-by: NRabin Vincent <rabin.vincent@stericsson.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      377e4f16
    • N
      writeback: remove nr_pages_dirtied arg from balance_dirty_pages_ratelimited_nr() · d0e1d66b
      Namjae Jeon 提交于
      There is no reason to pass the nr_pages_dirtied argument, because
      nr_pages_dirtied value from the caller is unused in
      balance_dirty_pages_ratelimited_nr().
      Signed-off-by: NNamjae Jeon <linkinjeon@gmail.com>
      Signed-off-by: NVivek Trivedi <vtrivedi018@gmail.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0e1d66b
    • L
      Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6 · b58ed041
      Linus Torvalds 提交于
      Pull device tree changes from Grant Likely:
       "Here are the DT changes I've got queued up for v3.8.  As described
        below, there are a lot of bug fixes here and documentation updates but
        nothing major:
      
        Bug fixes, little cleanups, and documentation changes.  The most
        invasive thing here touches a bunch of the arch directories to use a
        common build rule for .dtb files.  There are no major changes to
        functionality here other than a few new helper functions."
      
      * tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6: (34 commits)
        arm64: Fix the dtbs target building
        mtd: nand: davinci: fix the binding documentation
        rtc: rtc-mv: Add the device tree binding documentation
        devicetree/bindings: Move gpio-leds binding into leds directory
        of/vendor-prefixes: add Imagination Technologies
        microblaze: use new common dtc rule
        c6x: use new common dtc rule
        openrisc: use new common dtc rule
        arm64: Add dtbs target for building all the enabled dtb files
        arm64: use new common dtc rule
        ARM: dt: change .dtb build rules to build in dts directory
        kbuild: centralize .dts->.dtb rule
        Fix build when CONFIG_W1_MASTER_GPIO=m b exporting "allnodes"
        of/spi: Honour "status=disabled" property of device
        of_mdio: Honour "status=disabled" property of device
        of_i2c: Honour "status=disabled" property of device
        powerpc: Fix fallout from device_node->name constification
        of: add 'const' for of_parse_phandle parameter *np
        Documentation: correct of_platform_populate() argument list
        script: dtc: clean generated files
        ...
      b58ed041
    • L
      Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6 · 259cdbee
      Linus Torvalds 提交于
      Pull irqdomain changes from Grant Likely:
       "Trivial changes to irqdomain.  An update to the documentation and make
        one of the error paths not quite so obnoxious."
      
      * tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6:
        irqdomain: update documentation
        irqdomain: stop screaming about preallocated irqdescs
      259cdbee
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp · 9ada9fd5
      Linus Torvalds 提交于
      Pull EDAC fixes from Borislav Petkov:
      
       - EDAC core error path fix, from Denis Kirjanov.
      
       - Generalization of AMD MCE bank names and some minor error reporting
         improvements.
      
       - EDAC core cleanups and simplifications, from Wei Yongjun.
      
       - amd64_edac fixes for sysfs-reported values, from Josh Hunt.
      
       - some heavy amd64_edac error reporting path shaving, leading to
         removing a bunch of code.
      
       - amd64_edac error injection method improvements.
      
       - EDAC core cleanups and fixes
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (24 commits)
        EDAC, pci_sysfs: Use for_each_pci_dev to simplify the code
        EDAC: Handle error path in edac_mc_sysfs_init() properly
        MCE, AMD: Dump error status
        MCE, AMD: Report decoded error type first
        MCE, AMD: Dump CPU f/m/s triple with the error
        MCE, AMD: Remove functional unit references
        EDAC: Convert to use simple_open()
        EDAC, Calxeda highbank: Convert to use simple_open()
        EDAC: Fix mc size reported in sysfs
        EDAC: Fix csrow size reported in sysfs
        EDAC: Pass mci parent
        EDAC: Add memory controller flags
        amd64_edac: Fix csrows size and pages computation
        amd64_edac: Use DBAM_DIMM macro
        amd64_edac: Fix K8 chip select reporting
        amd64_edac: Reorganize error reporting path
        amd64_edac: Do not check whether error address is valid
        amd64_edac: Improve error injection
        amd64_edac: Cleanup error injection code
        amd64_edac: Small fixlets and cleanups
        ...
      9ada9fd5
    • L
      Merge branch 'for-v3.8' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping · c45564e9
      Linus Torvalds 提交于
      Pull CMA and DMA-mapping update from Marek Szyprowski:
       "Another set of Contiguous Memory Allocator and DMA-mapping framework
        updates for v3.8.
      
        This pull request consists only of two patches.  The first fixes a
        long standing issue with dmapools (the code predates current GIT
        history), which forced all allocations to use GFP_ATOMIC flag,
        ignoring the flags passed by the caller.  The second patch changes CMA
        code to correctly use phys_addr_t type what enables support for LPAE
        systems."
      
      * 'for-v3.8' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
        drivers: cma: represent physical addresses as phys_addr_t
        mm: dmapool: use provided gfp flags for all dma_alloc_coherent() calls
      c45564e9
    • L
      Merge tag 'clk-for-linus' of git://git.linaro.org/people/mturquette/linux · 93874681
      Linus Torvalds 提交于
      Pull clock framework changes from Mike Turquette:
       "The common clock framework changes for 3.8 are comprised of lots of
        fixes for existing platforms as well as new ports for some ARM
        platforms.  In addition there are new clk drivers for audio devices
        and MFDs."
      
      Fix up trivial conflict in <linux/clk-provider.h> (removal of 'inline'
      clashing with return type fixes)
      
      * tag 'clk-for-linus' of git://git.linaro.org/people/mturquette/linux: (51 commits)
        MAINTAINERS: bad email address for Mike Turquette
        clk: introduce optional disable_unused callback
        clk: ux500: fix bit error
        clk: clock multiplexers may register out of order
        clk: ux500: Initial support for abx500 clock driver
        CLK: SPEAr: Remove unused dummy apb_pclk
        CLK: SPEAr: Correct index scanning done for clock synths
        CLK: SPEAr: Update clock rate table
        CLK: SPEAr: Add missing clocks
        CLK: SPEAr: Set CLK_SET_RATE_PARENT for few clocks
        CLK: SPEAr13xx: fix parent names of multiple clocks
        CLK: SPEAr13xx: Fix mux clock names
        CLK: SPEAr: Fix dev_id & con_id for multiple clocks
        clk: move IM-PD1 clocks to drivers/clk
        clk: make ICST driver handle the VCO registers
        clk: add GPLv2 headers to the Versatile clock files
        clk: mxs: Use a better name for the USB PHY clock
        clk: spear: Add stub functions for spear3[0|1|2]0_clk_init()
        CLK: clk-twl6040: fix return value check in twl6040_clk_probe()
        clk: ux500: Register nomadik keypad clock lookups for u8500
        ...
      93874681
    • L
      Merge tag 'pinctrl-for-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 505cbeda
      Linus Torvalds 提交于
      Pull pinctrl changes from Linus Walleij:
       "These are the first and major pinctrl changes for the v3.8 merge
        cycle.  Some of this is used as merge base for other trees so I better
        be early on the trigger.
      
        As can be seen from the diffstat the major changes are:
      
        - A big conversion of the AT91 pinctrl driver and the associated ACKed
          platform changes under arch/arm/max-at91 and its device trees.  This
          has been coordinated with the AT91 maintainers to go in through the
          pinctrl tree.
      
        - A larger chunk of changes to the SPEAr drivers and the addition of
          the "plgpio" driver for the SPEAr as well.
      
        - The removal of the remnants of the Nomadik driver from the arch/arm
          tree and fusion of that into the Nomadik driver and platform data
          header files.
      
        - Some local movement in the Marvell MVEBU drivers, these now have
          their own subdirectory.
      
        - The addition of a chunk of code to gpiolib under drivers/gpio to
          register gpio-to-pin range mappings from the GPIO side of things.
          This has been requested by Grant Likely and is now implemented, it
          is particularly useful for device tree work.
      
        Then we have incremental updates all over the place, many of these are
        cleanups and fixes from Axel Lin who has done a great job of removing
        minor mistakes and compilation annoyances."
      
      * tag 'pinctrl-for-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (114 commits)
        ARM: mmp: select PINCTRL for ARCH_MMP
        pinctrl: Drop selecting PINCONF for MMP2, PXA168 and PXA910
        pinctrl: pinctrl-single: Fix error check condition
        pinctrl: SPEAr: Update error check for unsigned variables
        gpiolib: Fix use after free in gpiochip_add_pin_range
        gpiolib: rename pin range arguments
        pinctrl: single: support gpio request and free
        pinctrl: generic: add input schmitt disable parameter
        pinctrl/u300/coh901: stop spawning pinctrl from GPIO
        pinctrl/u300/coh901: let the gpio_chip register the range
        pinctrl: add function to retrieve range from pin
        gpiolib: return any error code from range creation
        pinctrl: make range registration defer properly
        gpiolib: rename find_pinctrl_*
        gpiolib: let gpiochip_add_pin_range() specify offset
        ARM: at91: pm9g45: add mmc support
        ARM: at91: Animeo IP: add mmc support
        ARM: at91: dt: add mmc pinctrl for Atmel reference boards
        ARM: at91: dt: at91sam9: add mmc pinctrl support
        ARM: at91/dts: add nodes for atmel hsmci controllers for atmel boards
        ...
      505cbeda
    • L
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · a8936db7
      Linus Torvalds 提交于
      Pull hwmon updates from Guenter Roeck:
       "New driver: DA9055
      
        Added/improved support for new chips in existing drivers: Z650/670,
        N550/570, ADS7830, AMD 16h family"
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (da9055) Fix chan_mux[DA9055_ADC_ADCIN3] setting
        hwmon: DA9055 HWMON driver
        hwmon: (coretemp) List TjMax for Z650/670 and N550/570
        hwmon: (coretemp) Drop N4xx, N5xx, D4xx, D5xx CPUs from tjmax table
        hwmon: (coretemp) Use model table instead of if/else to identify CPU models
        hwmon: da9052: Use da9052_reg_update for rmw operations
        hwmon: (coretemp) Drop dependency on PCI for TjMax detection on Atom CPUs
        hwmon: (ina2xx) use module_i2c_driver to simplify the code
        hwmon: (ads7828) add support for ADS7830
        hwmon: (ads7828) driver cleanup
        x86,AMD: Power driver support for AMD's family 16h processors
      a8936db7
    • L
      Merge tag 'mmc-updates-for-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc · 11b84c58
      Linus Torvalds 提交于
      Pull MMC updates from Chris Ball:
       "MMC highlights for 3.8:
      
        Core:
         - Expose access to the eMMC RPMB ("Replay Protected Memory Block")
           area by extending the existing mmc_block ioctl.
         - Add SDIO powered-suspend DT properties to the core MMC DT binding.
         - Add no-1-8-v DT flag for boards where the SD controller reports
           that it supports 1.8V but the board itself has no way to switch to
           1.8V.
         - More work on switching to 1.8V UHS support using a vqmmc regulator.
         - Fix up a case where the slot-gpio helper may fail to reset the host
           controller properly if a card was removed during a transfer.
         - Fix several cases where a broken device could cause an infinite
           loop while we wait for a register to update.
      
        Drivers:
         - at91-mci: Remove obsolete driver, atmel-mci handles these devices
           now.
         - sdhci-dove: Allow using GPIOs for card-detect notifications.
         - sdhci-esdhc: Fix for recovering from ADMA errors on broken silicon.
         - sdhci-s3c: Add pinctrl support.
         - wmt-sdmmc: New driver for WonderMedia SD/MMC controllers."
      
      * tag 'mmc-updates-for-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (65 commits)
        mmc: sdhci: implement the .card_event() method
        mmc: extend the slot-gpio card-detection to use host's .card_event() method
        mmc: add a card-event host operation
        mmc: sdhci-s3c: Fix compilation warning
        mmc: sdhci-pci: Enable SDHCI_CAN_DO_HISPD for Ricoh SDHCI controller
        mmc: sdhci-dove: allow GPIOs to be used for card detection on Dove
        mmc: sdhci-dove: use two-stage initialization for sdhci-pltfm
        mmc: sdhci-dove: use devm_clk_get()
        mmc: eSDHC: Recover from ADMA errors
        mmc: dw_mmc: remove duplicated buswidth code
        mmc: dw_mmc: relocate where dw_mci_setup_bus() is called from
        mmc: Limit MMC speed to 52MHz if not HS200
        mmc: dw_mmc: use devres functions in dw_mmc
        mmc: sh_mmcif: remove unneeded clock connection ID
        mmc: sh_mobile_sdhi: remove unneeded clock connection ID
        mmc: sh_mobile_sdhi: fix clock frequency printing
        mmc: Remove redundant null check before kfree in bus.c
        mmc: Remove redundant null check before kfree in sdio_bus.c
        mmc: sdhci-imx-esdhc: use more devm_* functions
        mmc: dt: add no-1-8-v device tree flag
        ...
      11b84c58
  2. 11 12月, 2012 13 次提交
  3. 10 12月, 2012 4 次提交
    • N
      inet_diag: validate port comparison byte code to prevent unsafe reads · 5e1f5420
      Neal Cardwell 提交于
      Add logic to verify that a port comparison byte code operation
      actually has the second inet_diag_bc_op from which we read the port
      for such operations.
      
      Previously the code blindly referenced op[1] without first checking
      whether a second inet_diag_bc_op struct could fit there. So a
      malicious user could make the kernel read 4 bytes beyond the end of
      the bytecode array by claiming to have a whole port comparison byte
      code (2 inet_diag_bc_op structs) when in fact the bytecode was not
      long enough to hold both.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5e1f5420
    • N
      inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run() · f67caec9
      Neal Cardwell 提交于
      Add logic to check the address family of the user-supplied conditional
      and the address family of the connection entry. We now do not do
      prefix matching of addresses from different address families (AF_INET
      vs AF_INET6), except for the previously existing support for having an
      IPv4 prefix match an IPv4-mapped IPv6 address (which this commit
      maintains as-is).
      
      This change is needed for two reasons:
      
      (1) The addresses are different lengths, so comparing a 128-bit IPv6
      prefix match condition to a 32-bit IPv4 connection address can cause
      us to unwittingly walk off the end of the IPv4 address and read
      garbage or oops.
      
      (2) The IPv4 and IPv6 address spaces are semantically distinct, so a
      simple bit-wise comparison of the prefixes is not meaningful, and
      would lead to bogus results (except for the IPv4-mapped IPv6 case,
      which this commit maintains).
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f67caec9
    • N
      inet_diag: validate byte code to prevent oops in inet_diag_bc_run() · 405c0059
      Neal Cardwell 提交于
      Add logic to validate INET_DIAG_BC_S_COND and INET_DIAG_BC_D_COND
      operations.
      
      Previously we did not validate the inet_diag_hostcond, address family,
      address length, and prefix length. So a malicious user could make the
      kernel read beyond the end of the bytecode array by claiming to have a
      whole inet_diag_hostcond when the bytecode was not long enough to
      contain a whole inet_diag_hostcond of the given address family. Or
      they could make the kernel read up to about 27 bytes beyond the end of
      a connection address by passing a prefix length that exceeded the
      length of addresses of the given family.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      405c0059
    • N
      inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state · 1c95df85
      Neal Cardwell 提交于
      Fix inet_diag to be aware of the fact that AF_INET6 TCP connections
      instantiated for IPv4 traffic and in the SYN-RECV state were actually
      created with inet_reqsk_alloc(), instead of inet6_reqsk_alloc(). This
      means that for such connections inet6_rsk(req) returns a pointer to a
      random spot in memory up to roughly 64KB beyond the end of the
      request_sock.
      
      With this bug, for a server using AF_INET6 TCP sockets and serving
      IPv4 traffic, an inet_diag user like `ss state SYN-RECV` would lead to
      inet_diag_fill_req() causing an oops or the export to user space of 16
      bytes of kernel memory as a garbage IPv6 address, depending on where
      the garbage inet6_rsk(req) pointed.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c95df85
  4. 09 12月, 2012 2 次提交
    • J
      mm: vmscan: fix inappropriate zone congestion clearing · ed23ec4f
      Johannes Weiner 提交于
      commit c702418f ("mm: vmscan: do not keep kswapd looping forever due
      to individual uncompactable zones") removed zone watermark checks from
      the compaction code in kswapd but left in the zone congestion clearing,
      which now happens unconditionally on higher order reclaim.
      
      This messes up the reclaim throttling logic for zones with
      dirty/writeback pages, where zones should only lose their congestion
      status when their watermarks have been restored.
      
      Remove the clearing from the zone compaction section entirely.  The
      preliminary zone check and the reclaim loop in kswapd will clear it if
      the zone is considered balanced.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed23ec4f
    • L
      vfs: fix O_DIRECT read past end of block device · 684c9aae
      Linus Torvalds 提交于
      The direct-IO write path already had the i_size checks in mm/filemap.c,
      but it turns out the read path did not, and removing the block size
      checks in fs/block_dev.c (commit bbec0270: "blkdev_max_block: make
      private to fs/buffer.c") removed the magic "shrink IO to past the end of
      the device" code there.
      
      Fix it by truncating the IO to the size of the block device, like the
      write path already does.
      
      NOTE! I suspect the write path would be *much* better off doing it this
      way in fs/block_dev.c, rather than hidden deep in mm/filemap.c.  The
      mm/filemap.c code is extremely hard to follow, and has various
      conditionals on the target being a block device (ie the flag passed in
      to 'generic_write_checks()', along with a conditional update of the
      inode timestamp etc).
      
      It is also quite possible that we should treat this whole block device
      size as a "s_maxbytes" issue, and try to make the logic even more
      generic.  However, in the meantime this is the fairly minimal targeted
      fix.
      
      Noted by Milan Broz thanks to a regression test for the cryptsetup
      reencrypt tool.
      Reported-and-tested-by: NMilan Broz <mbroz@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      684c9aae
  5. 08 12月, 2012 2 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 1b3c393c
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
       "Two stragglers:
      
         1) The new code that adds new flushing semantics to GRO can cause SKB
            pointer list corruption, manage the lists differently to avoid the
            OOPS.  Fix from Eric Dumazet.
      
         2) When TCP fast open does a retransmit of data in a SYN-ACK or
            similar, we update retransmit state that we shouldn't triggering a
            WARN_ON later.  Fix from Yuchung Cheng."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        net: gro: fix possible panic in skb_gro_receive()
        tcp: bug fix Fast Open client retransmission
      1b3c393c
    • E
      net: gro: fix possible panic in skb_gro_receive() · c3c7c254
      Eric Dumazet 提交于
      commit 2e71a6f8 (net: gro: selective flush of packets) added
      a bug for skbs using frag_list. This part of the GRO stack is rarely
      used, as it needs skb not using a page fragment for their skb->head.
      
      Most drivers do use a page fragment, but some of them use GFP_KERNEL
      allocations for the initial fill of their RX ring buffer.
      
      napi_gro_flush() overwrite skb->prev that was used for these skb to
      point to the last skb in frag_list.
      
      Fix this using a separate field in struct napi_gro_cb to point to the
      last fragment.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3c7c254