1. 20 3月, 2015 1 次提交
    • S
      ARM: perf: reject groups spanning multiple hardware PMUs · e429817b
      Suzuki K. Poulose 提交于
      The perf core implicitly rejects events spanning multiple HW PMUs, as in
      these cases the event->ctx will differ. However this validation is
      performed after pmu::event_init() is called in perf_init_event(), and
      thus pmu::event_init() may be called with a group leader from a
      different HW PMU.
      
      The ARM PMU driver does not take this fact into account, and when
      validating groups assumes that it can call to_arm_pmu(event->pmu) for
      any HW event. When the event in question is from another HW PMU this is
      wrong, and results in dereferencing garbage.
      
      This patch updates the ARM PMU driver to first test for and reject
      events from other PMUs, moving the to_arm_pmu and related logic after
      this test. Fixes a crash triggered by perf_fuzzer on Linux-4.0-rc2, with
      a CCI PMU present:
      
       ---
      CPU: 0 PID: 1527 Comm: perf_fuzzer Not tainted 4.0.0-rc2 #57
      Hardware name: ARM-Versatile Express
      task: bd8484c0 ti: be676000 task.ti: be676000
      PC is at 0xbf1bbc90
      LR is at validate_event+0x34/0x5c
      pc : [<bf1bbc90>]    lr : [<80016060>]    psr: 00000013
      ...
      [<80016060>] (validate_event) from [<80016198>] (validate_group+0x28/0x90)
      [<80016198>] (validate_group) from [<80016398>] (armpmu_event_init+0x150/0x218)
      [<80016398>] (armpmu_event_init) from [<800882e4>] (perf_try_init_event+0x30/0x48)
      [<800882e4>] (perf_try_init_event) from [<8008f544>] (perf_init_event+0x5c/0xf4)
      [<8008f544>] (perf_init_event) from [<8008f8a8>] (perf_event_alloc+0x2cc/0x35c)
      [<8008f8a8>] (perf_event_alloc) from [<8009015c>] (SyS_perf_event_open+0x498/0xa70)
      [<8009015c>] (SyS_perf_event_open) from [<8000e420>] (ret_fast_syscall+0x0/0x34)
      Code: bf1be000 bf1bb380 802a2664 00000000 (00000002)
      ---[ end trace 01aff0ff00926a0a ]---
      
      Also cleans up the code to use the arm_pmu only when we know that
      we are dealing with an arm pmu event.
      
      Cc: Will Deacon <will.deacon@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NPeter Ziljstra (Intel) <peterz@infradead.org>
      Signed-off-by: NSuzuki K. Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e429817b
  2. 18 3月, 2015 3 次提交
    • S
      ARM: perf: Add support for Scorpion PMUs · 341e42c4
      Stephen Boyd 提交于
      Scorpion supports a set of local performance monitor event
      selection registers (LPM) sitting behind a cp15 based interface
      that extend the architected PMU events to include Scorpion CPU
      and Venum VFP specific events. To use these events the user is
      expected to program the lpm register with the event code shifted
      into the group they care about and then point the PMNx event at
      that region+group combo by writing a LPMn_GROUPx event. Add
      support for this hardware.
      
      Note: the raw event number is a pure software construct that
      allows us to map the multi-dimensional number space of regions,
      groups, and event codes into a flat event number space suitable
      for use by the perf framework.
      
      This is based on code originally written by Sheetal Sahasrabudhe,
      Ashwin Chaugule, and Neil Leeder [1].
      
      [1] https://www.codeaurora.org/cgit/quic/la/kernel/msm/tree/arch/arm/kernel/perf_event_msm.c?h=msm-3.4
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Neil Leeder <nleeder@codeaurora.org>
      Cc: Ashwin Chaugule <ashwinc@codeaurora.org>
      Cc: Sheetal Sahasrabudhe <sheetals@codeaurora.org>
      Cc: <devicetree@vger.kernel.org>
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      341e42c4
    • S
      ARM: perf: Only reset PMxEVCNTCR registers on reset · 93499918
      Stephen Boyd 提交于
      The Krait specific PMxEVCNTCR register is unpredictable upon
      reset. Currently we clear the register before we setup an event,
      but we don't need to do that. Instead, we can iterate through all
      the events and clear them once when we reset the PMU, saving a
      write in the event setup path.
      
      Cc: Neil Leeder <nleeder@codeaurora.org>
      Cc: Ashwin Chaugule <ashwinc@codeaurora.org>
      Cc: Sheetal Sahasrabudhe <sheetals@codeaurora.org>
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      93499918
    • S
      ARM: perf: Preparatory work for Scorpion PMU support · 65bab451
      Stephen Boyd 提交于
      Do some things to make the Krait PMU support code generic enough
      to be used by the Scorpion PMU support code.
      
       * Rename the venum register functions to be venum instead of krait
         specific because the same registers exist on Scorpion
      
       * Add some macros to decode our Krait specific event encoding that's
         the same on Scorpion (modulo an extra region).
      
       * Drop 'krait' from krait_clear_pmresrn_group() so it can be used
         by Scorpion code
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      65bab451
  3. 16 3月, 2015 6 次提交
    • L
      Linux 4.0-rc4 · 06e5801b
      Linus Torvalds 提交于
      06e5801b
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 08352086
      Linus Torvalds 提交于
      Pull drm fix from Dave Airlie:
       "An oops snuck in in an -rc3 patch, this fixes it"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        [PATCH] drm/mm: Fix support 4 GiB and larger ranges
      08352086
    • L
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 1ee89c51
      Linus Torvalds 提交于
      Pull clock framework fixes from Michael Turquette:
       "The clk fixes for 4.0-rc4 comprise three themes.
      
        First are the usual driver fixes for new regressions since v3.19.
      
        Second are fixes to the common clock divider type caused by recent
        changes to how we round clock rates.  This affects many clock drivers
        that use this common code.
      
        Finally there are fixes for drivers that improperly compared struct
        clk pointers (drivers must not deref these pointers).  While some of
        these drivers have done this for a long time, this did not cause a
        problem until we started generating unique struct clk pointers for
        every consumer.  A new function, clk_is_match was introduced to get
        these drivers working again and they are fixed up to no longer deref
        the pointers themselves"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        ASoC: kirkwood: fix struct clk pointer comparing
        ASoC: fsl_spdif: fix struct clk pointer comparing
        ARM: imx: fix struct clk pointer comparing
        clk: introduce clk_is_match
        clk: don't export static symbol
        clk: divider: fix calculation of initial best divider when rounding to closest
        clk: divider: fix selection of divider when rounding to closest
        clk: divider: fix calculation of maximal parent rate for a given divider
        clk: divider: return real rate instead of divider value
        clk: qcom: fix platform_no_drv_owner.cocci warnings
        clk: qcom: fix platform_no_drv_owner.cocci warnings
        clk: qcom: Add PLL4 vote clock
        clk: qcom: lcc-msm8960: Fix PLL rate detection
        clk: qcom: Fix slimbus n and m val offsets
        clk: ti: Fix FAPLL parent enable bit handling
      1ee89c51
    • K
      [PATCH] drm/mm: Fix support 4 GiB and larger ranges · 046d669c
      Krzysztof Kolasa 提交于
      bad argument if(tmp)... in check_free_hole
      
      fix oops: kernel BUG at drivers/gpu/drm/drm_mm.c:305!
      
      [airlied: excellent, this was my task for today].
      Signed-off-by: NKrzysztof Kolasa <kkolasa@winsoft.pl>
      Reviewed-by: NChris wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      046d669c
    • L
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 6981e2af
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Arnd Bergmann:
       "This is a rather unpleasantly large set of bug fixes for arm-soc, Most
        of them because of cross-tree dependencies for Exynos where we should
        have figured out the right path to merge things before the merge
        window, and then the maintainer being unable to sort things out in
        time during a business trip.
      
        The other changes contained here are the usual collection:
      
        MAINTAINERS file updates
         - Gregory Clement is now a co-maintainer for the legacy Marvell EBU
           platforms
         - A MAINTAINERS entry for the Freescale Vybrid platform that was
           added last year
         - Matt Porter no longer works as a maintainer on Broadcom SoCs
      
        Build-time issues
         - A compile-time error for at91
         - Several minor DT fixes on at91, imx, exynos, socfpga, and omap
         - The new digicolor platform was not correctly enabled at all
      
        Configuration issues
         - Two defconfig fix for regressions using USB on versatile express
           and on OMAP3
         - Enabling all 8 CPUs on Allwinner/SUNxi
         - Enabling the new STiH410 platform to be usable
      
        Bug fixes in platform code
         - A missing barrier for socfpga
         - Fixing LPDDR1 self-refresh mode on at91
         - Fixing RTC interrupt numbers on Exynos3250
         - Fixing a cache-coherency issues in CPU power-down on Exynos5
         - Multiple small OMAP power management fixes"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (69 commits)
        MAINTAINERS: Add myself as co-maintainer to the legacy support of the mvebu SoCs
        ARM: at91: pm_slowclock: fix the compilation error
        ARM: at91/dt: fix USB high-speed clock to select UTMI
        ARM: at91/dt: fix at91 udc compatible strings
        ARM: at91/dt: declare matrix node as a syscon device
        ARM: vexpress: update CONFIG_USB_ISP1760 option
        ARM: digicolor: add the machine directory to Makefile
        ARM: STi: Add STiH410 SoC support
        MAINTAINERS: add Freescale Vybrid SoC
        MAINTAINERS: Remove self as ARM mach-bcm co-maintainer
        ARM: imx6sl-evk: set swbst_reg as vbus's parent reg
        ARM: imx6qdl-sabresd: set swbst_reg as vbus's parent reg
        ARM: at91/dt: at91sam9261: fix clocks and clock-names in udc definition
        ARM: OMAP2+: Fix wl12xx on dm3730-evm with mainline u-boot
        ARM: OMAP: enable TWL4030_USB in omap2plus_defconfig
        ARM: dts: dra7x-evm: avoid possible contention while muxing on CAN lines
        ARM: dts: dra7x-evm: Don't use dcan1_rx.gpio1_15 in DCAN pinctrl
        ARM: dts: am43xx: fix SLEWCTRL_FAST pinctrl binding
        ARM: dts: am33xx: fix SLEWCTRL_FAST pinctrl binding
        ARM: dts: OMAP5: fix polling intervals for thermal zones
        ...
      6981e2af
    • L
      Merge tag 'irqchip-fixes-4.0' of git://git.infradead.org/users/jcooper/linux · 71c87bd0
      Linus Torvalds 提交于
      Pull irqchip fixes from Jason Cooper:
       "armada-370-xp:
         - Chained per-cpu interrupts
      
        gic{,-v3,v3-its}"
         - Various fixes for safer operation"
      
      * tag 'irqchip-fixes-4.0' of git://git.infradead.org/users/jcooper/linux:
        irqchip: gicv3-its: Support safe initialization
        irqchip: gicv3-its: Define macros for GITS_CTLR fields
        irqchip: gicv3-its: Add limitation to page order
        irqchip: gicv3-its: Use 64KB page as default granule
        irqchip: gicv3-its: Zero itt before handling to hardware
        irqchip: gic-v3: Fix out of bounds access to cpu_logical_map
        irqchip: gic: Fix unsafe locking reported by lockdep
        irqchip: gicv3-its: Fix unsafe locking reported by lockdep
        irqchip: gicv3-its: Iterate over PCI aliases to generate ITS configuration
        irqchip: gicv3-its: Allocate enough memory for the full range of DeviceID
        irqchip: gicv3-its: Fix ITS CPU init
        irqchip: armada-370-xp: Fix chained per-cpu interrupts
      71c87bd0
  4. 15 3月, 2015 7 次提交
  5. 14 3月, 2015 11 次提交
  6. 13 3月, 2015 12 次提交
    • G
      of/platform: Fix sparc:allmodconfig build · a697c2ef
      Guenter Roeck 提交于
      sparc:allmodconfig fails to build with:
      
      drivers/built-in.o: In function `platform_bus_init':
      (.init.text+0x3684): undefined reference to `of_platform_register_reconfig_notifier'
      
      of_platform_register_reconfig_notifier is only declared if both OF_ADDRESS
      and OF_DYNAMIC are configured. Yet, the include file only declares a dummy
      function if OF_DYNAMIC is not configured. The sparc architecture does not
      configure OF_ADDRESS, but does configure OF_DYNAMIC, causing above error.
      
      Fixes: 801d728c ("of/reconfig: Add OF_DYNAMIC notifier for platform_bus_type")
      Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NRob Herring <robh@kernel.org>
      a697c2ef
    • T
      ALSA: hda - Don't access stereo amps for mono channel widgets · ef403edb
      Takashi Iwai 提交于
      The current HDA generic parser initializes / modifies the amp values
      always in stereo, but this seems causing the problem on ALC3229 codec
      that has a few mono channel widgets: namely, these mono widgets react
      to actions for both channels equally.
      
      In the driver code, we do care the mono channel and create a control
      only for the left channel (as defined in HD-audio spec) for such a
      node.  When the control is updated, only the left channel value is
      changed.  However, in the resume, the right channel value is also
      restored from the initial value we took as stereo, and this overwrites
      the left channel value.  This ends up being the silent output as the
      right channel has been never touched and remains muted.
      
      This patch covers the places where unconditional stereo amp accesses
      are done and converts to the conditional accesses.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=94581
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      ef403edb
    • L
      Merge branch 'akpm' (patches from Andrew) · c202baf0
      Linus Torvalds 提交于
      Merge misc fixes from Andrew Morton:
       "13 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        memcg: disable hierarchy support if bound to the legacy cgroup hierarchy
        mm: reorder can_do_mlock to fix audit denial
        kasan, module: move MODULE_ALIGN macro into <linux/moduleloader.h>
        kasan, module, vmalloc: rework shadow allocation for modules
        fanotify: fix event filtering with FAN_ONDIR set
        mm/nommu.c: export symbol max_mapnr
        arch/c6x/include/asm/pgtable.h: define dummy pgprot_writecombine for !MMU
        nilfs2: fix deadlock of segment constructor during recovery
        mm: cma: fix CMA aligned offset calculation
        mm, hugetlb: close race when setting PageTail for gigantic pages
        mm, oom: do not fail __GFP_NOFAIL allocation if oom killer is disabled
        drivers/rtc/rtc-s3c.c: add .needs_src_clk to s3c6410 RTC data
        ocfs2: make append_dio an incompat feature
      c202baf0
    • V
      memcg: disable hierarchy support if bound to the legacy cgroup hierarchy · 7feee590
      Vladimir Davydov 提交于
      If the memory cgroup controller is initially mounted in the scope of the
      default cgroup hierarchy and then remounted to a legacy hierarchy, it will
      still have hierarchy support enabled, which is incorrect.  We should
      disable hierarchy support if bound to the legacy cgroup hierarchy.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7feee590
    • J
      mm: reorder can_do_mlock to fix audit denial · a5a6579d
      Jeff Vander Stoep 提交于
      A userspace call to mmap(MAP_LOCKED) may result in the successful locking
      of memory while also producing a confusing audit log denial.  can_do_mlock
      checks capable and rlimit.  If either of these return positive
      can_do_mlock returns true.  The capable check leads to an LSM hook used by
      apparmour and selinux which produce the audit denial.  Reordering so
      rlimit is checked first eliminates the denial on success, only recording a
      denial when the lock is unsuccessful as a result of the denial.
      Signed-off-by: NJeff Vander Stoep <jeffv@google.com>
      Acked-by: NNick Kralevich <nnk@google.com>
      Cc: Jeff Vander Stoep <jeffv@google.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Paul Cassella <cassella@cray.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a5a6579d
    • A
      kasan, module: move MODULE_ALIGN macro into <linux/moduleloader.h> · d3733e5c
      Andrey Ryabinin 提交于
      include/linux/moduleloader.h is more suitable place for this macro.
      Also change alignment to PAGE_SIZE for CONFIG_KASAN=n as such
      alignment already assumed in several places.
      Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d3733e5c
    • A
      kasan, module, vmalloc: rework shadow allocation for modules · a5af5aa8
      Andrey Ryabinin 提交于
      Current approach in handling shadow memory for modules is broken.
      
      Shadow memory could be freed only after memory shadow corresponds it is no
      longer used.  vfree() called from interrupt context could use memory its
      freeing to store 'struct llist_node' in it:
      
          void vfree(const void *addr)
          {
          ...
              if (unlikely(in_interrupt())) {
                  struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
                  if (llist_add((struct llist_node *)addr, &p->list))
                          schedule_work(&p->wq);
      
      Later this list node used in free_work() which actually frees memory.
      Currently module_memfree() called in interrupt context will free shadow
      before freeing module's memory which could provoke kernel crash.
      
      So shadow memory should be freed after module's memory.  However, such
      deallocation order could race with kasan_module_alloc() in module_alloc().
      
      Free shadow right before releasing vm area.  At this point vfree()'d
      memory is not used anymore and yet not available for other allocations.
      New VM_KASAN flag used to indicate that vm area has dynamically allocated
      shadow memory so kasan frees shadow only if it was previously allocated.
      Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a5af5aa8
    • S
      fanotify: fix event filtering with FAN_ONDIR set · b3c1030d
      Suzuki K. Poulose 提交于
      With FAN_ONDIR set, the user can end up getting events, which it hasn't
      marked.  This was revealed with fanotify04 testcase failure on
      Linux-4.0-rc1, and is a regression from 3.19, revealed with 66ba93c0
      ("fanotify: don't set FAN_ONDIR implicitly on a marks ignored mask").
      
         # /opt/ltp/testcases/bin/fanotify04
         [ ... ]
        fanotify04    7  TPASS  :  event generated properly for type 100000
        fanotify04    8  TFAIL  :  fanotify04.c:147: got unexpected event 30
        fanotify04    9  TPASS  :  No event as expected
      
      The testcase sets the adds the following marks : FAN_OPEN | FAN_ONDIR for
      a fanotify on a dir.  Then does an open(), followed by close() of the
      directory and expects to see an event FAN_OPEN(0x20).  However, the
      fanotify returns (FAN_OPEN|FAN_CLOSE_NOWRITE(0x10)).  This happens due to
      the flaw in the check for event_mask in fanotify_should_send_event() which
      does:
      
      	if (event_mask & marks_mask & ~marks_ignored_mask)
      		return true;
      
      where, event_mask == (FAN_ONDIR | FAN_CLOSE_NOWRITE),
             marks_mask == (FAN_ONDIR | FAN_OPEN),
             marks_ignored_mask == 0
      
      Fix this by masking the outgoing events to the user, as we already take
      care of FAN_ONDIR and FAN_EVENT_ON_CHILD.
      Signed-off-by: NSuzuki K. Poulose <suzuki.poulose@arm.com>
      Tested-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3c1030d
    • G
      mm/nommu.c: export symbol max_mapnr · 5b8bf307
      gchen gchen 提交于
      Several modules may need max_mapnr, so export, the related error with
      allmodconfig under c6x:
      
        MODPOST 3327 modules
        ERROR: "max_mapnr" [fs/pstore/ramoops.ko] undefined!
        ERROR: "max_mapnr" [drivers/media/v4l2-core/videobuf2-dma-contig.ko] undefined!
      Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b8bf307
    • C
      arch/c6x/include/asm/pgtable.h: define dummy pgprot_writecombine for !MMU · 65b9ab88
      Chen Gang 提交于
      When !MMU, asm-generic will not define default pgprot_writecombine, so c6x
      needs to define it by itself.  The related error:
      
          CC [M]  fs/pstore/ram_core.o
        fs/pstore/ram_core.c: In function 'persistent_ram_vmap':
        fs/pstore/ram_core.c:399:10: error: implicit declaration of function 'pgprot_writecombine' [-Werror=implicit-function-declaration]
           prot = pgprot_writecombine(PAGE_KERNEL);
                  ^
        fs/pstore/ram_core.c:399:8: error: incompatible types when assigning to type 'pgprot_t {aka struct <anonymous>}' from type 'int'
           prot = pgprot_writecombine(PAGE_KERNEL);
                ^
      Signed-off-by: NChen Gang <gang.chen.5i5j@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      65b9ab88
    • R
      nilfs2: fix deadlock of segment constructor during recovery · 283ee148
      Ryusuke Konishi 提交于
      According to a report from Yuxuan Shui, nilfs2 in kernel 3.19 got stuck
      during recovery at mount time.  The code path that caused the deadlock was
      as follows:
      
        nilfs_fill_super()
          load_nilfs()
            nilfs_salvage_orphan_logs()
              * Do roll-forwarding, attach segment constructor for recovery,
                and kick it.
      
              nilfs_segctor_thread()
                nilfs_segctor_thread_construct()
                 * A lock is held with nilfs_transaction_lock()
                   nilfs_segctor_do_construct()
                     nilfs_segctor_drop_written_files()
                       iput()
                         iput_final()
                           write_inode_now()
                             writeback_single_inode()
                               __writeback_single_inode()
                                 do_writepages()
                                   nilfs_writepage()
                                     nilfs_construct_dsync_segment()
                                       nilfs_transaction_lock() --> deadlock
      
      This can happen if commit 7ef3ff2f ("nilfs2: fix deadlock of segment
      constructor over I_SYNC flag") is applied and roll-forward recovery was
      performed at mount time.  The roll-forward recovery can happen if datasync
      write is done and the file system crashes immediately after that.  For
      instance, we can reproduce the issue with the following steps:
      
       < nilfs2 is mounted on /nilfs (device: /dev/sdb1) >
       # dd if=/dev/zero of=/nilfs/test bs=4k count=1 && sync
       # dd if=/dev/zero of=/nilfs/test conv=notrunc oflag=dsync bs=4k
       count=1 && reboot -nfh
       < the system will immediately reboot >
       # mount -t nilfs2 /dev/sdb1 /nilfs
      
      The deadlock occurs because iput() can run segment constructor through
      writeback_single_inode() if MS_ACTIVE flag is not set on sb->s_flags.  The
      above commit changed segment constructor so that it calls iput()
      asynchronously for inodes with i_nlink == 0, but that change was
      imperfect.
      
      This fixes the another deadlock by deferring iput() in segment constructor
      even for the case that mount is not finished, that is, for the case that
      MS_ACTIVE flag is not set.
      Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Reported-by: NYuxuan Shui <yshuiv7@gmail.com>
      Tested-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      283ee148
    • D
      mm: cma: fix CMA aligned offset calculation · 850fc430
      Danesh Petigara 提交于
      The CMA aligned offset calculation is incorrect for non-zero order_per_bit
      values.
      
      For example, if cma->order_per_bit=1, cma->base_pfn= 0x2f800000 and
      align_order=12, the function returns a value of 0x17c00 instead of 0x400.
      
      This patch fixes the CMA aligned offset calculation.
      
      The previous calculation was wrong and would return too-large values for
      the offset, so that when cma_alloc looks for free pages in the bitmap with
      the requested alignment > order_per_bit, it starts too far into the bitmap
      and so CMA allocations will fail despite there actually being plenty of
      free pages remaining.  It will also probably have the wrong alignment.
      With this change, we will get the correct offset into the bitmap.
      
      One affected user is powerpc KVM, which has kvm_cma->order_per_bit set to
      KVM_CMA_CHUNK_ORDER - PAGE_SHIFT, or 18 - 12 = 6.
      
      [gregory.0xf0@gmail.com: changelog additions]
      Signed-off-by: NDanesh Petigara <dpetigara@broadcom.com>
      Reviewed-by: NGregory Fong <gregory.0xf0@gmail.com>
      Acked-by: NMichal Nazarewicz <mina86@mina86.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      850fc430
新手
引导
客服 返回
顶部