1. 03 10月, 2014 1 次提交
  2. 01 10月, 2014 1 次提交
  3. 29 9月, 2014 4 次提交
    • R
      cpufreq: Replace strnicmp with strncasecmp · 7c4f4539
      Rasmus Villemoes 提交于
      The kernel used to contain two functions for length-delimited,
      case-insensitive string comparison, strnicmp with correct semantics
      and a slightly buggy strncasecmp. The latter is the POSIX name, so
      strnicmp was renamed to strncasecmp, and strnicmp made into a wrapper
      for the new strncasecmp to avoid breaking existing users.
      
      To allow the compat wrapper strnicmp to be removed at some point in
      the future, and to avoid the extra indirection cost, do
      s/strnicmp/strncasecmp/g.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      7c4f4539
    • S
      cpufreq: powernv: Set the cpus to nominal frequency during reboot/kexec · cf30af76
      Shilpasri G Bhat 提交于
      This patch ensures the cpus to kexec/reboot at nominal frequency.
      Nominal frequency is the highest cpu frequency on PowerPC at
      which the cores can run without getting throttled.
      
      If the host kernel had set the cpus to a low pstate and then it
      kexecs/reboots to a cpufreq disabled kernel it would cause the target
      kernel to perform poorly. It will also increase the boot up time of
      the target kernel. So set the cpus to high pstate, in this case to
      nominal frequency before rebooting to avoid such scenarios.
      
      The reboot notifier will set the cpus to nominal frequncy.
      Signed-off-by: NShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
      Reviewed-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      cf30af76
    • P
      cpufreq: powernv: Set the pstate of the last hotplugged out cpu in policy->cpus to minimum · b120339c
      Preeti U Murthy 提交于
      Its possible today that the pstate of a core is held at a high even after the
      entire core is hotplugged out if a load had just run on  the hotplugged cpu. This is
      fair, since it is assumed that the pstate does not matter to a cpu in a deep idle
      state, which is the expected state of a hotplugged core on powerpc. However on powerpc,
      the pstate at a socket level is held at the maximum of the pstates of each core. Even
      if the pstates of the active cores on that socket is low, the socket pstate is held
      high due to the pstate of the hotplugged core in the above mentioned scenario. This
      can cost significant amount of power loss for no good.
      
      Besides, since it is a non active core, nothing can be done from the kernel's end
      to set the frequency of the core right. Hence make use of the stop_cpu callback
      to explicitly set the pstate of the core to a minimum when the last cpu of the
      core gets hotplugged out.
      Signed-off-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      b120339c
    • P
      cpufreq: Allow stop CPU callback to be used by all cpufreq drivers · 789ca243
      Preeti U Murthy 提交于
      Commit 367dc4aa ("cpufreq: Add stop CPU callback to
      cpufreq_driver interface") introduced the stop CPU callback for
      intel_pstate drivers. During the CPU_DOWN_PREPARE stage, this
      callback is invoked so that drivers can take some action on the
      pstate of the cpu before it is taken offline. This callback was
      assumed to be useful only for those drivers which have implemented
      the set_policy CPU callback because they have no other way to take
      action about the cpufreq of a CPU which is being hotplugged out
      except in the exit callback which is called very late in the offline
      process.
      
      The drivers which implement the target/target_index callbacks were
      expected to take care of requirements like the ones that commit
      367dc4aa addresses in the GOV_STOP notification event. But there
      are disadvantages to restricting the usage of stop CPU callback
      to cpufreq drivers that implement the set_policy callbacks and who
      want to take explicit action on the setting the cpufreq during a
      hotplug operation.
      
      1.GOV_STOP gets called for every CPU offline and drivers would usually
      want to take action when the last cpu in the policy->cpus mask
      is taken offline. As long as there is more than one cpu in the
      policy->cpus mask, cpufreq core itself makes sure that the freq
      for the other cpus in this mask is set according to the maximum load.
      This is sensible and drivers which implement the target_index callback
      would mostly not want to modify that. However the cpufreq core leaves a
      loose end when the cpu in the policy->cpus mask is the last one to go offline;
      it does nothing explicit to the frequency of the core. Drivers may need
      a way to take some action here and stop CPU callback mechanism is the
      best way to do it today.
      
      2. We cannot implement driver specific actions in the GOV_STOP mechanism.
      So we will need another driver callback which is invoked from here which is
      unnecessary.
      
      Therefore this patch extends the usage of stop CPU callback to be used
      by all cpufreq drivers as long as they have this callback implemented
      and irrespective of whether they are set_policy/target_index drivers.
      The assumption is if the drivers find the GOV_STOP path to be a suitable
      way of implementing what they want to do with the freq of the cpu
      going offine,they will not implement the stop CPU callback at all.
      Signed-off-by: NPreeti U Murthy <preeti@linux.vnet.ibm.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      789ca243
  4. 09 9月, 2014 8 次提交
  5. 08 9月, 2014 1 次提交
  6. 03 9月, 2014 1 次提交
  7. 01 9月, 2014 8 次提交
    • L
      Linux 3.17-rc3 · 69e273c0
      Linus Torvalds 提交于
      69e273c0
    • L
      Merge tag 'xtensa-20140830' of git://github.com/czankel/xtensa-linux · 05bdb8c9
      Linus Torvalds 提交于
      Pull Xtensa updates from Chris Zankel:
       "Xtensa improvements for 3.17:
         - support highmem on cores with aliasing data cache.  Enable highmem
           on kc705 by default
         - simplify addition of new core variants (no need to modify Kconfig /
           Makefiles)
         - improve robustness of unaligned access handler and its interaction
           with window overflow/underflow exception handlers
         - deprecate atomic and spill registers syscalls
         - clean up Kconfig: remove orphan MATH_EMULATION, sort 'select'
           statements
         - wire up renameat2 syscall.
      
        Various fixes:
         - fix address checks in dma_{alloc,free}_coherent (runtime BUG)
         - fix access to THREAD_RA/THREAD_SP/THREAD_DS (debug build breakage)
         - fix TLBTEMP_BASE_2 region handling in fast_second_level_miss
           (runtime unrecoverable exception)
         - fix a6 and a7 handling in fast_syscall_xtensa (runtime userspace
           register clobbering)
         - fix kernel/user jump out of fast_unaligned (potential runtime
           unrecoverabl exception)
         - replace termios IOCTL code definitions with constants (userspace
           build breakage)"
      
      * tag 'xtensa-20140830' of git://github.com/czankel/xtensa-linux: (25 commits)
        xtensa: deprecate fast_xtensa and fast_spill_registers syscalls
        xtensa: don't allow overflow/underflow on unaligned stack
        xtensa: fix a6 and a7 handling in fast_syscall_xtensa
        xtensa: allow single-stepping through unaligned load/store
        xtensa: move invalid unaligned instruction handler closer to its users
        xtensa: make fast_unaligned store restartable
        xtensa: add double exception fixup handler for fast_unaligned
        xtensa: fix kernel/user jump out of fast_unaligned
        xtensa: configure kc705 for highmem
        xtensa: support highmem in aliasing cache flushing code
        xtensa: support aliasing cache in kmap
        xtensa: support aliasing cache in k[un]map_atomic
        xtensa: implement clear_user_highpage and copy_user_highpage
        xtensa: fix TLBTEMP_BASE_2 region handling in fast_second_level_miss
        xtensa: allow fixmap and kmap span more than one page table
        xtensa: make fixmap region addressing grow with index
        xtensa: fix access to THREAD_RA/THREAD_SP/THREAD_DS
        xtensa: add renameat2 syscall
        xtensa: fix address checks in dma_{alloc,free}_coherent
        xtensa: replace IOCTL code definitions with constants
        ...
      05bdb8c9
    • G
      unicore32: Fix build error · ca98565a
      Guenter Roeck 提交于
      unicore32 builds fail with
      
        arch/unicore32/kernel/signal.c: In function ‘setup_frame’:
        arch/unicore32/kernel/signal.c:257: error: ‘usig’ undeclared (first use in this function)
        arch/unicore32/kernel/signal.c:279: error: ‘usig’ undeclared (first use in this function)
        arch/unicore32/kernel/signal.c: In function ‘handle_signal’:
        arch/unicore32/kernel/signal.c:306: warning: unused variable ‘tsk’
        arch/unicore32/kernel/signal.c: In function ‘do_signal’:
        arch/unicore32/kernel/signal.c:376: error: implicit declaration of function ‘get_signsl’
        make[1]: *** [arch/unicore32/kernel/signal.o] Error 1
        make: *** [arch/unicore32/kernel/signal.o] Error 2
      
      Bisect points to commit 649671c9 ("unicore32: Use get_signal()
      signal_setup_done()").
      
      This code never even compiled.  Reverting the patch does not work, since
      previously used functions no longer exist, so try to fix it up.  Compile
      tested only.
      
      Fixes: 649671c9 ("unicore32: Use get_signal() signal_setup_done()")
      Cc: Richard Weinberger <richard@nod.at>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ca98565a
    • L
      Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · 94559a4a
      Linus Torvalds 提交于
      Pull ARM fixes from Russell King:
       "Various assorted fixes:
      
         - a couple of patches from Mark Rutland to resolve an errata with
           Cortex-A15 CPUs.
         - fix cpuidle for the CPU part ID changes in the last merge window
         - add support for a relocation which ARM binutils is generating in
           some circumstances"
      
      * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
        ARM: 8130/1: cpuidle/cpuidle-big_little: fix reading cpu id part number
        ARM: 8129/1: errata: work around Cortex-A15 erratum 830321 using dummy strex
        ARM: 8128/1: abort: don't clear the exclusive monitors
        ARM: 8127/1: module: add support for R_ARM_TARGET1 relocations
      94559a4a
    • L
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 19ed3eb9
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Olof Johansson:
       "Here's the weekly batch of fixes from arm-soc.
      
        The delta is a largeish negative delta, due to revert of SMP support
        for Broadcom's STB SoC -- it was accidentally merged before some
        issues had been addressed, so they will make a new attempt for 3.18.
        I didn't see a need for a full revert of the whole platform due to
        this, we're keeping the rest enabled.
      
        The rest is mostly:
      
         - a handful of DT fixes for i.MX (Hummingboard/Cubox-i in particular)
         - some MTD/NAND fixes for OMAP
         - minor DT fixes for shmobile
         - warning fix for UP builds on vexpress/spc
      
        There's also a couple of patches that wires up hwmod on TI's DRA7 SoC
        so it can boot.  Drivers and the rest had landed for 3.17, and it's
        small and isolated so it made sense to pick up now even if it's not a
        bugfix"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (23 commits)
        vexpress/spc: fix a build warning on array bounds
        ARM: DRA7: hwmod: Add dra74x and dra72x specific ocp interface lists
        ARM: DRA7: Add support for soc_is_dra74x() and soc_is_dra72x() variants
        MAINTAINERS: catch special Rockchip code locations
        ARM: dts: microsom-ar8035: MDIO pad must be set open drain
        ARM: dts: omap54xx-clocks: Fix the l3 and l4 clock rates
        ARM: brcmstb: revert SMP support
        ARM: OMAP2+: hwmod: Rearm wake-up interrupts for DT when MUSB is idled
        ARM: dts: Enable UART wake-up events for beagleboard
        ARM: dts: Remove twl6030 clk32g "regulator"
        ARM: OMAP2+: omap_device: remove warning that clk alias already exists
        ARM: OMAP: fix %d confusingly prefixed with 0x in format string
        ARM: dts: DRA7: fix interrupt-cells for GPIO
        mtd: nand: omap: Fix 1-bit Hamming code scheme, omap_calculate_ecc()
        ARM: dts: omap3430-sdp: Revert to using software ECC for NAND
        ARM: OMAP2+: GPMC: Support Software ECC scheme via DT
        mtd: nand: omap: Revert to using software ECC by default
        ARM: dts: hummingboard/cubox-i: change SPDIF output to be more descriptive
        ARM: dts: hummingboard/cubox-i: add USB OC pinctrl configuration
        ARM: shmobile: r8a7791: add missing 0x0100 for SDCKCR
        ...
      19ed3eb9
    • A
      vexpress/spc: fix a build warning on array bounds · e160cc17
      Alex Shi 提交于
      With ARCH_VEXPRESS_SPC option, kernel build has the following
      warning:
      
      arch/arm/mach-vexpress/spc.c: In function ‘ve_spc_clk_init’:
      arch/arm/mach-vexpress/spc.c:431:38: warning: array subscript is below array bounds [-Warray-bounds]
        struct ve_spc_opp *opps = info->opps[cluster];
                                            ^
      since 'cluster' maybe '-1' in UP system. This patch does a active
      checking to fix this issue.
      Signed-off-by: NAlex Shi <alex.shi@linaro.org>
      Acked-by: NPawel Moll <pawel.moll@arm.com>
      Acked-by: NSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      e160cc17
    • O
      Merge tag 'for-v3.17-rc/omap-dra72x-d74x-support-a' of... · 98fd1508
      Olof Johansson 提交于
      Merge tag 'for-v3.17-rc/omap-dra72x-d74x-support-a' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending into fixes
      
      Pull "ARM: OMAP2+: DRA72x/DRA74x basic support" from Tony Lindgren:
      
      Add basic subarchitecture support for the DRA72x and DRA74x.  These
      are OMAP2+ derivative SoCs.  This should be low-risk to existing OMAP
      platforms.
      
      Basic build, boot, and PM test logs are available here:
      
      http://www.pwsan.com/omap/testlogs/hwmod-a-early-v3.17-rc/20140827194314/
      
      * tag 'for-v3.17-rc/omap-dra72x-d74x-support-a' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending:
        ARM: DRA7: hwmod: Add dra74x and dra72x specific ocp interface lists
        ARM: DRA7: Add support for soc_is_dra74x() and soc_is_dra72x() variants
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      98fd1508
    • L
      Merge tag 'spi-v3.17-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 81bbadc6
      Linus Torvalds 提交于
      Pull spi bugfixes from Mark Brown:
       "A smattering of bug fixes for the SPI subsystem, all in driver code
        which has seen active work recently and none of them with any great
        global impact.
      
        There's also a new ACPI ID for the pxa2xx driver which required no
        code changes and the addition of kerneldoc for some structure fields
        that were missing it and generating warnings during documentation
        builds as a result"
      
      * tag 'spi-v3.17-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: sh-msiof: Fix transmit-only DMA transfers
        spi/rockchip: Avoid accidentally turning off the clock
        spi: dw: fix kernel crash due to NULL pointer dereference
        spi: dw-pci: fix bug when regs left uninitialized
        spi: davinci: fix SPI_NO_CS functionality
        spi/rockchip: fixup incorrect dma direction setting
        spi/pxa2xx: Add ACPI ID for Intel Braswell
        spi: spi-au1550: fix build failure
        spi: rspi: Fix leaking of unused DMA descriptors
        spi: sh-msiof: Fix leaking of unused DMA descriptors
        spi: Add missing kerneldoc bits
        spi/omap-mcspi: Fix the spi task hangs waiting dma_rx
      81bbadc6
  8. 31 8月, 2014 6 次提交
  9. 30 8月, 2014 10 次提交
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fd5984d7
      Linus Torvalds 提交于
      Pull x86 fixes from Peter Anvin:
       "One patch to avoid assigning interrupts we don't actually have on
        non-PC platforms, and two patches that addresses bugs in the new
        IOAPIC assignment code"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, irq, PCI: Keep IRQ assignment for runtime power management
        x86: irq: Fix bug in setting IOAPIC pin attributes
        x86: Fix non-PC platform kernel crash on boot due to NULL dereference
      fd5984d7
    • L
      Merge tag 'pm+acpi-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · ad6ede80
      Linus Torvalds 提交于
      Pull ACPI and power management fixes from Rafael Wysocki:
      
       - Fix for an ACPI regression related to the handling of fixed events
         that caused netlink routines to be (incorrectly) run in interrupt
         context from Lan Tianyu
      
       - Fix for an ACPI EC driver regression on Acer Aspire V5-573G that
         caused AC/battery plug/unplug and video brightness change
         notifications to be delayed on that machine from Lv Zheng
      
       - Fix for an ACPI device enumeration regression that caused ACPI driver
         probe to fail for some devices where it succeeded before (Rafael J
         Wysocki)
      
       - intel_pstate driver fix to prevent it from printing an information
         message for every CPU in the system on every boot from Andi Kleen
      
       - s5pv210 cpufreq driver fix to remove an __init annotation from a
         routine that in fact can be called at any time after init too from
         Mark Brown
      
       - New Intel Braswell device ID for the ACPI LPSS (Low-Power Subsystem)
         driver from Alan Cox
      
       - New Intel Braswell CPU ID for intel_pstate from Mika Westerberg
      
      * tag 'pm+acpi-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: s5pv210: Remove spurious __init annotation
        cpufreq: intel_pstate: Add CPU ID for Braswell processor
        intel_pstate: Turn per cpu printk into pr_debug
        ACPI / LPSS: Add ACPI IDs for Intel Braswell
        ACPI / EC: Add support to disallow QR_EC to be issued before completing previous QR_EC
        ACPI / EC: Add support to disallow QR_EC to be issued when SCI_EVT isn't set
        ACPI: Run fixed event device notifications in process context
        ACPI / scan: Allow ACPI drivers to bind to PNP device objects
      ad6ede80
    • L
      Merge branch 'akpm' (fixes from Andrew Morton) · 10f3291a
      Linus Torvalds 提交于
      Merge patches from Andrew Morton:
       "22 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
        kexec: purgatory: add clean-up for purgatory directory
        Documentation/kdump/kdump.txt: add ARM description
        flush_icache_range: export symbol to fix build errors
        tools: selftests: fix build issue with make kselftests target
        ocfs2: quorum: add a log for node not fenced
        ocfs2: o2net: set tcp user timeout to max value
        ocfs2: o2net: don't shutdown connection when idle timeout
        ocfs2: do not write error flag to user structure we cannot copy from/to
        x86/purgatory: use approprate -m64/-32 build flag for arch/x86/purgatory
        drivers/rtc/rtc-s5m.c: re-add support for devices without irq specified
        xattr: fix check for simultaneous glibc header inclusion
        kexec: remove CONFIG_KEXEC dependency on crypto
        kexec: create a new config option CONFIG_KEXEC_FILE for new syscall
        x86,mm: fix pte_special versus pte_numa
        hugetlb_cgroup: use lockdep_assert_held rather than spin_is_locked
        mm/zpool: use prefixed module loading
        zram: fix incorrect stat with failed_reads
        lib: turn CONFIG_STACKTRACE into an actual option.
        mm: actually clear pmd_numa before invalidating
        memblock, memhotplug: fix wrong type in memblock_find_in_range_node().
        ...
      10f3291a
    • M
      kexec: purgatory: add clean-up for purgatory directory · b0108f9e
      Michael Welling 提交于
      Without this patch the kexec-purgatory.c and purgatory.ro files are not
      removed after make mrproper.
      Signed-off-by: NMichael Welling <mwelling@ieee.org>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b0108f9e
    • H
      Documentation/kdump/kdump.txt: add ARM description · 16b0371a
      HuKeping 提交于
      Add arm specific parts to kdump kernel documentation.
      Signed-off-by: NHu Keping <hukeping@huawei.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: Haren Myneni <hbabu@us.ibm.com>
      Cc: Rob Landley <rob@landley.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      16b0371a
    • P
      flush_icache_range: export symbol to fix build errors · e3560305
      Pranith Kumar 提交于
      Fix building errors occuring due to a missing export of
      flush_icache_range() in
      
      kisskb.ellerman.id.au/kisskb/buildresult/11677809/
      
      ERROR: "flush_icache_range" [drivers/misc/lkdtm.ko] undefined!
      Signed-off-by: NPranith Kumar <bobby.prani@gmail.com>
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: Vineet Gupta <vgupta@synopsys.com>	[arc]
      Acked-by: Richard Kuo <rkuo@codeaurora.org>	[hexagon]
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Chris Zankel <chris@zankel.net>
      Acked-by: Max Filippov <jcmvbkbc@gmail.com>	[xtensa]
      Cc: Noam Camus <noamc@ezchip.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Acked-by: Zhigang Lu <zlu@tilera.com>		[tile]
      Cc: Kirill Tkhai <tkhai@yandex.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e3560305
    • P
      tools: selftests: fix build issue with make kselftests target · 498b473a
      Phong Tran 提交于
      Fix the typo of ARCH when running 'make kselftests'.  Change the 'X86'
      to 'x86'.  Test by compilation.
      Signed-off-by: NPhong Tran <tranmanphong@gmail.com>
      Cc: David Herrmann <dh.herrmann@gmail.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Shuah Khan <shuah.kh@samsung.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      498b473a
    • J
      ocfs2: quorum: add a log for node not fenced · 8c7b638c
      Junxiao Bi 提交于
      For debug use, we can see from the log whether the fence decision is
      made and why it is not fenced.
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8c7b638c
    • J
      ocfs2: o2net: set tcp user timeout to max value · 8e9801df
      Junxiao Bi 提交于
      When tcp retransmit timeout(15mins), the connection will be closed.
      Pending messages may be lost during this time.  So we set tcp user
      timeout to override the retransmit timeout to the max value.  This is OK
      for ocfs2 since we have disk heartbeat, if peer crash, the disk
      heartbeat will timeout and it will be evicted, if disk heartbeat not
      timeout and connection idle for a long time, then this means the cluster
      enters split-brain state, since fence can't happen, we'd better keep the
      connection and wait network recover.
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8e9801df
    • J
      ocfs2: o2net: don't shutdown connection when idle timeout · c43c363d
      Junxiao Bi 提交于
      This patch series is to fix a possible message lost bug in ocfs2 when
      network go bad.  This bug will cause ocfs2 hung forever even network
      become good again.
      
      The messages may lost in this case.  After the tcp connection is
      established between two nodes, an idle timer will be set to check its
      state periodically, if no messages are received during this time, idle
      timer will timeout, it will shutdown the connection and try to
      reconnect, so pending messages in tcp queues will be lost.  This
      messages may be from dlm.  Dlm may get hung in this case.  This may
      cause the whole ocfs2 cluster hung.
      
      This is very possible to happen when network state goes bad.  Do the
      reconnect is useless, it will fail if network state is still bad.  Just
      waiting there for network recovering may be a good idea, it will not
      lost messages and some node will be fenced until cluster goes into
      split-brain state, for this case, Tcp user timeout is used to override
      the tcp retransmit timeout.  It will timeout after 25 days, user should
      have notice this through the provided log and fix the network, if they
      don't, ocfs2 will fall back to original reconnect way.
      
      This patch (of 3):
      
      Some messages in the tcp queue maybe lost if we shutdown the connection
      and reconnect when idle timeout.  If packets lost and reconnect success,
      then the ocfs2 cluster maybe hung.
      
      To fix this, we can leave the connection there and do the fence decision
      when idle timeout, if network recover before fence dicision is made, the
      connection survive without lost any messages.
      
      This bug can be saw when network state go bad.  It may cause ocfs2 hung
      forever if some packets lost.  With this fix, ocfs2 will recover from
      hung if network becomes good again.
      Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: NSrinivas Eeda <srinivas.eeda@oracle.com>
      Reviewed-by: NMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c43c363d