1. 07 11月, 2014 1 次提交
    • G
      aio: fix uncorrent dirty pages accouting when truncating AIO ring buffer · 835f252c
      Gu Zheng 提交于
      https://bugzilla.kernel.org/show_bug.cgi?id=86831
      
      Markus reported that when shutting down mysqld (with AIO support,
      on a ext3 formatted Harddrive) leads to a negative number of dirty pages
      (underrun to the counter). The negative number results in a drastic reduction
      of the write performance because the page cache is not used, because the kernel
      thinks it is still 2 ^ 32 dirty pages open.
      
      Add a warn trace in __dec_zone_state will catch this easily:
      
      static inline void __dec_zone_state(struct zone *zone, enum
      	zone_stat_item item)
      {
           atomic_long_dec(&zone->vm_stat[item]);
      +    WARN_ON_ONCE(item == NR_FILE_DIRTY &&
      	atomic_long_read(&zone->vm_stat[item]) < 0);
           atomic_long_dec(&vm_stat[item]);
      }
      
      [   21.341632] ------------[ cut here ]------------
      [   21.346294] WARNING: CPU: 0 PID: 309 at include/linux/vmstat.h:242
      cancel_dirty_page+0x164/0x224()
      [   21.355296] Modules linked in: wutbox_cp sata_mv
      [   21.359968] CPU: 0 PID: 309 Comm: kworker/0:1 Not tainted 3.14.21-WuT #80
      [   21.366793] Workqueue: events free_ioctx
      [   21.370760] [<c0016a64>] (unwind_backtrace) from [<c0012f88>]
      (show_stack+0x20/0x24)
      [   21.378562] [<c0012f88>] (show_stack) from [<c03f8ccc>]
      (dump_stack+0x24/0x28)
      [   21.385840] [<c03f8ccc>] (dump_stack) from [<c0023ae4>]
      (warn_slowpath_common+0x84/0x9c)
      [   21.393976] [<c0023ae4>] (warn_slowpath_common) from [<c0023bb8>]
      (warn_slowpath_null+0x2c/0x34)
      [   21.402800] [<c0023bb8>] (warn_slowpath_null) from [<c00c0688>]
      (cancel_dirty_page+0x164/0x224)
      [   21.411524] [<c00c0688>] (cancel_dirty_page) from [<c00c080c>]
      (truncate_inode_page+0x8c/0x158)
      [   21.420272] [<c00c080c>] (truncate_inode_page) from [<c00c0a94>]
      (truncate_inode_pages_range+0x11c/0x53c)
      [   21.429890] [<c00c0a94>] (truncate_inode_pages_range) from
      [<c00c0f6c>] (truncate_pagecache+0x88/0xac)
      [   21.439252] [<c00c0f6c>] (truncate_pagecache) from [<c00c0fec>]
      (truncate_setsize+0x5c/0x74)
      [   21.447731] [<c00c0fec>] (truncate_setsize) from [<c013b3a8>]
      (put_aio_ring_file.isra.14+0x34/0x90)
      [   21.456826] [<c013b3a8>] (put_aio_ring_file.isra.14) from
      [<c013b424>] (aio_free_ring+0x20/0xcc)
      [   21.465660] [<c013b424>] (aio_free_ring) from [<c013b4f4>]
      (free_ioctx+0x24/0x44)
      [   21.473190] [<c013b4f4>] (free_ioctx) from [<c003d8d8>]
      (process_one_work+0x134/0x47c)
      [   21.481132] [<c003d8d8>] (process_one_work) from [<c003e988>]
      (worker_thread+0x130/0x414)
      [   21.489350] [<c003e988>] (worker_thread) from [<c00448ac>]
      (kthread+0xd4/0xec)
      [   21.496621] [<c00448ac>] (kthread) from [<c000ec18>]
      (ret_from_fork+0x14/0x20)
      [   21.503884] ---[ end trace 79c4bf42c038c9a1 ]---
      
      The cause is that we set the aio ring file pages as *DIRTY* via SetPageDirty
      (bypasses the VFS dirty pages increment) when init, and aio fs uses
      *default_backing_dev_info* as the backing dev, which does not disable
      the dirty pages accounting capability.
      So truncating aio ring file will contribute to accounting dirty pages (VFS
      dirty pages decrement), then error occurs.
      
      The original goal is keeping these pages in memory (can not be reclaimed
      or swapped) in life-time via marking it dirty. But thinking more, we have
      already pinned pages via elevating the page's refcount, which can already
      achieve the goal, so the SetPageDirty seems unnecessary.
      
      In order to fix the issue, using the __set_page_dirty_no_writeback instead
      of the nop .set_page_dirty, and dropped the SetPageDirty (don't manually
      set the dirty flags, don't disable set_page_dirty(), rely on default behaviour).
      
      With the above change, the dirty pages accounting can work well. But as we
      known, aio fs is an anonymous one, which should never cause any real write-back,
      we can ignore the dirty pages (write back) accounting by disabling the dirty
      pages (write back) accounting capability. So we introduce an aio private
      backing dev info (disabled the ACCT_DIRTY/WRITEBACK/ACCT_WB capabilities) to
      replace the default one.
      Reported-by: NMarkus Königshaus <m.koenigshaus@wut.de>
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Cc: stable <stable@vger.kernel.org>
      Acked-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
      835f252c
  2. 05 9月, 2014 1 次提交
  3. 03 9月, 2014 1 次提交
    • J
      aio: add missing smp_rmb() in read_events_ring · 2ff396be
      Jeff Moyer 提交于
      We ran into a case on ppc64 running mariadb where io_getevents would
      return zeroed out I/O events.  After adding instrumentation, it became
      clear that there was some missing synchronization between reading the
      tail pointer and the events themselves.  This small patch fixes the
      problem in testing.
      
      Thanks to Zach for helping to look into this, and suggesting the fix.
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
      Cc: stable@vger.kernel.org
      2ff396be
  4. 25 8月, 2014 10 次提交
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7be141d0
      Linus Torvalds 提交于
      Pull x86 fixes from Ingo Molnar:
       "A couple of EFI fixes, plus misc fixes all around the map"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi/arm64: Store Runtime Services revision
        firmware: Do not use WARN_ON(!spin_is_locked())
        x86_32, entry: Clean up sysenter_badsys declaration
        x86/doc: Fix the 'tlb_single_page_flush_ceiling' sysconfig path
        x86/mm: Fix sparse 'tlb_single_page_flush_ceiling' warning and make the variable read-mostly
        x86/mm: Fix RCU splat from new TLB tracepoints
      7be141d0
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 44744bb3
      Linus Torvalds 提交于
      Pull perf fixes from Ingo Molnar:
       "A kprobes and a perf compat ioctl fix"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf: Handle compat ioctl
        kprobes: Skip kretprobe hit in NMI context to avoid deadlock
      44744bb3
    • L
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 959dc258
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Olof Johansson:
       "A collection of fixes from this week, it's been pretty quiet and
        nothing really stands out as particularly noteworthy here -- mostly
        minor fixes across the field:
      
         - ODROID booting was fixed due to PMIC interrupts missing in DT
         - a collection of i.MX fixes
         - minor Tegra fix for regulators
         - Rockchip fix and addition of SoC-specific mailing list to make it
           easier to find posted patches"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        bus: arm-ccn: Fix warning message
        ARM: shmobile: koelsch: Remove non-existent i2c6 pinmux
        ARM: tegra: apalis/colibri t30: fix on-module 5v0 supplies
        MAINTAINERS: add new Rockchip SoC list
        ARM: dts: rockchip: readd missing mmc0 pinctrl settings
        ARM: dts: ODROID i2c improvements
        ARM: dts: Enable PMIC interrupts on ODROID
        ARM: dts: imx6sx: fix the pad setting for uart CTS_B
        ARM: dts: i.MX53: fix apparent bug in VPU clks
        ARM: imx: correct gpu2d_axi and gpu3d_axi clock setting
        ARM: dts: imx6: edmqmx6: change enet reset pin
        ARM: dts: vf610-twr: Fix pinctrl_esdhc1 pin definitions.
        ARM: imx: remove unnecessary ARCH_HAS_OPP select
        ARM: imx: fix TLB missing of IOMUXC base address during suspend
        ARM: imx6: fix SMP compilation again
        ARM: dt: sun6i: Add #address-cells and #size-cells to i2c controller nodes
      959dc258
    • L
      Merge tag 'gpio-v3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · fa7f78e0
      Linus Torvalds 提交于
      Pull gpio fixes from Linus Walleij:
      
       - a largeish fix for the IRQ handling in the new Zynq driver.  The
         quite verbose commit message gives the exact details.
       - move some defines for gpiod flags outside an ifdef to make stub
         functions work again.
       - various minor fixes that we can accept for -rc1.
      
      * tag 'gpio-v3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
        gpio-lynxpoint: enable input sensing in resume
        gpio: move GPIOD flags outside #ifdef
        gpio: delete unneeded test before of_node_put
        gpio: zynq: Fix IRQ handlers
        gpiolib: devres: use correct structure type name in sizeof
        MAINTAINERS: Change maintainer for gpio-bcm-kona.c
      fa7f78e0
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 5e30ca1e
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "Intel and radeon fixes.
      
        Post KS/LC git requests from i915 and radeon stacked up.  They are all
        fixes along with some new pci ids for radeon, and one maintainers file
        entry.
      
         - i915: display fixes and irq fixes
         - radeon: pci ids, and misc gpuvm, dpm and hdp cache"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (29 commits)
        MAINTAINERS: Add entry for Renesas DRM drivers
        drm/radeon: add additional SI pci ids
        drm/radeon: add new bonaire pci ids
        drm/radeon: add new KV pci id
        Revert "drm/radeon: Use write-combined CPU mappings of ring buffers with PCIe"
        drm/radeon: fix active_cu mask on SI and CIK after re-init (v3)
        drm/radeon: fix active cu count for SI and CIK
        drm/radeon: re-enable selective GPUVM flushing
        drm/radeon: Sync ME and PFP after CP semaphore waits v4
        drm/radeon: fix display handling in radeon_gpu_reset
        drm/radeon: fix pm handling in radeon_gpu_reset
        drm/radeon: Only flush HDP cache for indirect buffers from userspace
        drm/radeon: properly document reloc priority mask
        drm/i915: don't try to retrain a DP link on an inactive CRTC
        drm/i915: make sure VDD is turned off during system suspend
        drm/i915: cancel hotplug and dig_port work during suspend and unload
        drm/i915: fix HPD IRQ reenable work cancelation
        drm/i915: take display port power domain in DP HPD handler
        drm/i915: Don't try to enable cursor from setplane when crtc is disabled
        drm/i915: Skip load detect when intel_crtc->new_enable==true
        ...
      5e30ca1e
    • B
      aio: fix reqs_available handling · d856f32a
      Benjamin LaHaise 提交于
      As reported by Dan Aloni, commit f8567a38 ("aio: fix aio request
      leak when events are reaped by userspace") introduces a regression when
      user code attempts to perform io_submit() with more events than are
      available in the ring buffer.  Reverting that commit would reintroduce a
      regression when user space event reaping is used.
      
      Fixing this bug is a bit more involved than the previous attempts to fix
      this regression.  Since we do not have a single point at which we can
      count events as being reaped by user space and io_getevents(), we have
      to track event completion by looking at the number of events left in the
      event ring.  So long as there are as many events in the ring buffer as
      there have been completion events generate, we cannot call
      put_reqs_available().  The code to check for this is now placed in
      refill_reqs_available().
      
      A test program from Dan and modified by me for verifying this bug is available
      at http://www.kvack.org/~bcrl/20140824-aio_bug.c .
      Reported-by: NDan Aloni <dan@kernelim.com>
      Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
      Acked-by: NDan Aloni <dan@kernelim.com>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Mateusz Guzik <mguzik@redhat.com>
      Cc: Petr Matousek <pmatouse@redhat.com>
      Cc: stable@vger.kernel.org      # v3.16 and anything that f8567a38 was backported to
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d856f32a
    • P
      bus: arm-ccn: Fix warning message · bf87bb12
      Pawel Moll 提交于
      A message warning a user about wrong vc value was printing
      out port instead.
      Reported-by: NDrew Richardson <drew.richardson@arm.com>
      Signed-off-by: NPawel Moll <pawel.moll@arm.com>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      bf87bb12
    • G
      ARM: shmobile: koelsch: Remove non-existent i2c6 pinmux · 12266db7
      Geert Uytterhoeven 提交于
      On r8a7791, i2c6 (aka iic3) doesn't need pinmux, but the koelsch dts
      refers to non-existent pinmux configuration data:
      
      pinmux core: sh-pfc does not support function i2c6
      sh-pfc e6060000.pfc: invalid function i2c6 in map table
      
      Remove it to fix this.
      
      Fixes: commit 1d41f36a ("ARM: shmobile:
             koelsch dts: Add VDD MPU regulator for DVFS")
      Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: NSimon Horman <horms+renesas@verge.net.au>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      12266db7
    • M
      ARM: tegra: apalis/colibri t30: fix on-module 5v0 supplies · caa9eac5
      Marcel Ziswiler 提交于
      Working on Gigabit/PCIe support in U-Boot for Apalis T30 I realised
      that the current device tree source includes for our modules only
      happen to work due to referencing the on-carrier 5v0 supply from USB
      which is not at all available on-module. The modules actually contain
      TPS60150 charge pumps to generate the PMIC required 5 volts from the
      one and only 3.3 volt module supply. This patch fixes this.
      
      (Note: When back-porting this to v3.16 stable releases, simply drop the
      change to tegra30-apalis.dtsi; that file was added in v3.17)
      
      Cc: <stable@vger.kernel.org> #v3.16+
      Signed-off-by: NMarcel Ziswiler <marcel@ziswiler.com>
      Signed-off-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      caa9eac5
    • O
      Merge tag 'v3.17-rockchip-fixes1' of... · 9d0b1f34
      Olof Johansson 提交于
      Merge tag 'v3.17-rockchip-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes
      
      Merge "ARM: rockchip: fix for 3.17" from Heiko Stubner:
      
      Pinctrl that got accidentially dropped when reorganizing the
      dts files and addition of the new Rockchip list to MAINTAINERS.
      
      * tag 'v3.17-rockchip-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
        MAINTAINERS: add new Rockchip SoC list
        ARM: dts: rockchip: readd missing mmc0 pinctrl settings
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      9d0b1f34
  5. 24 8月, 2014 2 次提交
  6. 23 8月, 2014 17 次提交
    • H
      MAINTAINERS: add new Rockchip SoC list · 00250b52
      Heiko Stuebner 提交于
      Add the new list that Rockchip-specific patches should also be directed to.
      Signed-off-by: NHeiko Stuebner <heiko@sntech.de>
      00250b52
    • H
      ARM: dts: rockchip: readd missing mmc0 pinctrl settings · 1302d32c
      Heiko Stuebner 提交于
      During the restructuring of the Rockchip Cortex-A9 dtsi files it seems
      like the pinctrl settings vanished at some point from the mmc0 support.
      
      This of course renders them unusable, so readd the necessary pinctrl
      properties.
      Signed-off-by: NHeiko Stuebner <heiko@sntech.de>
      1302d32c
    • O
      Merge tag 'sunxi-dt-for-3.17-2' of... · 2136edf3
      Olof Johansson 提交于
      Merge tag 'sunxi-dt-for-3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into fixes
      
      Merge "Allwinner DT changes, take 2" from Maxime Ripard:
      
      Only a single patch in here that fixes a DTC warning.
      
      * tag 'sunxi-dt-for-3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
        ARM: dt: sun6i: Add #address-cells and #size-cells to i2c controller nodes
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      2136edf3
    • L
      Merge tag 'pwm/for-3.17-rc2' of... · 451fd722
      Linus Torvalds 提交于
      Merge tag 'pwm/for-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
      
      Pull pwm fix from Thierry Reding:
       "Just one bugfix for the PWM lookup table code that would cause a PWM
        channel to be set to the wrong period and polarity for non-perfect
        matches"
      
      * tag 'pwm/for-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
        pwm: Fix period and polarity in pwm_get() for non-perfect matches
      451fd722
    • M
      mac80211: fix channel switch for chanctx-based drivers · 47e4df94
      Michal Kazior 提交于
      The new_ctx pointer is set only for non-chanctx drivers.  This yielded a
      crash for chanctx-based drivers during channel switch finalization:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
        IP: ieee80211_vif_use_reserved_switch+0x71c/0xb00 [mac80211]
      
      Use an adequate chanctx pointer to fix this.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NMichal Kazior <michal.kazior@tieto.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      47e4df94
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 433ab34d
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
       "Here are some bug fixes that have piled up during ksummit/linuxcon.
      
         1) Fix endian problems in ibmveth, from Anton Blanchard.
      
         2) IPV6 routing code does GFP_KERNEL allocation in atomic, fix from
            Benjamin Block.
      
         3) SCTP association fixes from Daniel Borkmann.
      
         4) When multiple VLAN headers are present we have to make sure the
            second and subsequent ones are pullable in the SKB otherwise we
            blindly dereference garbage.  From Jiri Benc.
      
         5) The argument adjustment of the signature of hlist_add_after*()
            introduced a regression in the batman-adv code, fix from Sven
            Eckelmann.
      
         6) Fix TX hang handling to avoid a panic in i40e, from Anjali Singhai
            Jain.
      
         7) PTP flag test is inverted in i40e driver, from Jesse Brandeburg.
      
         8) ATM LEC driver needs to hold RTNL mutex over MTU changes, from
            Chas Williams.
      
         9) Truncate packets larger then the TPACKET_V3 format configured
            buffers, otherwise we overwrite past the end of said buffers.
            From Eric Dumazet.
      
        10) Fix endianness bugs in qlcnic firmware handling, from Rajesh
            Borundia and Shahed Shaikh.
      
        11) CXGB4 sometimes doesn't get all of the TX completion events it
            should resulting in SKBs getting stuck in the TX queue, from
            Hariprasad Shenai.
      
        12) When the FEC chip's PTP clock is disabled, you can't access the
            register.  Add necessary checks to avoid the resulting hang, from
            Fugang Duan"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits)
        drivers: isdn: eicon: xdi_msg.h: Fix typo in #ifndef
        net: sctp: fix suboptimal edge-case on non-active active/retrans path selection
        net: sctp: spare unnecessary comparison in sctp_trans_elect_best
        net: ethernet: broadcom: bnx2x: Remove redundant #ifdef
        ibmveth: Fix endian issues with rx_no_buffer statistic
        net: xgene: fix possible NULL dereference in xgene_enet_free_desc_rings()
        openvswitch: fix panic with multiple vlan headers
        net: ipv6: fib: don't sleep inside atomic lock
        net: fec: ptp: avoid register access when ipg clock is disabled
        cxgb4: Free completed tx skbs promptly
        cxgb4: Fix race condition in cleanup
        sctp: not send SCTP_PEER_ADDR_CHANGE notifications with failed probe
        bnx2x: Revert UNDI flushing mechanism
        qlcnic: Fix endianess issue in firmware load from file operation
        qlcnic: Fix endianess issue in FW dump template header
        qlcnic: Fix flash access interface to application
        MAINTAINERS: Add section for MRF24J40 IEEE 802.15.4 radio driver
        macvlan: Allow setting multicast filter on all macvlan types
        packet: handle too big packets for PACKET_V3
        MAINTAINERS: add entry for ec_bhf driver
        ...
      433ab34d
    • R
      drivers: isdn: eicon: xdi_msg.h: Fix typo in #ifndef · faaa5524
      Rasmus Villemoes 提交于
      Test for definedness of the macro which is actually defined (the
      change is hard to see: it is s/SSS/SSA/).
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      faaa5524
    • D
      net: sctp: fix suboptimal edge-case on non-active active/retrans path selection · aa4a83ee
      Daniel Borkmann 提交于
      In SCTP, selection of active (T.ACT) and retransmission (T.RET)
      transports is being done whenever transport control operations
      (UP, DOWN, PF, ...) are engaged through sctp_assoc_control_transport().
      
      Commits 4c47af4d ("net: sctp: rework multihoming retransmission
      path selection to rfc4960") and a7288c4d ("net: sctp: improve
      sctp_select_active_and_retran_path selection") have both improved
      it towards a more fine-grained and optimal path selection.
      
      Currently, the selection algorithm for T.ACT and T.RET is as follows:
      
      1) Elect the two most recently used ACTIVE transports T1, T2 for
         T.ACT, T.RET, where T.ACT<-T1 and T1 is most recently used
      2) In case primary path T.PRI not in {T1, T2} but ACTIVE, set
         T.ACT<-T.PRI and T.RET<-T1
      3) If only T1 is ACTIVE from the set, set T.ACT<-T1 and T.RET<-T1
      4) If none is ACTIVE, set T.ACT<-best(T.PRI, T.RET, T3) where
         T3 is the most recently used (if avail) in PF, set T.RET<-T.PRI
      
      Prior to above commits, 4) was simply a camp on T.ACT<-T.PRI and
      T.RET<-T.PRI, ignoring possible paths in PF. Camping on T.PRI is
      still slightly suboptimal as it can lead to the following scenario:
      
      Setup:
              <A>                                <B>
          T1: p1p1 (10.0.10.10) <==>  .'`)  <==> p1p1 (10.0.10.12)  <= T.PRI
          T2: p1p2 (10.0.10.20) <==> (_ . ) <==> p1p2 (10.0.10.22)
      
          net.sctp.rto_min = 1000
          net.sctp.path_max_retrans = 2
          net.sctp.pf_retrans = 0
          net.sctp.hb_interval = 1000
      
      T.PRI is permanently down, T2 is put briefly into PF state (e.g. due to
      link flapping). Here, the first time transmission is sent over PF path
      T2 as it's the only non-INACTIVE path, but the retransmitted data-chunks
      are sent over the INACTIVE path T1 (T.PRI), which is not good.
      
      After the patch, it's choosing better transports in both cases by
      modifying step 4):
      
      4) If none is ACTIVE, set T.ACT_new<-best(T.ACT_old, T3) where T3 is
         the most recently used (if avail) in PF, set T.RET<-T.ACT_new
      
      This will still select a best possible path in PF if available (which
      can also include T.PRI/T.RET), and set both T.ACT/T.RET to it.
      
      In case sctp_assoc_control_transport() *just* put T.ACT_old into INACTIVE
      as it transitioned from ACTIVE->PF->INACTIVE and stays in INACTIVE just
      for a very short while before going back ACTIVE, it will guarantee that
      this path will be reselected for T.ACT/T.RET since T3 (PF) is not
      available.
      
      Previously, this was not possible, as we would only select between T.PRI
      and T.RET, and a possible T3 would be NULL due to the fact that we have
      just transitioned T3 in sctp_assoc_control_transport() from PF->INACTIVE
      and would select a suboptimal path when T.PRI/T.RET have worse properties.
      
      In the case that T.ACT_old permanently went to INACTIVE during this
      transition and there's no PF path available, plus T.PRI and T.RET are
      INACTIVE as well, we would now camp on T.ACT_old, but if everything is
      being INACTIVE there's really not much we can do except hoping for a
      successful HB to bring one of the transports back up again and, thus
      cause a new selection through sctp_assoc_control_transport().
      
      Now both tests work fine:
      
      Case 1:
      
       1. T1 S(ACTIVE) T.ACT
          T2 S(ACTIVE) T.RET
      
       2. T1 S(ACTIVE) T.ACT, T.RET
          T2 S(PF)
      
       3. T1 S(ACTIVE) T.ACT, T.RET
          T2 S(INACTIVE)
      
       5. T1 S(PF) T.ACT, T.RET
          T2 S(INACTIVE)
      
      [ 5.1 T1 S(INACTIVE) T.ACT, T.RET
            T2 S(INACTIVE) ]
      
       6. T1 S(ACTIVE) T.ACT, T.RET
          T2 S(INACTIVE)
      
       7. T1 S(ACTIVE) T.ACT
          T2 S(ACTIVE) T.RET
      
      Case 2:
      
       1. T1 S(ACTIVE) T.ACT
          T2 S(ACTIVE) T.RET
      
       2. T1 S(PF)
          T2 S(ACTIVE) T.ACT, T.RET
      
       3. T1 S(INACTIVE)
          T2 S(ACTIVE) T.ACT, T.RET
      
       5. T1 S(INACTIVE)
          T2 S(PF) T.ACT, T.RET
      
      [ 5.1 T1 S(INACTIVE)
            T2 S(INACTIVE) T.ACT, T.RET ]
      
       6. T1 S(INACTIVE)
          T2 S(ACTIVE) T.ACT, T.RET
      
       7. T1 S(ACTIVE) T.ACT
          T2 S(ACTIVE) T.RET
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa4a83ee
    • D
      net: sctp: spare unnecessary comparison in sctp_trans_elect_best · ea4f19c1
      Daniel Borkmann 提交于
      When both transports are the same, we don't have to go down that
      road only to realize that we will return the very same transport.
      We are guaranteed that curr is always non-NULL. Therefore, just
      short-circuit this special case.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ea4f19c1
    • R
      net: ethernet: broadcom: bnx2x: Remove redundant #ifdef · 7d149c52
      Rasmus Villemoes 提交于
      Nothing defines _ASM_GENERIC_INT_L64_H, it is a weird way to check for
      64 bit longs, and u64 should be printed using %llx anyway.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d149c52
    • A
      ibmveth: Fix endian issues with rx_no_buffer statistic · cbd52281
      Anton Blanchard 提交于
      Hidden away in the last 8 bytes of the buffer_list page is a solitary
      statistic. It needs to be byte swapped or else ethtool -S will
      produce numbers that terrify the user.
      
      Since we do this in multiple places, create a helper function with a
      comment explaining what is going on.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbd52281
    • I
      net: xgene: fix possible NULL dereference in xgene_enet_free_desc_rings() · c10e4caf
      Iyappan Subramanian 提交于
      A NULL pointer dereference is possible for the argument ring->buf_pool
      which is passed to xgene_enet_free_desc_ring(), as ring could be NULL.
      
      And now since NULL pointers are being checked for before the calls to
      xgene_enet_free_desc_ring(), might as well take advantage of them and
      not call the function if the argument would be NULL.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c10e4caf
    • J
      openvswitch: fix panic with multiple vlan headers · 2ba5af42
      Jiri Benc 提交于
      When there are multiple vlan headers present in a received frame, the first
      one is put into vlan_tci and protocol is set to ETH_P_8021Q. Anything in the
      skb beyond the VLAN TPID may be still non-linear, including the inner TCI
      and ethertype. While ovs_flow_extract takes care of IP and IPv6 headers, it
      does nothing with ETH_P_8021Q. Later, if OVS_ACTION_ATTR_POP_VLAN is
      executed, __pop_vlan_tci pulls the next vlan header into vlan_tci.
      
      This leads to two things:
      
      1. Part of the resulting ethernet header is in the non-linear part of the
         skb. When eth_type_trans is called later as the result of
         OVS_ACTION_ATTR_OUTPUT, kernel BUGs in __skb_pull. Also, __pop_vlan_tci
         is in fact accessing random data when it reads past the TPID.
      
      2. network_header points into the ethernet header instead of behind it.
         mac_len is set to a wrong value (10), too.
      Reported-by: NYulong Pei <ypei@redhat.com>
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ba5af42
    • B
      net: ipv6: fib: don't sleep inside atomic lock · 793c3b40
      Benjamin Block 提交于
      The function fib6_commit_metrics() allocates a piece of memory in mode
      GFP_KERNEL while holding an atomic lock from higher up in the stack, in
      the function __ip6_ins_rt(). This produces the following BUG:
      
      > BUG: sleeping function called from invalid context at mm/slub.c:1250
      > in_atomic(): 1, irqs_disabled(): 0, pid: 2909, name: dhcpcd
      > 2 locks held by dhcpcd/2909:
      >  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81978e67>] rtnl_lock+0x17/0x20
      >  #1:  (&tb->tb6_lock){++--+.}, at: [<ffffffff81a6951a>] ip6_route_add+0x65a/0x800
      > CPU: 1 PID: 2909 Comm: dhcpcd Not tainted 3.17.0-rc1 #1
      > Hardware name: ASUS All Series/Q87T, BIOS 0216 10/16/2013
      >  0000000000000008 ffff8800c8f13858 ffffffff81af135a 0000000000000000
      >  ffff880212202430 ffff8800c8f13878 ffffffff810f8d3a ffff880212202c98
      >  0000000000000010 ffff8800c8f138c8 ffffffff8121ad0e 0000000000000001
      > Call Trace:
      >  [<ffffffff81af135a>] dump_stack+0x4e/0x68
      >  [<ffffffff810f8d3a>] __might_sleep+0x10a/0x120
      >  [<ffffffff8121ad0e>] kmem_cache_alloc_trace+0x4e/0x190
      >  [<ffffffff81a6bcd6>] ? fib6_commit_metrics+0x66/0x110
      >  [<ffffffff81a6bcd6>] fib6_commit_metrics+0x66/0x110
      >  [<ffffffff81a6cbf3>] fib6_add+0x883/0xa80
      >  [<ffffffff81a6951a>] ? ip6_route_add+0x65a/0x800
      >  [<ffffffff81a69535>] ip6_route_add+0x675/0x800
      >  [<ffffffff81a68f2a>] ? ip6_route_add+0x6a/0x800
      >  [<ffffffff81a6990c>] inet6_rtm_newroute+0x5c/0x80
      >  [<ffffffff8197cf01>] rtnetlink_rcv_msg+0x211/0x260
      >  [<ffffffff81978e67>] ? rtnl_lock+0x17/0x20
      >  [<ffffffff81119708>] ? lock_release_holdtime+0x28/0x180
      >  [<ffffffff81978e67>] ? rtnl_lock+0x17/0x20
      >  [<ffffffff8197ccf0>] ? __rtnl_unlock+0x20/0x20
      >  [<ffffffff819a989e>] netlink_rcv_skb+0x6e/0xd0
      >  [<ffffffff81978ee5>] rtnetlink_rcv+0x25/0x40
      >  [<ffffffff819a8e59>] netlink_unicast+0xd9/0x180
      >  [<ffffffff819a9600>] netlink_sendmsg+0x700/0x770
      >  [<ffffffff81103735>] ? local_clock+0x25/0x30
      >  [<ffffffff8194e83c>] sock_sendmsg+0x6c/0x90
      >  [<ffffffff811f98e3>] ? might_fault+0xa3/0xb0
      >  [<ffffffff8195ca6d>] ? verify_iovec+0x7d/0xf0
      >  [<ffffffff8194ec3e>] ___sys_sendmsg+0x37e/0x3b0
      >  [<ffffffff8111ef15>] ? trace_hardirqs_on_caller+0x185/0x220
      >  [<ffffffff81af979e>] ? mutex_unlock+0xe/0x10
      >  [<ffffffff819a55ec>] ? netlink_insert+0xbc/0xe0
      >  [<ffffffff819a65e5>] ? netlink_autobind.isra.30+0x125/0x150
      >  [<ffffffff819a6520>] ? netlink_autobind.isra.30+0x60/0x150
      >  [<ffffffff819a84f9>] ? netlink_bind+0x159/0x230
      >  [<ffffffff811f989a>] ? might_fault+0x5a/0xb0
      >  [<ffffffff8194f25e>] ? SYSC_bind+0x7e/0xd0
      >  [<ffffffff8194f8cd>] __sys_sendmsg+0x4d/0x80
      >  [<ffffffff8194f912>] SyS_sendmsg+0x12/0x20
      >  [<ffffffff81afc692>] system_call_fastpath+0x16/0x1b
      
      Fixing this by replacing the mode GFP_KERNEL with GFP_ATOMIC.
      Signed-off-by: NBenjamin Block <bebl@mageta.org>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      793c3b40
    • N
      net: fec: ptp: avoid register access when ipg clock is disabled · 91c0d987
      Nimrod Andy 提交于
      The current kernel hang on i.MX6SX with rootfs mount from MMC.
      The root cause is that ptp uses a periodic timer to access enet register
      even if ipg clock is disabled.
      
      FEC ptp driver start one period timer to read 1588 counter register in the
      ptp init function that is called after FEC driver is probed.
      
      To save power, after FEC probe finish, FEC driver disable all clocks including
      ipg clock that is needed for register access.
      
      i.MX5x, i.MX6q/dl/sl FEC register access don't cause system hang when ipg clock
      is disabled, just return zero value. But for i.MX6sx SOC, it cause system hang.
      
      To avoid the issue, we need to check ptp clock status before ptp timer count access.
      Signed-off-by: NFugang Duan <B38611@freescale.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91c0d987
    • L
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 26d189b8
      Linus Torvalds 提交于
      Pull arm64 fixes from Will Deacon:
       "This small set of fixes addresses a few issues introduced during the
        merge window, including:
      
         - fix typo in I-cache detection that was causing us to treat all
           I-caches as aliasing
         - hook up memfd_create and getrandom syscalls for native and compat
         - revert a temporary hack for defconfig builds in -next (the audit
           tree changes didn't make it in this merge window)
         - a couple of UEFI fixes for TEXT_OFFSET fuzzing and /memreserve/
         - a simple sparsemem fix for 48-bit physical addressing
         - small defconfig updates to get autotesters working with X-gene"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        Revert "arm64: Do not invoke audit_syscall_* functions if !CONFIG_AUDIT_SYSCALL"
        arm64: mm: update max pa bits to 48
        arm64: ignore DT memreserve entries when booting in UEFI mode
        arm64: configs: Enable X-Gene SATA and ethernet in defconfig
        arm64: align randomized TEXT_OFFSET on 4 kB boundary
        asm-generic: add memfd_create system call to unistd.h
        arm64: compat: wire up memfd_create and getrandom syscalls for aarch32
        arm64: fix typo in I-cache policy detection
      26d189b8
    • L
      Merge tag 'iommu-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 1ae45cf0
      Linus Torvalds 提交于
      Pull IOMMU fixes from Joerg Roedel:
       "The fixes include:
      
         - fix a crash in the VT-d driver when devices with a driver attached
           are hot-unplugged
      
         - fix a AMD IOMMU driver crash with device assignment of 32 bit PCI
           devices to KVM guests
      
         - fix for a copy&paste error in generic IOMMU code.  Now the right
           function pointer is checked before calling"
      
      * tag 'iommu-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/core: Check for the right function pointer in iommu_map()
        iommu/amd: Fix cleanup_domain for mass device removal
        iommu/vt-d: Defer domain removal if device is assigned to a driver
      1ae45cf0
  7. 22 8月, 2014 8 次提交