1. 15 7月, 2015 16 次提交
    • Y
      IB/ipoib: Scatter-Gather support in connected mode · c4268778
      Yuval Shaia 提交于
      By default, IPoIB-CM driver uses 64k MTU. Larger MTU gives better
      performance.
      This MTU plus overhead puts the memory allocation for IP based packets at
      32 4k pages (order 5), which have to be contiguous.
      When the system memory under pressure, it was observed that allocating 128k
      contiguous physical memory is difficult and causes serious errors (such as
      system becomes unusable).
      
      This enhancement resolve the issue by removing the physically contiguous
      memory requirement using Scatter/Gather feature that exists in Linux stack.
      
      With this fix Scatter-Gather will be supported also in connected mode.
      
      This change reverts some of the change made in commit e112373f
      ("IPoIB/cm: Reduce connected mode TX object size").
      
      The ability to use SG in IPoIB CM is possible because the coupling
      between NETIF_F_SG and NETIF_F_CSUM was removed in commit
      ec5f0615 ("net: Kill link between CSUM and SG features.")
      Signed-off-by: NYuval Shaia <yuval.shaia@oracle.com>
      Acked-by: NChristian Marie <christian@ponies.io>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      c4268778
    • C
      IB/ucm: Fix bitmap wrap when devnum > IB_UCM_MAX_DEVICES · 59d40dd9
      Carol L Soto 提交于
      ib_ucm_release_dev clears the wrong bit if devnum is greater
      than IB_UCM_MAX_DEVICES.
      Signed-off-by: NCarol L Soto <clsoto@linux.vnet.ibm.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      59d40dd9
    • H
      IB/ipoib: Prevent lockdep warning in __ipoib_ib_dev_flush · 8b7cce0d
      Haggai Eran 提交于
      __ipoib_ib_dev_flush calls itself recursively on child devices, and lockdep
      complains about locking vlan_rwsem twice (see below). Use down_read_nested
      instead of down_read to prevent the warning.
      
       =============================================
       [ INFO: possible recursive locking detected ]
       4.1.0-rc4+ #36 Tainted: G           O
       ---------------------------------------------
       kworker/u20:2/261 is trying to acquire lock:
        (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
      
       but task is already holding lock:
        (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&priv->vlan_rwsem);
         lock(&priv->vlan_rwsem);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       3 locks held by kworker/u20:2/261:
        #0:  ("%s""ipoib_flush"){.+.+..}, at: [<ffffffff810827cc>] process_one_work+0x15c/0x760
        #1:  ((&priv->flush_heavy)){+.+...}, at: [<ffffffff810827cc>] process_one_work+0x15c/0x760
        #2:  (&priv->vlan_rwsem){.+.+..}, at: [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
      
       stack backtrace:
       CPU: 3 PID: 261 Comm: kworker/u20:2 Tainted: G           O    4.1.0-rc4+ #36
       Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
       Workqueue: ipoib_flush ipoib_ib_dev_flush_heavy [ib_ipoib]
        ffff8801c6c54790 ffff8801c9927af8 ffffffff81665238 0000000000000001
        ffffffff825b5b30 ffff8801c9927bd8 ffffffff810bba51 ffff880100000000
        ffffffff00000001 ffff880100000001 ffff8801c6c55428 ffff8801c6c54790
       Call Trace:
        [<ffffffff81665238>] dump_stack+0x4f/0x6f
        [<ffffffff810bba51>] __lock_acquire+0x741/0x1820
        [<ffffffff810bcbf8>] lock_acquire+0xc8/0x240
        [<ffffffffa0791e2a>] ? __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
        [<ffffffff81669d2c>] down_read+0x4c/0x70
        [<ffffffffa0791e2a>] ? __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
        [<ffffffffa0791e2a>] __ipoib_ib_dev_flush+0x3a/0x2b0 [ib_ipoib]
        [<ffffffffa0791e4a>] __ipoib_ib_dev_flush+0x5a/0x2b0 [ib_ipoib]
        [<ffffffffa07920ba>] ipoib_ib_dev_flush_heavy+0x1a/0x20 [ib_ipoib]
        [<ffffffff81082871>] process_one_work+0x201/0x760
        [<ffffffff810827cc>] ? process_one_work+0x15c/0x760
        [<ffffffff81082ef0>] worker_thread+0x120/0x4d0
        [<ffffffff81082dd0>] ? process_one_work+0x760/0x760
        [<ffffffff81082dd0>] ? process_one_work+0x760/0x760
        [<ffffffff81088b7e>] kthread+0xfe/0x120
        [<ffffffff81088a80>] ? __init_kthread_worker+0x70/0x70
        [<ffffffff8166c6e2>] ret_from_fork+0x42/0x70
        [<ffffffff81088a80>] ? __init_kthread_worker+0x70/0x70
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      8b7cce0d
    • H
      IB/ucma: Fix lockdep warning in ucma_lock_files · 31b57b87
      Haggai Eran 提交于
      The ucma_lock_files() locks the mut mutex on two files, e.g. for migrating
      an ID. Use mutex_lock_nested() to prevent the warning below.
      
       =============================================
       [ INFO: possible recursive locking detected ]
       4.1.0-rc6-hmm+ #40 Tainted: G           O
       ---------------------------------------------
       pingpong_rpc_se/10260 is trying to acquire lock:
        (&file->mut){+.+.+.}, at: [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]
      
       but task is already holding lock:
        (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&file->mut);
         lock(&file->mut);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       1 lock held by pingpong_rpc_se/10260:
        #0:  (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]
      
       stack backtrace:
       CPU: 0 PID: 10260 Comm: pingpong_rpc_se Tainted: G           O    4.1.0-rc6-hmm+ #40
       Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
        ffff8801f85b63d0 ffff880195677b58 ffffffff81668f49 0000000000000001
        ffffffff825cbbe0 ffff880195677c38 ffffffff810bb991 ffff880100000000
        ffff880100000000 ffff880100000001 ffff8801f85b7010 ffffffff8121bee9
       Call Trace:
        [<ffffffff81668f49>] dump_stack+0x4f/0x6e
        [<ffffffff810bb991>] __lock_acquire+0x741/0x1820
        [<ffffffff8121bee9>] ? dput+0x29/0x320
        [<ffffffff810bcb38>] lock_acquire+0xc8/0x240
        [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
        [<ffffffff8166b901>] ? mutex_lock_nested+0x291/0x3e0
        [<ffffffff8166b6d5>] mutex_lock_nested+0x65/0x3e0
        [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
        [<ffffffff810baeed>] ? trace_hardirqs_on+0xd/0x10
        [<ffffffff8166b66e>] ? mutex_unlock+0xe/0x10
        [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]
        [<ffffffffa0478474>] ucma_write+0xa4/0xb0 [rdma_ucm]
        [<ffffffff81200674>] __vfs_write+0x34/0x100
        [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
        [<ffffffff810ec055>] ? current_kernel_time+0xc5/0xe0
        [<ffffffff812aa4d3>] ? security_file_permission+0x23/0x90
        [<ffffffff8120088d>] ? rw_verify_area+0x5d/0xe0
        [<ffffffff812009bb>] vfs_write+0xab/0x120
        [<ffffffff81201519>] SyS_write+0x59/0xd0
        [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
        [<ffffffff8166ffee>] system_call_fastpath+0x12/0x76
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      31b57b87
    • W
      rds: rds_ib_device.refcount overflow · 4fabb594
      Wengang Wang 提交于
      Fixes: 3e0249f9 ("RDS/IB: add refcount tracking to struct rds_ib_device")
      
      There lacks a dropping on rds_ib_device.refcount in case rds_ib_alloc_fmr
      failed(mr pool running out). this lead to the refcount overflow.
      
      A complain in line 117(see following) is seen. From vmcore:
      s_ib_rdma_mr_pool_depleted is 2147485544 and rds_ibdev->refcount is -2147475448.
      That is the evidence the mr pool is used up. so rds_ib_alloc_fmr is very likely
      to return ERR_PTR(-EAGAIN).
      
      115 void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
      116 {
      117         BUG_ON(atomic_read(&rds_ibdev->refcount) <= 0);
      118         if (atomic_dec_and_test(&rds_ibdev->refcount))
      119                 queue_work(rds_wq, &rds_ibdev->free_work);
      120 }
      
      fix is to drop refcount when rds_ib_alloc_fmr failed.
      Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
      Reviewed-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      4fabb594
    • T
      RDMA/nes: Fix for incorrect recording of the MAC address · 0a691272
      Tatyana Nikolova 提交于
      Fix for incorrect recording of the MAC address
      Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      0a691272
    • T
      RDMA/nes: Fix for resolving the neigh · 165d6822
      Tatyana Nikolova 提交于
      Neighbor resolution doesn't work without this fix
      Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      165d6822
    • T
      RDMA/core: Fixes for port mapper client registration · a7f2f24c
      Tatyana Nikolova 提交于
      Fixes to allow clients to make remove mapping requests, after
      they have provided the user space service with the mapping
      information, they are using when the service is restarted.
      
      1) Adding IWPM_REG_VALID, IWPM_REG_INCOMPL and IWPM_REG_UNDEF
         registration types for the port mapper clients and functions
         to set/check the registration type.
      2) If the port mapper user space service is not available to register
         the client, then its registration stays IWPM_REG_UNDEF and the
         registration isn't checked until the service becomes available
         (no mappings are possible, if the user space service isn't running).
      3) After the service is restarted, the user space port mapper pid is set
         to valid and the client registration is set to IWPM_REG_INCOMPL
         to allow the client to make remove mapping requests.
      Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
      Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
      Tested-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      a7f2f24c
    • A
      IB/IPoIB: Fix bad error flow in ipoib_add_port() · 58e9cc90
      Amir Vadai 提交于
      Error values of ib_query_port() and ib_query_device() weren't propagated
      correctly. Because of that, ipoib_add_port() could return NULL value,
      which escaped the IS_ERR() check in ipoib_add_one() and we crashed.
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      58e9cc90
    • M
      IB/mlx4: Do not attemp to report HCA clock offset on VFs · 8a7ff14d
      Matan Barak 提交于
      mlx4 VFs can provide CQE raw time-stamping services, but they
      don't have the hca core clock mapped to their PCI bars.
      
      As such, we should not attempt to query and report the clock offset
      to user space for VFs. Doing so causes query_device over VFs to fail
      with -ENOSUPP.
      
      Fixes: 4b664c43 ('IB/mlx4: Add support for CQ time-stamping')
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      8a7ff14d
    • E
      IB/cm: Do not queue work to a device that's going away · be4b4993
      Erez Shitrit 提交于
      Whenever ib_cm gets remove_one call, like when there is a hot-unplug
      event, the driver should mark itself as going_down and confirm that no
      new works are going to be queued for that device.
      so, the order of the actions are:
      1. mark the going_down bit.
      2. flush the wq.
      3. [make sure no new works for that device.]
      4. unregister mad agent.
      
      otherwise, works that are already queued can be scheduled after the mad
      agent was freed.
      Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      be4b4993
    • S
      IB/srp: Avoid using uninitialized variable · 3fdf70ac
      Sagi Grimberg 提交于
      We might return res which is not initialized. Also
      reduce code duplication by exporting srp_parse_tmo so
      srp_tmo_set can reuse it.
      
      Detected by Coverity.
      Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: NJenny Falkovich <jennyf@mellanox.com>
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      3fdf70ac
    • V
      IB/srpt: Convert use of __constant_cpu_to_beXX to cpu_to_beXX · b356c1c1
      Vaishali Thakkar 提交于
      In little endian cases, the macro cpu_to_be{16,32,64} unfolds to
      __swab{16,32,64} which provides special case for constants. In
      big endian cases, __constant_cpu_to_be{16,32,64} and
      cpu_to_be{16,32,64} expand directly to the same expression. So,
      replace __constant_cpu_to_be{16,32,64} with cpu_to_be{16,32,64}
      with the goal of getting rid of the definitions of
      __constant_cpu_to_be{16,32,64} completely.
      
      The Coccinelle semantic patch that performs this transformation
      is as follows:
      
      @@expression x;@@
      
      (
      - __constant_cpu_to_be16(x)
      + cpu_to_be16(x)
      |
      - __constant_cpu_to_be32(x)
      + cpu_to_be32(x)
      |
      - __constant_cpu_to_be64(x)
      + cpu_to_be64(x)
      )
      Signed-off-by: NVaishali Thakkar <vthakkar1994@gmail.com>
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      b356c1c1
    • I
      IB/mad: Remove improper use of BUG_ON · 3b8ab700
      Ira Weiny 提交于
      We recently added BUG_ON's which were inappropriate for a condition which
      should never happen. Change these to be WARN_ON_ONCE as a debugging aid.
      
      Fixes: 4cd7c947 ('IB/mad: Add support for additional MAD info to/from drivers')
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      3b8ab700
    • I
      IB/mad: Fix compare between big endian and cpu endian · cd4cd565
      Ira Weiny 提交于
      The define OPA_LID_PERMISSIVE is big endian and was compared to the
      cpu endian variable opa_drslid.
      
      Problem caught by 0-day build infrastructure.
      
      Fixes: 8e4349d1 (IB/mad: Add final OPA MAD processing)
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: NJohn, Jubin <jubin.john@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      cd4cd565
    • H
      IB: Add rdma_cap_ib_switch helper and use where appropriate · 4139032b
      Hal Rosenstock 提交于
      Persuant to Liran's comments on node_type on linux-rdma
      mailing list:
      
      In an effort to reform the RDMA core and ULPs to minimize use of
      node_type in struct ib_device, an additional bit is added to
      struct ib_device for is_switch (IB switch). This is needed
      to be initialized by any IB switch device driver. This is a
      NEW requirement on such device drivers which are all
      "out of tree".
      
      In addition, an ib_switch helper was added to ib_verbs.h
      based on the is_switch device bit rather than node_type
      (although those should be consistent).
      
      The RDMA core (MAD, SMI, agent, sa_query, multicast, sysfs)
      as well as (IPoIB and SRP) ULPs are updated where
      appropriate to use this new helper. In some cases,
      the helper is now used under the covers of using
      rdma_[start end]_port rather than the open coding
      previously used.
      Reviewed-by: NSean Hefty <sean.hefty@intel.com>
      Reviewed-By: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Reviewed-by: NIra Weiny <ira.weiny@intel.com>
      Tested-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NHal Rosenstock <hal@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      4139032b
  2. 13 7月, 2015 7 次提交
    • L
      Linux 4.2-rc2 · bc0195aa
      Linus Torvalds 提交于
      bc0195aa
    • L
      Revert "drm/i915: Use crtc_state->active in primary check_plane func" · 01e2d062
      Linus Torvalds 提交于
      This reverts commit dec4f799.
      
      Jörg Otte reports a NULL pointder dereference due to this commit, as
      'crtc_state' very much can be NULL:
      
              crtc_state = state->base.state ?
                      intel_atomic_get_crtc_state(state->base.state, intel_crtc) : NULL;
      
      So the change to test 'crtc_state->base.active' cannot possibly be
      correct as-is.
      
      There may be some other minimal fix (like just checking crtc_state for
      NULL), but I'm just reverting it now for the rc2 release, and people
      like Daniel Vetter who actually know this code will figure out what the
      right solution is in the longer term.
      Reported-and-bisected-by: NJörg Otte <jrg.otte@gmail.com>
      Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@intel.com>
      CC: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01e2d062
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · c83727a6
      Linus Torvalds 提交于
      Pull VFS fixes from Al Viro:
       "Fixes for this cycle regression in overlayfs and a couple of
        long-standing (== all the way back to 2.6.12, at least) bugs"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        freeing unlinked file indefinitely delayed
        fix a braino in ovl_d_select_inode()
        9p: don't leave a half-initialized inode sitting around
      c83727a6
    • L
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 7fbb58a0
      Linus Torvalds 提交于
      Pull MIPS fixes from Ralf Baechle:
       "A fair number of 4.2 fixes also because Markos opened the flood gates.
      
         - Patch up the math used calculate the location for the page bitmap.
      
         - The FDC (Not what you think, FDC stands for Fast Debug Channel) IRQ
           around was causing issues on non-Malta platforms, so move the code
           to a Malta specific location.
      
         - A spelling fix replicated through several files.
      
         - Fix to the emulation of an R2 instruction for R6 cores.
      
         - Fix the JR emulation for R6.
      
         - Further patching of mindless 64 bit issues.
      
         - Ensure the kernel won't crash on CPUs with L2 caches with >= 8
           ways.
      
         - Use compat_sys_getsockopt for O32 ABI on 64 bit kernels.
      
         - Fix cache flushing for multithreaded cores.
      
         - A build fix"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: O32: Use compat_sys_getsockopt.
        MIPS: c-r4k: Extend way_string array
        MIPS: Pistachio: Support CDMM & Fast Debug Channel
        MIPS: Malta: Make GIC FDC IRQ workaround Malta specific
        MIPS: c-r4k: Fix cache flushing for MT cores
        Revert "MIPS: Kconfig: Disable SMP/CPS for 64-bit"
        MIPS: cps-vec: Use macros for various arithmetics and memory operations
        MIPS: kernel: cps-vec: Replace KSEG0 with CKSEG0
        MIPS: kernel: cps-vec: Use ta0-ta3 pseudo-registers for 64-bit
        MIPS: kernel: cps-vec: Replace mips32r2 ISA level with mips64r2
        MIPS: kernel: cps-vec: Replace 'la' macro with PTR_LA
        MIPS: kernel: smp-cps: Fix 64-bit compatibility errors due to pointer casting
        MIPS: Fix erroneous JR emulation for MIPS R6
        MIPS: Fix branch emulation for BLTC and BGEC instructions
        MIPS: kernel: traps: Fix broken indentation
        MIPS: bootmem: Don't use memory holes for page bitmap
        MIPS: O32: Do not handle require 32 bytes from the stack to be readable.
        MIPS, CPUFREQ: Fix spelling of Institute.
        MIPS: Lemote 2F: Fix build caused by recent mass rename.
      7fbb58a0
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1daa1cfb
      Linus Torvalds 提交于
      Pull x86 fixes from Thomas Gleixner:
      
       - the high latency PIT detection fix, which slipped through the cracks
         for rc1
      
       - a regression fix for the early printk mechanism
      
       - the x86 part to plug irq/vector related hotplug races
      
       - move the allocation of the espfix pages on cpu hotplug to non atomic
         context.  The current code triggers a might_sleep() warning.
      
       - a series of KASAN fixes addressing boot crashes and usability
      
       - a trivial typo fix for Kconfig help text
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/kconfig: Fix typo in the CONFIG_CMDLINE_BOOL help text
        x86/irq: Retrieve irq data after locking irq_desc
        x86/irq: Use proper locking in check_irq_vectors_for_cpu_disable()
        x86/irq: Plug irq vector hotplug race
        x86/earlyprintk: Allow early_printk() to use console style parameters like '115200n8'
        x86/espfix: Init espfix on the boot CPU side
        x86/espfix: Add 'cpu' parameter to init_espfix_ap()
        x86/kasan: Move KASAN_SHADOW_OFFSET to the arch Kconfig
        x86/kasan: Add message about KASAN being initialized
        x86/kasan: Fix boot crash on AMD processors
        x86/kasan: Flush TLBs after switching CR3
        x86/kasan: Fix KASAN shadow region page tables
        x86/init: Clear 'init_level4_pgt' earlier
        x86/tsc: Let high latency PIT fail fast in quick_pit_calibrate()
      1daa1cfb
    • L
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7b732169
      Linus Torvalds 提交于
      Pull timer fixes from Thomas Gleixner:
       "This update from the timer departement contains:
      
         - A series of patches which address a shortcoming in the tick
           broadcast code.
      
           If the broadcast device is not available or an hrtimer emulated
           broadcast device, some of the original assumptions lead to boot
           failures.  I rather plugged all of the corner cases instead of only
           addressing the issue reported, so the change got a little larger.
      
           Has been extensivly tested on x86 and arm.
      
         - Get rid of the last holdouts using do_posix_clock_monotonic_gettime()
      
         - A regression fix for the imx clocksource driver
      
         - An update to the new state callbacks mechanism for clockevents.
           This is required to simplify the conversion, which will take place
           in 4.3"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tick/broadcast: Prevent NULL pointer dereference
        time: Get rid of do_posix_clock_monotonic_gettime
        cris: Replace do_posix_clock_monotonic_gettime()
        tick/broadcast: Unbreak CONFIG_GENERIC_CLOCKEVENTS=n build
        tick/broadcast: Handle spurious interrupts gracefully
        tick/broadcast: Check for hrtimer broadcast active early
        tick/broadcast: Return busy when IPI is pending
        tick/broadcast: Return busy if periodic mode and hrtimer broadcast
        tick/broadcast: Move the check for periodic mode inside state handling
        tick/broadcast: Prevent deep idle if no broadcast device available
        tick/broadcast: Make idle check independent from mode and config
        tick/broadcast: Sanity check the shutdown of the local clock_event
        tick/broadcast: Prevent hrtimer recursion
        clockevents: Allow set-state callbacks to be optional
        clocksource/imx: Define clocksource for mx27
      7b732169
    • L
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c4bc680c
      Linus Torvalds 提交于
      Pull irq fix from Thomas Gleixner:
       "A single fix for a cpu hotplug race vs. interrupt descriptors:
      
        Prevent irq setup/teardown across the cpu starting/dying parts of cpu
        hotplug so that the starting/dying cpu has a stable view of the
        descriptor space.  This has been an issue for all architectures in the
        cpu dying phase, where interrupts are migrated away from the dying
        cpu.  In the starting phase its mostly a x86 issue vs the vector space
        update"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        hotplug: Prevent alloc/free of irq descriptors during cpu up/down
      c4bc680c
  3. 12 7月, 2015 11 次提交
    • A
      freeing unlinked file indefinitely delayed · 75a6f82a
      Al Viro 提交于
      	Normally opening a file, unlinking it and then closing will have
      the inode freed upon close() (provided that it's not otherwise busy and
      has no remaining links, of course).  However, there's one case where that
      does *not* happen.  Namely, if you open it by fhandle with cold dcache,
      then unlink() and close().
      
      	In normal case you get d_delete() in unlink(2) notice that dentry
      is busy and unhash it; on the final dput() it will be forcibly evicted from
      dcache, triggering iput() and inode removal.  In this case, though, we end
      up with *two* dentries - disconnected (created by open-by-fhandle) and
      regular one (used by unlink()).  The latter will have its reference to inode
      dropped just fine, but the former will not - it's considered hashed (it
      is on the ->s_anon list), so it will stay around until the memory pressure
      will finally do it in.  As the result, we have the final iput() delayed
      indefinitely.  It's trivial to reproduce -
      
      void flush_dcache(void)
      {
              system("mount -o remount,rw /");
      }
      
      static char buf[20 * 1024 * 1024];
      
      main()
      {
              int fd;
              union {
                      struct file_handle f;
                      char buf[MAX_HANDLE_SZ];
              } x;
              int m;
      
              x.f.handle_bytes = sizeof(x);
              chdir("/root");
              mkdir("foo", 0700);
              fd = open("foo/bar", O_CREAT | O_RDWR, 0600);
              close(fd);
              name_to_handle_at(AT_FDCWD, "foo/bar", &x.f, &m, 0);
              flush_dcache();
              fd = open_by_handle_at(AT_FDCWD, &x.f, O_RDWR);
              unlink("foo/bar");
              write(fd, buf, sizeof(buf));
              system("df .");			/* 20Mb eaten */
              close(fd);
              system("df .");			/* should've freed those 20Mb */
              flush_dcache();
              system("df .");			/* should be the same as #2 */
      }
      
      will spit out something like
      Filesystem     1K-blocks   Used Available Use% Mounted on
      /dev/root         322023 303843      1131 100% /
      Filesystem     1K-blocks   Used Available Use% Mounted on
      /dev/root         322023 303843      1131 100% /
      Filesystem     1K-blocks   Used Available Use% Mounted on
      /dev/root         322023 283282     21692  93% /
      - inode gets freed only when dentry is finally evicted (here we trigger
      than by remount; normally it would've happened in response to memory
      pressure hell knows when).
      
      Cc: stable@vger.kernel.org # v2.6.38+; earlier ones need s/kill_it/unhash_it/
      Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      75a6f82a
    • A
      fix a braino in ovl_d_select_inode() · 9391dd00
      Al Viro 提交于
      when opening a directory we want the overlayfs inode, not one from
      the topmost layer.
      Reported-By: NAndrey Jr. Melnikov <temnota.am@gmail.com>
      Tested-By: NAndrey Jr. Melnikov <temnota.am@gmail.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9391dd00
    • A
      9p: don't leave a half-initialized inode sitting around · 0a73d0a2
      Al Viro 提交于
      Cc: stable@vger.kernel.org # all branches
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0a73d0a2
    • L
      Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm · 59c3cb55
      Linus Torvalds 提交于
      Pull libnvdimm fixes from Dan Williams:
       "1) Fixes for a handful of smatch reports (Thanks Dan C.!) and minor
           bug fixes (patches 1-6)
      
        2) Correctness fixes to the BLK-mode nvdimm driver (patches 7-10).
      
           Granted these are slightly large for a -rc update.  They have been
           out for review in one form or another since the end of May and were
           deferred from the merge window while we settled on the "PMEM API"
           for the PMEM-mode nvdimm driver (ie memremap_pmem, memcpy_to_pmem,
           and wmb_pmem).
      
           Now that those apis are merged we implement them in the BLK driver
           to guarantee that mmio aperture moves stay ordered with respect to
           incoming read/write requests, and that writes are flushed through
           those mmio-windows and platform-buffers to be persistent on media.
      
        These pass the sub-system unit tests with the updates to
        tools/testing/nvdimm, and have received a successful build-report from
        the kbuild robot (468 configs).
      
        With acks from Rafael for the touches to drivers/acpi/"
      
      * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm:
        nfit: add support for NVDIMM "latch" flag
        nfit: update block I/O path to use PMEM API
        tools/testing/nvdimm: add mock acpi_nfit_flush_address entries to nfit_test
        tools/testing/nvdimm: fix return code for unimplemented commands
        tools/testing/nvdimm: mock ioremap_wt
        pmem: add maintainer for include/linux/pmem.h
        nfit: fix smatch "use after null check" report
        nvdimm: Fix return value of nvdimm_bus_init() if class_create() fails
        libnvdimm: smatch cleanups in __nd_ioctl
        sparse: fix misplaced __pmem definition
      59c3cb55
    • L
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · e4925198
      Linus Torvalds 提交于
      Pull i2c fixes from Wolfram Sang:
       "Mostly slight adjusments for new drivers, but also one core fix for
        which finally the dependencies are now available as well"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: Mark instantiated device nodes with OF_POPULATE
        i2c: jz4780: Fix return value if probe fails
        i2c: xgene-slimpro: Fix missing mbox_free_channel call in probe error path
        i2c: I2C_MT65XX should depend on HAS_DMA
      e4925198
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 8a7b8ff4
      Linus Torvalds 提交于
      Pull input fixes from Dmitry Torokhov:
       "A fix (revert) for a recent regression in Synaptics driver and a fix
        for Elan i2c touchpad driver"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Revert "Input: synaptics - allocate 3 slots to keep stability in image sensors"
        Input: elan_i2c - change the hover event from MT to ST
      8a7b8ff4
    • L
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 4322f028
      Linus Torvalds 提交于
      Pull clk fixes from Stephen Boyd:
       "A small set of fixes for problems found by smatch in new drivers that
        we added this rc and a handful of driver fixes that came in during the
        merge window"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        drivers: clk: st: Incorrect register offset used for lock_status
        clk: mediatek: mt8173: Fix enabling of critical clocks
        drivers: clk: st: Fix mux bit-setting for Cortex A9 clocks
        drivers: clk: st: Add CLK_GET_RATE_NOCACHE flag to clocks
        drivers: clk: st: Fix flexgen lock init
        drivers: clk: st: Fix FSYN channel values
        drivers: clk: st: Remove unused code
        clk: qcom: Use parent rate when set rate to pixel RCG clock
        clk: at91: do not leak resources
        clk: stm32: Fix out-by-one error path in the index lookup
        clk: iproc: fix bit manipulation arithmetic
        clk: iproc: fix memory leak from clock name
      4322f028
    • L
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 9cb1680c
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "A bunch of fixes for radeon, intel, omap and one amdkfd fix.
      
        Radeon fixes are all over, but it does fix some cursor corruption
        across suspend/resume.  i915 should fix the second warn you were
        seeing, so let us know if not.  omap is a bunch of small fixes"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (28 commits)
        drm/radeon: disable vce init on cayman (v2)
        drm/amdgpu: fix timeout calculation
        drm/radeon: check if BO_VA is set before adding it to the invalidation list
        drm/radeon: allways add the VM clear duplicate
        Revert "Revert "drm/radeon: dont switch vt on suspend""
        drm/radeon: Fold radeon_set_cursor() into radeon_show_cursor()
        drm/radeon: unpin cursor BOs on suspend and pin them again on resume (v2)
        drm/radeon: Clean up reference counting and pinning of the cursor BOs
        drm/amdkfd: validate pdd where it acquired first
        Revert "drm/i915: Allocate context objects from stolen"
        drm/i915: Declare the swizzling unknown for L-shaped configurations
        drm/radeon: fix underflow in r600_cp_dispatch_texture()
        drm/radeon: default to 2048 MB GART size on SI+
        drm/radeon: fix HDP flushing
        drm/radeon: use RCU query for GEM_BUSY syscall
        drm/amdgpu: Handle irqs only based on irq ring, not irq status regs.
        drm/radeon: Handle irqs only based on irq ring, not irq status regs.
        drm/i915: Use crtc_state->active in primary check_plane func
        drm/i915: Check crtc->active in intel_crtc_disable_planes
        drm/i915: Restore all GGTT VMAs on resume
        ...
      9cb1680c
    • L
      Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 2278cb0b
      Linus Torvalds 提交于
      Pull selinux fixes from James Morris.
      
      * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        selinux: fix mprotect PROT_EXEC regression caused by mm change
        selinux: don't waste ebitmap space when importing NetLabel categories
      2278cb0b
    • L
      Merge branch 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 31b7a57c
      Linus Torvalds 提交于
      Pull btrfs fixes from Chris Mason:
       "This is an assortment of fixes.  Most of the commits are from Filipe
        (fsync, the inode allocation cache and a few others).  Mark kicked in
        a series fixing corners in the extent sharing ioctls, and everyone
        else fixed up on assorted other problems"
      
      * 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        Btrfs: fix wrong check for btrfs_force_chunk_alloc()
        Btrfs: fix warning of bytes_may_use
        Btrfs: fix hang when failing to submit bio of directIO
        Btrfs: fix a comment in inode.c:evict_inode_truncate_pages()
        Btrfs: fix memory corruption on failure to submit bio for direct IO
        btrfs: don't update mtime/ctime on deduped inodes
        btrfs: allow dedupe of same inode
        btrfs: fix deadlock with extent-same and readpage
        btrfs: pass unaligned length to btrfs_cmp_data()
        Btrfs: fix fsync after truncate when no_holes feature is enabled
        Btrfs: fix fsync xattr loss in the fast fsync path
        Btrfs: fix fsync data loss after append write
        Btrfs: fix crash on close_ctree() if cleaner starts new transaction
        Btrfs: fix race between caching kthread and returning inode to inode cache
        Btrfs: use kmem_cache_free when freeing entry in inode cache
        Btrfs: fix race between balance and unused block group deletion
        btrfs: add error handling for scrub_workers_get()
        btrfs: cleanup noused initialization of dev in btrfs_end_bio()
        btrfs: qgroup: allow user to clear the limitation on qgroup
      31b7a57c
    • L
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 84e3e9d0
      Linus Torvalds 提交于
      Pull ARM SoC fixes from Kevin Hilman:
       "A fairly random colletion of fixes based on -rc1 for OMAP, sunxi and
        prima2 as well as a few arm64-specific DT fixes.
      
        This series also includes a late to support a new Allwinner (sunxi)
        SoC, but since it's rather simple and isolated to the
        platform-specific code, it's included it for this -rc"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        arm64: dts: add device tree for ARM SMM-A53x2 on LogicTile Express 20MG
        arm: dts: vexpress: add missing CCI PMU device node to TC2
        arm: dts: vexpress: describe all PMUs in TC2 dts
        GICv3: Add ITS entry to THUNDER dts
        arm64: dts: Add poweroff button device node for APM X-Gene platform
        ARM: dts: am4372.dtsi: disable rfbi
        ARM: dts: am57xx-beagle-x15: Provide supply for usb2_phy2
        ARM: dts: am4372: Add emif node
        Revert "ARM: dts: am335x-boneblack: disable RTC-only sleep"
        ARM: sunxi: Enable simplefb in the defconfig
        ARM: Remove deprecated symbol from defconfig files
        ARM: sunxi: Add Machine support for A33
        ARM: sunxi: Introduce Allwinner H3 support
        Documentation: sunxi: Update Allwinner SoC documentation
        ARM: prima2: move to use REGMAP APIs for rtciobrg
        ARM: dts: atlas7: add pinctrl and gpio descriptions
        ARM: OMAP2+: Remove unnessary return statement from the void function, omap2_show_dma_caps
        memory: omap-gpmc: Fix parsing of devices
      84e3e9d0
  4. 11 7月, 2015 6 次提交
    • T
      tick/broadcast: Prevent NULL pointer dereference · c4d029f2
      Thomas Gleixner 提交于
      Dan reported that the recent changes to the broadcast code introduced
      a potential NULL dereference.
      
      Add the proper check.
      
      Fixes: e0454311 "tick/broadcast: Sanity check the shutdown of the local clock_event"
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      c4d029f2
    • L
      Merge branch 'parisc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · b9243b5a
      Linus Torvalds 提交于
      Pull parisc fixes from Helge Deller:
       "We have one important patch from Dave Anglin and myself which fixes
        PTE/TLB race conditions which caused random segmentation faults on our
        debian buildd servers, and one patch from Alex Ivanov which speeds up
        the graphical text console on the STI framebuffer driver"
      
      * 'parisc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Fix some PTE/TLB race conditions and optimize __flush_tlb_range based on timing results
        stifb: Implement hardware accelerated copyarea
      b9243b5a
    • J
    • S
      selinux: fix mprotect PROT_EXEC regression caused by mm change · 892e8cac
      Stephen Smalley 提交于
      commit 66fc1303 ("mm: shmem_zero_setup
      skip security check and lockdep conflict with XFS") caused a regression
      for SELinux by disabling any SELinux checking of mprotect PROT_EXEC on
      shared anonymous mappings.  However, even before that regression, the
      checking on such mprotect PROT_EXEC calls was inconsistent with the
      checking on a mmap PROT_EXEC call for a shared anonymous mapping.  On a
      mmap, the security hook is passed a NULL file and knows it is dealing
      with an anonymous mapping and therefore applies an execmem check and no
      file checks.  On a mprotect, the security hook is passed a vma with a
      non-NULL vm_file (as this was set from the internally-created shmem
      file during mmap) and therefore applies the file-based execute check
      and no execmem check.  Since the aforementioned commit now marks the
      shmem zero inode with the S_PRIVATE flag, the file checks are disabled
      and we have no checking at all on mprotect PROT_EXEC.  Add a test to
      the mprotect hook logic for such private inodes, and apply an execmem
      check in that case.  This makes the mmap and mprotect checking
      consistent for shared anonymous mappings, as well as for /dev/zero and
      ashmem.
      
      Cc: <stable@vger.kernel.org> # 4.1.x
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: NPaul Moore <pmoore@redhat.com>
      892e8cac
    • L
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 1604f871
      Linus Torvalds 提交于
      Pull arm64 fixes and clean-up from Catalin Marinas:
       - ACPI fix when checking the validity of the GICC MADT subtable
       - handle debug exceptions in the el*_inv exception entries
       - remove pointless register assignment in two compat syscall wrappers
       - unnecessary include path
       - defconfig update
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: entry32: remove pointless register assignment
        arm64: entry: handle debug exceptions in el*_inv
        arm64: Keep the ARM64 Kconfig selects sorted
        ACPI / ARM64 : use the new BAD_MADT_GICC_ENTRY macro
        ACPI / ARM64: add BAD_MADT_GICC_ENTRY() macro
        arm64: defconfig: Add Ceva ahci to the defconfig
        arm64: remove another unnecessary libfdt include path
      1604f871
    • J
      parisc: Fix some PTE/TLB race conditions and optimize __flush_tlb_range based on timing results · 01ab6057
      John David Anglin 提交于
      The increased use of pdtlb/pitlb instructions seemed to increase the
      frequency of random segmentation faults building packages. Further, we
      had a number of cases where TLB inserts would repeatedly fail and all
      forward progress would stop. The Haskell ghc package caused a lot of
      trouble in this area. The final indication of a race in pte handling was
      this syslog entry on sibaris (C8000):
      
       swap_free: Unused swap offset entry 00000004
       BUG: Bad page map in process mysqld  pte:00000100 pmd:019bbec5
       addr:00000000ec464000 vm_flags:00100073 anon_vma:0000000221023828 mapping: (null) index:ec464
       CPU: 1 PID: 9176 Comm: mysqld Not tainted 4.0.0-2-parisc64-smp #1 Debian 4.0.5-1
       Backtrace:
        [<0000000040173eb0>] show_stack+0x20/0x38
        [<0000000040444424>] dump_stack+0x9c/0x110
        [<00000000402a0d38>] print_bad_pte+0x1a8/0x278
        [<00000000402a28b8>] unmap_single_vma+0x3d8/0x770
        [<00000000402a4090>] zap_page_range+0xf0/0x198
        [<00000000402ba2a4>] SyS_madvise+0x404/0x8c0
      
      Note that the pte value is 0 except for the accessed bit 0x100. This bit
      shouldn't be set without the present bit.
      
      It should be noted that the madvise system call is probably a trigger for many
      of the random segmentation faults.
      
      In looking at the kernel code, I found the following problems:
      
      1) The pte_clear define didn't take TLB lock when clearing a pte.
      2) We didn't test pte present bit inside lock in exception support.
      3) The pte and tlb locks needed to merged in order to ensure consistency
      between page table and TLB. This also has the effect of serializing TLB
      broadcasts on SMP systems.
      
      The attached change implements the above and a few other tweaks to try
      to improve performance. Based on the timing code, TLB purges are very
      slow (e.g., ~ 209 cycles per page on rp3440). Thus, I think it
      beneficial to test the split_tlb variable to avoid duplicate purges.
      Probably, all PA 2.0 machines have combined TLBs.
      
      I dropped using __flush_tlb_range in flush_tlb_mm as I realized all
      applications and most threads have a stack size that is too large to
      make this useful. I added some comments to this effect.
      
      Since implementing 1 through 3, I haven't had any random segmentation
      faults on mx3210 (rp3440) in about one week of building code and running
      as a Debian buildd.
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Cc: stable@vger.kernel.org # v3.18+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      01ab6057