1. 31 8月, 2015 3 次提交
    • Y
      IB/core: Find the network device matching connection parameters · 9268f72d
      Yotam Kenneth 提交于
      In the case of IPoIB, and maybe in other cases, the network device is
      managed by an upper-layer protocol (ULP). In order to expose this
      network device to other users of the IB device, let ULPs implement
      a callback that returns network device according to connection parameters.
      
      The IB device and port, together with the P_Key and the GID should
      be enough to uniquely identify the ULP net device. However, in current
      kernels there can be multiple IPoIB interfaces created with the same GID.
      Furthermore, such configuration may be desireable to support ipvlan-like
      configurations for RDMA CM with IPoIB.  To resolve the device in these
      cases the code will also take the IP address as an additional input.
      Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NYotam Kenneth <yotamke@mellanox.com>
      Signed-off-by: NShachar Raindel <raindel@mellanox.com>
      Signed-off-by: NGuy Shapiro <guysh@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      9268f72d
    • H
      IB/core: lock client data with lists_rwsem · 7c1eb45a
      Haggai Eran 提交于
      An ib_client callback that is called with the lists_rwsem locked only for
      read is protected from changes to the IB client lists, but not from
      ib_unregister_device() freeing its client data. This is because
      ib_unregister_device() will remove the device from the device list with
      lists_rwsem locked for write, but perform the rest of the cleanup,
      including the call to remove() without that lock.
      
      Mark client data that is undergoing de-registration with a new going_down
      flag in the client data context. Lock the client data list with lists_rwsem
      for write in addition to using the spinlock, so that functions calling the
      callback would be able to lock only lists_rwsem for read and let callbacks
      sleep.
      
      Since ib_unregister_client() now marks the client data context, no need for
      remove() to search the context again, so pass the client data directly to
      remove() callbacks.
      Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      7c1eb45a
    • H
      IB/core: Add rwsem to allow reading device list or client list · 5aa44bb9
      Haggai Eran 提交于
      Currently the RDMA subsystem's device list and client list are protected by
      a single mutex. This prevents adding user-facing APIs that iterate these
      lists, since using them may cause a deadlock. The patch attempts to solve
      this problem by adding a read-write semaphore to protect the lists. Readers
      now don't need the mutex, and are safe just by read-locking the semaphore.
      
      The ib_register_device, ib_register_client, ib_unregister_device, and
      ib_unregister_client functions are modified to lock the semaphore for write
      during their respective list modification. Also, in order to make sure
      client callbacks are called only between add() and remove() calls, the code
      is changed to only add items to the lists after the add() calls and remove
      from the lists before the remove() calls.
      
      This patch attempts to solve a similar need [1] that was seen in the RoCE
      v2 patch series.
      
      [1] http://www.spinics.net/lists/linux-rdma/msg24733.htmlReviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Cc: Matan Barak <matanb@mellanox.com>
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      5aa44bb9
  2. 29 8月, 2015 17 次提交
  3. 22 8月, 2015 1 次提交
    • M
      mm: make page pfmemalloc check more robust · 2f064f34
      Michal Hocko 提交于
      Commit c48a11c7 ("netvm: propagate page->pfmemalloc to skb") added
      checks for page->pfmemalloc to __skb_fill_page_desc():
      
              if (page->pfmemalloc && !page->mapping)
                      skb->pfmemalloc = true;
      
      It assumes page->mapping == NULL implies that page->pfmemalloc can be
      trusted.  However, __delete_from_page_cache() can set set page->mapping
      to NULL and leave page->index value alone.  Due to being in union, a
      non-zero page->index will be interpreted as true page->pfmemalloc.
      
      So the assumption is invalid if the networking code can see such a page.
      And it seems it can.  We have encountered this with a NFS over loopback
      setup when such a page is attached to a new skbuf.  There is no copying
      going on in this case so the page confuses __skb_fill_page_desc which
      interprets the index as pfmemalloc flag and the network stack drops
      packets that have been allocated using the reserves unless they are to
      be queued on sockets handling the swapping which is the case here and
      that leads to hangs when the nfs client waits for a response from the
      server which has been dropped and thus never arrive.
      
      The struct page is already heavily packed so rather than finding another
      hole to put it in, let's do a trick instead.  We can reuse the index
      again but define it to an impossible value (-1UL).  This is the page
      index so it should never see the value that large.  Replace all direct
      users of page->pfmemalloc by page_is_pfmemalloc which will hide this
      nastiness from unspoiled eyes.
      
      The information will get lost if somebody wants to use page->index
      obviously but that was the case before and the original code expected
      that the information should be persisted somewhere else if that is
      really needed (e.g.  what SLAB and SLUB do).
      
      [akpm@linux-foundation.org: fix blooper in slub]
      Fixes: c48a11c7 ("netvm: propagate page->pfmemalloc to skb")
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Debugged-by: NVlastimil Babka <vbabka@suse.com>
      Debugged-by: NJiri Bohac <jbohac@suse.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Cc: <stable@vger.kernel.org>	[3.6+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2f064f34
  4. 21 8月, 2015 3 次提交
    • D
      drm/radeon: fix hotplug race at startup · 7f98ca45
      Dave Airlie 提交于
      We apparantly get a hotplug irq before we've initialised
      modesetting,
      
      [drm] Loading R100 Microcode
      BUG: unable to handle kernel NULL pointer dereference at   (null)
      IP: [<c125f56f>] __mutex_lock_slowpath+0x23/0x91
      *pde = 00000000
      Oops: 0002 [#1]
      Modules linked in: radeon(+) drm_kms_helper ttm drm i2c_algo_bit backlight pcspkr psmouse evdev sr_mod input_leds led_class cdrom sg parport_pc parport floppy intel_agp intel_gtt lpc_ich acpi_cpufreq processor button mfd_core agpgart uhci_hcd ehci_hcd rng_core snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm usbcore usb_common i2c_i801 i2c_core snd_timer snd soundcore thermal_sys
      CPU: 0 PID: 15 Comm: kworker/0:1 Not tainted 4.2.0-rc7-00015-gbf674028 #111
      Hardware name: MicroLink                               /D850MV                         , BIOS MV85010A.86A.0067.P24.0304081124 04/08/2003
      Workqueue: events radeon_hotplug_work_func [radeon]
      task: f6ca5900 ti: f6d3e000 task.ti: f6d3e000
      EIP: 0060:[<c125f56f>] EFLAGS: 00010282 CPU: 0
      EIP is at __mutex_lock_slowpath+0x23/0x91
      EAX: 00000000 EBX: f5e900fc ECX: 00000000 EDX: fffffffe
      ESI: f6ca5900 EDI: f5e90100 EBP: f5e90000 ESP: f6d3ff0c
       DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
      CR0: 8005003b CR2: 00000000 CR3: 36f61000 CR4: 000006d0
      Stack:
       f5e90100 00000000 c103c4c1 f6d2a5a0 f5e900fc f6df394c c125f162 f8b0faca
       f6d2a5a0 c138ca00 f6df394c f7395600 c1034741 00d40000 00000000 f6d2a5a0
       c138ca00 f6d2a5b8 c138ca10 c1034b58 00000001 f6d40000 f6ca5900 f6d0c940
      Call Trace:
       [<c103c4c1>] ? dequeue_task_fair+0xa4/0xb7
       [<c125f162>] ? mutex_lock+0x9/0xa
       [<f8b0faca>] ? radeon_hotplug_work_func+0x17/0x57 [radeon]
       [<c1034741>] ? process_one_work+0xfc/0x194
       [<c1034b58>] ? worker_thread+0x18d/0x218
       [<c10349cb>] ? rescuer_thread+0x1d5/0x1d5
       [<c103742a>] ? kthread+0x7b/0x80
       [<c12601c0>] ? ret_from_kernel_thread+0x20/0x30
       [<c10373af>] ? init_completion+0x18/0x18
      Code: 42 08 e8 8e a6 dd ff c3 57 56 53 83 ec 0c 8b 35 48 f7 37 c1 8b 10 4a 74 1a 89 c3 8d 78 04 8b 40 08 89 63
      Reported-and-Tested-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      7f98ca45
    • B
      PCI: Don't use 64-bit bus addresses on PA-RISC · 45ea2a5f
      Bjorn Helgaas 提交于
      Meelis and Helge reported that 3a9ad0b4 ("PCI: Add pci_bus_addr_t")
      caused HPMCs on A500 and hangs on rp5470.
      
      PA-RISC does not set ARCH_DMA_ADDR_T_64BIT, even for 64-bit kernels, so
      prior to 3a9ad0b4, we always used 32-bit PCI addresses.  After
      3a9ad0b4, we do use 64-bit PCI addresses in 64-bit kernels, and
      apparently there's some PA-RISC problem related to them.
      
      Fixes: 3a9ad0b4 ("PCI: Add pci_bus_addr_t")
      Link: http://lkml.kernel.org/r/alpine.LRH.2.11.1507260929000.30065@math.ut.eeReported-by: NMeelis Roos <mroos@linux.ee>
      Reported-by: NHelge Deller <deller@gmx.de>
      Tested-by: NHelge Deller <deller@gmx.de>
      Based-on-idea-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      CC: stable@vger.kernel.org	# v3.19+
      45ea2a5f
    • V
      Input: gpio_keys_polled - request GPIO pin as input. · 1ae5ddb6
      Vincent Pelletier 提交于
      GPIOF_IN flag was lost in:
      Commit 633a21d8("input: gpio_keys_polled: Add support for GPIO
      descriptors").
      
      Without this flag, legacy code path (for non-descriptor GPIO declarations)
      would configure GPIO as output (0 meaning GPIOF_DIR_OUT | GPIOF_INIT_LOW).
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NVincent Pelletier <plr.vincent@gmail.com>
      Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      1ae5ddb6
  5. 20 8月, 2015 5 次提交
    • G
      clocksource/imx: Fix boot with non-DT systems · be3b0f9b
      Guenter Roeck 提交于
      Commit 6dd74782 ("ARM: imx: move timer resources into a structure")
      moved initialization parameters into a data structure, but neglected to set
      the irq field in that data structure for non-DT boots. This causes the system
      to hang if a non-DT boot is attempted.
      
      Fixes: 6dd74782 ("ARM: imx: move timer resources into a structure")
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Cc: Shawn Guo <shawn.guo@linaro.org>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Link: http://lkml.kernel.org/r/1440066441-13930-1-git-send-email-linux@roeck-us.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      be3b0f9b
    • G
      irqchip/crossbar: Restore set_wake functionality · 8200fe43
      Grygorii Strashko 提交于
      The TI crossbar irqchip doesn't provides any facility to configure the
      wakeup sources, but the conversion to hierarchical irqdomains set the
      irq_set_wake callback to irq_chip_set_wake_parent. The parent chip
      (OMAP wakeupgen) has no irq_set_wake function either so the call will
      fail with -ENOSYS. As a result the irq_set_wake() call in the resume
      path will trigger an 'Unbalanced wake disable' warning.
      
      Before the conversion the GIC irqchip was the top level irqchip and
      correctly flagged with IRQCHIP_SKIP_SET_WAKE.
      
      Restore the correct behaviour by removing the irq_set_type callback
      from the crossbar irqchip and set the IRQCHIP_SKIP_SET_WAKE flag which
      lets the irq_set_irq_wake() call from the driver succeed.
      
      [ tglx: Massaged changelog ]
      
      Fixes: 783d3186 ('irqchip: crossbar: Convert dra7 crossbar...')
      Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: <linux@arm.linux.org.uk>
      Cc: <nsekhar@ti.com>
      Cc: <jason@lakedaemon.net>
      Cc: <balbi@ti.com>
      Cc: <linux-arm-kernel@lists.infradead.org>
      Cc: <tony@atomide.com>
      Cc: <marc.zyngier@arm.com>
      Cc: stable@vger.kernel.org # 4.1
      Link: http://lkml.kernel.org/r/1439554830-19502-7-git-send-email-grygorii.strashko@ti.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      8200fe43
    • G
      irqchip/crossbar: Restore the mask on suspend behaviour · 4fd8f47e
      Grygorii Strashko 提交于
      The ARM GIC requires that all interrupts which are not used as a
      wakeup source have to be masked during suspend.
      
      The conversion of the crossbar irqchip to hierarchical irq domains
      failed to mark the crossbar irqchip with the IRQCHIP_MASK_ON_SUSPEND
      flag and therefor broke the suspend requirement of the GIC.
      
      Before the conversion the flags were visible because the GIC was the
      top level irqchip. After the conversion the crossbar irqchip is the
      top level irq chip whose flags are evaluated in suspend_device_irq().
      As the flag is not set the masking of the non-wakeup irqs is not
      invoked which breaks suspend.
      
      Add the IRQCHIP_MASK_ON_SUSPEND flag to the crossbar irqchip, so the
      GIC interrupts get masked properly.
      
      [ tglx: Massaged changelog ]
      
      Fixes: 783d3186 ('irqchip: crossbar: Convert dra7 crossbar...')
      Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: <linux@arm.linux.org.uk>
      Cc: <nsekhar@ti.com>
      Cc: <jason@lakedaemon.net>
      Cc: <balbi@ti.com>
      Cc: <linux-arm-kernel@lists.infradead.org>
      Cc: <tony@atomide.com>
      Cc: <marc.zyngier@arm.com>
      Cc: stable@vger.kernel.org # 4.1
      Link: http://lkml.kernel.org/r/1439554830-19502-6-git-send-email-grygorii.strashko@ti.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      4fd8f47e
    • G
      irqchip/crossbar: Restore the irq_set_type() mechanism · e269ec42
      Grygorii Strashko 提交于
      The conversion of the crossbar irqchip to hierarchical irq domains
      failed to provide a mechanism to properly set the trigger type of an
      interrupt.
      
      The crossbar irq chip itself has no mechanism and therefor no
      irq_set_type() callback. The code before the conversion relayed the
      trigger configuration directly to the underlying GIC.
      
      Restore the correct behaviour by setting the crossbar irq_set_type
      callback to irq_chip_set_type_parent(). This propagates the
      set_trigger() call to the underlying GIC irqchip.
      
      [ tglx: Massaged changelog ]
      
      Fixes: 783d3186 ('irqchip: crossbar: Convert dra7 crossbar...')
      Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: <linux@arm.linux.org.uk>
      Cc: <nsekhar@ti.com>
      Cc: <jason@lakedaemon.net>
      Cc: <balbi@ti.com>
      Cc: <linux-arm-kernel@lists.infradead.org>
      Cc: <tony@atomide.com>
      Cc: <marc.zyngier@arm.com>
      Cc: stable@vger.kernel.org # 4.1
      Link: http://lkml.kernel.org/r/1439554830-19502-4-git-send-email-grygorii.strashko@ti.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      e269ec42
    • Y
      PCI: Tolerate hierarchies with no Root Port · b35b1df5
      Yijing Wang 提交于
      We should not assume any particular hardware topology.  Commit d0751b98
      ("PCI: Add dev->has_secondary_link to track downstream PCIe links") relied
      on the assumption that every PCIe hierarchy is rooted at a Root Port.  But
      we can't rely on any assumption about what hardware we will find; we just
      have to deal with the world as it is.
      
      On some platforms, PCIe devices (endpoints, switch upstream ports, etc.)
      appear directly on the root bus, and there is no Root Port in the PCI bus
      hierarchy.  For example, Meelis observed these top-level devices on a
      Sparc V245:
      
        0000:02:00.0 PCI bridge to [bus 03-0d]    Switch Upstream Port
        0001:02:00.0 PCI bridge to [bus 03]       PCIe to PCI/PCI-X Bridge
      
      These devices *look* like they have links going upstream, but there really
      are no upstream devices.
      
      In set_pcie_port_type(), we used the parent device to figure out which side
      of a switch port has a link, so if the parent device did not exist, we
      dereferenced a NULL parent pointer.
      
      Check whether the parent device exists before dereferencing it.
      
      Meelis observed this oops on Sparc V245 and T2000.  Ben Herrenschmidt says
      this is also possible on IBM PowerVM guests on PowerPC.
      
      [bhelgaas: changelog, comment]
      Link: http://lkml.kernel.org/r/alpine.LRH.2.20.1508122118210.18637@math.ut.eeReported-by: NMeelis Roos <mroos@linux.ee>
      Tested-by: NMeelis Roos <mroos@linux.ee>
      Signed-off-by: NYijing Wang <wangyijing@huawei.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      b35b1df5
  6. 19 8月, 2015 11 次提交