1. 26 9月, 2013 1 次提交
  2. 25 9月, 2013 13 次提交
  3. 23 9月, 2013 4 次提交
    • M
      dm: add reserved_bio_based_ios module parameter · e8603136
      Mike Snitzer 提交于
      Allow user to change the number of IOs that are reserved by
      bio-based DM's mempools by writing to this file:
      /sys/module/dm_mod/parameters/reserved_bio_based_ios
      
      The default value is RESERVED_BIO_BASED_IOS (16).  The maximum allowed
      value is RESERVED_MAX_IOS (1024).
      
      Export dm_get_reserved_bio_based_ios() for use by DM targets and core
      code.  Switch to sizing dm-io's mempool and bioset using DM core's
      configurable 'reserved_bio_based_ios'.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NFrank Mayhar <fmayhar@google.com>
      e8603136
    • M
      dm: add reserved_rq_based_ios module parameter · f4790826
      Mike Snitzer 提交于
      Allow user to change the number of IOs that are reserved by
      request-based DM's mempools by writing to this file:
      /sys/module/dm_mod/parameters/reserved_rq_based_ios
      
      The default value is RESERVED_REQUEST_BASED_IOS (256).  The maximum
      allowed value is RESERVED_MAX_IOS (1024).
      
      Export dm_get_reserved_rq_based_ios() for use by DM targets and core
      code.  Switch to sizing dm-mpath's mempool using DM core's configurable
      'reserved_rq_based_ios'.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NFrank Mayhar <fmayhar@google.com>
      Acked-by: NMikulas Patocka <mpatocka@redhat.com>
      f4790826
    • M
      dm: lower bio-based mempool reservation · 6cfa5857
      Mike Snitzer 提交于
      Bio-based device mapper processing doesn't need larger mempools (like
      request-based DM does), so lower the number of reserved entries for
      bio-based operation.  16 was already used for bio-based DM's bioset
      but mistakenly wasn't used for it's _io_cache.
      
      Formalize difference between bio-based and request-based defaults by
      introducing RESERVED_BIO_BASED_IOS and RESERVED_REQUEST_BASED_IOS.
      
      (based on older code from Mikulas Patocka)
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NFrank Mayhar <fmayhar@google.com>
      Acked-by: NMikulas Patocka <mpatocka@redhat.com>
      6cfa5857
    • M
      dm thin: do not expose non-zero discard limits if discards disabled · b60ab990
      Mike Snitzer 提交于
      Fix issue where the block layer would stack the discard limits of the
      pool's data device even if the "ignore_discard" pool feature was
      specified.
      
      The pool and thin device(s) still had discards disabled because the
      QUEUE_FLAG_DISCARD request_queue flag wasn't set.  But to avoid user
      confusion when "ignore_discard" is used: both the pool device and the
      thin device(s) have zeroes for all discard limits.
      
      Also, always set discard_zeroes_data_unsupported in targets because they
      should never advertise the 'discard_zeroes_data' capability (even if the
      pool's data device supports it).
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Acked-by: NJoe Thornber <ejt@redhat.com>
      b60ab990
  4. 21 9月, 2013 8 次提交
  5. 20 9月, 2013 10 次提交
    • M
      dm mpath: disable WRITE SAME if it fails · f84cb8a4
      Mike Snitzer 提交于
      Workaround the SCSI layer's problematic WRITE SAME heuristics by
      disabling WRITE SAME in the DM multipath device's queue_limits if an
      underlying device disabled it.
      
      The WRITE SAME heuristics, with both the original commit 5db44863
      ("[SCSI] sd: Implement support for WRITE SAME") and the updated commit
      66c28f97 ("[SCSI] sd: Update WRITE SAME heuristics"), default to enabling
      WRITE SAME(10) even without successfully determining it is supported.
      After the first failed WRITE SAME the SCSI layer will disable WRITE SAME
      for the device (by setting sdkp->device->no_write_same which results in
      'max_write_same_sectors' in device's queue_limits to be set to 0).
      
      When a device is stacked ontop of such a SCSI device any changes to that
      SCSI device's queue_limits do not automatically propagate up the stack.
      As such, a DM multipath device will not have its WRITE SAME support
      disabled.  This causes the block layer to continue to issue WRITE SAME
      requests to the mpath device which causes paths to fail and (if mpath IO
      isn't configured to queue when no paths are available) it will result in
      actual IO errors to the upper layers.
      
      This fix doesn't help configurations that have additional devices
      stacked ontop of the mpath device (e.g. LVM created linear DM devices
      ontop).  A proper fix that restacks all the queue_limits from the bottom
      of the device stack up will need to be explored if SCSI will continue to
      use this model of optimistically allowing op codes and then disabling
      them after they fail for the first time.
      
      Before this patch:
      
      EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
      device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
      device-mapper: multipath: XXX snitm debugging: failing WRITE SAME IO with error=-121
      end_request: critical target error, dev dm-6, sector 528
      dm-6: WRITE SAME failed. Manually zeroing.
      device-mapper: multipath: Failing path 8:112.
      end_request: I/O error, dev dm-6, sector 4616
      dm-6: WRITE SAME failed. Manually zeroing.
      end_request: I/O error, dev dm-6, sector 4616
      end_request: I/O error, dev dm-6, sector 5640
      end_request: I/O error, dev dm-6, sector 6664
      end_request: I/O error, dev dm-6, sector 7688
      end_request: I/O error, dev dm-6, sector 524288
      Buffer I/O error on device dm-6, logical block 65536
      lost page write due to I/O error on dm-6
      JBD2: Error -5 detected when updating journal superblock for dm-6-8.
      end_request: I/O error, dev dm-6, sector 524296
      Aborting journal on device dm-6-8.
      end_request: I/O error, dev dm-6, sector 524288
      Buffer I/O error on device dm-6, logical block 65536
      lost page write due to I/O error on dm-6
      JBD2: Error -5 detected when updating journal superblock for dm-6-8.
      
      # cat /sys/block/sdh/queue/write_same_max_bytes
      0
      # cat /sys/block/dm-6/queue/write_same_max_bytes
      33553920
      
      After this patch:
      
      EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
      device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
      device-mapper: multipath: XXX snitm debugging: WRITE SAME I/O failed with error=-121
      end_request: critical target error, dev dm-6, sector 528
      dm-6: WRITE SAME failed. Manually zeroing.
      
      # cat /sys/block/sdh/queue/write_same_max_bytes
      0
      # cat /sys/block/dm-6/queue/write_same_max_bytes
      0
      
      It should be noted that WRITE SAME support wasn't enabled in DM
      multipath until v3.10.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: stable@vger.kernel.org # 3.10+
      f84cb8a4
    • M
      dm-snapshot: fix performance degradation due to small hash size · 60e356f3
      Mikulas Patocka 提交于
      LVM2, since version 2.02.96, creates origin with zero size, then loads
      the snapshot driver and then loads the origin.  Consequently, the
      snapshot driver sees the origin size zero and sets the hash size to the
      lower bound 64.  Such small hash table causes performance degradation.
      
      This patch changes it so that the hash size is determined by the size of
      snapshot volume, not minimum of origin and snapshot size.  It doesn't
      make sense to set the snapshot size significantly larger than the origin
      size, so we do not need to take origin size into account when
      calculating the hash size.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      60e356f3
    • M
      dm snapshot: workaround for a false positive lockdep warning · 5ea330a7
      Mikulas Patocka 提交于
      The kernel reports a lockdep warning if a snapshot is invalidated because
      it runs out of space.
      
      The lockdep warning was triggered by commit 0976dfc1
      ("workqueue: Catch more locking problems with flush_work()") in v3.5.
      
      The warning is false positive.  The real cause for the warning is that
      the lockdep engine treats different instances of md->lock as a single
      lock.
      
      This patch is a workaround - we use flush_workqueue instead of flush_work.
      This code path is not performance sensitive (it is called only on
      initialization or invalidation), thus it doesn't matter that we flush the
      whole workqueue.
      
      The real fix for the problem would be to teach the lockdep engine to treat
      different instances of md->lock as separate locks.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Acked-by: NAlasdair G Kergon <agk@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org # 3.5+
      5ea330a7
    • Y
      cpufreq: return EEXIST instead of EBUSY for second registering · 4dea5806
      Yinghai Lu 提交于
      On systems that support intel_pstate, acpi_cpufreq fails to load, and
      udev keeps trying until trace gets filled up and kernel crashes.
      
      The root cause is driver return ret from cpufreq_register_driver(),
      because when some other driver takes over before, it will return
      EBUSY and then udev will keep trying ...
      
      cpufreq_register_driver() should return EEXIST instead so that the
      system can boot without appending intel_pstate=disable and still use
      intel_pstate.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      4dea5806
    • D
      Revert "drm: mark context support as a legacy subsystem" · c21eb21c
      Dave Airlie 提交于
      This reverts commit 7c510133.
      
      Well looks like not enough digging was done, libdrm_nouveau before 2.4.33
      used contexts,
      
      292da616fe1f936ca78a3fa8e1b1b19883e343b6 nouveau: pull in major libdrm rewrite
      
      got rid of them,
      Reported-by: NPaul Zimmerman <Paul.Zimmerman@synopsys.com>
      Reported-by: NMikael Pettersson <mikpe@it.uu.se>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      c21eb21c
    • R
      PCI / ACPI / PM: Clear pme_poll for devices in D3cold on wakeup · 83414515
      Rafael J. Wysocki 提交于
      Commit 448bd857 (PCI/PM: add PCIe runtime D3cold support) added a
      piece of code to pci_acpi_wake_dev() causing that function to behave
      in a special way for devices in D3cold (so that their configuration
      registers are not accessed before those devices are resumed).
      However, it didn't take the clearing of the pme_poll flag into
      account.  That has to be done for all devices, even if they are in
      D3cold, or pci_pme_list_scan() will not know that wakeup has been
      signaled for the device and will poll its PME Status bit
      unnecessarily.
      
      Fix the problem by moving the clearing of the pme_poll flag in
      pci_acpi_wake_dev() before the code introduced by commit 448bd857.
      Reported-and-tested-by: NDavid E. Box <david.e.box@intel.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: 3.6+ <stable@vger.kernel.org> # 3.6+
      83414515
    • N
      netconsole: fix a deadlock with rtnl and netconsole's mutex · c71380ff
      Nikolay Aleksandrov 提交于
      This bug was introduced by commit
      7a163bfb ("netconsole: avoid a crash with
      multiple sysfs writers"). In store_enabled() we have the following
      sequence: acquire nt->mutex then rtnl, but in the netconsole netdev
      notifier we have rtnl then nt->mutex effectively leading to a deadlock.
      The NULL pointer dereference that the above commit tries to fix is
      actually due to another bug in netpoll_cleanup(). This is fixed by dropping
      the mutex from the netdev notifier as it's already protected by rtnl.
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c71380ff
    • M
      skge: fix broken driver · c194992c
      Mikulas Patocka 提交于
      The patch 136d8f37 broke the skge driver.
      Note this part of the patch:
      +               if (skge_rx_setup(skge, e, nskb, skge->rx_buf_size) < 0) {
      +                       dev_kfree_skb(nskb);
      +                       goto resubmit;
      +               }
      +
                      pci_unmap_single(skge->hw->pdev,
                                       dma_unmap_addr(e, mapaddr),
                                       dma_unmap_len(e, maplen),
                                       PCI_DMA_FROMDEVICE);
                      skb = e->skb;
                      prefetch(skb->data);
      -               skge_rx_setup(skge, e, nskb, skge->rx_buf_size);
      
      The function skge_rx_setup modifies e->skb to point to the new skb. Thus,
      after this change, the new buffer, not the old, is returned to the
      networking stack.
      
      This bug is present in kernels 3.11, 3.11.1 and 3.12-rc1. The patch should
      be queued for 3.11-stable.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Reported-by: NMikulas Patocka <mpatocka@redhat.com>
      Reported-by: NVasiliy Glazov <vascom2@gmail.com>
      Tested-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c194992c
    • A
      ip: generate unique IP identificator if local fragmentation is allowed · 703133de
      Ansis Atteka 提交于
      If local fragmentation is allowed, then ip_select_ident() and
      ip_select_ident_more() need to generate unique IDs to ensure
      correct defragmentation on the peer.
      
      For example, if IPsec (tunnel mode) has to encrypt large skbs
      that have local_df bit set, then all IP fragments that belonged
      to different ESP datagrams would have used the same identificator.
      If one of these IP fragments would get lost or reordered, then
      peer could possibly stitch together wrong IP fragments that did
      not belong to the same datagram. This would lead to a packet loss
      or data corruption.
      Signed-off-by: NAnsis Atteka <aatteka@nicira.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      703133de
    • P
      xen-netback: Don't destroy the netdev until the vif is shut down · 279f438e
      Paul Durrant 提交于
      Without this patch, if a frontend cycles through states Closing
      and Closed (which Windows frontends need to do) then the netdev
      will be destroyed and requires re-invocation of hotplug scripts
      to restore state before the frontend can move to Connected. Thus
      when udev is not in use the backend gets stuck in InitWait.
      
      With this patch, the netdev is left alone whilst the backend is
      still online and is only de-registered and freed just prior to
      destroying the vif (which is also nicely symmetrical with the
      netdev allocation and registration being done during probe) so
      no re-invocation of hotplug scripts is required.
      Signed-off-by: NPaul Durrant <paul.durrant@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Acked-by: NWei Liu <wei.liu2@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      279f438e
  6. 19 9月, 2013 4 次提交