1. 14 1月, 2014 2 次提交
  2. 06 11月, 2013 5 次提交
  3. 29 10月, 2013 1 次提交
  4. 22 10月, 2013 1 次提交
  5. 18 10月, 2013 1 次提交
    • C
      drm: Pad drm_mode_get_connector to 64-bit boundary · bc5bd37c
      Chris Wilson 提交于
      Pavel Roskin reported that DRM_IOCTL_MODE_GETCONNECTOR was overwritting
      the 4 bytes beyond the end of its structure with a 32-bit userspace
      running on a 64-bit kernel. This is due to the padding gcc inserts as
      the drm_mode_get_connector struct includes a u64 and its size is not a
      natural multiple of u64s.
      
      64-bit kernel:
      
      sizeof(drm_mode_get_connector)=80, alignof=8
      sizeof(drm_mode_get_encoder)=20, alignof=4
      sizeof(drm_mode_modeinfo)=68, alignof=4
      
      32-bit userspace:
      
      sizeof(drm_mode_get_connector)=76, alignof=4
      sizeof(drm_mode_get_encoder)=20, alignof=4
      sizeof(drm_mode_modeinfo)=68, alignof=4
      
      Fortuituously we can insert explicit padding to the tail of our
      structures without breaking ABI.
      Reported-by: NPavel Roskin <proski@gnu.org>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: dri-devel@lists.freedesktop.org
      Cc: stable@vger.kernel.org
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      bc5bd37c
  6. 03 10月, 2013 1 次提交
  7. 21 9月, 2013 1 次提交
  8. 20 9月, 2013 2 次提交
    • P
      perf: Fix capabilities bitfield compatibility in 'struct perf_event_mmap_page' · fa731587
      Peter Zijlstra 提交于
      Solve the problems around the broken definition of perf_event_mmap_page::
      cap_usr_time and cap_usr_rdpmc fields which used to overlap, partially
      fixed by:
      
        860f085b ("perf: Fix broken union in 'struct perf_event_mmap_page'")
      
      The problem with the fix (merged in v3.12-rc1 and not yet released
      officially), noticed by Vince Weaver is that the new behavior is
      not detectable by new user-space, and that due to the reuse of the
      field names it's easy to mis-compile a binary if old headers are used
      on a new kernel or new headers are used on an old kernel.
      
      To solve all that make this change explicit, detectable and self-contained,
      by iterating the ABI the following way:
      
       - Always clear bit 0, and rename it to usrpage->cap_bit0, to at least not
         confuse old user-space binaries. RDPMC will be marked as unavailable
         to old binaries but that's within the ABI, this is a capability bit.
      
       - Rename bit 1 to ->cap_bit0_is_deprecated and always set it to 1, so new
         libraries can reliably detect that bit 0 is deprecated and perma-zero
         without having to check the kernel version.
      
       - Use bits 2, 3, 4 for the newly defined, correct functionality:
      
      	cap_user_rdpmc		: 1, /* The RDPMC instruction can be used to read counts */
      	cap_user_time		: 1, /* The time_* fields are used */
      	cap_user_time_zero	: 1, /* The time_zero field is used */
      
       - Rename all the bitfield names in perf_event.h to be different from the
         old names, to make sure it's not possible to mis-compile it
         accidentally with old assumptions.
      
      The 'size' field can then be used in the future to add new fields and it
      will act as a natural ABI version indicator as well.
      
      Also adjust tools/perf/ userspace for the new definitions, noticed by
      Adrian Hunter.
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Also-Fixed-by: NAdrian Hunter <adrian.hunter@intel.com>
      Link: http://lkml.kernel.org/n/tip-zr03yxjrpXesOzzupszqglbv@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fa731587
    • P
      perf: Update ABI comment · c5ecceef
      Peter Zijlstra 提交于
      For some mysterious reason the sample_id field of PERF_RECORD_MMAP went AWOL.
      Reported-by: NVince Weaver <vince@deater.net>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c5ecceef
  9. 18 9月, 2013 1 次提交
  10. 11 9月, 2013 1 次提交
    • G
      fs: bump inode and dentry counters to long · 3942c07c
      Glauber Costa 提交于
      This series reworks our current object cache shrinking infrastructure in
      two main ways:
      
       * Noticing that a lot of users copy and paste their own version of LRU
         lists for objects, we put some effort in providing a generic version.
         It is modeled after the filesystem users: dentries, inodes, and xfs
         (for various tasks), but we expect that other users could benefit in
         the near future with little or no modification.  Let us know if you
         have any issues.
      
       * The underlying list_lru being proposed automatically and
         transparently keeps the elements in per-node lists, and is able to
         manipulate the node lists individually.  Given this infrastructure, we
         are able to modify the up-to-now hammer called shrink_slab to proceed
         with node-reclaim instead of always searching memory from all over like
         it has been doing.
      
      Per-node lru lists are also expected to lead to less contention in the lru
      locks on multi-node scans, since we are now no longer fighting for a
      global lock.  The locks usually disappear from the profilers with this
      change.
      
      Although we have no official benchmarks for this version - be our guest to
      independently evaluate this - earlier versions of this series were
      performance tested (details at
      http://permalink.gmane.org/gmane.linux.kernel.mm/100537) yielding no
      visible performance regressions while yielding a better qualitative
      behavior in NUMA machines.
      
      With this infrastructure in place, we can use the list_lru entry point to
      provide memcg isolation and per-memcg targeted reclaim.  Historically,
      those two pieces of work have been posted together.  This version presents
      only the infrastructure work, deferring the memcg work for a later time,
      so we can focus on getting this part tested.  You can see more about the
      history of such work at http://lwn.net/Articles/552769/
      
      Dave Chinner (18):
        dcache: convert dentry_stat.nr_unused to per-cpu counters
        dentry: move to per-sb LRU locks
        dcache: remove dentries from LRU before putting on dispose list
        mm: new shrinker API
        shrinker: convert superblock shrinkers to new API
        list: add a new LRU list type
        inode: convert inode lru list to generic lru list code.
        dcache: convert to use new lru list infrastructure
        list_lru: per-node list infrastructure
        shrinker: add node awareness
        fs: convert inode and dentry shrinking to be node aware
        xfs: convert buftarg LRU to generic code
        xfs: rework buffer dispose list tracking
        xfs: convert dquot cache lru to list_lru
        fs: convert fs shrinkers to new scan/count API
        drivers: convert shrinkers to new count/scan API
        shrinker: convert remaining shrinkers to count/scan API
        shrinker: Kill old ->shrink API.
      
      Glauber Costa (7):
        fs: bump inode and dentry counters to long
        super: fix calculation of shrinkable objects for small numbers
        list_lru: per-node API
        vmscan: per-node deferred work
        i915: bail out earlier when shrinker cannot acquire mutex
        hugepage: convert huge zero page shrinker to new shrinker API
        list_lru: dynamically adjust node arrays
      
      This patch:
      
      There are situations in very large machines in which we can have a large
      quantity of dirty inodes, unused dentries, etc.  This is particularly true
      when umounting a filesystem, where eventually since every live object will
      eventually be discarded.
      
      Dave Chinner reported a problem with this while experimenting with the
      shrinker revamp patchset.  So we believe it is time for a change.  This
      patch just moves int to longs.  Machines where it matters should have a
      big long anyway.
      Signed-off-by: NGlauber Costa <glommer@openvz.org>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Cc: Arve Hjønnevåg <arve@android.com>
      Cc: Carlos Maiolino <cmaiolino@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: J. Bruce Fields <bfields@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      3942c07c
  11. 09 9月, 2013 3 次提交
  12. 08 9月, 2013 2 次提交
    • D
      Input: evdev - add EVIOCREVOKE ioctl · c7dc6573
      David Herrmann 提交于
      If we have multiple sessions on a system, we normally don't want
      background sessions to read input events. Otherwise, it could capture
      passwords and more entered by the user on the foreground session. This is
      a real world problem as the recent XMir development showed:
        http://mjg59.dreamwidth.org/27327.html
      
      We currently rely on sessions to release input devices when being
      deactivated. This relies on trust across sessions. But that's not given on
      usual systems. We therefore need a way to control which processes have
      access to input devices.
      
      With VTs the kernel simply routed them through the active /dev/ttyX. This
      is not possible with evdev devices, though. Moreover, we want to avoid
      routing input-devices through some dispatcher-daemon in userspace (which
      would add some latency).
      
      This patch introduces EVIOCREVOKE. If called on an evdev fd, this revokes
      device-access irrecoverably for that *single* open-file. Hence, once you
      call EVIOCREVOKE on any dup()ed fd, all fds for that open-file will be
      rather useless now (but still valid compared to close()!). This allows us
      to pass fds directly to session-processes from a trusted source. The
      source keeps a dup()ed fd and revokes access once the session-process is
      no longer active.
      Compared to the EVIOCMUTE proposal, we can avoid the CAP_SYS_ADMIN
      restriction now as there is no way to revive the fd again. Hence, a user
      is free to call EVIOCREVOKE themself to kill the fd.
      
      Additionally, this ioctl allows multi-layer access-control (again compared
      to EVIOCMUTE which was limited to one layer via CAP_SYS_ADMIN). A middle
      layer can simply request a new open-file from the layer above and pass it
      to the layer below. Now each layer can call EVIOCREVOKE on the fds to
      revoke access for all layers below, at the expense of one fd per layer.
      
      There's already ongoing experimental user-space work which demonstrates
      how it can be used:
        http://lists.freedesktop.org/archives/systemd-devel/2013-August/012897.htmlSigned-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      c7dc6573
    • L
      Revert "Input: introduce BTN/ABS bits for drums and guitars" · b04c99e3
      Linus Torvalds 提交于
      This reverts commits 61e00655, 73f8645d and 8e22ecb6:
        "Input: introduce BTN/ABS bits for drums and guitars"
        "HID: wiimote: add support for Guitar-Hero drums"
        "HID: wiimote: add support for Guitar-Hero guitars"
      
      The extra new ABS_xx values resulted in ABS_MAX no longer being a
      power-of-two, which broke the comparison logic.  It also caused the
      ioctl numbers to overflow into the next byte, causing problems for that.
      
      We'll try again for 3.13.
      Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Acked-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: Benjamin Tissoires <benjamin.tissoires@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b04c99e3
  13. 06 9月, 2013 1 次提交
    • M
      dm: add statistics support · fd2ed4d2
      Mikulas Patocka 提交于
      Support the collection of I/O statistics on user-defined regions of
      a DM device.  If no regions are defined no statistics are collected so
      there isn't any performance impact.  Only bio-based DM devices are
      currently supported.
      
      Each user-defined region specifies a starting sector, length and step.
      Individual statistics will be collected for each step-sized area within
      the range specified.
      
      The I/O statistics counters for each step-sized area of a region are
      in the same format as /sys/block/*/stat or /proc/diskstats but extra
      counters (12 and 13) are provided: total time spent reading and
      writing in milliseconds.  All these counters may be accessed by sending
      the @stats_print message to the appropriate DM device via dmsetup.
      
      The creation of DM statistics will allocate memory via kmalloc or
      fallback to using vmalloc space.  At most, 1/4 of the overall system
      memory may be allocated by DM statistics.  The admin can see how much
      memory is used by reading
      /sys/module/dm_mod/parameters/stats_current_allocated_bytes
      
      See Documentation/device-mapper/statistics.txt for more details.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      fd2ed4d2
  14. 05 9月, 2013 2 次提交
    • A
      vfio-pci: PCI hot reset interface · 8b27ee60
      Alex Williamson 提交于
      The current VFIO_DEVICE_RESET interface only maps to PCI use cases
      where we can isolate the reset to the individual PCI function.  This
      means the device must support FLR (PCIe or AF), PM reset on D3hot->D0
      transition, device specific reset, or be a singleton device on a bus
      for a secondary bus reset.  FLR does not have widespread support,
      PM reset is not very reliable, and bus topology is dictated by the
      system and device design.  We need to provide a means for a user to
      induce a bus reset in cases where the existing mechanisms are not
      available or not reliable.
      
      This device specific extension to VFIO provides the user with this
      ability.  Two new ioctls are introduced:
       - VFIO_DEVICE_PCI_GET_HOT_RESET_INFO
       - VFIO_DEVICE_PCI_HOT_RESET
      
      The first provides the user with information about the extent of
      devices affected by a hot reset.  This is essentially a list of
      devices and the IOMMU groups they belong to.  The user may then
      initiate a hot reset by calling the second ioctl.  We must be
      careful that the user has ownership of all the affected devices
      found via the first ioctl, so the second ioctl takes a list of file
      descriptors for the VFIO groups affected by the reset.  Each group
      must have IOMMU protection established for the ioctl to succeed.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      8b27ee60
    • C
      net: sync some IP headers with glibc · cfd280c9
      Carlos O'Donell 提交于
      Solution:
      =========
      
      - Synchronize linux's `include/uapi/linux/in6.h'
        with glibc's `inet/netinet/in.h'.
      - Synchronize glibc's `inet/netinet/in.h with linux's
        `include/uapi/linux/in6.h'.
      - Allow including the headers in either other.
      - First header included defines the structures and macros.
      
      Details:
      ========
      
      The kernel promises not to break the UAPI ABI so I don't
      see why we can't just have the two userspace headers
      coordinate?
      
      If you include the kernel headers first you get those,
      and if you include the glibc headers first you get those,
      and the following patch arranges a coordination and
      synchronization between the two.
      
      Let's handle `include/uapi/linux/in6.h' from linux,
      and `inet/netinet/in.h' from glibc and ensure they compile
      in any order and preserve the required ABI.
      
      These two patches pass the following compile tests:
      
      cat >> test1.c <<EOF
      int main (void) {
        return 0;
      }
      EOF
      gcc -c test1.c
      
      cat >> test2.c <<EOF
      int main (void) {
        return 0;
      }
      EOF
      gcc -c test2.c
      
      One wrinkle is that the kernel has a different name for one of
      the members in ipv6_mreq. In the kernel patch we create a macro
      to cover the uses of the old name, and while that's not entirely
      clean it's one of the best solutions (aside from an anonymous
      union which has other issues).
      
      I've reviewed the code and it looks to me like the ABI is
      assured and everything matches on both sides.
      
      Notes:
      - You want netinet/in.h to include bits/in.h as early as possible,
        but it needs in_addr so define in_addr early.
      - You want bits/in.h included as early as possible so you can use
        the linux specific code to define __USE_KERNEL_DEFS based on
        the _UAPI_* macro definition and use those to cull in.h.
      - glibc was missing IPPROTO_MH, added here.
      
      Compile tested and inspected.
      Reported-by: NThomas Backlund <tmb@mageia.org>
      Cc: Thomas Backlund <tmb@mageia.org>
      Cc: libc-alpha@sourceware.org
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Cc: David S. Miller <davem@davemloft.net>
      Tested-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NCarlos O'Donell <carlos@redhat.com>
      Signed-off-by: NCong Wang <amwang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cfd280c9
  15. 04 9月, 2013 5 次提交
  16. 03 9月, 2013 1 次提交
  17. 02 9月, 2013 3 次提交
  18. 01 9月, 2013 3 次提交
  19. 31 8月, 2013 1 次提交
    • T
      drm/radeon/si: Add support for CP DMA to CS checker for compute v2 · e5b9e750
      Tom Stellard 提交于
      Also add a new RADEON_INFO query to check that CP DMA packets are
      supported on the compute ring.
      
      CP DMA has been supported since the 3.8 kernel, but due to an oversight
      we forgot to teach the CS checker that the CP DMA packet was legal for
      the compute ring on Southern Islands GPUs.
      
      This patch fixes a bug where the radeon driver will incorrectly reject a legal
      CP DMA packet from user space.  I would like to have the patch
      backported to stable so that we don't have to require Mesa users to use a
      bleeding edge kernel in order to take advantage of this feature which
      is already present in the stable kernels (3.8 and newer).
      
      v2:
        - Don't bump kms version, so this patch can be backported to stable
          kernels.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NTom Stellard <thomas.stellard@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      e5b9e750
  20. 30 8月, 2013 3 次提交
    • E
      pkt_sched: fq: Fair Queue packet scheduler · afe4fd06
      Eric Dumazet 提交于
      - Uses perfect flow match (not stochastic hash like SFQ/FQ_codel)
      - Uses the new_flow/old_flow separation from FQ_codel
      - New flows get an initial credit allowing IW10 without added delay.
      - Special FIFO queue for high prio packets (no need for PRIO + FQ)
      - Uses a hash table of RB trees to locate the flows at enqueue() time
      - Smart on demand gc (at enqueue() time, RB tree lookup evicts old
        unused flows)
      - Dynamic memory allocations.
      - Designed to allow millions of concurrent flows per Qdisc.
      - Small memory footprint : ~8K per Qdisc, and 104 bytes per flow.
      - Single high resolution timer for throttled flows (if any).
      - One RB tree to link throttled flows.
      - Ability to have a max rate per flow. We might add a socket option
        to add per socket limitation.
      
      Attempts have been made to add TCP pacing in TCP stack, but this
      seems to add complex code to an already complex stack.
      
      TCP pacing is welcomed for flows having idle times, as the cwnd
      permits TCP stack to queue a possibly large number of packets.
      
      This removes the 'slow start after idle' choice, hitting badly
      large BDP flows, and applications delivering chunks of data
      as video streams.
      
      Nicely spaced packets :
      Here interface is 10Gbit, but flow bottleneck is ~20Mbit
      
      cwin is big, yet FQ avoids the typical bursts generated by TCP
      (as in netperf TCP_RR -- -r 100000,100000)
      
      15:01:23.545279 IP A > B: . 78193:81089(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
      15:01:23.545394 IP B > A: . ack 81089 win 3668 <nop,nop,timestamp 11597985 1115>
      15:01:23.546488 IP A > B: . 81089:83985(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
      15:01:23.546565 IP B > A: . ack 83985 win 3668 <nop,nop,timestamp 11597986 1115>
      15:01:23.547713 IP A > B: . 83985:86881(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
      15:01:23.547778 IP B > A: . ack 86881 win 3668 <nop,nop,timestamp 11597987 1115>
      15:01:23.548911 IP A > B: . 86881:89777(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
      15:01:23.548949 IP B > A: . ack 89777 win 3668 <nop,nop,timestamp 11597988 1115>
      15:01:23.550116 IP A > B: . 89777:92673(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
      15:01:23.550182 IP B > A: . ack 92673 win 3668 <nop,nop,timestamp 11597989 1115>
      15:01:23.551333 IP A > B: . 92673:95569(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
      15:01:23.551406 IP B > A: . ack 95569 win 3668 <nop,nop,timestamp 11597991 1115>
      15:01:23.552539 IP A > B: . 95569:98465(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
      15:01:23.552576 IP B > A: . ack 98465 win 3668 <nop,nop,timestamp 11597992 1115>
      15:01:23.553756 IP A > B: . 98465:99913(1448) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
      15:01:23.554138 IP A > B: P 99913:100001(88) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
      15:01:23.554204 IP B > A: . ack 100001 win 3668 <nop,nop,timestamp 11597993 1115>
      15:01:23.554234 IP B > A: . 65248:68144(2896) ack 100001 win 3668 <nop,nop,timestamp 11597993 1115>
      15:01:23.555620 IP B > A: . 68144:71040(2896) ack 100001 win 3668 <nop,nop,timestamp 11597993 1115>
      15:01:23.557005 IP B > A: . 71040:73936(2896) ack 100001 win 3668 <nop,nop,timestamp 11597993 1115>
      15:01:23.558390 IP B > A: . 73936:76832(2896) ack 100001 win 3668 <nop,nop,timestamp 11597993 1115>
      15:01:23.559773 IP B > A: . 76832:79728(2896) ack 100001 win 3668 <nop,nop,timestamp 11597993 1115>
      15:01:23.561158 IP B > A: . 79728:82624(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      15:01:23.562543 IP B > A: . 82624:85520(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      15:01:23.563928 IP B > A: . 85520:88416(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      15:01:23.565313 IP B > A: . 88416:91312(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      15:01:23.566698 IP B > A: . 91312:94208(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      15:01:23.568083 IP B > A: . 94208:97104(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      15:01:23.569467 IP B > A: . 97104:100000(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      15:01:23.570852 IP B > A: . 100000:102896(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      15:01:23.572237 IP B > A: . 102896:105792(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      15:01:23.573639 IP B > A: . 105792:108688(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      15:01:23.575024 IP B > A: . 108688:111584(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      15:01:23.576408 IP B > A: . 111584:114480(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      15:01:23.577793 IP B > A: . 114480:117376(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
      
      TCP timestamps show that most packets from B were queued in the same ms
      timeframe (TSval 1159799{3,4}), but FQ managed to send them right
      in time to avoid a big burst.
      
      In slow start or steady state, very few packets are throttled [1]
      
      FQ gets a bunch of tunables as :
      
        limit : max number of packets on whole Qdisc (default 10000)
      
        flow_limit : max number of packets per flow (default 100)
      
        quantum : the credit per RR round (default is 2 MTU)
      
        initial_quantum : initial credit for new flows (default is 10 MTU)
      
        maxrate : max per flow rate (default : unlimited)
      
        buckets : number of RB trees (default : 1024) in hash table.
                     (consumes 8 bytes per bucket)
      
        [no]pacing : disable/enable pacing (default is enable)
      
      All of them can be changed on a live qdisc.
      
      $ tc qd add dev eth0 root fq help
      Usage: ... fq [ limit PACKETS ] [ flow_limit PACKETS ]
                    [ quantum BYTES ] [ initial_quantum BYTES ]
                    [ maxrate RATE  ] [ buckets NUMBER ]
                    [ [no]pacing ]
      
      $ tc -s -d qd
      qdisc fq 8002: dev eth0 root refcnt 32 limit 10000p flow_limit 100p buckets 256 quantum 3028 initial_quantum 15140
       Sent 216532416 bytes 148395 pkt (dropped 0, overlimits 0 requeues 14)
       backlog 0b 0p requeues 14
        511 flows, 511 inactive, 0 throttled
        110 gc, 0 highprio, 0 retrans, 1143 throttled, 0 flows_plimit
      
      [1] Except if initial srtt is overestimated, as if using
      cached srtt in tcp metrics. We'll provide a fix for this issue.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      afe4fd06
    • K
      drm: Advertise async page flip ability through GETCAP ioctl · 62f2104f
      Keith Packard 提交于
      Let applications know whether the kernel supports asynchronous page
      flipping.
      Signed-off-by: NKeith Packard <keithp@keithp.com>
      Signed-off-by: NDave Airlie <airlied@gmail.com>
      62f2104f
    • K
      drm: Add DRM_MODE_PAGE_FLIP_ASYNC flag definition · 9bba0c42
      Keith Packard 提交于
      This requests that the driver perform the page flip as soon as
      possible, not necessarily waiting for vblank.
      Signed-off-by: NKeith Packard <keithp@keithp.com>
      Signed-off-by: NDave Airlie <airlied@gmail.com>
      9bba0c42