1. 20 8月, 2013 1 次提交
  2. 15 8月, 2013 1 次提交
    • J
      net_sched: restore "linklayer atm" handling · 8a8e3d84
      Jesper Dangaard Brouer 提交于
      commit 56b765b7 ("htb: improved accuracy at high rates")
      broke the "linklayer atm" handling.
      
       tc class add ... htb rate X ceil Y linklayer atm
      
      The linklayer setting is implemented by modifying the rate table
      which is send to the kernel.  No direct parameter were
      transferred to the kernel indicating the linklayer setting.
      
      The commit 56b765b7 ("htb: improved accuracy at high rates")
      removed the use of the rate table system.
      
      To keep compatible with older iproute2 utils, this patch detects
      the linklayer by parsing the rate table.  It also supports future
      versions of iproute2 to send this linklayer parameter to the
      kernel directly. This is done by using the __reserved field in
      struct tc_ratespec, to convey the choosen linklayer option, but
      only using the lower 4 bits of this field.
      
      Linklayer detection is limited to speeds below 100Mbit/s, because
      at high rates the rtab is gets too inaccurate, so bad that
      several fields contain the same values, this resembling the ATM
      detect.  Fields even start to contain "0" time to send, e.g. at
      1000Mbit/s sending a 96 bytes packet cost "0", thus the rtab have
      been more broken than we first realized.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a8e3d84
  3. 14 8月, 2013 2 次提交
    • H
      ipv6: make unsolicited report intervals configurable for mld · fc4eba58
      Hannes Frederic Sowa 提交于
      Commit cab70040 ("net: igmp:
      Reduce Unsolicited report interval to 1s when using IGMPv3") and
      2690048c ("net: igmp: Allow user-space
      configuration of igmp unsolicited report interval") by William Manley made
      igmp unsolicited report intervals configurable per interface and corrected
      the interval of unsolicited igmpv3 report messages resendings to 1s.
      
      Same needs to be done for IPv6:
      
      MLDv1 (RFC2710 7.10.): 10 seconds
      MLDv2 (RFC3810 9.11.): 1 second
      
      Both intervals are configurable via new procfs knobs
      mldv1_unsolicited_report_interval and mldv2_unsolicited_report_interval.
      
      (also added .force_mld_version to ipv6_devconf_dflt to bring structs in
      line without semantic changes)
      
      v2:
      a) Joined documentation update for IPv4 and IPv6 MLD/IGMP
         unsolicited_report_interval procfs knobs.
      b) incorporate stylistic feedback from William Manley
      
      v3:
      a) add new DEVCONF_* values to the end of the enum (thanks to David
         Miller)
      
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: William Manley <william.manley@youview.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc4eba58
    • S
      pptp: fix byte order warnings · ebd8b934
      stephen hemminger 提交于
      Pptp driver has lots of byte order warnings from sparse.
      This was because the on-the-wire header is in network byte order (obviously)
      but the definition did not reflect that.
      
      Also, the address structure to user space actually put the call id
      in host order. Rather than break ABI compatibility, just acknowledge
      the existing design.
      Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ebd8b934
  4. 10 8月, 2013 1 次提交
  5. 09 8月, 2013 1 次提交
    • E
      net: add SNMP counters tracking incoming ECN bits · 1f07d03e
      Eric Dumazet 提交于
      With GRO/LRO processing, there is a problem because Ip[6]InReceives SNMP
      counters do not count the number of frames, but number of aggregated
      segments.
      
      Its probably too late to change this now.
      
      This patch adds four new counters, tracking number of frames, regardless
      of LRO/GRO, and on a per ECN status basis, for IPv4 and IPv6.
      
      Ip[6]NoECTPkts : Number of packets received with NOECT
      Ip[6]ECT1Pkts  : Number of packets received with ECT(1)
      Ip[6]ECT0Pkts  : Number of packets received with ECT(0)
      Ip[6]CEPkts    : Number of packets received with Congestion Experienced
      
      lph37:~# nstat | egrep "Pkts|InReceive"
      IpInReceives                    1634137            0.0
      Ip6InReceives                   3714107            0.0
      Ip6InNoECTPkts                  19205              0.0
      Ip6InECT0Pkts                   52651828           0.0
      IpExtInNoECTPkts                33630              0.0
      IpExtInECT0Pkts                 15581379           0.0
      IpExtInCEPkts                   6                  0.0
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f07d03e
  6. 04 8月, 2013 1 次提交
    • S
      fib_rules: fix suppressor names and default values · 73f5698e
      Stefan Tomanek 提交于
      This change brings the suppressor attribute names into line; it also changes
      the data types to provide a more consistent interface.
      
      While -1 indicates that the suppressor is not enabled, values >= 0 for
      suppress_prefixlen or suppress_ifgroup  reject routing decisions violating the
      constraint.
      
      This changes the previously presented behaviour of suppress_prefixlen, where a
      prefix length _less_ than the attribute value was rejected. After this change,
      a prefix length less than *or* equal to the value is considered a violation of
      the rule constraint.
      
      It also changes the default values for default and newly added rules (disabling
      any suppression for those).
      Signed-off-by: NStefan Tomanek <stefan.tomanek@wertarbyte.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73f5698e
  7. 03 8月, 2013 2 次提交
  8. 02 8月, 2013 1 次提交
  9. 01 8月, 2013 1 次提交
    • S
      fib_rules: add .suppress operation · 7764a45a
      Stefan Tomanek 提交于
      This change adds a new operation to the fib_rules_ops struct; it allows the
      suppression of routing decisions if certain criteria are not met by its
      results.
      
      The first implemented constraint is a minimum prefix length added to the
      structures of routing rules. If a rule is added with a minimum prefix length
      >0, only routes meeting this threshold will be considered. Any other (more
      general) routing table entries will be ignored.
      
      When configuring a system with multiple network uplinks and default routes, it
      is often convinient to reference the main routing table multiple times - but
      omitting the default route. Using this patch and a modified "ip" utility, this
      can be achieved by using the following command sequence:
      
        $ ip route add table secuplink default via 10.42.23.1
      
        $ ip rule add pref 100            table main prefixlength 1
        $ ip rule add pref 150 fwmark 0xA table secuplink
      
      With this setup, packets marked 0xA will be processed by the additional routing
      table "secuplink", but only if no suitable route in the main routing table can
      be found. By using a minimal prefixlength of 1, the default route (/0) of the
      table "main" is hidden to packets processed by rule 100; packets traveling to
      destinations with more specific routing entries are processed as usual.
      Signed-off-by: NStefan Tomanek <stefan.tomanek@wertarbyte.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7764a45a
  10. 31 7月, 2013 2 次提交
  11. 28 7月, 2013 2 次提交
  12. 25 7月, 2013 2 次提交
    • E
      tcp: TCP_NOTSENT_LOWAT socket option · c9bee3b7
      Eric Dumazet 提交于
      Idea of this patch is to add optional limitation of number of
      unsent bytes in TCP sockets, to reduce usage of kernel memory.
      
      TCP receiver might announce a big window, and TCP sender autotuning
      might allow a large amount of bytes in write queue, but this has little
      performance impact if a large part of this buffering is wasted :
      
      Write queue needs to be large only to deal with large BDP, not
      necessarily to cope with scheduling delays (incoming ACKS make room
      for the application to queue more bytes)
      
      For most workloads, using a value of 128 KB or less is OK to give
      applications enough time to react to POLLOUT events in time
      (or being awaken in a blocking sendmsg())
      
      This patch adds two ways to set the limit :
      
      1) Per socket option TCP_NOTSENT_LOWAT
      
      2) A sysctl (/proc/sys/net/ipv4/tcp_notsent_lowat) for sockets
      not using TCP_NOTSENT_LOWAT socket option (or setting a zero value)
      Default value being UINT_MAX (0xFFFFFFFF), meaning this has no effect.
      
      This changes poll()/select()/epoll() to report POLLOUT
      only if number of unsent bytes is below tp->nosent_lowat
      
      Note this might increase number of sendmsg()/sendfile() calls
      when using non blocking sockets,
      and increase number of context switches for blocking sockets.
      
      Note this is not related to SO_SNDLOWAT (as SO_SNDLOWAT is
      defined as :
       Specify the minimum number of bytes in the buffer until
       the socket layer will pass the data to the protocol)
      
      Tested:
      
      netperf sessions, and watching /proc/net/protocols "memory" column for TCP
      
      With 200 concurrent netperf -t TCP_STREAM sessions, amount of kernel memory
      used by TCP buffers shrinks by ~55 % (20567 pages instead of 45458)
      
      lpq83:~# echo -1 >/proc/sys/net/ipv4/tcp_notsent_lowat
      lpq83:~# (super_netperf 200 -t TCP_STREAM -H remote -l 90 &); sleep 60 ; grep TCP /proc/net/protocols
      TCPv6     1880      2   45458   no     208   yes  ipv6        y  y  y  y  y  y  y  y  y  y  y  y  y  n  y  y  y  y  y
      TCP       1696    508   45458   no     208   yes  kernel      y  y  y  y  y  y  y  y  y  y  y  y  y  n  y  y  y  y  y
      
      lpq83:~# echo 131072 >/proc/sys/net/ipv4/tcp_notsent_lowat
      lpq83:~# (super_netperf 200 -t TCP_STREAM -H remote -l 90 &); sleep 60 ; grep TCP /proc/net/protocols
      TCPv6     1880      2   20567   no     208   yes  ipv6        y  y  y  y  y  y  y  y  y  y  y  y  y  n  y  y  y  y  y
      TCP       1696    508   20567   no     208   yes  kernel      y  y  y  y  y  y  y  y  y  y  y  y  y  n  y  y  y  y  y
      
      Using 128KB has no bad effect on the throughput or cpu usage
      of a single flow, although there is an increase of context switches.
      
      A bonus is that we hold socket lock for a shorter amount
      of time and should improve latencies of ACK processing.
      
      lpq83:~# echo -1 >/proc/sys/net/ipv4/tcp_notsent_lowat
      lpq83:~# perf stat -e context-switches ./netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3
      OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : +/-2.500% @ 99% conf.
      Local       Remote      Local  Elapsed Throughput Throughput  Local Local  Remote Remote Local   Remote  Service
      Send Socket Recv Socket Send   Time               Units       CPU   CPU    CPU    CPU    Service Service Demand
      Size        Size        Size   (sec)                          Util  Util   Util   Util   Demand  Demand  Units
      Final       Final                                             %     Method %      Method
      1651584     6291456     16384  20.00   17447.90   10^6bits/s  3.13  S      -1.00  U      0.353   -1.000  usec/KB
      
       Performance counter stats for './netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3':
      
                 412,514 context-switches
      
           200.034645535 seconds time elapsed
      
      lpq83:~# echo 131072 >/proc/sys/net/ipv4/tcp_notsent_lowat
      lpq83:~# perf stat -e context-switches ./netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3
      OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : +/-2.500% @ 99% conf.
      Local       Remote      Local  Elapsed Throughput Throughput  Local Local  Remote Remote Local   Remote  Service
      Send Socket Recv Socket Send   Time               Units       CPU   CPU    CPU    CPU    Service Service Demand
      Size        Size        Size   (sec)                          Util  Util   Util   Util   Demand  Demand  Units
      Final       Final                                             %     Method %      Method
      1593240     6291456     16384  20.00   17321.16   10^6bits/s  3.35  S      -1.00  U      0.381   -1.000  usec/KB
      
       Performance counter stats for './netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3':
      
               2,675,818 context-switches
      
           200.029651391 seconds time elapsed
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-By: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9bee3b7
    • D
      net: sctp: trivial: update mailing list address · 91705c61
      Daniel Borkmann 提交于
      The SCTP mailing list address to send patches or questions
      to is linux-sctp@vger.kernel.org and not
      lksctp-developers@lists.sourceforge.net anymore. Therefore,
      update all occurences.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91705c61
  13. 23 7月, 2013 1 次提交
  14. 20 7月, 2013 1 次提交
  15. 17 7月, 2013 1 次提交
  16. 16 7月, 2013 3 次提交
  17. 13 7月, 2013 1 次提交
    • A
      Safer ABI for O_TMPFILE · bb458c64
      Al Viro 提交于
      [suggested by Rasmus Villemoes] make O_DIRECTORY | O_RDWR part of O_TMPFILE;
      that will fail on old kernels in a lot more cases than what I came up with.
      And make sure O_CREAT doesn't get there...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      bb458c64
  18. 11 7月, 2013 2 次提交
    • E
      net: rename busy poll socket op and globals · 64b0dc51
      Eliezer Tamir 提交于
      Rename LL_SO to BUSY_POLL_SO
      Rename sysctl_net_ll_{read,poll} to sysctl_busy_{read,poll}
      Fix up users of these variables.
      Fix documentation for sysctl.
      
      a patch for the socket.7  man page will follow separately,
      because of limitations of my mail setup.
      Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64b0dc51
    • M
      dm: optimize use SRCU and RCU · 83d5e5b0
      Mikulas Patocka 提交于
      This patch removes "io_lock" and "map_lock" in struct mapped_device and
      "holders" in struct dm_table and replaces these mechanisms with
      sleepable-rcu.
      
      Previously, the code would call "dm_get_live_table" and "dm_table_put" to
      get and release table. Now, the code is changed to call "dm_get_live_table"
      and "dm_put_live_table". dm_get_live_table locks sleepable-rcu and
      dm_put_live_table unlocks it.
      
      dm_get_live_table_fast/dm_put_live_table_fast can be used instead of
      dm_get_live_table/dm_put_live_table. These *_fast functions use
      non-sleepable RCU, so the caller must not block between them.
      
      If the code changes active or inactive dm table, it must call
      dm_sync_table before destroying the old table.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      83d5e5b0
  19. 10 7月, 2013 1 次提交
    • M
      fatfs: add FAT_IOCTL_GET_VOLUME_ID · 6e5b93ee
      Mike Lockwood 提交于
      This patch, originally from Android kernel, adds vfat ioctl command
      FAT_IOCTL_GET_VOLUME_ID, with this command we can get the vfat volume ID
      using following code:
      
      	ioctl(fd, FAT_IOCTL_GET_VOLUME_ID, &volume_ID)
      
      This patch is a modified version of the patch by Mike Lockwood, with
      changes from Dmitry Pervushin, who noticed the original patch makes some
      volume IDs abiguous with error returns: for example, if volume id is
      0xFFFFFDAD, that matches -ENOIOCTLCMD, we get "FFFFFFFF" from the user
      space.
      
      So add a parameter to ioctl to get the correct volume ID.
      
      Android uses vfat volume ID to identify different sd card, when a new sd
      card is inserted to device, android can scan the media on it and pop up
      new contents.
      Signed-off-by: NBintian Wang <bintian.wang@linaro.org>
      Cc: dmitry pervushin <dpervushin@gmail.com>
      Cc: Mike Lockwood <lockwood@android.com>
      Cc: Colin Cross <ccross@android.com>
      Acked-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Sean McNeil <sean@mcneil.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6e5b93ee
  20. 09 7月, 2013 2 次提交
  21. 04 7月, 2013 1 次提交
    • A
      ptrace: add ability to get/set signal-blocked mask · 29000cae
      Andrey Vagin 提交于
      crtools uses a parasite code for dumping processes.  The parasite code is
      injected into a process with help PTRACE_SEIZE.
      
      Currently crtools blocks signals from a parasite code.  If a process has
      pending signals, crtools wait while a process handles these signals.
      
      This method is not suitable for stopped tasks.  A stopped task can have a
      few pending signals, when we will try to execute a parasite code, we will
      need to drop SIGSTOP, but all other signals must remain pending, because a
      state of processes must not be changed during checkpointing.
      
      This patch adds two ptrace commands to set/get signal-blocked mask.
      
      I think gdb can use this commands too.
      
      [akpm@linux-foundation.org: be consistent with brace layout]
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      29000cae
  22. 02 7月, 2013 1 次提交
  23. 01 7月, 2013 2 次提交
  24. 29 6月, 2013 2 次提交
  25. 28 6月, 2013 2 次提交
    • T
      ALSA: Replace the magic number 44 with const · 975cc02a
      Takashi Iwai 提交于
      The char arrays with size 44 are for the name string of
      snd_ctl_elem_id.  Define the constant and replace the raw numbers with
      it for clarifying better.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      975cc02a
    • D
      drm: add hotspot support for cursors. · 4c813d4d
      Dave Airlie 提交于
      So it looks like for virtual hw cursors on QXL we need to inform
      the "hw" device what the cursor hotspot parameters are. This
      makes sense if you think the host has to draw the cursor and interpret
      clicks from it. However the current modesetting interface doesn't support
      passing the hotspot information from userspace.
      
      This implements a new cursor ioctl, that takes the hotspot info as well,
      userspace can try calling the new interface and if it gets -ENOSYS it means
      its on an older kernel and can just fallback.
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      4c813d4d
  26. 27 6月, 2013 2 次提交
  27. 26 6月, 2013 1 次提交
    • A
      ipvs: SH fallback and L4 hashing · eba3b5a7
      Alexander Frolkin 提交于
      By default the SH scheduler rejects connections that are hashed onto a
      realserver of weight 0.  This patch adds a flag to make SH choose a
      different realserver in this case, instead of rejecting the connection.
      
      The patch also adds a flag to make SH include the source port (TCP, UDP,
      SCTP) in the hash as well as the source address.  This basically allows
      for deterministic round-robin load balancing (i.e., where any director
      in a cluster of directors with identical config will send the same
      packet the same way).
      
      The flags are service flags (IP_VS_SVC_F_SCHED*) so that these options
      can be set per service.  They are set using a new option to ipvsadm.
      Signed-off-by: NAlexander Frolkin <avf@eldamar.org.uk>
      Acked-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      eba3b5a7